Science.gov

Sample records for conserved sequence motif

  1. BlockLogo: visualization of peptide and sequence motif conservation.

    PubMed

    Olsen, Lars Rønn; Kudahl, Ulrich Johan; Simon, Christian; Sun, Jing; Schönbach, Christian; Reinherz, Ellis L; Zhang, Guang Lan; Brusic, Vladimir

    2013-12-31

    BlockLogo is a web-server application for the visualization of protein and nucleotide fragments, continuous protein sequence motifs, and discontinuous sequence motifs using calculation of block entropy from multiple sequence alignments. The user input consists of a multiple sequence alignment, selection of motif positions, type of sequence, and output format definition. The output has BlockLogo along with the sequence logo, and a table of motif frequencies. We deployed BlockLogo as an online application and have demonstrated its utility through examples that show visualization of T-cell epitopes and B-cell epitopes (both continuous and discontinuous). Our additional example shows a visualization and analysis of structural motifs that determine the specificity of peptide binding to HLA-DR molecules. The BlockLogo server also employs selected experimentally validated prediction algorithms to enable on-the-fly prediction of MHC binding affinity to 15 common HLA class I and class II alleles as well as visual analysis of discontinuous epitopes from multiple sequence alignments. It enables the visualization and analysis of structural and functional motifs that are usually described as regular expressions. It provides a compact view of discontinuous motifs composed of distant positions within biological sequences. BlockLogo is available at: http://research4.dfci.harvard.edu/cvc/blocklogo/ and http://met-hilab.bu.edu/blocklogo/. PMID:24001880

  2. The BsaHI restriction-modification system: Cloning, sequencing and analysis of conserved motifs

    PubMed Central

    Neely, Robert K; Roberts, Richard J

    2008-01-01

    Background Restriction and modification enzymes typically recognise short DNA sequences of between two and eight bases in length. Understanding the mechanism of this recognition represents a significant challenge that we begin to address for the BsaHI restriction-modification system, which recognises the six base sequence GRCGYC. Results The DNA sequences of the genes for the BsaHI methyltransferase, bsaHIM, and restriction endonuclease, bsaHIR, have been determined (GenBank accession #EU386360), cloned and expressed in E. coli. Both the restriction endonuclease and methyltransferase enzymes share significant similarity with a group of 6 other enzymes comprising the restriction-modification systems HgiDI and HgiGI and the putative HindVP, NlaCORFDP, NpuORFC228P and SplZORFNP restriction-modification systems. A sequence alignment of these homologues shows that their amino acid sequences are largely conserved and highlights several motifs of interest. We target one such conserved motif, reading SPERRFD, at the C-terminal end of the bsaHIR gene. A mutational analysis of these amino acids indicates that the motif is crucial for enzymatic activity. Sequence alignment of the methyltransferase gene reveals a short motif within the target recognition domain that is conserved among enzymes recognising the same sequences. Thus, this motif may be used as a diagnostic tool to define the recognition sequences of the cytosine C5 methyltransferases. Conclusion We have cloned and sequenced the BsaHI restriction and modification enzymes. We have identified a region of the R. BsaHI enzyme that is crucial for its activity. Analysis of the amino acid sequence of the BsaHI methyltransferase enzyme led us to propose two new motifs that can be used in the diagnosis of the recognition sequence of the cytosine C5-methyltransferases. PMID:18479503

  3. A search for small noncoding RNAs in Staphylococcus aureus reveals a conserved sequence motif for regulation

    PubMed Central

    Geissmann, Thomas; Chevalier, Clément; Cros, Marie-Josée; Boisset, Sandrine; Fechter, Pierre; Noirot, Céline; Schrenzel, Jacques; François, Patrice; Vandenesch, François; Gaspin, Christine; Romby, Pascale

    2009-01-01

    Bioinformatic analysis of the intergenic regions of Staphylococcus aureus predicted multiple regulatory regions. From this analysis, we characterized 11 novel noncoding RNAs (RsaA‐K) that are expressed in several S. aureus strains under different experimental conditions. Many of them accumulate in the late-exponential phase of growth. All ncRNAs are stable and their expression is Hfq-independent. The transcription of several of them is regulated by the alternative sigma B factor (RsaA, D and F) while the expression of RsaE is agrA-dependent. Six of these ncRNAs are specific to S. aureus, four are conserved in other Staphylococci, and RsaE is also present in Bacillaceae. Transcriptomic and proteomic analysis indicated that RsaE regulates the synthesis of proteins involved in various metabolic pathways. Phylogenetic analysis combined with RNA structure probing, searches for RsaE‐mRNA base pairing, and toeprinting assays indicate that a conserved and unpaired UCCC sequence motif of RsaE binds to target mRNAs and prevents the formation of the ribosomal initiation complex. This study unexpectedly shows that most of the novel ncRNAs carry the conserved C−rich motif, suggesting that they are members of a class of ncRNAs that target mRNAs by a shared mechanism. PMID:19786493

  4. An approach to delineate primers for a group of poorly conserved sequences incorporating the common motif region.

    PubMed

    Sahu, Mousumi; Sahu, Jagajjit; Sahoo, Smita; Dehury, Budheswar; Sarma, Kishore; Sarmah, Ranjan; Sen, Priyabrata; Modi, Mahendra Kumar; Barooah, Madhumita

    2012-01-01

    Glutathione synthetase (gshB) has previously been reported to confer tolerance to acidic soil condition in Rhizobium species. Cloning the gene coding for this enzyme necessitates the designing of proper primer sets which in turn depends on the identification of high quality sequence similarity in multiple global alignments. In this experiment, a group of homologous gene sequences related to gshB gene (accession no: gi-86355669:327589-328536) of Rhizobium etli CFN 42, were extracted from NCBI nucleotide sequence databases using BLASTN and were analyzed for designing degenerate primers. However, the T-coffee multiple global alignment results did not show any block of conserved region for the above sequence set to design the primers. Therefore, we attempted to identify the location of common motif region based on multiple local alignments employing the MEME algorithm supported with MAST and Primer3. The results revealed some common motif regions that enabled us to design the primer sets for related gshB gene sequences. The result will be validated in wet lab. PMID:22419837

  5. Mining protein sequences for motifs.

    PubMed

    Narasimhan, Giri; Bu, Changsong; Gao, Yuan; Wang, Xuning; Xu, Ning; Mathee, Kalai

    2002-01-01

    We use methods from Data Mining and Knowledge Discovery to design an algorithm for detecting motifs in protein sequences. The algorithm assumes that a motif is constituted by the presence of a "good" combination of residues in appropriate locations of the motif. The algorithm attempts to compile such good combinations into a "pattern dictionary" by processing an aligned training set of protein sequences. The dictionary is subsequently used to detect motifs in new protein sequences. Statistical significance of the detection results are ensured by statistically determining the various parameters of the algorithm. Based on this approach, we have implemented a program called GYM. The Helix-Turn-Helix motif was used as a model system on which to test our program. The program was also extended to detect Homeodomain motifs. The detection results for the two motifs compare favorably with existing programs. In addition, the GYM program provides a lot of useful information about a given protein sequence. PMID:12487759

  6. [Conserved motifs in voltage sensing proteins].

    PubMed

    Wang, Chang-He; Xie, Zhen-Li; Lv, Jian-Wei; Yu, Zhi-Dan; Shao, Shu-Li

    2012-08-25

    This paper was aimed to study conserved motifs of voltage sensing proteins (VSPs) and establish a voltage sensing model. All VSPs were collected from the Uniprot database using a comprehensive keyword search followed by manual curation, and the results indicated that there are only two types of known VSPs, voltage gated ion channels and voltage dependent phosphatases. All the VSPs have a common domain of four helical transmembrane segments (TMS, S1-S4), which constitute the voltage sensing module of the VSPs. The S1 segment was shown to be responsible for membrane targeting and insertion of these proteins, while S2-S4 segments, which can sense membrane potential, for protein properties. Conserved motifs/residues and their functional significance of each TMS were identified using profile-to-profile sequence alignments. Conserved motifs in these four segments are strikingly similar for all VSPs, especially, the conserved motif [RK]-X(2)-R-X(2)-R-X(2)-[RK] was presented in all the S4 segments, with positively charged arginine (R) alternating with two hydrophobic or uncharged residues. Movement of these arginines across the membrane electric field is the core mechanism by which the VSPs detect changes in membrane potential. The negatively charged aspartate (D) in the S3 segment is universally conserved in all the VSPs, suggesting that the aspartate residue may be involved in voltage sensing properties of VSPs as well as the electrostatic interactions with the positively charged residues in the S4 segment, which may enhance the thermodynamic stability of the S4 segments in plasma membrane. PMID:22907298

  7. Distance conservation of transcriptional and splicing regulatory motifs

    NASA Astrophysics Data System (ADS)

    Lu, Jun; Ding, Changjiang

    2012-09-01

    The distance conservation is a new kind of genomic evolutionary conservation. The transcriptional and splicing regulatory k-mer motifs are functionally important DNA sequence elements. We demonstrated that there exist the evolutionarily conservation of the distance between these k-mer pairs in genomic sequences. This kind of conservation is not based on the strict location of bases in genome sequences, and does not depend on excess frequency of occurrence of k-mers. By utilizing the conservation of k-mer distance it is possible to design a non-alignment-based approach to quickly identify transcriptional or splicing regulatory motifs on the genome-wide scale. In this paper we will summarize our previous studies on distance conservation, introduce the method of distance conservation and indicate the prospects of its application.

  8. A Gibbs sampler for motif detection in phylogenetically close sequences

    NASA Astrophysics Data System (ADS)

    Siddharthan, Rahul; van Nimwegen, Erik; Siggia, Eric

    2004-03-01

    Genes are regulated by transcription factors that bind to DNA upstream of genes and recognize short conserved ``motifs'' in a random intergenic ``background''. Motif-finders such as the Gibbs sampler compare the probability of these short sequences being represented by ``weight matrices'' to the probability of their arising from the background ``null model'', and explore this space (analogous to a free-energy landscape). But closely related species may show conservation not because of functional sites but simply because they have not had sufficient time to diverge, so conventional methods will fail. We introduce a new Gibbs sampler algorithm that accounts for common ancestry when searching for motifs, while requiring minimal ``prior'' assumptions on the number and types of motifs, assessing the significance of detected motifs by ``tracking'' clusters that stay together. We apply this scheme to motif detection in sporulation-cycle genes in the yeast S. cerevisiae, using recent sequences of other closely-related Saccharomyces species.

  9. Short sequence motifs, overrepresented in mammalian conservednon-coding sequences

    SciTech Connect

    Minovitsky, Simon; Stegmaier, Philip; Kel, Alexander; Kondrashov,Alexey S.; Dubchak, Inna

    2007-02-21

    Background: A substantial fraction of non-coding DNAsequences of multicellular eukaryotes is under selective constraint. Inparticular, ~;5 percent of the human genome consists of conservednon-coding sequences (CNSs). CNSs differ from other genomic sequences intheir nucleotide composition and must play important functional roles,which mostly remain obscure.Results: We investigated relative abundancesof short sequence motifs in all human CNSs present in the human/mousewhole-genome alignments vs. three background sets of sequences: (i)weakly conserved or unconserved non-coding sequences (non-CNSs); (ii)near-promoter sequences (located between nucleotides -500 and -1500,relative to a start of transcription); and (iii) random sequences withthe same nucleotide composition as that of CNSs. When compared tonon-CNSs and near-promoter sequences, CNSs possess an excess of AT-richmotifs, often containing runs of identical nucleotides. In contrast, whencompared to random sequences, CNSs contain an excess of GC-rich motifswhich, however, lack CpG dinucleotides. Thus, abundance of short sequencemotifs in human CNSs, taken as a whole, is mostly determined by theiroverall compositional properties and not by overrepresentation of anyspecific short motifs. These properties are: (i) high AT-content of CNSs,(ii) a tendency, probably due to context-dependent mutation, of A's andT's to clump, (iii) presence of short GC-rich regions, and (iv) avoidanceof CpG contexts, due to their hypermutability. Only a small number ofshort motifs, overrepresented in all human CNSs are similar to bindingsites of transcription factors from the FOX family.Conclusion: Human CNSsas a whole appear to be too broad a class of sequences to possess strongfootprints of any short sequence-specific functions. Such footprintsshould be studied at the level of functional subclasses of CNSs, such asthose which flank genes with a particular pattern of expression. Overallproperties of CNSs are affected by patterns in

  10. Comparison of SIV and HIV-1 genomic RNA structures reveals impact of sequence evolution on conserved and non-conserved structural motifs.

    PubMed

    Pollom, Elizabeth; Dang, Kristen K; Potter, E Lake; Gorelick, Robert J; Burch, Christina L; Weeks, Kevin M; Swanstrom, Ronald

    2013-01-01

    RNA secondary structure plays a central role in the replication and metabolism of all RNA viruses, including retroviruses like HIV-1. However, structures with known function represent only a fraction of the secondary structure reported for HIV-1(NL4-3). One tool to assess the importance of RNA structures is to examine their conservation over evolutionary time. To this end, we used SHAPE to model the secondary structure of a second primate lentiviral genome, SIVmac239, which shares only 50% sequence identity at the nucleotide level with HIV-1NL4-3. Only about half of the paired nucleotides are paired in both genomic RNAs and, across the genome, just 71 base pairs form with the same pairing partner in both genomes. On average the RNA secondary structure is thus evolving at a much faster rate than the sequence. Structure at the Gag-Pro-Pol frameshift site is maintained but in a significantly altered form, while the impact of selection for maintaining a protein binding interaction can be seen in the conservation of pairing partners in the small RRE stems where Rev binds. Structures that are conserved between SIVmac239 and HIV-1(NL4-3) also occur at the 5' polyadenylation sequence, in the plus strand primer sites, PPT and cPPT, and in the stem-loop structure that includes the first splice acceptor site. The two genomes are adenosine-rich and cytidine-poor. The structured regions are enriched in guanosines, while unpaired regions are enriched in adenosines, and functionaly important structures have stronger base pairing than nonconserved structures. We conclude that much of the secondary structure is the result of fortuitous pairing in a metastable state that reforms during sequence evolution. However, secondary structure elements with important function are stabilized by higher guanosine content that allows regions of structure to persist as sequence evolution proceeds, and, within the confines of selective pressure, allows structures to evolve. PMID:23593004

  11. Detecting correlations among functional-sequence motifs

    NASA Astrophysics Data System (ADS)

    Pirino, Davide; Rigosa, Jacopo; Ledda, Alice; Ferretti, Luca

    2012-06-01

    Sequence motifs are words of nucleotides in DNA with biological functions, e.g., gene regulation. Identification of such words proceeds through rejection of Markov models on the expected motif frequency along the genome. Additional biological information can be extracted from the correlation structure among patterns of motif occurrences. In this paper a log-linear multivariate intensity Poisson model is estimated via expectation maximization on a set of motifs along the genome of E. coli K12. The proposed approach allows for excitatory as well as inhibitory interactions among motifs and between motifs and other genomic features like gene occurrences. Our findings confirm previous stylized facts about such types of interactions and shed new light on genome-maintenance functions of some particular motifs. We expect these methods to be applicable to a wider set of genomic features.

  12. Detecting correlations among functional-sequence motifs.

    PubMed

    Pirino, Davide; Rigosa, Jacopo; Ledda, Alice; Ferretti, Luca

    2012-06-01

    Sequence motifs are words of nucleotides in DNA with biological functions, e.g., gene regulation. Identification of such words proceeds through rejection of Markov models on the expected motif frequency along the genome. Additional biological information can be extracted from the correlation structure among patterns of motif occurrences. In this paper a log-linear multivariate intensity Poisson model is estimated via expectation maximization on a set of motifs along the genome of E. coli K12. The proposed approach allows for excitatory as well as inhibitory interactions among motifs and between motifs and other genomic features like gene occurrences. Our findings confirm previous stylized facts about such types of interactions and shed new light on genome-maintenance functions of some particular motifs. We expect these methods to be applicable to a wider set of genomic features. PMID:23005179

  13. QGRS-Conserve: a computational method for discovering evolutionarily conserved G-quadruplex motifs

    PubMed Central

    2014-01-01

    Background Nucleic acids containing guanine tracts can form quadruplex structures via non-Watson-Crick base pairing. Formation of G-quadruplexes is associated with the regulation of important biological functions such as transcription, genetic instability, DNA repair, DNA replication, epigenetic mechanisms, regulation of translation, and alternative splicing. G-quadruplexes play important roles in human diseases and are being considered as targets for a variety of therapies. Identification of functional G-quadruplexes and the study of their overall distribution in genomes and transcriptomes is an important pursuit. Traditional computational methods map sequence motifs capable of forming G-quadruplexes but have difficulty in distinguishing motifs that occur by chance from ones which fold into G-quadruplexes. Results We present Quadruplex forming ‘G’-rich sequences (QGRS)-Conserve, a computational method for calculating motif conservation across exomes and supports filtering to provide researchers with more precise methods of studying G-quadruplex distribution patterns. Our method quantitatively evaluates conservation between quadruplexes found in homologous nucleotide sequences based on several motif structural characteristics. QGRS-Conserve also efficiently manages overlapping G-quadruplex sequences such that the resulting datasets can be analyzed effectively. Conclusions We have applied QGRS-Conserve to identify a large number of G-quadruplex motifs in the human exome conserved across several mammalian and non-mammalian species. We have successfully identified multiple homologs of many previously published G-quadruplexes that play post-transcriptional regulatory roles in human genes. Preliminary large-scale analysis identified many homologous G-quadruplexes in the 5′- and 3′-untranslated regions of mammalian species. An expectedly smaller set of G-quadruplex motifs was found to be conserved across larger phylogenetic distances. QGRS-Conserve provides means

  14. Detecting seeded motifs in DNA sequences.

    PubMed

    Pizzi, Cinzia; Bortoluzzi, Stefania; Bisognin, Andrea; Coppe, Alessandro; Danieli, Gian Antonio

    2005-01-01

    The problem of detecting DNA motifs with functional relevance in real biological sequences is difficult due to a number of biological, statistical and computational issues and also because of the lack of knowledge about the structure of searched patterns. Many algorithms are implemented in fully automated processes, which are often based upon a guess of input parameters from the user at the very first step. In this paper, we present a novel method for the detection of seeded DNA motifs, composed by regions with a different extent of variability. The method is based on a multi-step approach, which was implemented in a motif searching web tool (MOST). Overrepresented exact patterns are extracted from input sequences and clustered to produce motifs core regions, which are then extended and scored to generate seeded motifs. The combination of automated pattern discovery algorithms and different display tools for the evaluation and selection of results at several analysis steps can potentially lead to much more meaningful results than complete automation can produce. Experimental results on different yeast and human real datasets proved the methodology to be a promising solution for finding seeded motifs. MOST web tool is freely available at http://telethon.bio.unipd.it/bioinfo/MOST. PMID:16141193

  15. Detecting seeded motifs in DNA sequences

    PubMed Central

    Pizzi, Cinzia; Bortoluzzi, Stefania; Bisognin, Andrea; Coppe, Alessandro; Danieli, Gian Antonio

    2005-01-01

    The problem of detecting DNA motifs with functional relevance in real biological sequences is difficult due to a number of biological, statistical and computational issues and also because of the lack of knowledge about the structure of searched patterns. Many algorithms are implemented in fully automated processes, which are often based upon a guess of input parameters from the user at the very first step. In this paper, we present a novel method for the detection of seeded DNA motifs, composed by regions with a different extent of variability. The method is based on a multi-step approach, which was implemented in a motif searching web tool (MOST). Overrepresented exact patterns are extracted from input sequences and clustered to produce motifs core regions, which are then extended and scored to generate seeded motifs. The combination of automated pattern discovery algorithms and different display tools for the evaluation and selection of results at several analysis steps can potentially lead to much more meaningful results than complete automation can produce. Experimental results on different yeast and human real datasets proved the methodology to be a promising solution for finding seeded motifs. MOST web tool is freely available at . PMID:16141193

  16. Discovering Motifs in Biological Sequences Using the Micron Automata Processor.

    PubMed

    Roy, Indranil; Aluru, Srinivas

    2016-01-01

    Finding approximately conserved sequences, called motifs, across multiple DNA or protein sequences is an important problem in computational biology. In this paper, we consider the (l, d) motif search problem of identifying one or more motifs of length l present in at least q of the n given sequences, with each occurrence differing from the motif in at most d substitutions. The problem is known to be NP-complete, and the largest solved instance reported to date is (26,11). We propose a novel algorithm for the (l,d) motif search problem using streaming execution over a large set of non-deterministic finite automata (NFA). This solution is designed to take advantage of the micron automata processor, a new technology close to deployment that can simultaneously execute multiple NFA in parallel. We demonstrate the capability for solving much larger instances of the (l, d) motif search problem using the resources available within a single automata processor board, by estimating run-times for problem instances (39,18) and (40,17). The paper serves as a useful guide to solving problems using this new accelerator technology. PMID:26886735

  17. Fast, Sensitive Discovery of Conserved Genome-Wide Motifs

    PubMed Central

    Ihuegbu, Nnamdi E.; Buhler, Jeremy

    2012-01-01

    Abstract Regulatory sites that control gene expression are essential to the proper functioning of cells, and identifying them is critical for modeling regulatory networks. We have developed Magma (Multiple Aligner of Genomic Multiple Alignments), a software tool for multiple species, multiple gene motif discovery. Magma identifies putative regulatory sites that are conserved across multiple species and occur near multiple genes throughout a reference genome. Magma takes as input multiple alignments that can include gaps. It uses efficient clustering methods that make it about 70 times faster than PhyloNet, a previous program for this task, with slightly greater sensitivity. We ran Magma on all non-coding DNA conserved between Caenorhabditis elegans and five additional species, about 70 Mbp in total, in <4 h. We obtained 2,309 motifs with lengths of 6–20 bp, each occurring at least 10 times throughout the genome, which collectively covered about 566 kbp of the genomes, approximately 0.8% of the input. Predicted sites occurred in all types of non-coding sequence but were especially enriched in the promoter regions. Comparisons to several experimental datasets show that Magma motifs correspond to a variety of known regulatory motifs. PMID:22300316

  18. Genomic analysis of membrane protein families: abundance and conserved motifs

    PubMed Central

    Liu, Yang; Engelman, Donald M; Gerstein, Mark

    2002-01-01

    Background Polytopic membrane proteins can be related to each other on the basis of the number of transmembrane helices and sequence similarities. Building on the Pfam classification of protein domain families, and using transmembrane-helix prediction and sequence-similarity searching, we identified a total of 526 well-characterized membrane protein families in 26 recently sequenced genomes. To this we added a clustering of a number of predicted but unclassified membrane proteins, resulting in a total of 637 membrane protein families. Results Analysis of the occurrence and composition of these families revealed several interesting trends. The number of assigned membrane protein domains has an approximately linear relationship to the total number of open reading frames (ORFs) in 26 genomes studied. Caenorhabditis elegans is an apparent outlier, because of its high representation of seven-span transmembrane (7-TM) chemoreceptor families. In all genomes, including that of C. elegans, the number of distinct membrane protein families has a logarithmic relation to the number of ORFs. Glycine, proline, and tyrosine locations tend to be conserved in transmembrane regions within families, whereas isoleucine, valine, and methionine locations are relatively mutable. Analysis of motifs in putative transmembrane helices reveals that GxxxG and GxxxxxxG (which can be written GG4 and GG7, respectively; see Materials and methods) are among the most prevalent. This was noted in earlier studies; we now find these motifs are particularly well conserved in families, however, especially those corresponding to transporters, symporters, and channels. Conclusions We carried out a genome-wide analysis on patterns of the classified polytopic membrane protein families and analyzed the distribution of conserved amino acids and motifs in the transmembrane helix regions in these families. PMID:12372142

  19. Sequence-Based Screening for Rare Enzymes: New Insights into the World of AMDases Reveal a Conserved Motif and 58 Novel Enzymes Clustering in Eight Distinct Families

    PubMed Central

    Maimanakos, Janine; Chow, Jennifer; Gaßmeyer, Sarah K.; Güllert, Simon; Busch, Florian; Kourist, Robert; Streit, Wolfgang R.

    2016-01-01

    Arylmalonate Decarboxylases (AMDases, EC 4.1.1.76) are very rare and mostly underexplored enzymes. Currently only four known and biochemically characterized representatives exist. However, their ability to decarboxylate α-disubstituted malonic acid derivatives to optically pure products without cofactors makes them attractive and promising candidates for the use as biocatalysts in industrial processes. Until now, AMDases could not be separated from other members of the aspartate/glutamate racemase superfamily based on their gene sequences. Within this work, a search algorithm was developed that enables a reliable prediction of AMDase activity for potential candidates. Based on specific sequence patterns and screening methods 58 novel AMDase candidate genes could be identified in this work. Thereby, AMDases with the conserved sequence pattern of Bordetella bronchiseptica’s prototype appeared to be limited to the classes of Alpha-, Beta-, and Gamma-proteobacteria. Amino acid homologies and comparison of gene surrounding sequences enabled the classification of eight enzyme clusters. Particularly striking is the accumulation of genes coding for different transporters of the tripartite tricarboxylate transporters family, TRAP transporters and ABC transporters as well as genes coding for mandelate racemases/muconate lactonizing enzymes that might be involved in substrate uptake or degradation of AMDase products. Further, three novel AMDases were characterized which showed a high enantiomeric excess (>99%) of the (R)-enantiomer of flurbiprofen. These are the recombinant AmdA and AmdV from Variovorax sp. strains HH01 and HH02, originated from soil, and AmdP from Polymorphum gilvum found by a data base search. Altogether our findings give new insights into the class of AMDases and reveal many previously unknown enzyme candidates with high potential for bioindustrial processes. PMID:27610105

  20. Sequence-Based Screening for Rare Enzymes: New Insights into the World of AMDases Reveal a Conserved Motif and 58 Novel Enzymes Clustering in Eight Distinct Families.

    PubMed

    Maimanakos, Janine; Chow, Jennifer; Gaßmeyer, Sarah K; Güllert, Simon; Busch, Florian; Kourist, Robert; Streit, Wolfgang R

    2016-01-01

    Arylmalonate Decarboxylases (AMDases, EC 4.1.1.76) are very rare and mostly underexplored enzymes. Currently only four known and biochemically characterized representatives exist. However, their ability to decarboxylate α-disubstituted malonic acid derivatives to optically pure products without cofactors makes them attractive and promising candidates for the use as biocatalysts in industrial processes. Until now, AMDases could not be separated from other members of the aspartate/glutamate racemase superfamily based on their gene sequences. Within this work, a search algorithm was developed that enables a reliable prediction of AMDase activity for potential candidates. Based on specific sequence patterns and screening methods 58 novel AMDase candidate genes could be identified in this work. Thereby, AMDases with the conserved sequence pattern of Bordetella bronchiseptica's prototype appeared to be limited to the classes of Alpha-, Beta-, and Gamma-proteobacteria. Amino acid homologies and comparison of gene surrounding sequences enabled the classification of eight enzyme clusters. Particularly striking is the accumulation of genes coding for different transporters of the tripartite tricarboxylate transporters family, TRAP transporters and ABC transporters as well as genes coding for mandelate racemases/muconate lactonizing enzymes that might be involved in substrate uptake or degradation of AMDase products. Further, three novel AMDases were characterized which showed a high enantiomeric excess (>99%) of the (R)-enantiomer of flurbiprofen. These are the recombinant AmdA and AmdV from Variovorax sp. strains HH01 and HH02, originated from soil, and AmdP from Polymorphum gilvum found by a data base search. Altogether our findings give new insights into the class of AMDases and reveal many previously unknown enzyme candidates with high potential for bioindustrial processes. PMID:27610105

  1. Identification of imine reductase-specific sequence motifs.

    PubMed

    Fademrecht, Silvia; Scheller, Philipp N; Nestl, Bettina M; Hauer, Bernhard; Pleiss, Jürgen

    2016-05-01

    Chiral amines are valuable building blocks for the production of a variety of pharmaceuticals, agrochemicals and other specialty chemicals. Only recently, imine reductases (IREDs) were discovered which catalyze the stereoselective reduction of imines to chiral amines. Although several IREDs were biochemically characterized in the last few years, knowledge of the reaction mechanism and the molecular basis of substrate specificity and stereoselectivity is limited. To gain further insights into the sequence-function relationships, the Imine Reductase Engineering Database (www.IRED.BioCatNet.de) was established and a systematic analysis of 530 putative IREDs was performed. A standard numbering scheme based on R-IRED-Sk was introduced to facilitate the identification and communication of structurally equivalent positions in different proteins. A conservation analysis revealed a highly conserved cofactor binding region and a predominantly hydrophobic substrate binding cleft. Two IRED-specific motifs were identified, the cofactor binding motif GLGxMGx5 [ATS]x4 Gx4 [VIL]WNR[TS]x2 [KR] and the active site motif Gx[DE]x[GDA]x[APS]x3 {K}x[ASL]x[LMVIAG]. Our results indicate a preference toward NADPH for all IREDs and explain why, despite their sequence similarity to β-hydroxyacid dehydrogenases (β-HADs), no conversion of β-hydroxyacids has been observed. Superfamily-specific conservations were investigated to explore the molecular basis of their stereopreference. Based on our analysis and previous experimental results on IRED mutants, an exclusive role of standard position 187 for stereoselectivity is excluded. Alternatively, two standard positions 139 and 194 were identified which are superfamily-specifically conserved and differ in R- and S-selective enzymes. Proteins 2016; 84:600-610. © 2016 Wiley Periodicals, Inc. PMID:26857686

  2. D-MATRIX: A web tool for constructing weight matrix of conserved DNA motifs

    PubMed Central

    Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok

    2009-01-01

    Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. D­MATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the co­regulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sos­box cis­regulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. D­MATRIX tool is accessible through the CIMAP domain network. Availability http://203.190.147.116/dmatrix/ PMID:19759861

  3. CodingMotif: exact determination of overrepresented nucleotide motifs in coding sequences

    PubMed Central

    2012-01-01

    Background It has been increasingly appreciated that coding sequences harbor regulatory sequence motifs in addition to encoding for protein. These sequence motifs are expected to be overrepresented in nucleotide sequences bound by a common protein or small RNA. However, detecting overrepresented motifs has been difficult because of interference by constraints at the protein level. Sampling-based approaches to solve this problem based on codon-shuffling have been limited to exploring only an infinitesimal fraction of the sequence space and by their use of parametric approximations. Results We present a novel O(N(log N)2)-time algorithm, CodingMotif, to identify nucleotide-level motifs of unusual copy number in protein-coding regions. Using a new dynamic programming algorithm we are able to exhaustively calculate the distribution of the number of occurrences of a motif over all possible coding sequences that encode the same amino acid sequence, given a background model for codon usage and dinucleotide biases. Our method takes advantage of the sparseness of loci where a given motif can occur, greatly speeding up the required convolution calculations. Knowledge of the distribution allows one to assess the exact non-parametric p-value of whether a given motif is over- or under- represented. We demonstrate that our method identifies known functional motifs more accurately than sampling and parametric-based approaches in a variety of coding datasets of various size, including ChIP-seq data for the transcription factors NRSF and GABP. Conclusions CodingMotif provides a theoretically and empirically-demonstrated advance for the detection of motifs overrepresented in coding sequences. We expect CodingMotif to be useful for identifying motifs in functional genomic datasets such as DNA-protein binding, RNA-protein binding, or microRNA-RNA binding within coding regions. A software implementation is available at http://bioinformatics.bc.edu/chuanglab/codingmotif.tar PMID

  4. Conservation defines functional motifs in the squint/nodal-related 1 RNA dorsal localization element

    PubMed Central

    Gilligan, Patrick C.; Kumari, Pooja; Lim, Shimin; Cheong, Albert; Chang, Alex; Sampath, Karuna

    2011-01-01

    RNA localization is emerging as a general principle of sub-cellular protein localization and cellular organization. However, the sequence and structural requirements in many RNA localization elements remain poorly understood. Whereas transcription factor-binding sites in DNA can be recognized as short degenerate motifs, and consensus binding sites readily inferred, protein-binding sites in RNA often contain structural features, and can be difficult to infer. We previously showed that zebrafish squint/nodal-related 1 (sqt/ndr1) RNA localizes to the future dorsal side of the embryo. Interestingly, mammalian nodal RNA can also localize to dorsal when injected into zebrafish embryos, suggesting that the sequence motif(s) may be conserved, even though the fish and mammal UTRs cannot be aligned. To define potential sequence and structural features, we obtained ndr1 3′-UTR sequences from approximately 50 fishes that are closely, or distantly, related to zebrafish, for high-resolution phylogenetic footprinting. We identify conserved sequence and structural motifs within the zebrafish/carp family and catfish. We find that two novel motifs, a single-stranded AGCAC motif and a small stem-loop, are required for efficient sqt RNA localization. These findings show that comparative sequencing in the zebrafish/carp family is an efficient approach for identifying weak consensus binding sites for RNA regulatory proteins. PMID:21149265

  5. The highly conserved amino acid sequence motif Tyr-Gly-Asp-Thr-Asp-Ser in alpha-like DNA polymerases is required by phage phi 29 DNA polymerase for protein-primed initiation and polymerization.

    PubMed Central

    Bernad, A; Lázaro, J M; Salas, M; Blanco, L

    1990-01-01

    The alpha-like DNA polymerases from bacteriophage phi 29 and other viruses, prokaryotes and eukaryotes contain an amino acid consensus sequence that has been proposed to form part of the dNTP binding site. We have used site-directed mutants to study five of the six highly conserved consecutive amino acids corresponding to the most conserved C-terminal segment (Tyr-Gly-Asp-Thr-Asp-Ser). Our results indicate that in phi 29 DNA polymerase this consensus sequence, although irrelevant for the 3'----5' exonuclease activity, is essential for initiation and elongation. Based on these results and on its homology with known or putative metal-binding amino acid sequences, we propose that in phi 29 DNA polymerase the Tyr-Gly-Asp-Thr-Asp-Ser consensus motif is part of the dNTP binding site, involved in the synthetic activities of the polymerase (i.e., initiation and polymerization), and that it is involved particularly in the metal binding associated with the dNTP site. Images PMID:2191296

  6. Occurrence probability of structured motifs in random sequences.

    PubMed

    Robin, S; Daudin, J-J; Richard, H; Sagot, M-F; Schbath, S

    2002-01-01

    The problem of extracting from a set of nucleic acid sequences motifs which may have biological function is more and more important. In this paper, we are interested in particular motifs that may be implicated in the transcription process. These motifs, called structured motifs, are composed of two ordered parts separated by a variable distance and allowing for substitutions. In order to assess their statistical significance, we propose approximations of the probability of occurrences of such a structured motif in a given sequence. An application of our method to evaluate candidate promoters in E. coli and B. subtilis is presented. Simulations show the goodness of the approximations. PMID:12614545

  7. Comparative analysis of the full genome sequence of European bat lyssavirus type 1 and type 2 with other lyssaviruses and evidence for a conserved transcription termination and polyadenylation motif in the G-L 3' non-translated region.

    PubMed

    Marston, D A; McElhinney, L M; Johnson, N; Müller, T; Conzelmann, K K; Tordo, N; Fooks, A R

    2007-04-01

    We report the first full-length genomic sequences for European bat lyssavirus type-1 (EBLV-1) and type-2 (EBLV-2). The EBLV-1 genomic sequence was derived from a virus isolated from a serotine bat in Hamburg, Germany, in 1968 and the EBLV-2 sequence was derived from a virus isolate from a human case of rabies that occurred in Scotland in 2002. A long-distance PCR strategy was used to amplify the open reading frames (ORFs), followed by standard and modified RACE (rapid amplification of cDNA ends) techniques to amplify the 3' and 5' ends. The lengths of each complete viral genome for EBLV-1 and EBLV-2 were 11 966 and 11 930 base pairs, respectively, and follow the standard rhabdovirus genome organization of five viral proteins. Comparison with other lyssavirus sequences demonstrates variation in degrees of homology, with the genomic termini showing a high degree of complementarity. The nucleoprotein was the most conserved, both intra- and intergenotypically, followed by the polymerase (L), matrix and glyco- proteins, with the phosphoprotein being the most variable. In addition, we have shown that the two EBLVs utilize a conserved transcription termination and polyadenylation (TTP) motif, approximately 50 nt upstream of the L gene start codon. All available lyssavirus sequences to date, with the exception of Pasteur virus (PV) and PV-derived isolates, use the second TTP site. This observation may explain differences in pathogenicity between lyssavirus strains, dependent on the length of the untranslated region, which might affect transcriptional activity and RNA stability. PMID:17374776

  8. Identifying novel sequence variants of RNA 3D motifs

    PubMed Central

    Zirbel, Craig L.; Roll, James; Sweeney, Blake A.; Petrov, Anton I.; Pirrung, Meg; Leontis, Neocles B.

    2015-01-01

    Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameterize each model, we use all instances of each motif found in the RNA 3D Motif Atlas and annotations of pairwise nucleotide interactions generated by the FR3D software. Isostericity relations between non-Watson–Crick basepairs are used in scoring sequence variants. SCFG techniques model nested pairs and insertions, while MRF ideas handle crossing interactions and base triples. We use test sets of randomly-generated sequences to set acceptance and rejection thresholds for each motif group and thus control the false positive rate. Validation was carried out by comparing results for four motif groups to RMDetect. The software developed for sequence scoring (JAR3D) is structured to automatically incorporate new motifs as they accumulate in the RNA 3D Motif Atlas when new structures are solved and is available free for download. PMID:26130723

  9. Identifying novel sequence variants of RNA 3D motifs.

    PubMed

    Zirbel, Craig L; Roll, James; Sweeney, Blake A; Petrov, Anton I; Pirrung, Meg; Leontis, Neocles B

    2015-09-01

    Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameterize each model, we use all instances of each motif found in the RNA 3D Motif Atlas and annotations of pairwise nucleotide interactions generated by the FR3D software. Isostericity relations between non-Watson-Crick basepairs are used in scoring sequence variants. SCFG techniques model nested pairs and insertions, while MRF ideas handle crossing interactions and base triples. We use test sets of randomly-generated sequences to set acceptance and rejection thresholds for each motif group and thus control the false positive rate. Validation was carried out by comparing results for four motif groups to RMDetect. The software developed for sequence scoring (JAR3D) is structured to automatically incorporate new motifs as they accumulate in the RNA 3D Motif Atlas when new structures are solved and is available free for download. PMID:26130723

  10. Over-represented localized sequence motifs in ribosomal protein gene promoters of basal metazoans.

    PubMed

    Perina, Drago; Korolija, Marina; Roller, Maša; Harcet, Matija; Jeličić, Branka; Mikoč, Andreja; Cetković, Helena

    2011-07-01

    Equimolecular presence of ribosomal proteins (RPs) in the cell is needed for ribosome assembly and is achieved by synchronized expression of ribosomal protein genes (RPGs) with promoters of similar strengths. Over-represented motifs of RPG promoter regions are identified as targets for specific transcription factors. Unlike RPs, those motifs are not conserved between mammals, drosophila, and yeast. We analyzed RPGs proximal promoter regions of three basal metazoans with sequenced genomes: sponge, cnidarian, and placozoan and found common features, such as 5'-terminal oligopyrimidine tracts and TATA-boxes. Furthermore, we identified over-represented motifs, some of which displayed the highest similarity to motifs abundant in human RPG promoters and not present in Drosophila or yeast. Our results indicate that humans over-represented motifs, as well as corresponding domains of transcription factors, were established very early in metazoan evolution. The fast evolving nature of RPGs regulatory network leads to formation of other, lineage specific, over-represented motifs. PMID:21457775

  11. Function-based classification of carbohydrate-active enzymes by recognition of short, conserved peptide motifs.

    PubMed

    Busk, Peter Kamp; Lange, Lene

    2013-06-01

    Functional prediction of carbohydrate-active enzymes is difficult due to low sequence identity. However, similar enzymes often share a few short motifs, e.g., around the active site, even when the overall sequences are very different. To exploit this notion for functional prediction of carbohydrate-active enzymes, we developed a simple algorithm, peptide pattern recognition (PPR), that can divide proteins into groups of sequences that share a set of short conserved sequences. When this method was used on 118 glycoside hydrolase 5 proteins with 9% average pairwise identity and representing four characterized enzymatic functions, 97% of the proteins were sorted into groups correlating with their enzymatic activity. Furthermore, we analyzed 8,138 glycoside hydrolase 13 proteins including 204 experimentally characterized enzymes with 28 different functions. There was a 91% correlation between group and enzyme activity. These results indicate that the function of carbohydrate-active enzymes can be predicted with high precision by finding short, conserved motifs in their sequences. The glycoside hydrolase 61 family is important for fungal biomass conversion, but only a few proteins of this family have been functionally characterized. Interestingly, PPR divided 743 glycoside hydrolase 61 proteins into 16 subfamilies useful for targeted investigation of the function of these proteins and pinpointed three conserved motifs with putative importance for enzyme activity. Furthermore, the conserved sequences were useful for cloning of new, subfamily-specific glycoside hydrolase 61 proteins from 14 fungi. In conclusion, identification of conserved sequence motifs is a new approach to sequence analysis that can predict carbohydrate-active enzyme functions with high precision. PMID:23524681

  12. [Conserved motifs in the primary and secondary ITS1 structures in bryophytes].

    PubMed

    Milyutina, I A; Ignatov, M S

    2015-01-01

    A study of the ITS1 nucleotide sequences of 1000 moss species of 62 families, 11 liverwort species from five orders, and one hornwort Anthoceros agrestis identified five highly conserved motifs (CM1-CM5), which are presumably involved in pre-rRNA processing. Although the ITS1 sequences substantially differ in length and the extent of divergence, the conserved motifs are found in all of them. ITS1 secondary structures were constructed for 76 mosses, and main regularities at conserved motif positioning were observed. The positions of processing sites in the ITS1 secondary structure of the yeast Saccharomyces cerevisiae were found to be similar to the positions of the conserved motifs in the ITS1 secondary structures of mosses and liverworts. In addition, a potential hairpin formation in the putative secondary structure of a pre-rRNA fragment was considered for the region between ITS1 CM4-CM5 and a highly conserved region between hairpins 49 and 50 (H49 and H50) of the 18S rRNA. PMID:26107892

  13. Bioinformatic identification of novel regulatory DNA sequence motifs in Streptomyces coelicolor

    PubMed Central

    Studholme, David J; Bentley, Stephen D; Kormanec, Jan

    2004-01-01

    Background Streptomyces coelicolor is a bacterium with a vast repertoire of metabolic functions and complex systems of cellular development. Its genome sequence is rich in genes that encode regulatory proteins to control these processes in response to its changing environment. We wished to apply a recently published bioinformatic method for identifying novel regulatory sequence signals to gain new insights into regulation in S. coelicolor. Results The method involved production of position-specific weight matrices from alignments of over-represented words of DNA sequence. We generated 2497 weight matrices, each representing a candidate regulatory DNA sequence motif. We scanned the genome sequence of S. coelicolor against each of these matrices. A DNA sequence motif represented by one of the matrices was found preferentially in non-coding sequences immediately upstream of genes involved in polysaccharide degradation, including several that encode chitinases. This motif (TGGTCTAGACCA) was also found upstream of genes encoding components of the phosphoenolpyruvate phosphotransfer system (PTS). We hypothesise that this DNA sequence motif represents a regulatory element that is responsive to availability of carbon-sources. Other motifs of potential biological significance were found upstream of genes implicated in secondary metabolism (TTAGGTtAGgCTaACCTAA), sigma factors (TGACN19TGAC), DNA replication and repair (ttgtCAGTGN13TGGA), nucleotide conversions (CTACgcNCGTAG), and ArsR (TCAGN12TCAG). A motif found upstream of genes involved in chromosome replication (TGTCagtgcN7Tagg) was similar to a previously described motif found in UV-responsive promoters. Conclusions We successfully applied a recently published in silico method to identify conserved sequence motifs in S. coelicolor that may be biologically significant as regulatory elements. Our data are broadly consistent with and further extend data from previously published studies. We invite experimental testing of

  14. Characterization of the tandem CWCH2 sequence motif: a hallmark of inter-zinc finger interactions

    PubMed Central

    2010-01-01

    Background The C2H2 zinc finger (ZF) domain is widely conserved among eukaryotic proteins. In Zic/Gli/Zap1 C2H2 ZF proteins, the two N-terminal ZFs form a single structural unit by sharing a hydrophobic core. This structural unit defines a new motif comprised of two tryptophan side chains at the center of the hydrophobic core. Because each tryptophan residue is located between the two cysteine residues of the C2H2 motif, we have named this structure the tandem CWCH2 (tCWCH2) motif. Results Here, we characterized 587 tCWCH2-containing genes using data derived from public databases. We categorized genes into 11 classes including Zic/Gli/Glis, Arid2/Rsc9, PacC, Mizf, Aebp2, Zap1/ZafA, Fungl, Zfp106, Twincl, Clr1, and Fungl-4ZF, based on sequence similarity, domain organization, and functional similarities. tCWCH2 motifs are mostly found in organisms belonging to the Opisthokonta (metazoa, fungi, and choanoflagellates) and Amoebozoa (amoeba, Dictyostelium discoideum). By comparison, the C2H2 ZF motif is distributed widely among the eukaryotes. The structure and organization of the tCWCH2 motif, its phylogenetic distribution, and molecular phylogenetic analysis suggest that prototypical tCWCH2 genes existed in the Opisthokonta ancestor. Within-group or between-group comparisons of the tCWCH2 amino acid sequence identified three additional sequence features (site-specific amino acid frequencies, longer linker sequence between two C2H2 ZFs, and frequent extra-sequences within C2H2 ZF motifs). Conclusion These features suggest that the tCWCH2 motif is a specialized motif involved in inter-zinc finger interactions. PMID:20167128

  15. Fast and Accurate Discovery of Degenerate Linear Motifs in Protein Sequences

    PubMed Central

    Levy, Emmanuel D.; Michnick, Stephen W.

    2014-01-01

    Linear motifs mediate a wide variety of cellular functions, which makes their characterization in protein sequences crucial to understanding cellular systems. However, the short length and degenerate nature of linear motifs make their discovery a difficult problem. Here, we introduce MotifHound, an algorithm particularly suited for the discovery of small and degenerate linear motifs. MotifHound performs an exact and exhaustive enumeration of all motifs present in proteins of interest, including all of their degenerate forms, and scores the overrepresentation of each motif based on its occurrence in proteins of interest relative to a background (e.g., proteome) using the hypergeometric distribution. To assess MotifHound, we benchmarked it together with state-of-the-art algorithms. The benchmark consists of 11,880 sets of proteins from S. cerevisiae; in each set, we artificially spiked-in one motif varying in terms of three key parameters, (i) number of occurrences, (ii) length and (iii) the number of degenerate or “wildcard” positions. The benchmark enabled the evaluation of the impact of these three properties on the performance of the different algorithms. The results showed that MotifHound and SLiMFinder were the most accurate in detecting degenerate linear motifs. Interestingly, MotifHound was 15 to 20 times faster at comparable accuracy and performed best in the discovery of highly degenerate motifs. We complemented the benchmark by an analysis of proteins experimentally shown to bind the FUS1 SH3 domain from S. cerevisiae. Using the full-length protein partners as sole information, MotifHound recapitulated most experimentally determined motifs binding to the FUS1 SH3 domain. Moreover, these motifs exhibited properties typical of SH3 binding peptides, e.g., high intrinsic disorder and evolutionary conservation, despite the fact that none of these properties were used as prior information. MotifHound is available (http://michnick.bcm.umontreal.ca or http

  16. Phosphatidylinositol transfer proteins: sequence motifs in structural and evolutionary analyses

    PubMed Central

    Wyckoff, Gerald J.; Solidar, Ada; Yoden, Marilyn D.

    2016-01-01

    Phosphatidylinositol transfer proteins (PITP) are a family of monomeric proteins that bind and transfer phosphatidylinositol and phosphatidylcholine between membrane compartments. They are required for production of inositol and diacylglycerol second messengers, and are found in most metazoan organisms. While PITPs are known to carry out crucial cell-signaling roles in many organisms, the structure, function and evolution of the majority of family members remains unexplored; primarily because the ubiquity and diversity of the family thwarts traditional methods of global alignment. To surmount this obstacle, we instead took a novel approach, using MEME and a parsimony-based analysis to create a cladogram of conserved sequence motifs in 56 PITP family proteins from 26 species. In keeping with previous functional annotations, three clades were supported within our evolutionary analysis; two classes of soluble proteins and a class of membrane-associated proteins. By, focusing on conserved regions, the analysis allowed for in depth queries regarding possible functional roles of PITP proteins in both intra- and extra- cellular signaling.

  17. Predicting candidate genomic sequences that correspond to synthetic functional RNA motifs

    PubMed Central

    Laserson, Uri; Gan, Hin Hark; Schlick, Tamar

    2005-01-01

    Riboswitches and RNA interference are important emerging mechanisms found in many organisms to control gene expression. To enhance our understanding of such RNA roles, finding small regulatory motifs in genomes presents a challenge on a wide scale. Many simple functional RNA motifs have been found by in vitro selection experiments, which produce synthetic target-binding aptamers as well as catalytic RNAs, including the hammerhead ribozyme. Motivated by the prediction of Piganeau and Schroeder [(2003) Chem. Biol., 10, 103–104] that synthetic RNAs may have natural counterparts, we develop and apply an efficient computational protocol for identifying aptamer-like motifs in genomes. We define motifs from the sequence and structural information of synthetic aptamers, search for sequences in genomes that will produce motif matches, and then evaluate the structural stability and statistical significance of the potential hits. Our application to aptamers for streptomycin, chloramphenicol, neomycin B and ATP identifies 37 candidate sequences (in coding and non-coding regions) that fold to the target aptamer structures in bacterial and archaeal genomes. Further energetic screening reveals that several candidates exhibit energetic properties and sequence conservation patterns that are characteristic of functional motifs. Besides providing candidates for experimental testing, our computational protocol offers an avenue for expanding natural RNA's functional repertoire. PMID:16254081

  18. A Conserved Motif Provides Binding Specificity to the PP2A-B56 Phosphatase.

    PubMed

    Hertz, Emil Peter Thrane; Kruse, Thomas; Davey, Norman E; López-Méndez, Blanca; Sigurðsson, Jón Otti; Montoya, Guillermo; Olsen, Jesper V; Nilsson, Jakob

    2016-08-18

    Dynamic protein phosphorylation is a fundamental mechanism regulating biological processes in all organisms. Protein phosphatase 2A (PP2A) is the main source of phosphatase activity in the cell, but the molecular details of substrate recognition are unknown. Here, we report that a conserved surface-exposed pocket on PP2A regulatory B56 subunits binds to a consensus sequence on interacting proteins, which we term the LxxIxE motif. The composition of the motif modulates the affinity for B56, which in turn determines the phosphorylation status of associated substrates. Phosphorylation of amino acid residues within the motif increases B56 binding, allowing integration of kinase and phosphatase activity. We identify conserved LxxIxE motifs in essential proteins throughout the eukaryotic domain of life and in human viruses, suggesting that the motifs are required for basic cellular function. Our study provides a molecular description of PP2A binding specificity with broad implications for understanding signaling in eukaryotes. PMID:27453045

  19. Discovering Motifs in Ranked Lists of DNA Sequences

    PubMed Central

    Eden, Eran; Lipson, Doron; Yogev, Sivan; Yakhini, Zohar

    2007-01-01

    Computational methods for discovery of sequence elements that are enriched in a target set compared with a background set are fundamental in molecular biology research. One example is the discovery of transcription factor binding motifs that are inferred from ChIP–chip (chromatin immuno-precipitation on a microarray) measurements. Several major challenges in sequence motif discovery still require consideration: (i) the need for a principled approach to partitioning the data into target and background sets; (ii) the lack of rigorous models and of an exact p-value for measuring motif enrichment; (iii) the need for an appropriate framework for accounting for motif multiplicity; (iv) the tendency, in many of the existing methods, to report presumably significant motifs even when applied to randomly generated data. In this paper we present a statistical framework for discovering enriched sequence elements in ranked lists that resolves these four issues. We demonstrate the implementation of this framework in a software application, termed DRIM (discovery of rank imbalanced motifs), which identifies sequence motifs in lists of ranked DNA sequences. We applied DRIM to ChIP–chip and CpG methylation data and obtained the following results. (i) Identification of 50 novel putative transcription factor (TF) binding sites in yeast ChIP–chip data. The biological function of some of them was further investigated to gain new insights on transcription regulation networks in yeast. For example, our discoveries enable the elucidation of the network of the TF ARO80. Another finding concerns a systematic TF binding enhancement to sequences containing CA repeats. (ii) Discovery of novel motifs in human cancer CpG methylation data. Remarkably, most of these motifs are similar to DNA sequence elements bound by the Polycomb complex that promotes histone methylation. Our findings thus support a model in which histone methylation and CpG methylation are mechanistically linked. Overall

  20. Oligonucleotide Sequence Motifs as Nucleosome Positioning Signals

    PubMed Central

    Collings, Clayton K.; Fernandez, Alfonso G.; Pitschka, Chad G.; Hawkins, Troy B.; Anderson, John N.

    2010-01-01

    To gain a better understanding of the sequence patterns that characterize positioned nucleosomes, we first performed an analysis of the periodicities of the 256 tetranucleotides in a yeast genome-wide library of nucleosomal DNA sequences that was prepared by in vitro reconstitution. The approach entailed the identification and analysis of 24 unique tetranucleotides that were defined by 8 consensus sequences. These consensus sequences were shown to be responsible for most if not all of the tetranucleotide and dinucleotide periodicities displayed by the entire library, demonstrating that the periodicities of dinucleotides that characterize the yeast genome are, in actuality, due primarily to the 8 consensus sequences. A novel combination of experimental and bioinformatic approaches was then used to show that these tetranucleotides are important for preferred formation of nucleosomes at specific sites along DNA in vitro. These results were then compared to tetranucleotide patterns in genome-wide in vivo libraries from yeast and C. elegans in order to assess the contributions of DNA sequence in the control of nucleosome residency in the cell. These comparisons revealed striking similarities in the tetranucleotide occurrence profiles that are likely to be involved in nucleosome positioning in both in vitro and in vivo libraries, suggesting that DNA sequence is an important factor in the control of nucleosome placement in vivo. However, the strengths of the tetranucleotide periodicities were 3–4 fold higher in the in vitro as compared to the in vivo libraries, which implies that DNA sequence plays less of a role in dictating nucleosome positions in vivo. The results of this study have important implications for models of sequence-dependent positioning since they suggest that a defined subset of tetranucleotides is involved in preferred nucleosome occupancy and that these tetranucleotides are the major source of the dinucleotide periodicities that are characteristic of

  1. A conserved heptamer motif for ribosomal RNA transcription termination in animal mitochondria.

    PubMed Central

    Valverde, J R; Marco, R; Garesse, R

    1994-01-01

    A search of sequence data bases for a tridecamer transcription termination signal, previously described in human mtDNA as being responsible for the accumulation of mitochondrial ribosomal RNAs (rRNAs) in excess over the rest of mitochondrial genes, has revealed that this termination signal occurs in equivalent positions in a wide variety of organisms from protozoa to mammals. Due to the compact organization of the mtDNA, the tridecamer motif usually appears as part of the 3' adjacent gene sequence. Because in phylogenetically widely separated organisms the mitochondrial genome has experienced many rearrangements, it is interesting that its occurrence near the 3' end of the large rRNA is independent of the adjacent gene. The tridecamer sequence has diverged in phylogenetically widely separated organisms. Nevertheless, a well-conserved heptamer--TGGCAGA, the mitochondrial rRNA termination box--can be defined. Although extending the experimental evidence of its role as a transcription termination signal in humans will be of great interest, its evolutionary conservation strongly suggests that mitochondrial rRNA transcription termination could be a widely conserved mechanism in animals. Furthermore, the conservation of a homologous tridecamer motif in one of the last 3' secondary loops of nonmitochondrial 23S-like rRNAs suggests that the role of the sequence has changed during mitochondrial evolution. PMID:7515499

  2. Classification of protein motifs based on subcellular localization uncovers evolutionary relationships at both sequence and functional levels

    PubMed Central

    2013-01-01

    Background Most proteins have evolved in specific cellular compartments that limit their functions and potential interactions. On the other hand, motifs define amino acid arrangements conserved between protein family members and represent powerful tools for assigning function to protein sequences. The ideal motif would identify all members of a protein family but in practice many motifs identify both family members and unrelated proteins, referred to as True Positive (TP) and False Positive (FP) sequences, respectively. Results To address the relationship between protein motifs, protein function and cellular localization, we systematically assigned subcellular localization data to motif sequences from the comprehensive PROSITE sequence motif database. Using this data we analyzed relationships between localization and function. We find that TPs and FPs have a strong tendency to localize in different compartments. When multiple localizations are considered, TPs are usually distributed between related cellular compartments. We also identified cases where FPs are concentrated in particular subcellular regions, indicating possible functional or evolutionary relationships with TP sequences of the same motif. Conclusions Our findings suggest that the systematic examination of subcellular localization has the potential to uncover evolutionary and functional relationships between motif-containing sequences. We believe that this type of analysis complements existing motif annotations and could aid in their interpretation. Our results shed light on the evolution of cellular organelles and potentially establish the basis for new subcellular localization and function prediction algorithms. PMID:23865897

  3. Functional Analysis of Semi-conserved Transit Peptide Motifs and Mechanistic Implications in Precursor Targeting and Recognition.

    PubMed

    Holbrook, Kristen; Subramanian, Chitra; Chotewutmontri, Prakitchai; Reddick, L Evan; Wright, Sarah; Zhang, Huixia; Moncrief, Lily; Bruce, Barry D

    2016-09-01

    Over 95% of plastid proteins are nuclear-encoded as their precursors containing an N-terminal extension known as the transit peptide (TP). Although highly variable, TPs direct the precursors through a conserved, posttranslational mechanism involving translocons in the outer (TOC) and inner envelope (TOC). The organelle import specificity is mediated by one or more components of the Toc complex. However, the high TP diversity creates a paradox on how the sequences can be specifically recognized. An emerging model of TP design is that they contain multiple loosely conserved motifs that are recognized at different steps in the targeting and transport process. Bioinformatics has demonstrated that many TPs contain semi-conserved physicochemical motifs, termed FGLK. In order to characterize FGLK motifs in TP recognition and import, we have analyzed two well-studied TPs from the precursor of RuBisCO small subunit (SStp) and ferredoxin (Fdtp). Both SStp and Fdtp contain two FGLK motifs. Analysis of large set mutations (∼85) in these two motifs using in vitro, in organello, and in vivo approaches support a model in which the FGLK domains mediate interaction with TOC34 and possibly other TOC components. In vivo import analysis suggests that multiple FGLK motifs are functionally redundant. Furthermore, we discuss how FGLK motifs are required for efficient precursor protein import and how these elements may permit a convergent function of this highly variable class of targeting sequences. PMID:27378725

  4. Discovering common stem–loop motifs in unaligned RNA sequences

    PubMed Central

    Gorodkin, Jan; Stricklin, Shawn L.; Stormo, Gary D.

    2001-01-01

    Post-transcriptional regulation of gene expression is often accomplished by proteins binding to specific sequence motifs in mRNA molecules, to affect their translation or stability. The motifs are often composed of a combination of sequence and structural constraints such that the overall structure is preserved even though much of the primary sequence is variable. While several methods exist to discover transcriptional regulatory sites in the DNA sequences of coregulated genes, the RNA motif discovery problem is much more difficult because of covariation in the positions. We describe the combined use of two approaches for RNA structure prediction, FOLDALIGN and COVE, that together can discover and model stem–loop RNA motifs in unaligned sequences, such as UTRs from post-transcriptionally coregulated genes. We evaluate the method on two datasets, one a section of rRNA genes with randomly truncated ends so that a global alignment is not possible, and the other a hyper-variable collection of IRE-like elements that were inserted into randomized UTR sequences. In both cases the combined method identified the motifs correctly, and in the rRNA example we show that it is capable of determining the structure, which includes bulge and internal loops as well as a variable length hairpin loop. Those automated results are quantitatively evaluated and found to agree closely with structures contained in curated databases, with correlation coefficients up to 0.9. A basic server, Stem–Loop Align SearcH (SLASH), which will perform stem–loop searches in unaligned RNA sequences, is available at http://www.bioinf.au.dk/slash/. PMID:11353083

  5. An evolutionary analysis of flightin reveals a conserved motif unique and widespread in Pancrustacea.

    PubMed

    Soto-Adames, Felipe N; Alvarez-Ortiz, Pedro; Vigoreaux, Jim O

    2014-01-01

    Flightin is a thick filament protein that in Drosophila melanogaster is uniquely expressed in the asynchronous, indirect flight muscles (IFM). Flightin is required for the structure and function of the IFM and is indispensable for flight in Drosophila. Given the importance of flight acquisition in the evolutionary history of insects, here we study the phylogeny and distribution of flightin. Flightin was identified in 69 species of hexapods in classes Collembola (springtails), Protura, Diplura, and insect orders Thysanura (silverfish), Dictyoptera (roaches), Orthoptera (grasshoppers), Pthiraptera (lice), Hemiptera (true bugs), Coleoptera (beetles), Neuroptera (green lacewing), Hymenoptera (bees, ants, and wasps), Lepidoptera (moths), and Diptera (flies and mosquitoes). Flightin was also found in 14 species of crustaceans in orders Anostraca (water flea), Cladocera (brine shrimp), Isopoda (pill bugs), Amphipoda (scuds, sideswimmers), and Decapoda (lobsters, crabs, and shrimps). Flightin was not identified in representatives of chelicerates, myriapods, or any species outside Pancrustacea (Tetraconata, sensu Dohle). Alignment of amino acid sequences revealed a conserved region of 52 amino acids, referred herein as WYR, that is bound by strictly conserved tryptophan (W) and arginine (R) and an intervening sequence with a high content of tyrosines (Y). This motif has no homologs in GenBank or PROSITE and is unique to flightin and paraflightin, a putative flightin paralog identified in decapods. A third motif of unclear affinities to pancrustacean WYR was observed in chelicerates. Phylogenetic analysis of amino acid sequences of the conserved motif suggests that paraflightin originated before the divergence of amphipods, isopods, and decapods. We conclude that flightin originated de novo in the ancestor of Pancrustacea > 500 MYA, well before the divergence of insects (~400 MYA) and the origin of flight (~325 MYA), and that its IFM-specific function in Drosophila is a more

  6. Conserved motif of CDK5RAP2 mediates its localization to centrosomes and the Golgi complex.

    PubMed

    Wang, Zhe; Wu, Tao; Shi, Lin; Zhang, Lin; Zheng, Wei; Qu, Jianan Y; Niu, Ruifang; Qi, Robert Z

    2010-07-16

    As the primary microtubule-organizing centers, centrosomes require gamma-tubulin for microtubule nucleation and organization. Located in close vicinity to centrosomes, the Golgi complex is another microtubule-organizing organelle in interphase cells. CDK5RAP2 is a gamma-tubulin complex-binding protein and functions in gamma-tubulin attachment to centrosomes. In this study, we find that CDK5RAP2 localizes to the Golgi complex in an ATP- and centrosome-dependent manner and associates with Golgi membranes independently of microtubules. CDK5RAP2 contains a centrosome-targeting domain with its core region highly homologous to the Motif 2 (CM2) of centrosomin, a functionally related protein in Drosophila. This sequence, referred to as the CM2-like motif, is also conserved in related proteins in chicken and zebrafish. Therefore, CDK5RAP2 may undertake a conserved mechanism for centrosomal localization. Using a mutational approach, we demonstrate that the CM2-like motif plays a crucial role in the centrosomal and Golgi localization of CDK5RAP2. Furthermore, the CM2-like motif is essential for the association of the centrosome-targeting domain to pericentrin and AKAP450. The binding with pericentrin is required for the centrosomal and Golgi localization of CDK5RAP2, whereas the binding with AKAP450 is required for the Golgi localization. Although the CM2-like motif possesses the activity of Ca(2+)-independent calmodulin binding, binding of calmodulin to this sequence is dispensable for centrosomal and Golgi association. Altogether, CDK5RAP2 may represent a novel mechanism for centrosomal and Golgi localization. PMID:20466722

  7. Computing distribution of scale independent motifs in biological sequences

    PubMed Central

    Almeida, Jonas S; Vinga, Susana

    2006-01-01

    The use of Chaos Game Representation (CGR) or its generalization, Universal Sequence Maps (USM), to describe the distribution of biological sequences has been found objectionable because of the fractal structure of that coordinate system. Consequently, the investigation of distribution of symbolic motifs at multiple scales is hampered by an inexact association between distance and sequence dissimilarity. A solution to this problem could unleash the use of iterative maps as phase-state representation of sequences where its statistical properties can be conveniently investigated. In this study a family of kernel density functions is described that accommodates the fractal nature of iterative function representations of symbolic sequences and, consequently, enables the exact investigation of sequence motifs of arbitrary lengths in that scale-independent representation. Furthermore, the proposed kernel density includes both Markovian succession and currently used alignment-free sequence dissimilarity metrics as special solutions. Therefore, the fractal kernel described is in fact a generalization that provides a common framework for a diverse suite of sequence analysis techniques. PMID:17049089

  8. Do short, frequent DNA sequence motifs mould the epigenome?

    PubMed

    Quante, Timo; Bird, Adrian

    2016-04-01

    'Epigenome' refers to the panoply of chemical modifications borne by DNA and its associated proteins that locally affect genome function. Epigenomic patterns are thought to be determined by external constraints resulting from development, disease and the environment, but DNA sequence is also a potential influence. We propose that domains of relatively uniform DNA base composition may modulate the epigenome through cell type-specific proteins that recognize short, frequent sequence motifs. Differential recruitment of epigenomic modifiers may adjust gene expression in multigene blocks as an alternative to tuning the activity of each gene separately, thus simplifying gene expression programming. PMID:26837845

  9. Sequence-motif Detection of NAD(P)-binding Proteins: Discovery of a Unique Antibacterial Drug Target

    NASA Astrophysics Data System (ADS)

    Hua, Yun Hao; Wu, Chih Yuan; Sargsyan, Karen; Lim, Carmay

    2014-09-01

    Many enzymes use nicotinamide adenine dinucleotide or nicotinamide adenine dinucleotide phosphate (NAD(P)) as essential coenzymes. These enzymes often do not share significant sequence identity and cannot be easily detected by sequence homology. Previously, we determined all distinct locally conserved pyrophosphate-binding structures (3d motifs) from NAD(P)-bound protein structures, from which 1d sequence motifs were derived. Here, we aim to establish the precision of these 3d and 1d motifs to annotate NAD(P)-binding proteins. We show that the pyrophosphate-binding 3d motifs are characteristic of NAD(P)-binding proteins, as they are rarely found in nonNAD(P)-binding proteins. Furthermore, several 1d motifs could distinguish between proteins that bind only NAD and those that bind only NADP. They could also distinguish between NAD(P)-binding proteins from nonNAD(P)-binding ones. Interestingly, one of the pyrophosphate-binding 3d and corresponding 1d motifs was found only in enoyl-acyl carrier protein reductases, which are enzymes essential for bacterial fatty acid biosynthesis. This unique 3d motif serves as an attractive novel drug target, as it is conserved across many bacterial species and is not found in human proteins.

  10. Sequence-motif Detection of NAD(P)-binding Proteins: Discovery of a Unique Antibacterial Drug Target

    PubMed Central

    Hua, Yun Hao; Wu, Chih Yuan; Sargsyan, Karen; Lim, Carmay

    2014-01-01

    Many enzymes use nicotinamide adenine dinucleotide or nicotinamide adenine dinucleotide phosphate (NAD(P)) as essential coenzymes. These enzymes often do not share significant sequence identity and cannot be easily detected by sequence homology. Previously, we determined all distinct locally conserved pyrophosphate-binding structures (3d motifs) from NAD(P)-bound protein structures, from which 1d sequence motifs were derived. Here, we aim to establish the precision of these 3d and 1d motifs to annotate NAD(P)-binding proteins. We show that the pyrophosphate-binding 3d motifs are characteristic of NAD(P)-binding proteins, as they are rarely found in nonNAD(P)-binding proteins. Furthermore, several 1d motifs could distinguish between proteins that bind only NAD and those that bind only NADP. They could also distinguish between NAD(P)-binding proteins from nonNAD(P)-binding ones. Interestingly, one of the pyrophosphate-binding 3d and corresponding 1d motifs was found only in enoyl-acyl carrier protein reductases, which are enzymes essential for bacterial fatty acid biosynthesis. This unique 3d motif serves as an attractive novel drug target, as it is conserved across many bacterial species and is not found in human proteins. PMID:25253464

  11. Conserved rhodopsin intradiscal structural motifs mediate stabilization: effects of zinc.

    PubMed

    Gleim, Scott; Stojanovic, Aleksandar; Arehart, Eric; Byington, Daniel; Hwa, John

    2009-03-01

    Retinitis pigmentosa (RP), a neurodegenerative disorder, can arise from single point mutations in rhodopsin, leading to a cascade of protein instability, misfolding, aggregation, rod cell death, retinal degeneration, and ultimately blindness. Divalent cations, such as zinc and copper, have allosteric effects on misfolded aggregates of comparable neurodegenerative disorders including Alzheimer disease, prion diseases, and ALS. We report that two structurally conserved low-affinity zinc coordination motifs, located among a cluster of RP mutations in the intradiscal loop region, mediate dose-dependent rhodopsin destabilization. Disruption of native interactions involving histidines 100 and 195, through site-directed mutagenesis or exogenous zinc coordination, results in significant loss of receptor stability. Furthermore, chelation with EDTA stabilizes the structure of both wild-type rhodopsin and the most prevalent rhodopsin RP mutation, P(23)H. These interactions suggest that homeostatic regulation of trace metal concentrations in the rod outer segment of the retina may be important both physiologically and for an important cluster of RP mutations. Furthermore, with a growing awareness of allosteric zinc binding domains on a diverse range of GPCRs, such principles may apply to many other receptors and their associated diseases. PMID:19206210

  12. Conserved rhodopsin intradiscal structural motifs mediate stabilization; effects of zinc†

    PubMed Central

    Gleim, Scott; Stojanovic, Aleksandar; Arehart, Eric; Byington, Daniel; Hwa, John

    2009-01-01

    Retinitis pigmentosa (RP), a neurodegenerative disorder, can arise from single point mutations in rhodopsin, leading to a cascade of protein instability, misfolding, aggregation, rod cell death, retinal degeneration, and ultimately blindness. Divalent cations, such as zinc and copper, have allosteric effects on misfolded aggregates of comparable neurodegenerative disorders including Alzheimer disease, prion diseases, and ALS. We report that two structurally conserved low-affinity zinc coordination motifs, located among a cluster of RP mutations in the intradiscal loop region, mediate dose-dependent rhodopsin destabilization. Disruption of native interactions involving histidines 100 and 195, through site-directed mutagenesis or exogenous zinc coordination, results in significant loss of receptor stability. Furthermore, chelation with EDTA stabilizes the structure of both wild type rhodopsin and the most prevalent rhodopsin RP mutation, P23H. These interactions suggest that homeostatic regulation of trace metal concentrations in the rod outer segment of the retina may be important both physiologically and for an important cluster of RP mutations. Furthermore, with a growing awareness of allosteric zinc binding domains on a diverse range of GPCRs, such principles may apply to many other receptors and their associated diseases. PMID:19206210

  13. Gapped alignment of protein sequence motifs through Monte Carlo optimization of a hidden Markov model

    PubMed Central

    Neuwald, Andrew F; Liu, Jun S

    2004-01-01

    Background Certain protein families are highly conserved across distantly related organisms and belong to large and functionally diverse superfamilies. The patterns of conservation present in these protein sequences presumably are due to selective constraints maintaining important but unknown structural mechanisms with some constraints specific to each family and others shared by a larger subset or by the entire superfamily. To exploit these patterns as a source of functional information, we recently devised a statistically based approach called contrast hierarchical alignment and interaction network (CHAIN) analysis, which infers the strengths of various categories of selective constraints from co-conserved patterns in a multiple alignment. The power of this approach strongly depends on the quality of the multiple alignments, which thus motivated development of theoretical concepts and strategies to improve alignment of conserved motifs within large sets of distantly related sequences. Results Here we describe a hidden Markov model (HMM), an algebraic system, and Markov chain Monte Carlo (MCMC) sampling strategies for alignment of multiple sequence motifs. The MCMC sampling strategies are useful both for alignment optimization and for adjusting position specific background amino acid frequencies for alignment uncertainties. Associated statistical formulations provide an objective measure of alignment quality as well as automatic gap penalty optimization. Improved alignments obtained in this way are compared with PSI-BLAST based alignments within the context of CHAIN analysis of three protein families: Giα subunits, prolyl oligopeptidases, and transitional endoplasmic reticulum (p97) AAA+ ATPases. Conclusion While not entirely replacing PSI-BLAST based alignments, which likewise may be optimized for CHAIN analysis using this approach, these motif-based methods often more accurately align very distantly related sequences and thus can provide a better measure of

  14. Characterization of evolutionarily conserved motifs involved in activity and regulation of the ABA-INSENSITIVE (ABI) 4 transcription factor.

    PubMed

    Gregorio, Josefat; Hernández-Bernal, Alma Fabiola; Cordoba, Elizabeth; León, Patricia

    2014-02-01

    In recent years, the transcription factor ABI4 has emerged as an important node of integration for external and internal signals such as nutrient status and hormone signaling that modulates critical transitions during the growth and development of plants. For this reason, understanding the mechanism of action and regulation of this protein represents an important step towards the elucidation of crosstalk mechanisms in plants. However, this understanding has been hindered due to the negligible levels of this protein as a result of multiple posttranscriptional regulations. To better understand the function and regulation of the ABI4 protein in this work, we performed a functional analysis of several evolutionarily conserved motifs. Based on these conserved motifs, we identified ortholog genes of ABI4 in different plant species. The functionality of the putative ortholog from Theobroma cacao was demonstrated in transient expression assays and in complementation studies in plants. The function of the highly conserved motifs was analyzed after their deletion or mutagenesis in the Arabidopsis ABI4 sequence using mesophyll protoplasts. This approach permitted us to immunologically detect the ABI4 protein and identify some of the mechanisms involved in its regulation. We identified sequences required for the nuclear localization (AP2-associated motif) as well as those for transcriptional activation function (LRP motif). Moreover, this approach showed that the protein stability of this transcription factor is controlled through protein degradation and subcellular localization and involves the AP2-associated and the PEST motifs. We demonstrated that the degradation of ABI4 protein through the PEST motif is mediated by the 26S proteasome in response to changes in the sugar levels. PMID:24046063

  15. Sequence-Based Classification Using Discriminatory Motif Feature Selection

    PubMed Central

    Xiong, Hao; Capurso, Daniel; Sen, Śaunak; Segal, Mark R.

    2011-01-01

    Most existing methods for sequence-based classification use exhaustive feature generation, employing, for example, all -mer patterns. The motivation behind such (enumerative) approaches is to minimize the potential for overlooking important features. However, there are shortcomings to this strategy. First, practical constraints limit the scope of exhaustive feature generation to patterns of length , such that potentially important, longer () predictors are not considered. Second, features so generated exhibit strong dependencies, which can complicate understanding of derived classification rules. Third, and most importantly, numerous irrelevant features are created. These concerns can compromise prediction and interpretation. While remedies have been proposed, they tend to be problem-specific and not broadly applicable. Here, we develop a generally applicable methodology, and an attendant software pipeline, that is predicated on discriminatory motif finding. In addition to the traditional training and validation partitions, our framework entails a third level of data partitioning, a discovery partition. A discriminatory motif finder is used on sequences and associated class labels in the discovery partition to yield a (small) set of features. These features are then used as inputs to a classifier in the training partition. Finally, performance assessment occurs on the validation partition. Important attributes of our approach are its modularity (any discriminatory motif finder and any classifier can be deployed) and its universality (all data, including sequences that are unaligned and/or of unequal length, can be accommodated). We illustrate our approach on two nucleosome occupancy datasets and a protein solubility dataset, previously analyzed using enumerative feature generation. Our method achieves excellent performance results, with and without optimization of classifier tuning parameters. A Python pipeline implementing the approach is available at http

  16. Rewiring yeast sugar transporter preference through modifying a conserved protein motif

    PubMed Central

    Young, Eric M.; Tong, Alice; Bui, Hang; Spofford, Caitlin; Alper, Hal S.

    2014-01-01

    Utilization of exogenous sugars found in lignocellulosic biomass hydrolysates, such as xylose, must be improved before yeast can serve as an efficient biofuel and biochemical production platform. In particular, the first step in this process, the molecular transport of xylose into the cell, can serve as a significant flux bottleneck and is highly inhibited by other sugars. Here we demonstrate that sugar transport preference and kinetics can be rewired through the programming of a sequence motif of the general form G-G/F-XXX-G found in the first transmembrane span. By evaluating 46 different heterologously expressed transporters, we find that this motif is conserved among functional transporters and highly enriched in transporters that confer growth on xylose. Through saturation mutagenesis and subsequent rational mutagenesis, four transporter mutants unable to confer growth on glucose but able to sustain growth on xylose were engineered. Specifically, Candida intermedia gxs1 Phe38Ile39Met40, Scheffersomyces stipitis rgt2 Phe38 and Met40, and Saccharomyces cerevisiae hxt7 Ile39Met40Met340 all exhibit this phenotype. In these cases, primary hexose transporters were rewired into xylose transporters. These xylose transporters nevertheless remained inhibited by glucose. Furthermore, in the course of identifying this motif, novel wild-type transporters with superior monosaccharide growth profiles were discovered, namely S. stipitis RGT2 and Debaryomyces hansenii 2D01474. These findings build toward the engineering of efficient pentose utilization in yeast and provide a blueprint for reprogramming transporter properties. PMID:24344268

  17. Rewiring yeast sugar transporter preference through modifying a conserved protein motif.

    PubMed

    Young, Eric M; Tong, Alice; Bui, Hang; Spofford, Caitlin; Alper, Hal S

    2014-01-01

    Utilization of exogenous sugars found in lignocellulosic biomass hydrolysates, such as xylose, must be improved before yeast can serve as an efficient biofuel and biochemical production platform. In particular, the first step in this process, the molecular transport of xylose into the cell, can serve as a significant flux bottleneck and is highly inhibited by other sugars. Here we demonstrate that sugar transport preference and kinetics can be rewired through the programming of a sequence motif of the general form G-G/F-XXX-G found in the first transmembrane span. By evaluating 46 different heterologously expressed transporters, we find that this motif is conserved among functional transporters and highly enriched in transporters that confer growth on xylose. Through saturation mutagenesis and subsequent rational mutagenesis, four transporter mutants unable to confer growth on glucose but able to sustain growth on xylose were engineered. Specifically, Candida intermedia gxs1 Phe(38)Ile(39)Met(40), Scheffersomyces stipitis rgt2 Phe(38) and Met(40), and Saccharomyces cerevisiae hxt7 Ile(39)Met(40)Met(340) all exhibit this phenotype. In these cases, primary hexose transporters were rewired into xylose transporters. These xylose transporters nevertheless remained inhibited by glucose. Furthermore, in the course of identifying this motif, novel wild-type transporters with superior monosaccharide growth profiles were discovered, namely S. stipitis RGT2 and Debaryomyces hansenii 2D01474. These findings build toward the engineering of efficient pentose utilization in yeast and provide a blueprint for reprogramming transporter properties. PMID:24344268

  18. Conserved Promoter Motif Is Required for Cell Cycle Timing of dnaX Transcription in Caulobacter

    PubMed Central

    Keiler, Kenneth C.; Shapiro, Lucy

    2001-01-01

    Cells use highly regulated transcriptional networks to control temporally regulated events. In the bacterium Caulobacter crescentus, many cellular processes are temporally regulated with respect to the cell cycle, and the genes required for these processes are expressed immediately before the products are needed. Genes encoding factors required for DNA replication, including dnaX, dnaA, dnaN, gyrB, and dnaK, are induced at the G1/S-phase transition. By analyzing mutations in the dnaX promoter, we identified a motif between the −10 and −35 regions that is required for proper timing of gene expression. This motif, named RRF (for repression of replication factors), is conserved in the promoters of other coordinately induced replication factors. Because mutations in the RRF motif result in constitutive gene expression throughout the cell cycle, this sequence is likely to be the binding site for a cell cycle-regulated transcriptional repressor. Consistent with this hypothesis, Caulobacter extracts contain an activity that binds specifically to the RRF in vitro. PMID:11466289

  19. Exploiting topological constraints to reveal buried sequence motifs in the membrane-bound N-linked oligosaccharyl transferases.

    PubMed

    Jaffee, Marcie B; Imperiali, Barbara

    2011-09-01

    The central enzyme in N-linked glycosylation is the oligosaccharyl transferase (OTase), which catalyzes glycan transfer from a polyprenyldiphosphate-linked carrier to select asparagines within acceptor proteins. PglB from Campylobacter jejuni is a single-subunit OTase with homology to the Stt3 subunit of the complex multimeric yeast OTase. Sequence identity between PglB and Stt3 is low (17.9%); however, both have a similar predicted architecture and contain the conserved WWDxG motif. To investigate the relationship between PglB and other Stt3 proteins, sequence analysis was performed using 28 homologues from evolutionarily distant organisms. Since detection of small conserved motifs within large membrane-associated proteins is complicated by divergent sequences surrounding the motifs, we developed a program to parse sequences according to predicted topology and then analyze topologically related regions. This approach identified three conserved motifs that served as the basis for subsequent mutagenesis and functional studies. This work reveals that several inter-transmembrane loop regions of PglB/Stt3 contain strictly conserved motifs that are essential for PglB function. The recent publication of a 3.4 Å resolution structure of full-length C. lari OTase provides clear structural evidence that these loops play a fundamental role in catalysis [ Lizak , C. ; ( 2011 ) Nature 474 , 350 - 355 ]. The current study provides biochemical support for the role of the inter-transmembrane domain loops in OTase catalysis and demonstrates the utility of combining topology prediction and sequence analysis for exposing buried pockets of homology in large membrane proteins. The described approach allowed detection of the catalytic motifs prior to availability of structural data and reveals additional catalytically relevant residues that are not predicted by structural data alone. PMID:21812456

  20. Phosphotyrosine Substrate Sequence Motifs for Dual Specificity Phosphatases

    PubMed Central

    Zhao, Bryan M.; Keasey, Sarah L.; Tropea, Joseph E.; Lountos, George T.; Dyas, Beverly K.; Cherry, Scott; Raran-Kurussi, Sreejith; Waugh, David S.; Ulrich, Robert G.

    2015-01-01

    Protein tyrosine phosphatases dephosphorylate tyrosine residues of proteins, whereas, dual specificity phosphatases (DUSPs) are a subgroup of protein tyrosine phosphatases that dephosphorylate not only Tyr(P) residue, but also the Ser(P) and Thr(P) residues of proteins. The DUSPs are linked to the regulation of many cellular functions and signaling pathways. Though many cellular targets of DUSPs are known, the relationship between catalytic activity and substrate specificity is poorly defined. We investigated the interactions of peptide substrates with select DUSPs of four types: MAP kinases (DUSP1 and DUSP7), atypical (DUSP3, DUSP14, DUSP22 and DUSP27), viral (variola VH1), and Cdc25 (A-C). Phosphatase recognition sites were experimentally determined by measuring dephosphorylation of 6,218 microarrayed Tyr(P) peptides representing confirmed and theoretical phosphorylation motifs from the cellular proteome. A broad continuum of dephosphorylation was observed across the microarrayed peptide substrates for all phosphatases, suggesting a complex relationship between substrate sequence recognition and optimal activity. Further analysis of peptide dephosphorylation by hierarchical clustering indicated that DUSPs could be organized by substrate sequence motifs, and peptide-specificities by phylogenetic relationships among the catalytic domains. The most highly dephosphorylated peptides represented proteins from 29 cell-signaling pathways, greatly expanding the list of potential targets of DUSPs. These newly identified DUSP substrates will be important for examining structure-activity relationships with physiologically relevant targets. PMID:26302245

  1. DoOPSearch: a web-based tool for finding and analysing common conserved motifs in the promoter regions of different chordate and plant genes

    PubMed Central

    Sebestyén, Endre; Nagy, Tibor; Suhai, Sándor; Barta, Endre

    2009-01-01

    Background The comparative genomic analysis of a large number of orthologous promoter regions of the chordate and plant genes from the DoOP databases shows thousands of conserved motifs. Most of these motifs differ from any known transcription factor binding site (TFBS). To identify common conserved motifs, we need a specific tool to be able to search amongst them. Since conserved motifs from the DoOP databases are linked to genes, the result of such a search can give a list of genes that are potentially regulated by the same transcription factor(s). Results We have developed a new tool called DoOPSearch for the analysis of the conserved motifs in the promoter regions of chordate or plant genes. We used the orthologous promoters of the DoOP database to extract thousands of conserved motifs from different taxonomic groups. The advantage of this approach is that different sets of conserved motifs might be found depending on how broad the taxonomic coverage of the underlying orthologous promoter sequence collection is (consider e.g. primates vs. mammals or Brassicaceae vs. Viridiplantae). The DoOPSearch tool allows the users to search these motif collections or the promoter regions of DoOP with user supplied query sequences or any of the conserved motifs from the DoOP database. To find overrepresented gene ontologies, the gene lists obtained can be analysed further using a modified version of the GeneMerge program. Conclusion We present here a comparative genomics based promoter analysis tool. Our system is based on a unique collection of conserved promoter motifs characteristic of different taxonomic groups. We offer both a command line and a web-based tool for searching in these motif collections using user specified queries. These can be either short promoter sequences or consensus sequences of known transcription factor binding sites. The GeneMerge analysis of the search results allows the user to identify statistically overrepresented Gene Ontology terms that

  2. A conserved motif flags Acyl Carrier Proteins for β-branching in polyketide synthesis

    PubMed Central

    Song, Zhongshu; Farmer, Rohit; Williams, Christopher; Hothersall, Joanne; Płoskoń, Eliza; Wattana-amorn, Pakorn; Stephens, Elton R.; Yamada, Erika; Gurney, Rachel; Takebayashi, Yuiko; Masschelein, Joleen; Cox, Russell J.; Lavigne, Rob; Willis, Christine L.; Simpson, Thomas J.; Crosby, John; Winn, Peter J.; Thomas, Christopher M.; Crump, Matthew P.

    2015-01-01

    Type I PKSs often utilise programmed β-branching, via enzymes of an “HMG-CoA synthase (HCS) cassette”, to incorporate various side chains at the second carbon from the terminal carboxylic acid of growing polyketide backbones. We identified a strong sequence motif in Acyl Carrier Proteins (ACPs) where β-branching is known. Substituting ACPs confirmed a correlation of ACP type with β-branching specificity. While these ACPs often occur in tandem, NMR analysis of tandem β-branching ACPs indicated no ACP-ACP synergistic effects and revealed that the conserved sequence motif forms an internal core rather than an exposed patch. Modelling and mutagenesis identified ACP Helix III as a probable anchor point of the ACP-HCS complex whose position is determined by the core. Mutating the core affects ACP functionality while ACP-HCS interface substitutions modulate system specificity. Our method for predicting β-carbon branching expands the potential for engineering novel polyketides and lays a basis for determining specificity rules. PMID:24056399

  3. Conserved Repeat Motifs and Glucan Binding by Glucansucrases of Oral Streptococci and Leuconostoc mesenteroides

    PubMed Central

    Shah, Deepan S. H.; Joucla, Gilles; Remaud-Simeon, Magali; Russell, Roy R. B.

    2004-01-01

    Glucansucrases of oral streptococci and Leuconostoc mesenteroides have a common pattern of structural organization and characteristically contain a domain with a series of tandem amino acid repeats in which certain residues are highly conserved, particularly aromatic amino acids and glycine. In some glucosyltransferases (GTFs) the repeat region has been identified as a glucan binding domain (GBD). Such GBDs are also found in several glucan binding proteins (GBP) of oral streptococci that do not have glucansucrase activity. Alignment of the amino acid sequences of 20 glucansucrases and GBP showed the widespread conservation of the 33-residue A repeat first identified in GtfI of Streptococcus downei. Site-directed mutagenesis of individual highly conserved residues in recombinant GBD of GtfI demonstrated the importance of the first tryptophan and the tyrosine-phenylalanine pair in the binding of dextran, as well as the essential contribution of a basic residue (arginine or lysine). A microplate binding assay was developed to measure the binding affinity of recombinant GBDs. GBD of GtfI was shown to be capable of binding glucans with predominantly α-1,3 or α-1,6 links, as well as alternating α-1,3 and α-1,6 links (alternan). Western blot experiments using biotinylated dextran or alternan as probes demonstrated a difference between the binding of streptococcal GTF and GBP and that of Leuconostoc glucansucrases. Experimental data and bioinformatics analysis showed that the A repeat motif is distinct from the 20-residue CW motif, which also has conserved aromatic amino acids and glycine and which occurs in the choline-binding proteins of Streptococcus pneumoniae and other organisms. PMID:15576779

  4. Bioinformatic Identification of Conserved Cis-Sequences in Coregulated Genes.

    PubMed

    Bülow, Lorenz; Hehl, Reinhard

    2016-01-01

    Bioinformatics tools can be employed to identify conserved cis-sequences in sets of coregulated plant genes because more and more gene expression and genomic sequence data become available. Knowledge on the specific cis-sequences, their enrichment and arrangement within promoters, facilitates the design of functional synthetic plant promoters that are responsive to specific stresses. The present chapter illustrates an example for the bioinformatic identification of conserved Arabidopsis thaliana cis-sequences enriched in drought stress-responsive genes. This workflow can be applied for the identification of cis-sequences in any sets of coregulated genes. The workflow includes detailed protocols to determine sets of coregulated genes, to extract the corresponding promoter sequences, and how to install and run a software package to identify overrepresented motifs. Further bioinformatic analyses that can be performed with the results are discussed. PMID:27557771

  5. SLiMPrints: conservation-based discovery of functional motif fingerprints in intrinsically disordered protein regions

    PubMed Central

    Davey, Norman E.; Cowan, Joanne L.; Shields, Denis C.; Gibson, Toby J.; Coldwell, Mark J.; Edwards, Richard J.

    2012-01-01

    Large portions of higher eukaryotic proteomes are intrinsically disordered, and abundant evidence suggests that these unstructured regions of proteins are rich in regulatory interaction interfaces. A major class of disordered interaction interfaces are the compact and degenerate modules known as short linear motifs (SLiMs). As a result of the difficulties associated with the experimental identification and validation of SLiMs, our understanding of these modules is limited, advocating the use of computational methods to focus experimental discovery. This article evaluates the use of evolutionary conservation as a discriminatory technique for motif discovery. A statistical framework is introduced to assess the significance of relatively conserved residues, quantifying the likelihood a residue will have a particular level of conservation given the conservation of the surrounding residues. The framework is expanded to assess the significance of groupings of conserved residues, a metric that forms the basis of SLiMPrints (short linear motif fingerprints), a de novo motif discovery tool. SLiMPrints identifies relatively overconstrained proximal groupings of residues within intrinsically disordered regions, indicative of putatively functional motifs. Finally, the human proteome is analysed to create a set of highly conserved putative motif instances, including a novel site on translation initiation factor eIF2A that may regulate translation through binding of eIF4E. PMID:22977176

  6. Sequence Motifs in MADS Transcription Factors Responsible for Specificity and Diversification of Protein-Protein Interaction

    PubMed Central

    van Dijk, Aalt D. J.; Morabito, Giuseppa; Fiers, Martijn; van Ham, Roeland C. H. J.; Angenent, Gerco C.; Immink, Richard G. H.

    2010-01-01

    Protein sequences encompass tertiary structures and contain information about specific molecular interactions, which in turn determine biological functions of proteins. Knowledge about how protein sequences define interaction specificity is largely missing, in particular for paralogous protein families with high sequence similarity, such as the plant MADS domain transcription factor family. In comparison to the situation in mammalian species, this important family of transcription regulators has expanded enormously in plant species and contains over 100 members in the model plant species Arabidopsis thaliana. Here, we provide insight into the mechanisms that determine protein-protein interaction specificity for the Arabidopsis MADS domain transcription factor family, using an integrated computational and experimental approach. Plant MADS proteins have highly similar amino acid sequences, but their dimerization patterns vary substantially. Our computational analysis uncovered small sequence regions that explain observed differences in dimerization patterns with reasonable accuracy. Furthermore, we show the usefulness of the method for prediction of MADS domain transcription factor interaction networks in other plant species. Introduction of mutations in the predicted interaction motifs demonstrated that single amino acid mutations can have a large effect and lead to loss or gain of specific interactions. In addition, various performed bioinformatics analyses shed light on the way evolution has shaped MADS domain transcription factor interaction specificity. Identified protein-protein interaction motifs appeared to be strongly conserved among orthologs, indicating their evolutionary importance. We also provide evidence that mutations in these motifs can be a source for sub- or neo-functionalization. The analyses presented here take us a step forward in understanding protein-protein interactions and the interplay between protein sequences and network evolution. PMID

  7. A Monte Carlo-based framework enhances the discovery and interpretation of regulatory sequence motifs

    PubMed Central

    2012-01-01

    Background Discovery of functionally significant short, statistically overrepresented subsequence patterns (motifs) in a set of sequences is a challenging problem in bioinformatics. Oftentimes, not all sequences in the set contain a motif. These non-motif-containing sequences complicate the algorithmic discovery of motifs. Filtering the non-motif-containing sequences from the larger set of sequences while simultaneously determining the identity of the motif is, therefore, desirable and a non-trivial problem in motif discovery research. Results We describe MotifCatcher, a framework that extends the sensitivity of existing motif-finding tools by employing random sampling to effectively remove non-motif-containing sequences from the motif search. We developed two implementations of our algorithm; each built around a commonly used motif-finding tool, and applied our algorithm to three diverse chromatin immunoprecipitation (ChIP) data sets. In each case, the motif finder with the MotifCatcher extension demonstrated improved sensitivity over the motif finder alone. Our approach organizes candidate functionally significant discovered motifs into a tree, which allowed us to make additional insights. In all cases, we were able to support our findings with experimental work from the literature. Conclusions Our framework demonstrates that additional processing at the sequence entry level can significantly improve the performance of existing motif-finding tools. For each biological data set tested, we were able to propose novel biological hypotheses supported by experimental work from the literature. Specifically, in Escherichia coli, we suggested binding site motifs for 6 non-traditional LexA protein binding sites; in Saccharomyces cerevisiae, we hypothesize 2 disparate mechanisms for novel binding sites of the Cse4p protein; and in Halobacterium sp. NRC-1, we discoverd subtle differences in a general transcription factor (GTF) binding site motif across several data sets. We

  8. SVM2Motif--Reconstructing Overlapping DNA Sequence Motifs by Mimicking an SVM Predictor.

    PubMed

    Vidovic, Marina M-C; Görnitz, Nico; Müller, Klaus-Robert; Rätsch, Gunnar; Kloft, Marius

    2015-01-01

    Identifying discriminative motifs underlying the functionality and evolution of organisms is a major challenge in computational biology. Machine learning approaches such as support vector machines (SVMs) achieve state-of-the-art performances in genomic discrimination tasks, but--due to its black-box character--motifs underlying its decision function are largely unknown. As a remedy, positional oligomer importance matrices (POIMs) allow us to visualize the significance of position-specific subsequences. Although being a major step towards the explanation of trained SVM models, they suffer from the fact that their size grows exponentially in the length of the motif, which renders their manual inspection feasible only for comparably small motif sizes, typically k ≤ 5. In this work, we extend the work on positional oligomer importance matrices, by presenting a new machine-learning methodology, entitled motifPOIM, to extract the truly relevant motifs--regardless of their length and complexity--underlying the predictions of a trained SVM model. Our framework thereby considers the motifs as free parameters in a probabilistic model, a task which can be phrased as a non-convex optimization problem. The exponential dependence of the POIM size on the oligomer length poses a major numerical challenge, which we address by an efficient optimization framework that allows us to find possibly overlapping motifs consisting of up to hundreds of nucleotides. We demonstrate the efficacy of our approach on a synthetic data set as well as a real-world human splice site data set. PMID:26690911

  9. TOPDOM: database of conservatively located domains and motifs in proteins

    PubMed Central

    Varga, Julia; Dobson, László; Tusnády, Gábor E.

    2016-01-01

    Summary: The TOPDOM database—originally created as a collection of domains and motifs located consistently on the same side of the membranes in α-helical transmembrane proteins—has been updated and extended by taking into consideration consistently localized domains and motifs in globular proteins, too. By taking advantage of the recently developed CCTOP algorithm to determine the type of a protein and predict topology in case of transmembrane proteins, and by applying a thorough search for domains and motifs as well as utilizing the most up-to-date version of all source databases, we managed to reach a 6-fold increase in the size of the whole database and a 2-fold increase in the number of transmembrane proteins. Availability and implementation: TOPDOM database is available at http://topdom.enzim.hu. The webpage utilizes the common Apache, PHP5 and MySQL software to provide the user interface for accessing and searching the database. The database itself is generated on a high performance computer. Contact: tusnady.gabor@ttk.mta.hu. Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153630

  10. JAR3D Webserver: Scoring and aligning RNA loop sequences to known 3D motifs.

    PubMed

    Roll, James; Zirbel, Craig L; Sweeney, Blake; Petrov, Anton I; Leontis, Neocles

    2016-07-01

    Many non-coding RNAs have been identified and may function by forming 2D and 3D structures. RNA hairpin and internal loops are often represented as unstructured on secondary structure diagrams, but RNA 3D structures show that most such loops are structured by non-Watson-Crick basepairs and base stacking. Moreover, different RNA sequences can form the same RNA 3D motif. JAR3D finds possible 3D geometries for hairpin and internal loops by matching loop sequences to motif groups from the RNA 3D Motif Atlas, by exact sequence match when possible, and by probabilistic scoring and edit distance for novel sequences. The scoring gauges the ability of the sequences to form the same pattern of interactions observed in 3D structures of the motif. The JAR3D webserver at http://rna.bgsu.edu/jar3d/ takes one or many sequences of a single loop as input, or else one or many sequences of longer RNAs with multiple loops. Each sequence is scored against all current motif groups. The output shows the ten best-matching motif groups. Users can align input sequences to each of the motif groups found by JAR3D. JAR3D will be updated with every release of the RNA 3D Motif Atlas, and so its performance is expected to improve over time. PMID:27235417

  11. Ser/Thr motifs in transmembrane proteins: conservation patterns and effects on local protein structure and dynamics.

    PubMed

    Del Val, Coral; White, Stephen H; Bondar, Ana-Nicoleta

    2012-11-01

    We combined systematic bioinformatics analyses and molecular dynamics simulations to assess the conservation patterns of Ser and Thr motifs in membrane proteins, and the effect of such motifs on the structure and dynamics of α-helical transmembrane (TM) segments. We find that Ser/Thr motifs are often present in β-barrel TM proteins. At least one Ser/Thr motif is present in almost half of the sequences of α-helical proteins analyzed here. The extensive bioinformatics analyses and inspection of protein structures led to the identification of molecular transporters with noticeable numbers of Ser/Thr motifs within the TM region. Given the energetic penalty for burying multiple Ser/Thr groups in the membrane hydrophobic core, the observation of transporters with multiple membrane-embedded Ser/Thr is intriguing and raises the question of how the presence of multiple Ser/Thr affects protein local structure and dynamics. Molecular dynamics simulations of four different Ser-containing model TM peptides indicate that backbone hydrogen bonding of membrane-buried Ser/Thr hydroxyl groups can significantly change the local structure and dynamics of the helix. Ser groups located close to the membrane interface can hydrogen bond to solvent water instead of protein backbone, leading to an enhanced local solvation of the peptide. PMID:22836667

  12. Ser/Thr Motifs in Transmembrane Proteins: Conservation Patterns and Effects on Local Protein Structure and Dynamics

    PubMed Central

    del Val, Coral; White, Stephen H.

    2014-01-01

    We combined systematic bioinformatics analyses and molecular dynamics simulations to assess the conservation patterns of Ser and Thr motifs in membrane proteins, and the effect of such motifs on the structure and dynamics of α-helical transmembrane (TM) segments. We find that Ser/Thr motifs are often present in β-barrel TM proteins. At least one Ser/Thr motif is present in almost half of the sequences of α-helical proteins analyzed here. The extensive bioinformatics analyses and inspection of protein structures led to the identification of molecular transporters with noticeable numbers of Ser/Thr motifs within the TM region. Given the energetic penalty for burying multiple Ser/Thr groups in the membrane hydrophobic core, the observation of transporters with multiple membrane-embedded Ser/Thr is intriguing and raises the question of how the presence of multiple Ser/Thr affects protein local structure and dynamics. Molecular dynamics simulations of four different Ser-containing model TM peptides indicate that backbone hydrogen bonding of membrane-buried Ser/Thr hydroxyl groups can significantly change the local structure and dynamics of the helix. Ser groups located close to the membrane interface can hydrogen bond to solvent water instead of protein backbone, leading to an enhanced local solvation of the peptide. PMID:22836667

  13. False occurrences of functional motifs in protein sequences highlight evolutionary constraints

    PubMed Central

    Via, Allegra; Gherardini, Pier Federico; Ferraro, Enrico; Ausiello, Gabriele; Scalia Tomba, Gianpaolo; Helmer-Citterich, Manuela

    2007-01-01

    Background False occurrences of functional motifs in protein sequences can be considered as random events due solely to the sequence composition of a proteome. Here we use a numerical approach to investigate the random appearance of functional motifs with the aim of addressing biological questions such as: How are organisms protected from undesirable occurrences of motifs otherwise selected for their functionality? Has the random appearance of functional motifs in protein sequences been affected during evolution? Results Here we analyse the occurrence of functional motifs in random sequences and compare it to that observed in biological proteomes; the behaviour of random motifs is also studied. Most motifs exhibit a number of false positives significantly similar to the number of times they appear in randomized proteomes (=expected number of false positives). Interestingly, about 3% of the analysed motifs show a different kind of behaviour and appear in biological proteomes less than they do in random sequences. In some of these cases, a mechanism of evolutionary negative selection is apparent; this helps to prevent unwanted functionalities which could interfere with cellular mechanisms. Conclusion Our thorough statistical and biological analysis showed that there are several mechanisms and evolutionary constraints both of which affect the appearance of functional motifs in protein sequences. PMID:17331242

  14. Illumina MiSeq sequencing disfavours a sequence motif in the GFP reporter gene

    PubMed Central

    Van den Hoecke, Silvie; Verhelst, Judith; Saelens, Xavier

    2016-01-01

    Green fluorescent protein (GFP) is one of the most used reporter genes. We have used next-generation sequencing (NGS) to analyse the genetic diversity of a recombinant influenza A virus that expresses GFP and found a remarkable coverage dip in the GFP coding sequence. This coverage dip was present when virus-derived RT-PCR product or the parental plasmid DNA was used as starting material for NGS and regardless of whether Nextera XT transposase or Covaris shearing was used for DNA fragmentation. Therefore, the sequence coverage dip in the GFP coding sequence was not the result of emerging GFP mutant viruses or a bias introduced by Nextera XT fragmentation. Instead, we found that the Illumina MiSeq sequencing method disfavours the ‘CCCGCC’ motif in the GFP coding sequence. PMID:27193250

  15. Illumina MiSeq sequencing disfavours a sequence motif in the GFP reporter gene.

    PubMed

    Van den Hoecke, Silvie; Verhelst, Judith; Saelens, Xavier

    2016-01-01

    Green fluorescent protein (GFP) is one of the most used reporter genes. We have used next-generation sequencing (NGS) to analyse the genetic diversity of a recombinant influenza A virus that expresses GFP and found a remarkable coverage dip in the GFP coding sequence. This coverage dip was present when virus-derived RT-PCR product or the parental plasmid DNA was used as starting material for NGS and regardless of whether Nextera XT transposase or Covaris shearing was used for DNA fragmentation. Therefore, the sequence coverage dip in the GFP coding sequence was not the result of emerging GFP mutant viruses or a bias introduced by Nextera XT fragmentation. Instead, we found that the Illumina MiSeq sequencing method disfavours the 'CCCGCC' motif in the GFP coding sequence. PMID:27193250

  16. Sequence motifs and prokaryotic expression of the reptilian paramyxovirus fusion protein

    USGS Publications Warehouse

    Franke, J.; Batts, W.N.; Ahne, W.; Kurath, G.; Winton, J.R.

    2006-01-01

    Fourteen reptilian paramyxovirus isolates were chosen to represent the known extent of genetic diversity among this novel group of viruses. Selected regions of the fusion (F) gene were sequenced, analyzed and compared. The F gene of all isolates contained conserved motifs homologous to those described for other members of the family Paramyxoviridae including: signal peptide, transmembrane domain, furin cleavage site, fusion peptide, N-linked glycosylation sites, and two heptad repeats, the second of which (HRB-LZ) had the characteristics of a leucine zipper. Selected regions of the fusion gene of isolate Gono-GER85 were inserted into a prokaryotic expression system to generate three recombinant protein fragments of various sizes. The longest recombinant protein was cleaved by furin into two fragments of predicted length. Western blot analysis with virus-neutralizing rabbit-antiserum against this isolate demonstrated that only the longest construct reacted with the antiserum. This construct was unique in containing 30 additional C-terminal amino acids that included most of the HRB-LZ. These results indicate that the F genes of reptilian paramyxoviruses contain highly conserved motifs typical of other members of the family and suggest that the HRB-LZ domain of the reptilian paramyxovirus F protein contains a linear antigenic epitope. ?? Springer-Verlag 2005.

  17. RSAT::Plants: Motif Discovery Within Clusters of Upstream Sequences in Plant Genomes.

    PubMed

    Contreras-Moreira, Bruno; Castro-Mondragon, Jaime A; Rioualen, Claire; Cantalapiedra, Carlos P; van Helden, Jacques

    2016-01-01

    The plant-dedicated mirror of the Regulatory Sequence Analysis Tools (RSAT, http://plants.rsat.eu ) offers specialized options for researchers dealing with plant transcriptional regulation. The website contains whole-sequenced genomes from species regularly updated from Ensembl Plants and other sources (currently 40), and supports an array of tasks frequently required for the analysis of regulatory sequences, such as retrieving upstream sequences, motif discovery, motif comparison, and pattern matching. RSAT::Plants also integrates the footprintDB collection of DNA motifs. This protocol explains step-by-step how to discover DNA motifs in regulatory regions of clusters of co-expressed genes in plants. It also explains how to empirically control the significance of the result, and how to associate the discovered motifs with putative binding factors. PMID:27557774

  18. Unique Structural Features and Sequence Motifs of Proline Utilization A (PutA)

    PubMed Central

    Singh, Ranjan K.; Tanner, John J.

    2013-01-01

    Proline utilization A proteins (PutAs) are bifunctional enzymes that catalyze the oxidation of proline to glutamate using spatially separated proline dehydrogenase and pyrroline-5-carboxylate dehydrogenase active sites. Here we use the crystal structure of the minimalist PutA from Bradyrhizobium japonicum (BjPutA) along with sequence analysis to identify unique structural features of PutAs. This analysis shows that PutAs have secondary structural elements and domains not found in the related monofunctional enzymes. Some of these extra features are predicted to be important for substrate channeling in BjPutA. Multiple sequence alignment analysis shows that some PutAs have a 17-residue conserved motif in the C-terminal 20–30 residues of the polypeptide chain. The BjPutA structure shows that this motif helps seal the internal substrate-channeling cavity from the bulk medium. Finally, it is shown that some PutAs have a 100–200 residue domain of unknown function in the C-terminus that is not found in minimalist PutAs. Remote homology detection suggests that this domain is homologous to the oligomerization beta-hairpin and Rossmann fold domain of BjPutA. PMID:22201760

  19. Unique structural features and sequence motifs of proline utilization A (PutA).

    PubMed

    Singh, Ranjan K; Tanner, John J

    2012-01-01

    Proline utilization A proteins (PutAs) are bifunctional enzymes that catalyze the oxidation of proline to glutamate using spatially separated proline dehydrogenase and pyrroline-5-carboxylate dehydrogenase active sites. Here we use the crystal structure of the minimalist PutA from Bradyrhizobium japonicum (BjPutA) along with sequence analysis to identify unique structural features of PutAs. This analysis shows that PutAs have secondary structural elements and domains not found in the related monofunctional enzymes. Some of these extra features are predicted to be important for substrate channeling in BjPutA. Multiple sequence alignment analysis shows that some PutAs have a 17-residue conserved motif in the C-terminal 20-30 residues of the polypeptide chain. The BjPutA structure shows that this motif helps seal the internal substrate-channeling cavity from the bulk medium. Finally, it is shown that some PutAs have a 100-200 residue domain of unknown function in the C-terminus that is not found in minimalist PutAs. Remote homology detection suggests that this domain is homologous to the oligomerization beta-hairpin and Rossmann fold domain of BjPutA. PMID:22201760

  20. A conserved motif mediates both multimer formation and allosteric activation of phosphoglycerate mutase 5.

    PubMed

    Wilkins, Jordan M; McConnell, Cyrus; Tipton, Peter A; Hannink, Mark

    2014-09-01

    Phosphoglycerate mutase 5 (PGAM5) is an atypical mitochondrial Ser/Thr phosphatase that modulates mitochondrial dynamics and participates in both apoptotic and necrotic cell death. The mechanisms that regulate the phosphatase activity of PGAM5 are poorly understood. The C-terminal phosphoglycerate mutase domain of PGAM5 shares homology with the catalytic domains found in other members of the phosphoglycerate mutase family, including a conserved histidine that is absolutely required for catalytic activity. However, this conserved domain is not sufficient for maximal phosphatase activity. We have identified a highly conserved amino acid motif, WDXNWD, located within the unique N-terminal region, which is required for assembly of PGAM5 into large multimeric complexes. Alanine substitutions within the WDXNWD motif abolish the formation of multimeric complexes and markedly reduce phosphatase activity of PGAM5. A peptide containing the WDXNWD motif dissociates the multimeric complex and reduces but does not fully abolish phosphatase activity. Addition of the WDXNWD-containing peptide in trans to a mutant PGAM5 protein lacking the WDXNWD motif markedly increases phosphatase activity of the mutant protein. Our results are consistent with an intermolecular allosteric regulation mechanism for the phosphatase activity of PGAM5, in which the assembly of PGAM5 into multimeric complexes, mediated by the WDXNWD motif, results in maximal activation of phosphatase activity. Our results suggest the possibility of identifying small molecules that function as allosteric regulators of the phosphatase activity of PGAM5. PMID:25012655

  1. A Conserved Motif Mediates both Multimer Formation and Allosteric Activation of Phosphoglycerate Mutase 5*

    PubMed Central

    Wilkins, Jordan M.; McConnell, Cyrus; Tipton, Peter A.; Hannink, Mark

    2014-01-01

    Phosphoglycerate mutase 5 (PGAM5) is an atypical mitochondrial Ser/Thr phosphatase that modulates mitochondrial dynamics and participates in both apoptotic and necrotic cell death. The mechanisms that regulate the phosphatase activity of PGAM5 are poorly understood. The C-terminal phosphoglycerate mutase domain of PGAM5 shares homology with the catalytic domains found in other members of the phosphoglycerate mutase family, including a conserved histidine that is absolutely required for catalytic activity. However, this conserved domain is not sufficient for maximal phosphatase activity. We have identified a highly conserved amino acid motif, WDXNWD, located within the unique N-terminal region, which is required for assembly of PGAM5 into large multimeric complexes. Alanine substitutions within the WDXNWD motif abolish the formation of multimeric complexes and markedly reduce phosphatase activity of PGAM5. A peptide containing the WDXNWD motif dissociates the multimeric complex and reduces but does not fully abolish phosphatase activity. Addition of the WDXNWD-containing peptide in trans to a mutant PGAM5 protein lacking the WDXNWD motif markedly increases phosphatase activity of the mutant protein. Our results are consistent with an intermolecular allosteric regulation mechanism for the phosphatase activity of PGAM5, in which the assembly of PGAM5 into multimeric complexes, mediated by the WDXNWD motif, results in maximal activation of phosphatase activity. Our results suggest the possibility of identifying small molecules that function as allosteric regulators of the phosphatase activity of PGAM5. PMID:25012655

  2. Physical-chemical property based sequence motifs and methods regarding same

    DOEpatents

    Braun, Werner; Mathura, Venkatarajan S.; Schein, Catherine H.

    2008-09-09

    A data analysis system, program, and/or method, e.g., a data mining/data exploration method, using physical-chemical property motifs. For example, a sequence database may be searched for identifying segments thereof having physical-chemical properties similar to the physical-chemical property motifs.

  3. The Eps1p Protein Disulfide Isomerase Conserves Classic Thioredoxin Superfamily Amino Acid Motifs but Not Their Functional Geometries

    PubMed Central

    Biran, Shai; Gat, Yair; Fass, Deborah

    2014-01-01

    The widespread thioredoxin superfamily enzymes typically share the following features: a characteristic α-β fold, the presence of a Cys-X-X-Cys (or Cys-X-X-Ser) redox-active motif, and a proline in the cis configuration abutting the redox-active site in the tertiary structure. The Cys-X-X-Cys motif is at the solvent-exposed amino terminus of an α-helix, allowing the first cysteine to engage in nucleophilic attack on substrates, or substrates to attack the Cys-X-X-Cys disulfide, depending on whether the enzyme functions to reduce, isomerize, or oxidize its targets. We report here the X-ray crystal structure of an enzyme that breaks many of our assumptions regarding the sequence-structure relationship of thioredoxin superfamily proteins. The yeast Protein Disulfide Isomerase family member Eps1p has Cys-X-X-Cys motifs and proline residues at the appropriate primary structural positions in its first two predicted thioredoxin-fold domains. However, crystal structures show that the Cys-X-X-Cys of the second domain is buried and that the adjacent proline is in the trans, rather than the cis isomer. In these configurations, neither the “active-site” disulfide nor the backbone carbonyl preceding the proline is available to interact with substrate. The Eps1p structures thus expand the documented diversity of the PDI oxidoreductase family and demonstrate that conserved sequence motifs in common folds do not guarantee structural or functional conservation. PMID:25437863

  4. Interaction of MYC with host cell factor-1 is mediated by the evolutionarily conserved Myc box IV motif.

    PubMed

    Thomas, L R; Foshage, A M; Weissmiller, A M; Popay, T M; Grieb, B C; Qualls, S J; Ng, V; Carboneau, B; Lorey, S; Eischen, C M; Tansey, W P

    2016-07-01

    The MYC family of oncogenes encodes a set of three related transcription factors that are overexpressed in many human tumors and contribute to the cancer-related deaths of more than 70,000 Americans every year. MYC proteins drive tumorigenesis by interacting with co-factors that enable them to regulate the expression of thousands of genes linked to cell growth, proliferation, metabolism and genome stability. One effective way to identify critical co-factors required for MYC function has been to focus on sequence motifs within MYC that are conserved throughout evolution, on the assumption that their conservation is driven by protein-protein interactions that are vital for MYC activity. In addition to their DNA-binding domains, MYC proteins carry five regions of high sequence conservation known as Myc boxes (Mb). To date, four of the Mb motifs (MbI, MbII, MbIIIa and MbIIIb) have had a molecular function assigned to them, but the precise role of the remaining Mb, MbIV, and the reason for its preservation in vertebrate Myc proteins, is unknown. Here, we show that MbIV is required for the association of MYC with the abundant transcriptional coregulator host cell factor-1 (HCF-1). We show that the invariant core of MbIV resembles the tetrapeptide HCF-binding motif (HBM) found in many HCF-interaction partners, and demonstrate that MYC interacts with HCF-1 in a manner indistinguishable from the prototypical HBM-containing protein VP16. Finally, we show that rationalized point mutations in MYC that disrupt interaction with HCF-1 attenuate the ability of MYC to drive tumorigenesis in mice. Together, these data expose a molecular function for MbIV and indicate that HCF-1 is an important co-factor for MYC. PMID:26522729

  5. Evolutionarily conserved sequences on human chromosome 21

    SciTech Connect

    Frazer, Kelly A.; Sheehan, John B.; Stokowski, Renee P.; Chen, Xiyin; Hosseini, Roya; Cheng, Jan-Fang; Fodor, Stephen P.A.; Cox, David R.; Patil, Nila

    2001-09-01

    Comparison of human sequences with the DNA of other mammals is an excellent means of identifying functional elements in the human genome. Here we describe the utility of high-density oligonucleotide arrays as a rapid approach for comparing human sequences with the DNA of multiple species whose sequences are not presently available. High-density arrays representing approximately 22.5 Mb of nonrepetitive human chromosome 21 sequence were synthesized and then hybridized with mouse and dog DNA to identify sequences conserved between humans and mice (human-mouse elements) and between humans and dogs (human-dog elements). Our data show that sequence comparison of multiple species provides a powerful empiric method for identifying actively conserved elements in the human genome. A large fraction of these evolutionarily conserved elements are present in regions on chromosome 21 that do not encode known genes.

  6. Analysis of Genomic Sequence Motifs for Deciphering Transcription Factor Binding and Transcriptional Regulation in Eukaryotic Cells

    PubMed Central

    Boeva, Valentina

    2016-01-01

    Eukaryotic genomes contain a variety of structured patterns: repetitive elements, binding sites of DNA and RNA associated proteins, splice sites, and so on. Often, these structured patterns can be formalized as motifs and described using a proper mathematical model such as position weight matrix and IUPAC consensus. Two key tasks are typically carried out for motifs in the context of the analysis of genomic sequences. These are: identification in a set of DNA regions of over-represented motifs from a particular motif database, and de novo discovery of over-represented motifs. Here we describe existing methodology to perform these two tasks for motifs characterizing transcription factor binding. When applied to the output of ChIP-seq and ChIP-exo experiments, or to promoter regions of co-modulated genes, motif analysis techniques allow for the prediction of transcription factor binding events and enable identification of transcriptional regulators and co-regulators. The usefulness of motif analysis is further exemplified in this review by how motif discovery improves peak calling in ChIP-seq and ChIP-exo experiments and, when coupled with information on gene expression, allows insights into physical mechanisms of transcriptional modulation. PMID:26941778

  7. Analysis of Genomic Sequence Motifs for Deciphering Transcription Factor Binding and Transcriptional Regulation in Eukaryotic Cells.

    PubMed

    Boeva, Valentina

    2016-01-01

    Eukaryotic genomes contain a variety of structured patterns: repetitive elements, binding sites of DNA and RNA associated proteins, splice sites, and so on. Often, these structured patterns can be formalized as motifs and described using a proper mathematical model such as position weight matrix and IUPAC consensus. Two key tasks are typically carried out for motifs in the context of the analysis of genomic sequences. These are: identification in a set of DNA regions of over-represented motifs from a particular motif database, and de novo discovery of over-represented motifs. Here we describe existing methodology to perform these two tasks for motifs characterizing transcription factor binding. When applied to the output of ChIP-seq and ChIP-exo experiments, or to promoter regions of co-modulated genes, motif analysis techniques allow for the prediction of transcription factor binding events and enable identification of transcriptional regulators and co-regulators. The usefulness of motif analysis is further exemplified in this review by how motif discovery improves peak calling in ChIP-seq and ChIP-exo experiments and, when coupled with information on gene expression, allows insights into physical mechanisms of transcriptional modulation. PMID:26941778

  8. Repulsive parallel MCMC algorithm for discovering diverse motifs from large sequence sets

    PubMed Central

    Ikebata, Hisaki; Yoshida, Ryo

    2015-01-01

    Motivation: The motif discovery problem consists of finding recurring patterns of short strings in a set of nucleotide sequences. This classical problem is receiving renewed attention as most early motif discovery methods lack the ability to handle large data of recent genome-wide ChIP studies. New ChIP-tailored methods focus on reducing computation time and pay little regard to the accuracy of motif detection. Unlike such methods, our method focuses on increasing the detection accuracy while maintaining the computation efficiency at an acceptable level. The major advantage of our method is that it can mine diverse multiple motifs undetectable by current methods. Results: The repulsive parallel Markov chain Monte Carlo (RPMCMC) algorithm that we propose is a parallel version of the widely used Gibbs motif sampler. RPMCMC is run on parallel interacting motif samplers. A repulsive force is generated when different motifs produced by different samplers near each other. Thus, different samplers explore different motifs. In this way, we can detect much more diverse motifs than conventional methods can. Through application to 228 transcription factor ChIP-seq datasets of the ENCODE project, we show that the RPMCMC algorithm can find many reliable cofactor interacting motifs that existing methods are unable to discover. Availability and implementation: A C++ implementation of RPMCMC and discovered cofactor motifs for the 228 ENCODE ChIP-seq datasets are available from http://daweb.ism.ac.jp/yoshidalab/motif. Contact: ikebata.hisaki@ism.ac.jp, yoshidar@ism.ac.jp Supplementary information: Supplementary data are available from Bioinformatics online. PMID:25583120

  9. Characterization of a conserved C-terminal motif (RSPRR) in ribosomal protein S6 kinase 1 required for its mammalian target of rapamycin-dependent regulation.

    PubMed

    Schalm, Stefanie S; Tee, Andrew R; Blenis, John

    2005-03-25

    The mammalian target of rapamycin, mTOR, is a Ser/Thr kinase that promotes cell growth and proliferation by activating ribosomal protein S6 kinase 1 (S6K1). We previously identified a conserved TOR signaling (TOS) motif in the N terminus of S6K1 that is required for its mTOR-dependent activation. Furthermore, our data suggested that the TOS motif suppresses an inhibitory function associated with the C terminus of S6K1. Here, we have characterized the mTOR-regulated inhibitory region within the C terminus. We have identified a conserved C-terminal "RSPRR" sequence that is responsible for an mTOR-dependent suppression of S6K1 activation. Deletion or mutations within this RSPRR motif partially rescue the kinase activity of the S6K1 TOS motif mutant (S6K1-F5A), and this rescued activity is rapamycin resistant. Furthermore, we have shown that the RSPRR motif significantly suppresses S6K1 phosphorylation at two phosphorylation sites (Thr-389 and Thr-229) that are crucial for S6K1 activation. Importantly, introducing both the Thr-389 phosphomimetic and RSPRR motif mutations into the catalytically inactive S6K1 mutant S6K1-F5A completely rescues its activity and renders it fully rapamycin resistant. These data show that the N-terminal TOS motif suppresses an inhibitory function mediated by the C-terminal RSPRR motif. We propose that the RSPRR motif interacts with a negative regulator of S6K1 that is normally suppressed by mTOR. PMID:15659381

  10. An artificial intelligence approach to motif discovery in protein sequences: application to steriod dehydrogenases.

    PubMed

    Bailey, T L; Baker, M E; Elkan, C P

    1997-05-01

    MEME (Multiple Expectation-maximization for Motif Elicitation) is a unique new software tool that uses artificial intelligence techniques to discover motifs shared by a set of protein sequences in a fully automated manner. This paper is the first detailed study of the use of MEME to analyse a large, biologically relevant set of sequences, and to evaluate the sensitivity and accuracy of MEME in identifying structurally important motifs. For this purpose, we chose the short-chain alcohol dehydrogenase superfamily because it is large and phylogenetically diverse, providing a test of how well MEME can work on sequences with low amino acid similarity. Moreover, this dataset contains enzymes of biological importance, and because several enzymes have known X-ray crystallographic structures, we can test the usefulness of MEME for structural analysis. The first six motifs from MEME map onto structurally important alpha-helices and beta-strands on Streptomyces hydrogenans 20beta-hydroxysteroid dehydrogenase. We also describe MAST (Motif Alignment Search Tool), which conveniently uses output from MEME for searching databases such as SWISS-PROT and Genpept. MAST provides statistical measures that permit a rigorous evaluation of the significance of database searches with individual motifs or groups of motifs. A database search of Genpept90 by MAST with the log-odds matrix of the first six motifs obtained from MEME yields a bimodal output, demonstrating the selectivity of MAST. We show for the first time, using primary sequence analysis, that bacterial sugar epimerases are homologs of short-chain dehydrogenases. MEME and MAST will be increasingly useful as genome sequencing provides large datasets of phylogenetically divergent sequences of biomedical interest. PMID:9366496

  11. A Conserved Di-Basic Motif of Drosophila Crumbs Contributes to Efficient ER Export.

    PubMed

    Kumichel, Alexandra; Kapp, Katja; Knust, Elisabeth

    2015-06-01

    The Drosophila type I transmembrane protein Crumbs is an apical determinant required for the maintenance of apico-basal epithelial cell polarity. The level of Crumbs at the plasma membrane is crucial, but how it is regulated is poorly understood. In a genetic screen for regulators of Crumbs protein trafficking we identified Sar1, the core component of the coat protein complex II transport vesicles. sar1 mutant embryos show a reduced plasma membrane localization of Crumbs, a defect similar to that observed in haunted and ghost mutant embryos, which lack Sec23 and Sec24CD, respectively. By pulse-chase assays in Drosophila Schneider cells and analysis of protein transport kinetics based on Endoglycosidase H resistance we identified an RNKR motif in Crumbs, which contributes to efficient ER export. The motif identified fits the highly conserved di-basic RxKR motif and mediates interaction with Sar1. The RNKR motif is also required for plasma membrane delivery of transgene-encoded Crumbs in epithelial cells of Drosophila embryos. Our data are the first to show that a di-basic motif acts as a signal for ER exit of a type I plasma membrane protein in a metazoan organism. PMID:25753515

  12. Two structurally distinct {kappa}B sequence motifs cooperatively control LPS-induced KC gene transcription in mouse macrophages

    SciTech Connect

    Ohmori, Y.; Fukumoto, S.; Hamilton, T.A.

    1995-10-01

    The mouse KC gene is an {alpha}-chemokine gene whose transcription is induced in mononuclear phagocytes by LPS. DNA sequences necessary for transcriptional control of KC by LPS were identified in the region flanking the transcription start site. Transient transfection analysis in macrophages using deletion mutants of a 1.5-kb sequence placed in front of the chloramphenicol acetyl transferase (CAT) gene identified an LPS-responsive region between residues -104 and +30. This region contained two {kappa}B sequence motifs. The first motif (position -70 to -59, {kappa}B1) is highly conserved in all three human GRO genes and in the mouse macrophage inflammatory protein-2 (MIP-2) gene. The second {kappa}B motif (position -89 to -78, {kappa}B2) was conserved only between the mouse and the rat KC genes. Consistent with previous reports, the highly conserved {kappa}B site ({kappa}B1) was essential for LPS inducibility. Surprisingly, the distal {kappa}B site ({kappa}B2) was also necessary for optimal response; mutation of either {kappa}B site markedly reduced sensitivity to LPS in RAW264.7 cells and to TNF-{alpha} in NIH 3T3 fibroblasts. Although both {kappa}B1 and {kappa}B2 sequences were able to bind members of the Rel homology family, including NF{kappa}B1 (P50), RelA (65), and c-Rel, the {kappa}B1 site bound these factors with higher affinity and functioned more effectively than the {kappa}B2 site in a heterologous promoter. These findings demonstrate that transcriptional control of the KC gene requires cooperation between two {kappa}B sites and is thus distinct from that of the three human GRO genes and the mouse MIP-2 gene. 71 refs., 8 figs.

  13. A tobacco bZip transcription activator (TAF-1) binds to a G-box-like motif conserved in plant genes.

    PubMed Central

    Oeda, K; Salinas, J; Chua, N H

    1991-01-01

    Tobacco nuclear extract contains a factor that binds specifically to the motif I sequence (5'-GTACGTGGCG-3') conserved among rice rab genes and cotton lea genes. We isolated from a tobacco cDNA expression library, a partial cDNA clone encoding a truncated derivative of a protein designated as TAF-1. The truncated TAF-1 (Mr = 26,000) contains an acidic region at its N-terminus and a bZip motif at its C-terminus. Using a panel of motif I mutants as probes, we showed that the truncated TAF-1 and the tobacco nuclear factor for motif I have similar, it not identical, binding specificities. In particular, both show high-affinity binding to the perfect palindrome 5'-GCCACGTGGC-3' which is also known as the G-box motif. TAF-1 mRNA is highly expressed in root, but the level is at least 10 times lower in stem and leaf. Consistent with this observation, we found that a motif I tetramer, when fused to the -90 derivative of the CaMV 35S promoter, is inactive in leaf of transgenic tobacco. The activity, however, can be elevated by transient expression of the truncated TAF-1. We conclude from these results that TAF-1 can bind to the G-box and related motifs and that it functions as a transcription activator. Images PMID:2050116

  14. Conserved structural motifs located in distal loops of aphthovirus internal ribosome entry site domain 3 are required for internal initiation of translation.

    PubMed Central

    López de Quinto, S; Martínez-Salas, E

    1997-01-01

    A comparison of picornavirus internal ribosome entry site (IRES) secondary structures revealed the existence of conserved motifs located on loops. We have carried out a mutational analysis to test their requirement for IRES-driven translation. The GUAA sequence, located in the aphthovirus 3A loop, did not tolerate substitutions that disrupt the GNRA motif. Interestingly, this motif was found at similar positions in all picornavirus IRESs, suggesting that it may form part of a tertiary-structure element. The RAAA tetranucleotide located in the 3B loop was conserved only in cardiovirus and aphthovirus. A mutational analysis of the RAAA motif revealed that activities of 3B loop mutants correlated with both the presence of a sequence close to CAAA at the new 3B loop and the absence of reorganization of the 3B and 3C stem-loops. In support of this conclusion, insertion of a large number of nucleotides close to the 3B loop, which was predicted to reorganize the 3B-3C stem-loop structure, led to defective IRES elements. We conclude that the aphthovirus IRES loops located at the most distal part of domain 3, which carries GNRA and RAAA motifs, are essential for IRES function. PMID:9094703

  15. A conserved disulfide motif in human tear lipocalins influences ligand binding.

    PubMed

    Glasgow, B J; Abduragimov, A R; Yusifov, T N; Gasymov, O K; Horwitz, J; Hubbell, W L; Faull, K F

    1998-02-24

    Structural and functional characteristics of the disulfide motif have been determined for tear lipocalins, members of a novel group of proteins that carry lipids. Amino acid sequences for two of the six isolated isoforms were assigned by a comparison of molecular mass measurements with masses calculated from the cDNA-predicted protein sequence and available N-terminal protein sequence data. A third isoform was tentatively sequence assigned using the same criteria. The most abundant isoform has a measured mass of 17 446.3 Da, consistent with residues 19-176 of the putative precursor (calculated mass 17 445.8 Da). Chemical derivatization of native and reduced/denatured protein confirmed the presence of a single intramolecular disulfide bond in the native protein. Reactivity of native, reduced, and denatured protein with 4-pyridine disulfide and dithiobis(2-nitrobenzoic acid) indicated that access to the free cysteine is markedly restricted by the intact disulfide bridge. Mass measurements of tryptic fragments identified C119 as the free cysteine and showed that the single intramolecular disulfide bond joined residues C79 and C171. Circular dichroism indicated that tear lipocalins have a predominant beta-pleated sheet structure (44%) that is essentially retained after reduction of the disulfide bond. Circular dichroism in the far-UV showed reduced molecular asymmetry and enhanced urea-induced unfolding with disulfide reduction indicative of relaxation of protein structure. Circular dichroism in the near-UV shows that the disulfide bond contributes to the asymmetry of aromatic sites. The effect of disulfide reduction on ligand binding was monitored using the intrinsic optical activity of bound retinol. The intact disulfide bond diminishes the affinity of tear lipocalins for retinol and restricts the displacement of native lipids by retinol. Disulfide reduction is accompanied by a dramatic alteration in ligand-induced conformational changes that involves aromatic

  16. Modeling of the Ebola virus delta peptide reveals a potential lytic sequence motif.

    PubMed

    Gallaher, William R; Garry, Robert F

    2015-01-01

    Filoviruses, such as Ebola and Marburg viruses, cause severe outbreaks of human infection, including the extensive epidemic of Ebola virus disease (EVD) in West Africa in 2014. In the course of examining mutations in the glycoprotein gene associated with 2014 Ebola virus (EBOV) sequences, a differential level of conservation was noted between the soluble form of glycoprotein (sGP) and the full length glycoprotein (GP), which are both encoded by the GP gene via RNA editing. In the region of the proteins encoded after the RNA editing site sGP was more conserved than the overlapping region of GP when compared to a distant outlier species, Tai Forest ebolavirus. Half of the amino acids comprising the "delta peptide", a 40 amino acid carboxy-terminal fragment of sGP, were identical between otherwise widely divergent species. A lysine-rich amphipathic peptide motif was noted at the carboxyl terminus of delta peptide with high structural relatedness to the cytolytic peptide of the non-structural protein 4 (NSP4) of rotavirus. EBOV delta peptide is a candidate viroporin, a cationic pore-forming peptide, and may contribute to EBOV pathogenesis. PMID:25609303

  17. Motif composition, conservation and condition-specificity of single and alternative transcription start sites in the Drosophila genome

    PubMed Central

    Rach, Elizabeth A; Yuan, Hsiang-Yu; Majoros, William H; Tomancak, Pavel; Ohler, Uwe

    2009-01-01

    Background Transcription initiation is a key component in the regulation of gene expression. mRNA 5' full-length sequencing techniques have enhanced our understanding of mammalian transcription start sites (TSSs), revealing different initiation patterns on a genomic scale. Results To identify TSSs in Drosophila melanogaster, we applied a hierarchical clustering strategy on available 5' expressed sequence tags (ESTs) and identified a high quality set of 5,665 TSSs for approximately 4,000 genes. We distinguished two initiation patterns: 'peaked' TSSs, and 'broad' TSS cluster groups. Peaked promoters were found to contain location-specific sequence elements; conversely, broad promoters were associated with non-location-specific elements. In alignments across other Drosophila genomes, conservation levels of sequence elements exceeded 90% within the melanogaster subgroup, but dropped considerably for distal species. Elements in broad promoters had lower levels of conservation than those in peaked promoters. When characterizing the distributions of ESTs, 64% of TSSs showed distinct associations to one out of eight different spatiotemporal conditions. Available whole-genome tiling array time series data revealed different temporal patterns of embryonic activity across the majority of genes with distinct alternative promoters. Many genes with maternally inherited transcripts were found to have alternative promoters utilized later in development. Core promoters of maternally inherited transcripts showed differences in motif composition compared to zygotically active promoters. Conclusions Our study provides a comprehensive map of Drosophila TSSs and the conditions under which they are utilized. Distinct differences in motif associations with initiation pattern and spatiotemporal utilization illustrate the complex regulatory code of transcription initiation. PMID:19589141

  18. Identification of an Electrostatic Ruler Motif for Sequence-Specific Binding of Collagenase to Collagen.

    PubMed

    Subramanian, Sundar Raman; Singam, Ettayapuram Ramaprasad Azhagiya; Berinski, Michael; Subramanian, Venkatesan; Wade, Rebecca C

    2016-08-25

    Sequence-specific cleavage of collagen by mammalian collagenase plays a pivotal role in cell function. Collagenases are matrix metalloproteinases that cleave the peptide bond at a specific position on fibrillar collagen. The collagenase Hemopexin-like (HPX) domain has been proposed to be responsible for substrate recognition, but the mechanism by which collagenases identify the cleavage site on fibrillar collagen is not clearly understood. In this study, Brownian dynamics simulations coupled with atomic-detail and coarse-grained molecular dynamics simulations were performed to dock matrix metalloproteinase-1 (MMP-1) on a collagen IIIα1 triple helical peptide. We find that the HPX domain recognizes the collagen triple helix at a conserved R-X11-R motif C-terminal to the cleavage site to which the HPX domain of collagen is guided electrostatically. The binding of the HPX domain between the two arginine residues is energetically stabilized by hydrophobic contacts with collagen. From the simulations and analysis of the sequences and structural flexibility of collagen and collagenase, a mechanistic scheme by which MMP-1 can recognize and bind collagen for proteolysis is proposed. PMID:27245212

  19. REPdenovo: Inferring De Novo Repeat Motifs from Short Sequence Reads

    PubMed Central

    Chu, Chong; Nielsen, Rasmus; Wu, Yufeng

    2016-01-01

    Repeat elements are important components of eukaryotic genomes. One limitation in our understanding of repeat elements is that most analyses rely on reference genomes that are incomplete and often contain missing data in highly repetitive regions that are difficult to assemble. To overcome this problem we develop a new method, REPdenovo, which assembles repeat sequences directly from raw shotgun sequencing data. REPdenovo can construct various types of repeats that are highly repetitive and have low sequence divergence within copies. We show that REPdenovo is substantially better than existing methods both in terms of the number and the completeness of the repeat sequences that it recovers. The key advantage of REPdenovo is that it can reconstruct long repeats from sequence reads. We apply the method to human data and discover a number of potentially new repeats sequences that have been missed by previous repeat annotations. Many of these sequences are incorporated into various parasite genomes, possibly because the filtering process for host DNA involved in the sequencing of the parasite genomes failed to exclude the host derived repeat sequences. REPdenovo is a new powerful computational tool for annotating genomes and for addressing questions regarding the evolution of repeat families. The software tool, REPdenovo, is available for download at https://github.com/Reedwarbler/REPdenovo. PMID:26977803

  20. Identification of disease-specific motifs in the antibody specificity repertoire via next-generation sequencing.

    PubMed

    Pantazes, Robert J; Reifert, Jack; Bozekowski, Joel; Ibsen, Kelly N; Murray, Joseph A; Daugherty, Patrick S

    2016-01-01

    Disease-specific antibodies can serve as highly effective biomarkers but have been identified for only a relatively small number of autoimmune diseases. A method was developed to identify disease-specific binding motifs through integration of bacterial display peptide library screening, next-generation sequencing (NGS) and computational analysis. Antibody specificity repertoires were determined by identifying bound peptide library members for each specimen using cell sorting and performing NGS. A computational algorithm, termed Identifying Motifs Using Next- generation sequencing Experiments (IMUNE), was developed and applied to discover disease- and healthy control-specific motifs. IMUNE performs comprehensive pattern searches, identifies patterns statistically enriched in the disease or control groups and clusters the patterns to generate motifs. Using celiac disease sera as a discovery set, IMUNE identified a consensus motif (QPEQPF[PS]E) with high diagnostic sensitivity and specificity in a validation sera set, in addition to novel motifs. Peptide display and sequencing (Display-Seq) coupled with IMUNE analysis may thus be useful to characterize antibody repertoires and identify disease-specific antibody epitopes and biomarkers. PMID:27481573

  1. Identification of disease-specific motifs in the antibody specificity repertoire via next-generation sequencing

    PubMed Central

    Pantazes, Robert J.; Reifert, Jack; Bozekowski, Joel; Ibsen, Kelly N.; Murray, Joseph A.; Daugherty, Patrick S.

    2016-01-01

    Disease-specific antibodies can serve as highly effective biomarkers but have been identified for only a relatively small number of autoimmune diseases. A method was developed to identify disease-specific binding motifs through integration of bacterial display peptide library screening, next-generation sequencing (NGS) and computational analysis. Antibody specificity repertoires were determined by identifying bound peptide library members for each specimen using cell sorting and performing NGS. A computational algorithm, termed Identifying Motifs Using Next- generation sequencing Experiments (IMUNE), was developed and applied to discover disease- and healthy control-specific motifs. IMUNE performs comprehensive pattern searches, identifies patterns statistically enriched in the disease or control groups and clusters the patterns to generate motifs. Using celiac disease sera as a discovery set, IMUNE identified a consensus motif (QPEQPF[PS]E) with high diagnostic sensitivity and specificity in a validation sera set, in addition to novel motifs. Peptide display and sequencing (Display-Seq) coupled with IMUNE analysis may thus be useful to characterize antibody repertoires and identify disease-specific antibody epitopes and biomarkers. PMID:27481573

  2. Identification of Promoter Motifs Involved in the Network of Phytochrome A-Regulated Gene Expression by Combined Analysis of Genomic Sequence and Microarray Data1[w

    PubMed Central

    Hudson, Matthew E.; Quail, Peter H.

    2003-01-01

    Several hundred Arabidopsis genes, transcriptionally regulated by phytochrome A (phyA), were previously identified using an oligonucleotide microarray. We have now identified, in silico, conserved sequence motifs in the promoters of these genes by comparing the promoter sequences to those of all the genes present on the microarray from which they were sampled. This was done using a Perl script (called Sift) that identifies over-represented motifs using an enumerative approach. The utility of Sift was verified by analysis of circadian-regulated promoters known to contain a biologically significant motif. Several elements were then identified in phyA-responsive promoters by their over-representation. Five previously undescribed motifs were detected in the promoters of phyA-induced genes. Four novel motifs were found in phyA-repressed promoters, plus a motif that strongly resembles the DE1 element. The G-box, CACGTG, was a prominent hit in both induced and repressed phyA-responsive promoters. Intriguingly, two distinct flanking consensus sequences were observed adjacent to the G-box core sequence: one predominating in phyA-induced promoters, the other in phyA-repressed promoters. Such different conserved flanking nucleotides around the core motif in these two sets of promoters may indicate that different members of the same family of DNA-binding proteins mediate phyA induction and repression. An increased abundance of G-box sequences was observed in the most rapidly phyA-responsive genes and in the promoters of phyA-regulated transcription factors, indicating that G-box-binding transcription factors are upstream components in a transcriptional cascade that mediates phyA-regulated development. PMID:14681527

  3. Conserved Hydration Sites in Pin1 Reveal a Distinctive Water Recognition Motif in Proteins.

    PubMed

    Barman, Arghya; Smitherman, Crystal; Souffrant, Michael; Gadda, Giovanni; Hamelberg, Donald

    2016-01-25

    Structurally conserved water molecules are important for biomolecular stability, flexibility, and function. X-ray crystallographic studies of Pin1 have resolved a number of water molecules around the enzyme, including two highly conserved water molecules within the protein. The functional role of these localized water molecules remains unknown and unexplored. Pin1 catalyzes cis/trans isomerizations of peptidyl prolyl bonds that are preceded by a phosphorylated serine or threonine residue. Pin1 is involved in many subcellular signaling processes and is a potential therapeutic target for the treatment of several life threatening diseases. Here, we investigate the significance of these structurally conserved water molecules in the catalytic domain of Pin1 using molecular dynamics (MD) simulations, free energy calculations, analysis of X-ray crystal structures, and circular dichroism (CD) experiments. MD simulations and free energy calculations suggest the tighter binding water molecule plays a crucial role in maintaining the integrity and stability of a critical hydrogen-bonding network in the active site. The second water molecule is exchangeable with bulk solvent and is found in a distinctive helix-turn-coil motif. Structural bioinformatics analysis of nonredundant X-ray crystallographic protein structures in the Protein Data Bank (PDB) suggest this motif is present in several other proteins and can act as a water site, akin to the calcium EF hand. CD experiments suggest the isolated motif is in a distorted PII conformation and requires the protein environment to fully form the α-helix-turn-coil motif. This study provides valuable insights into the role of hydration in the structural integrity of Pin1 that can be exploited in protein engineering and drug design. PMID:26651388

  4. Novel missense mutations in a conserved loop between ERCC6 (CSB) helicase motifs V and VI: Insights into Cockayne syndrome.

    PubMed

    Wilson, Brian T; Lochan, Anneline; Stark, Zornitza; Sutton, Ruth E

    2016-03-01

    Cockayne syndrome is caused by biallelic ERCC8 (CSA) or ERCC6 (CSB) mutations and is characterized by growth restriction, microcephaly, developmental delay, and premature pathological aging. Typically affected patients also have dermal photosensitivity. Although Cockayne syndrome is considered a DNA repair disorder, patients with UV-sensitive syndrome, with ERCC8 (CSA) or ERCC6 (CSB) mutations have indistinguishable DNA repair defects, but none of the extradermal features of Cockayne syndrome. We report novel missense mutations affecting a conserved loop in the ERCC6 (CSB) protein, associated with the Cockayne syndrome phenotype. Indeed, the amino acid sequence of this loop is more highly conserved than the adjacent helicase motifs V and VI, suggesting that this is a crucial structural component of the SWI/SNF family of proteins, to which ERCC6 (CSB) belongs. These comprise two RecA-like domains, separated by an interdomain linker, which interact through helicase motif VI. As the observed mutations are likely to act through destabilizing the tertiary protein structure, this prompted us to re-evaluate ERCC6 (CSB) mutation data in relation to the structure of SWI/SNF proteins. Our analysis suggests that antimorphic mutations cause Cockayne syndrome and that biallelic interdomain linker deletions produce more severe phenotypes. Based on our observations, we propose that further investigation of the pathogenic mechanisms underlying Cockayne syndrome should focus on the effect of antimorphic rather than null ERCC6 (CSB) mutations. PMID:26749132

  5. Using a color-coded ambigraphic nucleic acid notation to visualize conserved palindromic motifs within and across genomes

    PubMed Central

    2014-01-01

    Background Ambiscript is a graphically-designed nucleic acid notation that uses symbol symmetries to support sequence complementation, highlight biologically-relevant palindromes, and facilitate the analysis of consensus sequences. Although the original Ambiscript notation was designed to easily represent consensus sequences for multiple sequence alignments, the notation’s black-on-white ambiguity characters are unable to reflect the statistical distribution of nucleotides found at each position. We now propose a color-augmented ambigraphic notation to encode the frequency of positional polymorphisms in these consensus sequences. Results We have implemented this color-coding approach by creating an Adobe Flash® application ( http://www.ambiscript.org) that shades and colors modified Ambiscript characters according to the prevalence of the encoded nucleotide at each position in the alignment. The resulting graphic helps viewers perceive biologically-relevant patterns in multiple sequence alignments by uniquely combining color, shading, and character symmetries to highlight palindromes and inverted repeats in conserved DNA motifs. Conclusion Juxtaposing an intuitive color scheme over the deliberate character symmetries of an ambigraphic nucleic acid notation yields a highly-functional nucleic acid notation that maximizes information content and successfully embodies key principles of graphic excellence put forth by the statistician and graphic design theorist, Edward Tufte. PMID:24447494

  6. Sequence Motifs in Transit Peptides Act as Independent Functional Units and Can Be Transferred to New Sequence Contexts.

    PubMed

    Lee, Dong Wook; Woo, Seungjin; Geem, Kyoung Rok; Hwang, Inhwan

    2015-09-01

    A large number of nuclear-encoded proteins are imported into chloroplasts after they are translated in the cytosol. Import is mediated by transit peptides (TPs) at the N termini of these proteins. TPs contain many small motifs, each of which is critical for a specific step in the process of chloroplast protein import; however, it remains unknown how these motifs are organized to give rise to TPs with diverse sequences. In this study, we generated various hybrid TPs by swapping domains between Rubisco small subunit (RbcS) and chlorophyll a/b-binding protein, which have highly divergent sequences, and examined the abilities of the resultant TPs to deliver proteins into chloroplasts. Subsequently, we compared the functionality of sequence motifs in the hybrid TPs with those of wild-type TPs. The sequence motifs in the hybrid TPs exhibited three different modes of functionality, depending on their domain composition, as follows: active in both wild-type and hybrid TPs, active in wild-type TPs but inactive in hybrid TPs, and inactive in wild-type TPs but active in hybrid TPs. Moreover, synthetic TPs, in which only three critical motifs from RbcS or chlorophyll a/b-binding protein TPs were incorporated into an unrelated sequence, were able to deliver clients to chloroplasts with a comparable efficiency to RbcS TP. Based on these results, we propose that diverse sequence motifs in TPs are independent functional units that interact with specific translocon components at various steps during protein import and can be transferred to new sequence contexts. PMID:26149569

  7. Improved detection of helix-turn-helix DNA-binding motifs in protein sequences.

    PubMed Central

    Dodd, I B; Egan, J B

    1990-01-01

    We present an update of our method for systematic detection and evaluation of potential helix-turn-helix DNA-binding motifs in protein sequences [Dodd, I. and Egan, J. B. (1987) J. Mol. Biol. 194, 557-564]. The new method is considerably more powerful, detecting approximately 50% more likely helix-turn-helix sequences without an increase in false predictions. This improvement is due almost entirely to the use of a much larger reference set of 91 presumed helix-turn-helix sequences. The scoring matrix derived from this reference set has been calibrated against a large protein sequence database so that the score obtained by a sequence can be used to give a practical estimation of the probability that the sequence is a helix-turn-helix motif. PMID:2402433

  8. Sequence conservation on the Y chromosome

    SciTech Connect

    Gibson, L.H.; Yang-Feng, L.; Lau, C.

    1994-09-01

    The Y chromosome is present in all mammals and is considered to be essential to sex determination. Despite intense genomic research, only a few genes have been identified and mapped to this chromosome in humans. Several of them, such as SRY and ZFY, have been demonstrated to be conserved and Y-located in other mammals. In order to address the issue of sequence conservation on the Y chromosome, we performed fluorescence in situ hybridization (FISH) with DNA from a human Y cosmid library as a probe to study the Y chromosomes from other mammalian species. Total DNA from 3,000-4,500 cosmid pools were labeled with biotinylated-dUTP and hybridized to metaphase chromosomes. For human and primate preparations, human cot1 DNA was included in the hybridization mixture to suppress the hybridization from repeat sequences. FISH signals were detected on the Y chromosomes of human, gorilla, orangutan and baboon (Old World monkey) and were absent on those of squirrel monkey (New World monkey), Indian munjac, wood lemming, Chinese hamster, rat and mouse. Since sequence analysis suggested that specific genes, e.g. SRY and ZFY, are conserved between these two groups, the lack of detectable hybridization in the latter group implies either that conservation of the human Y sequences is limited to the Y chromosomes of the great apes and Old World monkeys, or that the size of the syntenic segment is too small to be detected under the resolution of FISH, or that homologeous sequences have undergone considerable divergence. Further studies with reduced hybridization stringency are currently being conducted. Our results provide some clues as to Y-sequence conservation across species and demonstrate the limitations of FISH across species with total DNA sequences from a particular chromosome.

  9. Conserved noncoding sequences (CNSs) in higher plants.

    PubMed

    Freeling, Michael; Subramaniam, Shabarinath

    2009-04-01

    Plant conserved noncoding sequences (CNSs)--a specific category of phylogenetic footprint--have been shown experimentally to function. No plant CNS is conserved to the extent that ultraconserved noncoding sequences are conserved in vertebrates. Plant CNSs are enriched in known transcription factor or other cis-acting binding sites, and are usually clustered around genes. Genes that encode transcription factors and/or those that respond to stimuli are particularly CNS-rich. Only rarely could this function involve small RNA binding. Some transcribed CNSs encode short translation products as a form of negative control. Approximately 4% of Arabidopsis gene content is estimated to be both CNS-rich and occupies a relatively long stretch of chromosome: Bigfoot genes (long phylogenetic footprints). We discuss a 'DNA-templated protein assembly' idea that might help explain Bigfoot gene CNSs. PMID:19249238

  10. Viroids: From Genotype to Phenotype Just Relying on RNA Sequence and Structural Motifs

    PubMed Central

    Flores, Ricardo; Serra, Pedro; Minoia, Sofía; Di Serio, Francesco; Navarro, Beatriz

    2012-01-01

    As a consequence of two unique physical properties, small size and circularity, viroid RNAs do not code for proteins and thus depend on RNA sequence/structural motifs for interacting with host proteins that mediate their invasion, replication, spread, and circumvention of defensive barriers. Viroid genomes fold up on themselves adopting collapsed secondary structures wherein stretches of nucleotides stabilized by Watson–Crick pairs are flanked by apparently unstructured loops. However, compelling data show that they are instead stabilized by alternative non-canonical pairs and that specific loops in the rod-like secondary structure, characteristic of Potato spindle tuber viroid and most other members of the family Pospiviroidae, are critical for replication and systemic trafficking. In contrast, rather than folding into a rod-like secondary structure, most members of the family Avsunviroidae adopt multibranched conformations occasionally stabilized by kissing-loop interactions critical for viroid viability in vivo. Besides these most stable secondary structures, viroid RNAs alternatively adopt during replication transient metastable conformations containing elements of local higher-order structure, prominent among which are the hammerhead ribozymes catalyzing a key replicative step in the family Avsunviroidae, and certain conserved hairpins that also mediate replication steps in the family Pospiviroidae. Therefore, different RNA structures – either global or local – determine different functions, thus highlighting the need for in-depth structural studies on viroid RNAs. PMID:22719735

  11. Membrane localization of MinD is mediated by a C-terminal motif that is conserved across eubacteria, archaea, and chloroplasts.

    PubMed

    Szeto, Tim H; Rowland, Susan L; Rothfield, Lawrence I; King, Glenn F

    2002-11-26

    MinD is a widely conserved ATPase that has been demonstrated to play a pivotal role in selection of the division site in eubacteria and chloroplasts. It is a member of the large ParA superfamily of ATPases that are characterized by a deviant Walker-type ATP-binding motif. MinD localizes to the cytoplasmic face of the inner membrane in Escherichia coli, and its association with the inner membrane is a prerequisite for membrane recruitment of the septation inhibitor MinC. However, the mechanism by which MinD associates with the membrane has proved enigmatic; it seems to lack a transmembrane domain and the amino acid sequence is devoid of hydrophobic tracts that might predispose the protein to interaction with lipids. In this study, we show that the extreme C-terminal region of MinD contains a highly conserved 8- to 12-residue sequence motif that is essential for membrane localization of the protein. We provide evidence that this motif forms an amphipathic helix that most likely mediates a direct interaction between MinD and membrane phospholipids. A model is proposed whereby the membrane-targeting motif mediates the rapid cycles of membrane attachment-release-reattachment that are presumed to occur during pole-to-pole oscillation of MinD in E. coli. PMID:12424340

  12. Stanniocalcin 1 binds hemin through a partially conserved heme regulatory motif

    SciTech Connect

    Westberg, Johan A.; Jiang, Ji; Andersson, Leif C.

    2011-06-03

    Highlights: {yields} Stanniocalcin 1 (STC1) binds heme through novel heme binding motif. {yields} Central iron atom of heme and cysteine-114 of STC1 are essential for binding. {yields} STC1 binds Fe{sup 2+} and Fe{sup 3+} heme. {yields} STC1 peptide prevents oxidative decay of heme. -- Abstract: Hemin (iron protoporphyrin IX) is a necessary component of many proteins, functioning either as a cofactor or an intracellular messenger. Hemoproteins have diverse functions, such as transportation of gases, gas detection, chemical catalysis and electron transfer. Stanniocalcin 1 (STC1) is a protein involved in respiratory responses of the cell but whose mechanism of action is still undetermined. We examined the ability of STC1 to bind hemin in both its reduced and oxidized states and located Cys{sup 114} as the axial ligand of the central iron atom of hemin. The amino acid sequence differs from the established (Cys-Pro) heme regulatory motif (HRM) and therefore presents a novel heme binding motif (Cys-Ser). A STC1 peptide containing the heme binding sequence was able to inhibit both spontaneous and H{sub 2}O{sub 2} induced decay of hemin. Binding of hemin does not affect the mitochondrial localization of STC1.

  13. A Conserved Cysteine Motif Is Critical for Rice Ceramide Kinase Activity and Function

    PubMed Central

    Liu, Zhe; Fang, Ce; Li, Jian; Su, Jian-Bin; Greenberg, Jean T.; Wang, Hong-Bin; Yao, Nan

    2011-01-01

    Background Ceramide kinase (CERK) is a key regulator of cell survival in dicotyledonous plants and animals. Much less is known about the roles of CERK and ceramides in mediating cellular processes in monocot plants. Here, we report the characterization of a ceramide kinase, OsCERK, from rice (Oryza sativa spp. Japonica cv. Nipponbare) and investigate the effects of ceramides on rice cell viability. Principal Findings OsCERK can complement the Arabidopsis CERK mutant acd5. Recombinant OsCERK has ceramide kinase activity with Michaelis-Menten kinetics and optimal activity at 7.0 pH and 40°C. Mg2+ activates OsCERK in a concentration-dependent manner. Importantly, a CXXXCXXC motif, conserved in all ceramide kinases and important for the activity of the human enzyme, is critical for OsCERK enzyme activity and in planta function. In a rice protoplast system, inhibition of CERK leads to cell death and the ratio of added ceramide and ceramide-1-phosphate, CERK's substrate and product, respectively, influences cell survival. Ceramide-induced rice cell death has apoptotic features and is an active process that requires both de novo protein synthesis and phosphorylation, respectively. Finally, mitochondria membrane potential loss previously associated with ceramide-induced cell death in Arabidopsis was also found in rice, but it occurred with different timing. Conclusions OsCERK is a bona fide ceramide kinase with a functionally and evolutionarily conserved Cys-rich motif that plays an important role in modulating cell fate in plants. The vital function of the conserved motif in both human and rice CERKs suggests that the biochemical mechanism of CERKs is similar in animals and plants. Furthermore, ceramides induce cell death with similar features in monocot and dicot plants. PMID:21483860

  14. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence

    PubMed Central

    Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya

    2015-01-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930

  15. Evolutionarily Conserved Regulatory Motifs in the Promoter of the Arabidopsis Clock Gene LATE ELONGATED HYPOCOTYL[C][W

    PubMed Central

    Spensley, Mark; Kim, Jae-Yean; Picot, Emma; Reid, John; Ott, Sascha; Helliwell, Chris; Carré, Isabelle A.

    2009-01-01

    The transcriptional regulation of the LATE ELONGATED HYPOCOTYL (LHY) gene is key to the structure of the circadian oscillator, integrating information from multiple regulatory pathways. We identified a minimal region of the LHY promoter that was sufficient for rhythmic expression. Another upstream sequence was also required for appropriate waveform of transcription and for maximum amplitude of oscillations under both diurnal and free-running conditions. We showed that two classes of protein complexes interact with a G-box and with novel 5A motifs; mutation of these sites reduced the amplitude of oscillation and broadened the peak of expression. A genome-wide bioinformatic analysis showed that these sites were enriched in phase-specific clusters of rhythmically expressed genes. Comparative genomic analyses showed that these motifs were conserved in orthologous promoters from several species. A position-specific scoring matrix for the 5A sites suggested similarity to CArG boxes, which are recognized by MADS box transcription factors. In support of this, the FLOWERING LOCUS C (FLC) protein was shown to interact with the LHY promoter in planta. This suggests a mechanism by which FLC might affect circadian period. PMID:19789276

  16. Novel hexamerization motif is discovered in a conserved cytoplasmic protein from Salmonella typhimurium.

    SciTech Connect

    Petrova, T.; Cuff, M.; Wu, R.; Kim, Y.; Holzle, D.; Joachimiak, A.; Biosciences Division; Inst. of Mathematical Problems of Biology

    2007-01-01

    The cytoplasmic protein Stm3548 of unknown function obtained from a strain of Salmonella typhimurium was determined by X-ray crystallography at a resolution of 2.25 A. The asymmetric unit contains a hexamer of structurally identical monomers. The monomer is a globular domain with a long beta-hairpin protrusion that distinguishes this structure. This beta-hairpin occupies a central position in the hexamer, and its residues participate in the majority of interactions between subunits of the hexamer. We suggest that the structure of Stm3548 presents a new hexamerization motif. Because the residues participating in interdomain interactions are highly conserved among close members of protein family DUF1355 and buried solvent accessible area for the hexamer is significant, the hexamer is most likely conserved as well. A light scattering experiment confirmed the presence of hexamer in solution.

  17. Proteome-Wide Discovery of Evolutionary Conserved Sequences in Disordered Regions

    PubMed Central

    Nguyen Ba, Alex N.; Yeh, Brian J.; van Dyk, Dewald; Davidson, Alan R.; Andrews, Brenda J.; Weiss, Eric L.; Moses, Alan M.

    2016-01-01

    At least 30% of human proteins are thought to contain intrinsically disordered regions, which lack stable structural conformation. Despite lacking enzymatic functions and having few protein domains, disordered regions are functionally important for protein regulation and contain short linear motifs (short peptide sequences involved in protein-protein interactions), but in most disordered regions, the functional amino acid residues remain unknown. We searched for evolutionarily conserved sequences within disordered regions according to the hypothesis that conservation would indicate functional residues. Using a phylogenetic hidden Markov model (phylo-HMM), we made accurate, specific predictions of functional elements in disordered regions even when these elements are only two or three amino acids long. Among the conserved sequences that we identified were previously known and newly identified short linear motifs, and we experimentally verified key examples, including a motif that may mediate interaction between protein kinase Cbk1 and its substrates. We also observed that hub proteins, which interact with many partners in a protein interaction network, are highly enriched in these conserved sequences. Our analysis enabled the systematic identification of the functional residues in disordered regions and suggested that at least 5% of amino acids in disordered regions are important for function. PMID:22416277

  18. Structural alphabet motif discovery and a structural motif database.

    PubMed

    Ku, Shih-Yen; Hu, Yuh-Jyh

    2012-01-01

    This study proposes a general framework for structural motif discovery. The framework is based on a modular design in which the system components can be modified or replaced independently to increase its applicability to various studies. It is a two-stage approach that first converts protein 3D structures into structural alphabet sequences, and then applies a sequence motif-finding tool to these sequences to detect conserved motifs. We named the structural motif database we built the SA-Motifbase, which provides the structural information conserved at different hierarchical levels in SCOP. For each motif, SA-Motifbase presents its 3D view; alphabet letter preference; alphabet letter frequency distribution; and the significance. SA-Motifbase is available at http://bioinfo.cis.nctu.edu.tw/samotifbase/. PMID:22099701

  19. The conserved helicase motifs of the herpes simplex virus type 1 origin-binding protein UL9 are important for function.

    PubMed Central

    Martinez, R; Shao, L; Weller, S K

    1992-01-01

    The UL9 gene of herpes simplex virus encodes a protein that specifically recognizes sequences within the viral origins of replication and exhibits helicase and DNA-dependent ATPase activities. The specific DNA binding domain of the UL9 protein was localized to the carboxy-terminal one-third of the molecule (H. M. Weir, J. M. Calder, and N. D. Stow, Nucleic Acids Res. 17:1409-1425, 1989). The N-terminal two-thirds of the UL9 gene contains six sequence motifs found in all members of a superfamily of DNA and RNA helicases, suggesting that this region may be important for helicase activity of UL9. In this report, we examined the functional significance of these six motifs for the UL9 protein through the introduction of site-specific mutations resulting in single amino acid substitutions of the most highly conserved residues within each motif. An in vivo complementation test was used to study the effect of each mutation on the function of the UL9 protein in viral DNA replication. In this assay, a mutant UL9 protein expressed from a transfected plasmid is used to complement a replication-deficient null mutant in the UL9 gene for the amplification of herpes simplex virus origin-containing plasmids. Mutations in five of the six conserved motifs inactivated the function of the UL9 protein in viral DNA replication, providing direct evidence for the importance of these conserved motifs. Insertion mutants resulting in the introduction of two alanines at 100-residue intervals in regions outside the conserved motifs were also constructed. Three of the insertion mutations were tolerated, whereas the other five abolished UL9 function. These data indicate that other regions of the protein, in addition to the helicase motifs, are important for function in vivo. Several mutations result in instability of the mutant products, presumably because of conformational changes in the protein. Taken together, these results suggest that UL9 is very sensitive to mutations with respect to both

  20. Discovering active motifs in sets of related protein sequences and using them for classification.

    PubMed Central

    Wang, J T; Marr, T G; Shasha, D; Shapiro, B A; Chirn, G W

    1994-01-01

    We describe a method for discovering active motifs in a set of related protein sequences. The method is an automatic two step process: (1) find candidate motifs in a small sample of the sequences; (2) test whether these motifs are approximately present in all the sequences. To reduce the running time, we develop two optimization heuristics based on statistical estimation and pattern matching techniques. Experimental results obtained by running these algorithms on generated data and functionally related proteins demonstrate the good performance of the presented method compared with visual method of O'Farrell and Leopold. By combining the discovered motifs with an existing fingerprint technique, we develop a protein classifier. When we apply the classifier to the 698 groups of related proteins in the PROSITE catalog, it gives information that is complementary to the BLOCKS protein classifier of Henikoff and Henikoff. Thus, using our classifier in conjunction with theirs, one can obtain high confidence classifications (if BLOCKS and our classifier agree) or suggest a new hypothesis (if the two disagree). PMID:8052532

  1. Identification of potential regulatory motifs in odorant receptor genes by analysis of promoter sequences

    PubMed Central

    Michaloski, Jussara S.; Galante, Pedro A.F.

    2006-01-01

    Mouse odorant receptors (ORs) are encoded by >1000 genes dispersed throughout the genome. Each olfactory neuron expresses one single OR gene, while the rest of the genes remain silent. The mechanisms underlying OR gene expression are poorly understood. Here, we investigated if OR genes share common cis-regulatory sequences in their promoter regions. We carried out a comprehensive analysis in which the upstream regions of a large number of OR genes were compared. First, using RLM-RACE, we generated cDNAs containing the complete 5′-untranslated regions (5′-UTRs) for a total number of 198 mouse OR genes. Then, we aligned these cDNA sequences to the mouse genome so that the 5′ structure and transcription start sites (TSSs) of the OR genes could be precisely determined. Sequences upstream of the TSSs were retrieved and browsed for common elements. We found DNA sequence motifs that are overrepresented in the promoter regions of the OR genes. Most motifs resemble O/E-like sites and are preferentially localized within 200 bp upstream of the TSSs. Finally, we show that these motifs specifically interact with proteins extracted from nuclei prepared from the olfactory epithelium, but not from brain or liver. Our results show that the OR genes share common promoter elements. The present strategy should provide information on the role played by cis-regulatory sequences in OR gene regulation. PMID:16902085

  2. Functional roles of short sequence motifs in the endocytosis of membrane receptors

    PubMed Central

    Pandey, Kailash N.

    2009-01-01

    Internalization and trafficking of cell-surface membrane receptors and proteins into subcellular compartments is mediated by specific short-sequence signal motifs, which are usually located within the cytoplasmic domains of these receptor and protein molecules. The signals usually consist of short linear amino acid sequences, which are recognized by adaptor coat proteins along the endocytic and sorting pathways. The complex arrays of signals and recognition proteins ensure the dynamic movement, accurate trafficking, and designated distribution of transmembrane receptors and ligands into intracellular compartments, particularly to the endosomal-lysosomal system. This review summarizes the new information and concepts, integrating them with the current and established views of endocytosis, intracellular trafficking, and sorting of membrane receptors and proteins. Particular emphasis has been given to the functional roles of short-sequence signal motifs responsible for the itinerary and destination of membrane receptors and proteins moving into the subcellular compartments. The specific characteristics and functions of short-sequence motifs, including various tyrosine-based, dileucine-type, and other short-sequence signals in the trafficking and sorting of membrane receptors and membrane proteins are presented and discussed. PMID:19482617

  3. Mutations in the highly conserved GGQ motif of class 1 polypeptide release factors abolish ability of human eRF1 to trigger peptidyl-tRNA hydrolysis.

    PubMed Central

    Frolova, L Y; Tsivkovskii, R Y; Sivolobova, G F; Oparina, N Y; Serpinsky, O I; Blinov, V M; Tatkov, S I; Kisselev, L L

    1999-01-01

    Although the primary structures of class 1 polypeptide release factors (RF1 and RF2 in prokaryotes, eRF1 in eukaryotes) are known, the molecular basis by which they function in translational termination remains obscure. Because all class 1 RFs promote a stop-codon-dependent and ribosome-dependent hydrolysis of peptidyl-tRNAs, one may anticipate that this common function relies on a common structural motif(s). We have compared amino acid sequences of the available class 1 RFs and found a novel, common, unique, and strictly conserved GGQ motif that should be in a loop (coil) conformation as deduced by programs predicting protein secondary structure. Site-directed mutagenesis of the human eRF1 as a representative of class 1 RFs shows that substitution of both glycyl residues in this motif, G183 and G184, causes complete inactivation of the protein as a release factor toward all three stop codons, whereas two adjacent amino acid residues, G181 and R182, are functionally nonessential. Inactive human eRF1 mutants compete in release assays with wild-type eRF1 and strongly inhibit their release activity. Mutations of the glycyl residues in this motif do not affect another function, the ability of eRF1 together with the ribosome to induce GTPase activity of human eRF3, a class 2 RF. We assume that the novel highly conserved GGQ motif is implicated directly or indirectly in the activity of class 1 RFs in translation termination. PMID:10445876

  4. In planta analysis of a cis-regulatory cytokinin response motif in Arabidopsis and identification of a novel enhancer sequence.

    PubMed

    Ramireddy, Eswarayya; Brenner, Wolfram G; Pfeifer, Andreas; Heyl, Alexander; Schmülling, Thomas

    2013-07-01

    The phytohormone cytokinin plays a key role in regulating plant growth and development, and is involved in numerous physiological responses to environmental changes. The type-B response regulators, which regulate the transcription of cytokinin response genes, are a part of the cytokinin signaling system. Arabidopsis thaliana encodes 11 type-B response regulators (type-B ARRs), and some of them were shown to bind in vitro to the core cytokinin response motif (CRM) 5'-(A/G)GAT(T/C)-3' or, in the case of ARR1, to an extended motif (ECRM), 5'-AAGAT(T/C)TT-3'. Here we obtained in planta proof for the functionality of the latter motif. Promoter deletion analysis of the primary cytokinin response gene ARR6 showed that a combination of two extended motifs within the promoter is required to mediate the full transcriptional activation by ARR1 and other type-B ARRs. CRMs were found to be over-represented in the vicinity of ECRMs in the promoters of cytokinin-regulated genes, suggesting their functional relevance. Moreover, an evolutionarily conserved 27 bp long T-rich region between -220 and -193 bp was identified and shown to be required for the full activation by type-B ARRs and the response to cytokinin. This novel enhancer is not bound by the DNA-binding domain of ARR1, indicating that additional proteins might be involved in mediating the transcriptional cytokinin response. Furthermore, genome-wide expression profiling identified genes, among them ARR16, whose induction by cytokinin depends on both ARR1 and other specific type-B ARRs. This together with the ECRM/CRM sequence clustering indicates cooperative action of different type-B ARRs for the activation of particular target genes. PMID:23620480

  5. The sea anemone actinoporin (Arg-Gly-Asp) conserved motif is involved in maintaining the competent oligomerization state of these pore-forming toxins.

    PubMed

    García-Linares, Sara; Richmond, Ryan; García-Mayoral, María F; Bustamante, Noemí; Bruix, Marta; Gavilanes, José G; Martínez-Del-Pozo, Alvaro

    2014-03-01

    Sea anemone actinoporins constitute an optimum model to investigate mechanisms of membrane pore formation. All actinoporins of known structure show a general fold of a β-sandwich motif flanked by two α-helices. The crucial structure for pore formation seems to be the helix located at the N-terminal end. The role of several other protein regions in membrane attachment is also well established. However, not much is known about the protein residues involved in the oligomerization required for pore formation. Previous detailed analysis of the soluble three-dimensional structures of different wild-type and mutant actinoporins from Stychodactyla helianthus suggested residues which could be involved in this oligomerization. One of these stretches contains a conserved sequence compatible with an integrin-binding RGD motif. The results presented now deal with mutants affecting this motif in the well-characterized actinoporin sticholysin II. Small modifications along this three-residue sequence had profound effects on its solubility. Just a single methyl group yielded an RAD mutant version with a highly diminished haemolytic activity and altered oligomerization behaviour. The results obtained are discussed in terms of a key role for the RGD motif in maintaining the actinoporins' pore-competent state of protein oligomerization. PMID:24418371

  6. Using machine learning to predict gene expression and discover sequence motifs

    NASA Astrophysics Data System (ADS)

    Li, Xuejing

    Recently, large amounts of experimental data for complex biological systems have become available. We use tools and algorithms from machine learning to build data-driven predictive models. We first present a novel algorithm to discover gene sequence motifs associated with temporal expression patterns of genes. Our algorithm, which is based on partial least squares (PLS) regression, is able to directly model the flow of information, from gene sequence to gene expression, to learn cis regulatory motifs and characterize associated gene expression patterns. Our algorithm outperforms traditional computational methods e.g. clustering in motif discovery. We then present a study of extending a machine learning model for transcriptional regulation predictive of genetic regulatory response to Caenorhabditis elegans. We show meaningful results both in terms of prediction accuracy on the test experiments and biological information extracted from the regulatory program. The model discovers DNA binding sites ab initio. We also present a case study where we detect a signal of lineage-specific regulation. Finally we present a comparative study on learning predictive models for motif discovery, based on different boosting algorithms: Adaptive Boosting (AdaBoost), Linear Programming Boosting (LPBoost) and Totally Corrective Boosting (TotalBoost). We evaluate and compare the performance of the three boosting algorithms via both statistical and biological validation, for hypoxia response in Saccharomyces cerevisiae.

  7. Comparative Analysis of Evolutionarily Conserved Motifs of Epidermal Growth Factor Receptor 2 (HER2) Predicts Novel Potential Therapeutic Epitopes

    PubMed Central

    Deng, Xiaohong; Zheng, Xuxu; Yang, Huanming; Moreira, José Manuel Afonso; Brünner, Nils; Christensen, Henrik

    2014-01-01

    Overexpression of human epidermal growth factor receptor 2 (HER2) is associated with tumor aggressiveness and poor prognosis in breast cancer. With the availability of therapeutic antibodies against HER2, great strides have been made in the clinical management of HER2 overexpressing breast cancer. However, de novo and acquired resistance to these antibodies presents a serious limitation to successful HER2 targeting treatment. The identification of novel epitopes of HER2 that can be used for functional/region-specific blockade could represent a central step in the development of new clinically relevant anti-HER2 antibodies. In the present study, we present a novel computational approach as an auxiliary tool for identification of novel HER2 epitopes. We hypothesized that the structurally and linearly evolutionarily conserved motifs of the extracellular domain of HER2 (ECD HER2) contain potential druggable epitopes/targets. We employed the PROSITE Scan to detect structurally conserved motifs and PRINTS to search for linearly conserved motifs of ECD HER2. We found that the epitopes recognized by trastuzumab and pertuzumab are located in the predicted conserved motifs of ECD HER2, supporting our initial hypothesis. Considering that structurally and linearly conserved motifs can provide functional specific configurations, we propose that by comparing the two types of conserved motifs, additional druggable epitopes/targets in the ECD HER2 protein can be identified, which can be further modified for potential therapeutic application. Thus, this novel computational process for predicting or searching for potential epitopes or key target sites may contribute to epitope-based vaccine and function-selected drug design, especially when x-ray crystal structure protein data is not available. PMID:25192037

  8. Sequence-specific intramembrane proteolysis: identification of a recognition motif in rhomboid substrates.

    PubMed

    Strisovsky, Kvido; Sharpe, Hayley J; Freeman, Matthew

    2009-12-25

    Members of the widespread rhomboid family of intramembrane proteases cleave transmembrane domain (TMD) proteins to regulate processes as diverse as EGF receptor signaling, mitochondrial dynamics, and invasion by apicomplexan parasites. However, lack of information about their substrates means that the biological role of most rhomboids remains obscure. Knowledge of how rhomboids recognize their substrates would illuminate their mechanism and might also allow substrate prediction. Previous work has suggested that rhomboid substrates are specified by helical instability in their TMD. Here we demonstrate that rhomboids instead primarily recognize a specific sequence surrounding the cleavage site. This recognition motif is necessary for substrate cleavage, it determines the cleavage site, and it is more strictly required than TM helix-destabilizing residues. Our work demonstrates that intramembrane proteases can be sequence specific and that genome-wide substrate prediction based on their recognition motifs is feasible. PMID:20064469

  9. A Convex Atomic-Norm Approach to Multiple Sequence Alignment and Motif Discovery

    PubMed Central

    Yen, Ian E. H.; Lin, Xin; Zhang, Jiong; Ravikumar, Pradeep; Dhillon, Inderjit S.

    2016-01-01

    Multiple Sequence Alignment and Motif Discovery, known as NP-hard problems, are two fundamental tasks in Bioinformatics. Existing approaches to these two problems are based on either local search methods such as Expectation Maximization (EM), Gibbs Sampling or greedy heuristic methods. In this work, we develop a convex relaxation approach to both problems based on the recent concept of atomic norm and develop a new algorithm, termed Greedy Direction Method of Multiplier, for solving the convex relaxation with two convex atomic constraints. Experiments show that our convex relaxation approach produces solutions of higher quality than those standard tools widely-used in Bioinformatics community on the Multiple Sequence Alignment and Motif Discovery problems. PMID:27559428

  10. Recognition of Conserved Amino Acid Motifs of Common Viruses and Its Role in Autoimmunity

    PubMed Central

    2005-01-01

    The triggers of autoimmune diseases such as multiple sclerosis (MS) remain elusive. Epidemiological studies suggest that common pathogens can exacerbate and also induce MS, but it has been difficult to pinpoint individual organisms. Here we demonstrate that in vivo clonally expanded CD4+ T cells isolated from the cerebrospinal fluid of a MS patient during disease exacerbation respond to a poly-arginine motif of the nonpathogenic and ubiquitous Torque Teno virus. These T cell clones also can be stimulated by arginine-enriched protein domains from other common viruses and recognize multiple autoantigens. Our data suggest that repeated infections with common pathogenic and even nonpathogenic viruses could expand T cells specific for conserved protein domains that are able to cross-react with tissue-derived and ubiquitous autoantigens. PMID:16362076

  11. Integrating bioinformatic resources to predict transcription factors interacting with cis-sequences conserved in co-regulated genes

    PubMed Central

    2014-01-01

    Background Using motif detection programs it is fairly straightforward to identify conserved cis-sequences in promoters of co-regulated genes. In contrast, the identification of the transcription factors (TFs) interacting with these cis-sequences is much more elaborate. To facilitate this, we explore the possibility of using several bioinformatic and experimental approaches for TF identification. This starts with the selection of co-regulated gene sets and leads first to the prediction and then to the experimental validation of TFs interacting with cis-sequences conserved in the promoters of these co-regulated genes. Results Using the PathoPlant database, 32 up-regulated gene groups were identified with microarray data for drought-responsive gene expression from Arabidopsis thaliana. Application of the binding site estimation suite of tools (BEST) discovered 179 conserved sequence motifs within the corresponding promoters. Using the STAMP web-server, 49 sequence motifs were classified into 7 motif families for which similarities with known cis-regulatory sequences were identified. All motifs were subjected to a footprintDB analysis to predict interacting DNA binding domains from plant TF families. Predictions were confirmed by using a yeast-one-hybrid approach to select interacting TFs belonging to the predicted TF families. TF-DNA interactions were further experimentally validated in yeast and with a Physcomitrella patens transient expression system, leading to the discovery of several novel TF-DNA interactions. Conclusions The present work demonstrates the successful integration of several bioinformatic resources with experimental approaches to predict and validate TFs interacting with conserved sequence motifs in co-regulated genes. PMID:24773781

  12. Evolutionary divergence and limits of conserved non-coding sequence detection in plant genomes

    PubMed Central

    Reineke, Anna R.; Bornberg-Bauer, Erich; Gu, Jenny

    2011-01-01

    The discovery of regulatory motifs embedded in upstream regions of plants is a particularly challenging bioinformatics task. Previous studies have shown that motifs in plants are short compared with those found in vertebrates. Furthermore, plant genomes have undergone several diversification mechanisms such as genome duplication events which impact the evolution of regulatory motifs. In this article, a systematic phylogenomic comparison of upstream regions is conducted to further identify features of the plant regulatory genomes, the component of genomes regulating gene expression, to enable future de novo discoveries. The findings highlight differences in upstream region properties between major plant groups and the effects of divergence times and duplication events. First, clear differences in upstream region evolution can be detected between monocots and dicots, thus suggesting that a separation of these groups should be made when searching for novel regulatory motifs, particularly since universal motifs such as the TATA box are rare. Second, investigating the decay rate of significantly aligned regions suggests that a divergence time of ∼100 mya sets a limit for reliable conserved non-coding sequence (CNS) detection. Insights presented here will set a framework to help identify embedded motifs of functional relevance by understanding the limits of bioinformatics detection for CNSs. PMID:21470961

  13. Interaction prediction using conserved network motifs in protein-protein interaction networks

    NASA Astrophysics Data System (ADS)

    Albert, Reka

    2005-03-01

    High-throughput protein interaction detection methods are strongly affected by false positive and false negative results. Focused experiments are needed to complement the large-scale methods by validating previously detected interactions but it is often difficult to decide which proteins to probe as interaction partners. Developing reliable computational methods assisting this decision process is a pressing need in bioinformatics. This talk will describe the recent developments in analyzing and understanding protein interaction networks, then present a method that uses the conserved properties of the protein network to identify and validate interaction candidates. We apply a number of machine learning algorithms to the protein connectivity information and achieve a surprisingly good overall performance in predicting interacting proteins. Using a ``leave-one-ou approach we find average success rates between 20-50% for predicting the correct interaction partner of a protein. We demonstrate that the success of these methods is based on the presence of conserved interaction motifs within the network. A reference implementation and a table with candidate interacting partners for each yeast protein are available at http://www.protsuggest.org

  14. Conserved Intramolecular Interactions Maintain Myosin Interacting-Heads Motifs Explaining Tarantula Muscle Super-Relaxed State Structural Basis.

    PubMed

    Alamo, Lorenzo; Qi, Dan; Wriggers, Willy; Pinto, Antonio; Zhu, Jingui; Bilbao, Aivett; Gillilan, Richard E; Hu, Songnian; Padrón, Raúl

    2016-03-27

    Tarantula striated muscle is an outstanding system for understanding the molecular organization of myosin filaments. Three-dimensional reconstruction based on cryo-electron microscopy images and single-particle image processing revealed that, in a relaxed state, myosin molecules undergo intramolecular head-head interactions, explaining why head activity switches off. The filament model obtained by rigidly docking a chicken smooth muscle myosin structure to the reconstruction was improved by flexibly fitting an atomic model built by mixing structures from different species to a tilt-corrected 2-nm three-dimensional map of frozen-hydrated tarantula thick filament. We used heavy and light chain sequences from tarantula myosin to build a single-species homology model of two heavy meromyosin interacting-heads motifs (IHMs). The flexibly fitted model includes previously missing loops and shows five intramolecular and five intermolecular interactions that keep the IHM in a compact off structure, forming four helical tracks of IHMs around the backbone. The residues involved in these interactions are oppositely charged, and their sequence conservation suggests that IHM is present across animal species. The new model, PDB 3JBH, explains the structural origin of the ATP turnover rates detected in relaxed tarantula muscle by ascribing the very slow rate to docked unphosphorylated heads, the slow rate to phosphorylated docked heads, and the fast rate to phosphorylated undocked heads. The conservation of intramolecular interactions across animal species and the presence of IHM in bilaterians suggest that a super-relaxed state should be maintained, as it plays a role in saving ATP in skeletal, cardiac, and smooth muscles. PMID:26851071

  15. CyanoLyase: a database of phycobilin lyase sequences, motifs and functions

    PubMed Central

    Bretaudeau, Anthony; Coste, François; Humily, Florian; Garczarek, Laurence; Le Corguillé, Gildas; Six, Christophe; Ratin, Morgane; Collin, Olivier; Schluchter, Wendy M.; Partensky, Frédéric

    2013-01-01

    CyanoLyase (http://cyanolyase.genouest.org/) is a manually curated sequence and motif database of phycobilin lyases and related proteins. These enzymes catalyze the covalent ligation of chromophores (phycobilins) to specific binding sites of phycobiliproteins (PBPs). The latter constitute the building bricks of phycobilisomes, the major light-harvesting systems of cyanobacteria and red algae. Phycobilin lyases sequences are poorly annotated in public databases. Sequences included in CyanoLyase were retrieved from all available genomes of these organisms and a few others by similarity searches using biochemically characterized enzyme sequences and then classified into 3 clans and 32 families. Amino acid motifs were computed for each family using Protomata learner. CyanoLyase also includes BLAST and a novel pattern matching tool (Protomatch) that allow users to rapidly retrieve and annotate lyases from any new genome. In addition, it provides phylogenetic analyses of all phycobilin lyases families, describes their function, their presence/absence in all genomes of the database (phyletic profiles) and predicts the chromophorylation of PBPs in each strain. The site also includes a thorough bibliography about phycobilin lyases and genomes included in the database. This resource should be useful to scientists and companies interested in natural or artificial PBPs, which have a number of biotechnological applications, notably as fluorescent markers. PMID:23175607

  16. Conserved function of the lysine-based KXD/E motif in Golgi retention for endomembrane proteins among different organisms.

    PubMed

    Woo, Cheuk Hang; Gao, Caiji; Yu, Ping; Tu, Linna; Meng, Zhaoyue; Banfield, David K; Yao, Xiaoqiang; Jiang, Liwen

    2015-11-15

    We recently identified a new COPI-interacting KXD/E motif in the C-terminal cytosolic tail (CT) of Arabidopsis endomembrane protein 12 (AtEMP12) as being a crucial Golgi retention mechanism for AtEMP12. This KXD/E motif is conserved in CTs of all EMPs found in plants, yeast, and humans and is also present in hundreds of other membrane proteins. Here, by cloning selective EMP isoforms from plants, yeast, and mammals, we study the localizations of EMPs in different expression systems, since there are contradictory reports on the localizations of EMPs. We show that the N-terminal and C-terminal GFP-tagged EMP fusions are localized to Golgi and post-Golgi compartments, respectively, in plant, yeast, and mammalian cells. In vitro pull-down assay further proves the interaction of the KXD/E motif with COPI coatomer in yeast. COPI loss of function in yeast and plants causes mislocalization of EMPs or KXD/E motif-containing proteins to vacuole. Ultrastructural studies further show that RNA interference (RNAi) knockdown of coatomer expression in transgenic Arabidopsis plants causes severe morphological changes in the Golgi. Taken together, our results demonstrate that N-terminal GFP fusions reflect the real localization of EMPs, and KXD/E is a conserved motif in COPI interaction and Golgi retention in eukaryotes. PMID:26378254

  17. A dominant negative mutation in the conserved RNA helicase motif 'SAT' causes splicing factor PRP2 to stall in spliceosomes.

    PubMed Central

    Plumpton, M; McGarvey, M; Beggs, J D

    1994-01-01

    To characterize sequences in the RNA helicase-like PRP2 protein of Saccharomyces cerevisiae that are essential for its function in pre-mRNA splicing, a pool of random PRP2 mutants was generated. A dominant negative allele was isolated which, when overexpressed in a wild-type yeast strain, inhibited cell growth by causing a defect in pre-mRNA splicing. This defect was partially alleviated by simultaneous co-overexpression of wild-type PRP2. The dominant negative PRP2 protein inhibited splicing in vitro and caused the accumulation of stalled splicing complexes. Immunoprecipitation with anti-PRP2 antibodies confirmed that dominant negative PRP2 protein competed with its wild-type counterpart for interaction with spliceosomes, with which the mutant protein remained associated. The PRP2-dn1 mutation led to a single amino acid change within the conserved SAT motif that in the prototype helicase eIF-4A is required for RNA unwinding. Purified dominant negative PRP2 protein had approximately 40% of the wild-type level of RNA-stimulated ATPase activity. As ATPase activity was reduced only slightly, but splicing activity was abolished, we propose that the dominant negative phenotype is due primarily to a defect in the putative RNA helicase activity of PRP2 protein. Images PMID:8112301

  18. CDR3β sequence motifs regulate autoreactivity of human invariant NKT cell receptors.

    PubMed

    Chamoto, Kenji; Guo, Tingxi; Imataki, Osamu; Tanaka, Makito; Nakatsugawa, Munehide; Ochi, Toshiki; Yamashita, Yuki; Saito, Akiko M; Saito, Toshiki I; Butler, Marcus O; Hirano, Naoto

    2016-04-01

    Invariant natural killer T (iNKT) cells are a subset of T lymphocytes that recognize lipid ligands presented by monomorphic CD1d. Human iNKT T cell receptor (TCR) is largely composed of invariant Vα24 (Vα24i) TCRα chain and semi-variant Vβ11 TCRβ chain, where complementarity-determining region (CDR)3β is the sole variable region. One of the characteristic features of iNKT cells is that they retain autoreactivity even after the thymic selection. However, the molecular features of human iNKT TCR CDR3β sequences that regulate autoreactivity remain unknown. Since the numbers of iNKT cells with detectable autoreactivity in peripheral blood is limited, we introduced the Vα24i gene into peripheral T cells and generated a de novo human iNKT TCR repertoire. By stimulating the transfected T cells with artificial antigen presenting cells (aAPCs) presenting self-ligands, we enriched strongly autoreactive iNKT TCRs and isolated a large panel of human iNKT TCRs with a broad range autoreactivity. From this panel of unique iNKT TCRs, we deciphered three CDR3β sequence motifs frequently encoded by strongly-autoreactive iNKT TCRs: a VD region with 2 or more acidic amino acids, usage of the Jβ2-5 allele, and a CDR3β region of 13 amino acids in length. iNKT TCRs encoding 2 or 3 sequence motifs also exhibit higher autoreactivity than those encoding 0 or 1 motifs. These data facilitate our understanding of the molecular basis for human iNKT cell autoreactivity involved in immune responses associated with human disease. PMID:26748722

  19. Identification of sequence motifs involved in Dengue virus-host interactions.

    PubMed

    Asnet Mary, J; Paramasivan, R; Shenbagarathai, R

    2016-03-01

    Dengue fever is a rapidly spreading mosquito-borne virus infection, which remains a serious global public health problem. As there is no specific treatment or commercial vaccine available for effective control of the disease, the attempts on developing novel control strategies are underway. Viruses utilize the surface receptor proteins of host to enter into the cells. Though various proteins were said to be receptors of Dengue virus (DENV) using Virus Overlay Protein Binding Assay, the precise interaction between DENV and host is not explored. Understanding the structural features of domain III envelope glycoprotein would help in developing efficient antiviral inhibitors. Therefore, an attempt was made to identify the sequence motifs present in domain III envelope glycoprotein of Dengue virus. Computational analysis revealed that the NGR motif is present in the domain III envelope glycoprotein of DENV-1 and DENV-3. Similarly, DENV-1, DENV-2 and DENV-4 were found to contain Yxxphi motif which is a tyrosine-based sorting signal responsible for the interaction with a mu subunit of adaptor protein complex. High-throughput virtual screening resulted in five compounds as lead molecules based on glide score, which ranges from -4.664 to -6.52 kcal/Mol. This computational prediction provides an additional tool for understanding the virus-host interactions and helps to identify potential targets in the host. Further, experimental evidence is warranted to confirm the virus-host interactions and also inhibitory activity of reported lead compounds. PMID:25905427

  20. A small conserved motif supports polarity augmentation of Shigella flexneri IcsA.

    PubMed

    Doyle, Matthew Thomas; Grabowicz, Marcin; Morona, Renato

    2015-11-01

    The rod-shaped enteric intracellular pathogen Shigella flexneri and other Shigella species are the causative agents of bacillary dysentery. S. flexneri are able to spread within the epithelial lining of the gut, resulting in lesion formation, cramps and bloody stools. The outer membrane protein IcsA is essential for this spreading process. IcsA is the initiator of an actin-based form of motility whereby it allows the formation of a filamentous actin 'tail' at the bacterial pole. Importantly, IcsA is specifically positioned at the bacterial pole such that this process occurs asymmetrically. The mechanism of IcsA polarity is not completely understood, but it appears to be a multifactorial process involving factors intrinsic to IcsA and other regulating factors. In this study, we further investigated IcsA polarization by its intramolecular N-terminal and central polar-targeting (PT) regions (nPT and cPT regions, respectively). The results obtained support a role in polar localization for the cPT region and contend the role of the nPT region. We identified single IcsA residues that have measurable impacts on IcsA polarity augmentation, resulting in decreased S. flexneri sprading efficiency. Intriguingly, regions and residues involved in PT clustered around a highly conserved motif which may provide a functional scaffold for polarity-augmenting residues. How these results fit with the current model of IcsA polarity determination is discussed. PMID:26315462

  1. A Conserved Ectodomain-Transmembrane Domain Linker Motif Tunes the Allosteric Regulation of Cell Surface Receptors.

    PubMed

    Schmidt, Thomas; Ye, Feng; Situ, Alan J; An, Woojin; Ginsberg, Mark H; Ulmer, Tobias S

    2016-08-19

    In many families of cell surface receptors, a single transmembrane (TM) α-helix separates ecto- and cytosolic domains. A defined coupling of ecto- and TM domains must be essential to allosteric receptor regulation but remains little understood. Here, we characterize the linker structure, dynamics, and resulting ecto-TM domain coupling of integrin αIIb in model constructs and relate it to other integrin α subunits by mutagenesis. Cellular integrin activation assays subsequently validate the findings in intact receptors. Our results indicate a flexible yet carefully tuned ecto-TM coupling that modulates the signaling threshold of integrin receptors. Interestingly, a proline at the N-terminal TM helix border, termed NBP, is critical to linker flexibility in integrins. NBP is further predicted in 21% of human single-pass TM proteins and validated in cytokine receptors by the TM domain structure of the cytokine receptor common subunit β and its P441A-substituted variant. Thus, NBP is a conserved uncoupling motif of the ecto-TM domain transition and the degree of ecto-TM domain coupling represents an important parameter in the allosteric regulation of diverse cell surface receptors. PMID:27365391

  2. Multiple cellular proteins interact with LEDGF/p75 through a conserved unstructured consensus motif.

    PubMed

    Tesina, Petr; Čermáková, Kateřina; Hořejší, Magdalena; Procházková, Kateřina; Fábry, Milan; Sharma, Subhalakshmi; Christ, Frauke; Demeulemeester, Jonas; Debyser, Zeger; De Rijck, Jan; Veverka, Václav; Řezáčová, Pavlína

    2015-01-01

    Lens epithelium-derived growth factor (LEDGF/p75) is an epigenetic reader and attractive therapeutic target involved in HIV integration and the development of mixed lineage leukaemia (MLL1) fusion-driven leukaemia. Besides HIV integrase and the MLL1-menin complex, LEDGF/p75 interacts with various cellular proteins via its integrase binding domain (IBD). Here we present structural characterization of IBD interactions with transcriptional repressor JPO2 and domesticated transposase PogZ, and show that the PogZ interaction is nearly identical to the interaction of LEDGF/p75 with MLL1. The interaction with the IBD is maintained by an intrinsically disordered IBD-binding motif (IBM) common to all known cellular partners of LEDGF/p75. In addition, based on IBM conservation, we identify and validate IWS1 as a novel LEDGF/p75 interaction partner. Our results also reveal how HIV integrase efficiently displaces cellular binding partners from LEDGF/p75. Finally, the similar binding modes of LEDGF/p75 interaction partners represent a new challenge for the development of selective interaction inhibitors. PMID:26245978

  3. Conserved phosphoprotein interaction motif is functionally interchangeable between ataxin-7 and arrestins.

    PubMed

    Mushegian, A R; Vishnivetskiy, S A; Gurevich, V V

    2000-06-13

    Olivopontocerebellar atrophy with retinal degeneration is a hereditary neurodegenerative disorder that belongs to the subtype II of the autosomal dominant cerebellar ataxias and is characterized by early-onset cerebellar and macular degeneration preceded by diagnostically useful tritan colorblindness. The gene mutated in the disease (SCA7) has been mapped to chromosome 3p12-13.5, and positional cloning identified the cause of the disease as CAG repeat expansion in this gene. The SCA7 gene product, ataxin-7, is an 897 amino acid protein with an expandable polyglutamine tract close to its N-terminus. No clues to ataxin-7 function have been obtained from sequence database searches. Here we report that ataxin-7 has a motif of ca. 50 amino acids, related to the phosphate-binding site of arrestins. To test the relevance of this sequence similarity, we introduced the putative ataxin-7 phosphate-binding site into visual arrestin and beta-arrestin. Both chimeric arrestins retain receptor-binding affinity and show characteristic high selectivity for phosphorylated activated forms of rhodopsin and beta-adrenergic receptor, respectively. Although the insertion of a Gly residue (absent in arrestins but present in the putative phosphate-binding site of ataxin-7) disrupts the function of visual arrestin-ataxin-7 chimera, it enhances the function of beta-arrestin-ataxin-7 chimera. Taken together, our data suggest that the arrestin-like site in the ataxin-7 sequence is a functional phosphate-binding site. The presence of the phosphate-binding site in ataxin-7 suggests that this protein may be involved in phosphorylation-dependent binding to its protein partner(s) in the cell. PMID:10841760

  4. Role of two sequence motifs of mesencephalic astrocyte-derived neurotrophic factor in its survival-promoting activity

    PubMed Central

    Mätlik, K; Yu, Li-ying; Eesmaa, A; Hellman, M; Lindholm, P; Peränen, J; Galli, E; Anttila, J; Saarma, M; Permi, P; Airavaara, M; Arumäe, U

    2015-01-01

    Mesencephalic astrocyte-derived neurotrophic factor (MANF) is a prosurvival protein that protects the cells when applied intracellularly in vitro or extracellularly in vivo. Its protective mechanisms are poorly known. Here we studied the role of two short sequence motifs within the carboxy-(C) terminal domain of MANF in its neuroprotective activity: the CKGC sequence (a CXXC motif) that could be involved in redox reactions, and the C-terminal RTDL sequence, an endoplasmic reticulum (ER) retention signal. We mutated these motifs and analyzed the antiapoptotic effect and intracellular localization of these mutants of MANF when overexpressed in cultured sympathetic or sensory neurons. As an in vivo model for studying the effect of these mutants after their extracellular application, we used the rat model of cerebral ischemia. Even though we found no evidence for oxidoreductase activity of MANF, the mutation of CXXC motif completely abolished its protective effect, showing that this motif is crucial for both MANF's intracellular and extracellular activity. The RTDL motif was not needed for the neuroprotective activity of MANF after its extracellular application in the stroke model in vivo. However, in vitro the deletion of RTDL motif inactivated MANF in the sympathetic neurons where the mutant protein localized to Golgi, but not in the sensory neurons where the mutant localized to the ER, showing that intracellular MANF protects these peripheral neurons in vitro only when localized to the ER. PMID:26720341

  5. Evolutionary and taxonomic implications of conserved structural motifs between picornaviruses and insect picorna-like viruses.

    PubMed

    Liljas, L; Tate, J; Lin, T; Christian, P; Johnson, J E

    2002-01-01

    A comparison of the recently determined structure of an insect picorna-like virus, Cricket paralysis virus (CrPV), with that of the mammalian picornaviruses shows that several structural features are highly conserved between these viruses. These conserved features include the topology of the coat proteins, the conformation of most loops, and the general arrangement of the internally located N-terminal arms of the coat proteins. The conformational conservation of the N-termini of the three major coat proteins between CrPV and the picornaviruses suggests a putative ancestral T = 3 virus. Comparisons of the genome structure and amino-acid sequence of the coat proteins of CrPV with a number of other insect picorna-like viruses show that most of them belong to a novel group, recently given the interim name Cricket paralysis-like viruses. Two other insect picorna-like viruses, Infectious flacherie virus (IFV) and Sacbrood virus (SBV), for which the genome sequences have recently been determined, have very different coat protein sequences and a genome organization more like the picornaviruses. However, the position of the small VP4 protein in the structural protein polyprotein as well as the mechanism for its cleavage from VP3 upon assembly strongly suggests an evolutionary link to the "Cricket paralysis-like viruses". We propose that the picornaviruses, Cricket paralysis-like viruses and IFV/SBV group are a natural assemblage. The ancestor for this assemblage had a structure based upon the CrPV/picornavirus paradigm and a genome encoding a single major coat protein; gene duplication and rearrangements have subsequently produced the viruses that we observe today. We also discuss the possible relatives of the proposed assemblage and the likely implications of future structural studies that may be carried out on the putative relatives. PMID:11855636

  6. A search for conserved sequences in coding regions reveals that the let-7 microRNA targets Dicer within its coding sequence

    PubMed Central

    Forman, Joshua J.; Legesse-Miller, Aster; Coller, Hilary A.

    2008-01-01

    Recognition sites for microRNAs (miRNAs) have been reported to be located in the 3′ untranslated regions of transcripts. In a computational screen for highly conserved motifs within coding regions, we found an excess of sequences conserved at the nucleotide level within coding regions in the human genome, the highest scoring of which are enriched for miRNA target sequences. To validate our results, we experimentally demonstrated that the let-7 miRNA directly targets the miRNA-processing enzyme Dicer within its coding sequence, thus establishing a mechanism for a miRNA/Dicer autoregulatory negative feedback loop. We also found computational evidence to suggest that miRNA target sites in coding regions and 3′ UTRs may differ in mechanism. This work demonstrates that miRNAs can directly target transcripts within their coding region in animals, and it suggests that a complete search for the regulatory targets of miRNAs should be expanded to include genes with recognition sites within their coding regions. As more genomes are sequenced, the methodological approach that we used for identifying motifs with high sequence conservation will be increasingly valuable for detecting functional sequence motifs within coding regions. PMID:18812516

  7. Mutational analysis of a conserved motif of Agrobacterium tumefaciens VirD2.

    PubMed

    Vogel, A M; Yoon, J; Das, A

    1995-10-25

    The VirD2 polypeptide from Agrobacterium tumefaciens, in the presence of VirD1, introduces a site- and strand-specific nick at the T-DNA borders. A similar reaction at the origin of transfer (oriT) of plasmids is essential for plasmid transfer by bacterial conjugation. A comparison of protein sequences of VirD2 and its functional homologs in bacterial conjugation and in rolling circle replication revealed that they share a conserved 14 residue segment, HxDxxx(P/u)HuHuuux [residues 126-139 of VirD2; Ilyina, T.V. and Koonin, E.V. (1992) Nucleic Acids Res. 20, 3279-3285]. A mutational approach was used to test the role of these residues in the endonuclease activity of VirD2. The results demonstrated that the two invariant histidine residues (H133 and H135) are essential for activity. Mutations at three sites, histidine 126, aspartic acid 128 and aspartic acid 130, that are conserved in a subfamily of the plasmid mobilization proteins, led to the loss of VirD2 activity. Aspartic acid at position 130, could be substituted with glutamic acid and to a much lesser extent, with tyrosine. In contrast, another conserved residue, asparagine 139, tolerated many different amino acid substitutions. The non-conserved residues, arginine 129, proline 132 and leucine 134, were also found to be important for function. Isolation of null mutations that map throughout this conserved domain confirm the hypothesis that this region is essential for function. PMID:7479069

  8. qPMS7: a fast algorithm for finding (ℓ, d)-motifs in DNA and protein sequences.

    PubMed

    Dinh, Hieu; Rajasekaran, Sanguthevar; Davila, Jaime

    2012-01-01

    Detection of rare events happening in a set of DNA/protein sequences could lead to new biological discoveries. One kind of such rare events is the presence of patterns called motifs in DNA/protein sequences. Finding motifs is a challenging problem since the general version of motif search has been proven to be intractable. Motifs discovery is an important problem in biology. For example, it is useful in the detection of transcription factor binding sites and transcriptional regulatory elements that are very crucial in understanding gene function, human disease, drug design, etc. Many versions of the motif search problem have been proposed in the literature. One such is the (ℓ, d)-motif search (or Planted Motif Search (PMS)). A generalized version of the PMS problem, namely, Quorum Planted Motif Search (qPMS), is shown to accurately model motifs in real data. However, solving the qPMS problem is an extremely difficult task because a special case of it, the PMS Problem, is already NP-hard, which means that any algorithm solving it can be expected to take exponential time in the worse case scenario. In this paper, we propose a novel algorithm named qPMS7 that tackles the qPMS problem on real data as well as challenging instances. Experimental results show that our Algorithm qPMS7 is on an average 5 times faster than the state-of-art algorithm. The executable program of Algorithm qPMS7 is freely available on the web at http://pms.engr.uconn.edu/downloads/qPMS7.zip. Our online motif discovery tools that use Algorithm qPMS7 are freely available at http://pms.engr.uconn.edu or http://motifsearch.com. PMID:22848493

  9. Conserved motifs II to VI of DNA helicase II from Escherichia coli are all required for biological activity.

    PubMed Central

    Zhang, G; Deng, E; Baugh, L R; Hamilton, C M; Maples, V F; Kushner, S R

    1997-01-01

    There are seven conserved motifs (IA, IB, and II to VI) in DNA helicase II of Escherichia coli that have high homology among a large family of proteins involved in DNA metabolism. To address the functional importance of motifs II to VI, we employed site-directed mutagenesis to replace the charged amino acid residues in each motif with alanines. Cells carrying these mutant alleles exhibited higher UV and methyl methanesulfonate sensitivity, increased rates of spontaneous mutagenesis, and elevated levels of homologous recombination, indicating defects in both the excision repair and mismatch repair pathways. In addition, we also changed the highly conserved tyrosine(600) in motif VI to phenylalanine (uvrD309, Y600F). This mutant displayed a moderate increase in UV sensitivity but a decrease in spontaneous mutation rate, suggesting that DNA helicase II may have different functions in the two DNA repair pathways. Furthermore, a mutation in domain IV (uvrD307, R284A) significantly reduced the viability of some E. coli K-12 strains at 30 degrees C but not at 37 degrees C. The implications of these observations are discussed. PMID:9393722

  10. Quadfinder: server for identification and analysis of quadruplex-forming motifs in nucleotide sequences

    PubMed Central

    Scaria, Vinod; Hariharan, Manoj; Arora, Amit; Maiti, Souvik

    2006-01-01

    G-quadruplex secondary structures, which play a structural role in repetitive DNA such as telomeres, may also play a functional role at other genomic locations as targetable regulatory elements which control gene expression. The recent interest in application of quadruplexes in biological systems prompted us to develop a tool for the identification and analysis of quadruplex-forming nucleotide sequences especially in the RNA. Here we present Quadfinder, an online server for prediction and bioinformatics of uni-molecular quadruplex-forming nucleotide sequences. The server is designed to be user-friendly and needs minimal intervention by the user, while providing flexibility of defining the variants of the motif. The server is freely available at URL . PMID:16845097

  11. Functionally conserved enhancers with divergent sequences in distant vertebrates

    SciTech Connect

    Yang, Song; Oksenberg, Nir; Takayama, Sachiko; Heo, Seok -Jin; Poliakov, Alexander; Ahituv, Nadav; Dubchak, Inna; Boffelli, Dario

    2015-10-30

    To examine the contributions of sequence and function conservation in the evolution of enhancers, we systematically identified enhancers whose sequences are not conserved among distant groups of vertebrate species, but have homologous function and are likely to be derived from a common ancestral sequence. In conclusion, our approach combined comparative genomics and epigenomics to identify potential enhancer sequences in the genomes of three groups of distantly related vertebrate species.

  12. Using Weeder, Pscan, and PscanChIP for the Discovery of Enriched Transcription Factor Binding Site Motifs in Nucleotide Sequences.

    PubMed

    Zambelli, Federico; Pesole, Graziano; Pavesi, Giulio

    2014-01-01

    One of the greatest challenges facing modern molecular biology is understanding the complex mechanisms regulating gene expression. A fundamental step in this process requires the characterization of sequence motifs involved in the regulation of gene expression at transcriptional and post-transcriptional levels. In particular, transcription is modulated by the interaction of transcription factors (TFs) with their corresponding binding sites. Weeder, Pscan, and PscanChIP are software tools freely available for noncommercial users as a stand-alone or Web-based applications for the automatic discovery of conserved motifs in a set of DNA sequences likely to be bound by the same TFs. Input for the tools can be promoter sequences from co-expressed or co-regulated genes (for which Weeder and Pscan are suitable), or regions identified through genome wide ChIP-seq or similar experiments (Weeder and PscanChIP). The motifs are either found by a de novo approach (Weeder) or by using descriptors of the binding specificity of TFs (Pscan and PscanChIP). PMID:25199791

  13. Sequence Analysis and Domain Motifs in the Porcine Skin Decorin Glycosaminoglycan Chain*

    PubMed Central

    Zhao, Xue; Yang, Bo; Solakylidirim, Kemal; Joo, Eun Ji; Toida, Toshihiko; Higashi, Kyohei; Linhardt, Robert J.; Li, Lingyun

    2013-01-01

    Decorin proteoglycan is comprised of a core protein containing a single O-linked dermatan sulfate/chondroitin sulfate glycosaminoglycan (GAG) chain. Although the sequence of the decorin core protein is determined by the gene encoding its structure, the structure of its GAG chain is determined in the Golgi. The recent application of modern MS to bikunin, a far simpler chondroitin sulfate proteoglycans, suggests that it has a single or small number of defined sequences. On this basis, a similar approach to sequence the decorin of porcine skin much larger and more structurally complex dermatan sulfate/chondroitin sulfate GAG chain was undertaken. This approach resulted in information on the consistency/variability of its linkage region at the reducing end of the GAG chain, its iduronic acid-rich domain, glucuronic acid-rich domain, and non-reducing end. A general motif for the porcine skin decorin GAG chain was established. A single small decorin GAG chain was sequenced using MS/MS analysis. The data obtained in the study suggest that the decorin GAG chain has a small or a limited number of sequences. PMID:23423381

  14. DILIMOT: discovery of linear motifs in proteins.

    PubMed

    Neduva, Victor; Russell, Robert B

    2006-07-01

    Discovery of protein functional motifs is critical in modern biology. Small segments of 3-10 residues play critical roles in protein interactions, post-translational modifications and trafficking. DILIMOT (DIscovery of LInear MOTifs) is a server for the prediction of these short linear motifs within a set of proteins. Given a set of sequences sharing a common functional feature (e.g. interaction partner or localization) the method finds statistically over-represented motifs likely to be responsible for it. The input sequences are first passed through a set of filters to remove regions unlikely to contain instances of linear motifs. Motifs are then found in the remaining sequence and ranked according to a statistic that measure over-representation and conservation across homologues in related species. The results are displayed via a visual interface for easy perusal. The server is available at http://dilimot.embl.de. PMID:16845024

  15. Temporal motifs reveal homophily, gender-specific patterns, and group talk in call sequences.

    PubMed

    Kovanen, Lauri; Kaski, Kimmo; Kertész, János; Saramäki, Jari

    2013-11-01

    Recent studies on electronic communication records have shown that human communication has complex temporal structure. We study how communication patterns that involve multiple individuals are affected by attributes such as sex and age. To this end, we represent the communication records as a colored temporal network where node color is used to represent individuals' attributes, and identify patterns known as temporal motifs. We then construct a null model for the occurrence of temporal motifs that takes into account the interaction frequencies and connectivity between nodes of different colors. This null model allows us to detect significant patterns in call sequences that cannot be observed in a static network that uses interaction frequencies as link weights. We find sex-related differences in communication patterns in a large dataset of mobile phone records and show the existence of temporal homophily, the tendency of similar individuals to participate in communication patterns beyond what would be expected on the basis of their average interaction frequencies. We also show that temporal patterns differ between dense and sparse neighborhoods in the network. Because also this result is independent of interaction frequencies, it can be seen as an extension of Granovetter's hypothesis to temporal networks. PMID:24145424

  16. ZFP57 recognizes multiple and closely spaced sequence motif variants to maintain repressive epigenetic marks in mouse embryonic stem cells

    PubMed Central

    Anvar, Zahra; Cammisa, Marco; Riso, Vincenzo; Baglivo, Ilaria; Kukreja, Harpreet; Sparago, Angela; Girardot, Michael; Lad, Shraddha; De Feis, Italia; Cerrato, Flavia; Angelini, Claudia; Feil, Robert; Pedone, Paolo V.; Grimaldi, Giovanna; Riccio, Andrea

    2016-01-01

    Imprinting Control Regions (ICRs) need to maintain their parental allele-specific DNA methylation during early embryogenesis despite genome-wide demethylation and subsequent de novo methylation. ZFP57 and KAP1 are both required for maintaining the repressive DNA methylation and H3-lysine-9-trimethylation (H3K9me3) at ICRs. In vitro, ZFP57 binds a specific hexanucleotide motif that is enriched at its genomic binding sites. We now demonstrate in mouse embryonic stem cells (ESCs) that SNPs disrupting closely-spaced hexanucleotide motifs are associated with lack of ZFP57 binding and H3K9me3 enrichment. Through a transgenic approach in mouse ESCs, we further demonstrate that an ICR fragment containing three ZFP57 motif sequences recapitulates the original methylated or unmethylated status when integrated into the genome at an ectopic position. Mutation of Zfp57 or the hexanucleotide motifs led to loss of ZFP57 binding and DNA methylation of the transgene. Finally, we identified a sequence variant of the hexanucleotide motif that interacts with ZFP57 both in vivo and in vitro. The presence of multiple and closely located copies of ZFP57 motif variants emerges as a distinct characteristic that is required for the faithful maintenance of repressive epigenetic marks at ICRs and other ZFP57 binding sites. PMID:26481358

  17. Sequence analysis of the L protein of the Ebola 2014 outbreak: Insight into conserved regions and mutations.

    PubMed

    Ayub, Gohar; Waheed, Yasir

    2016-06-01

    The 2014 Ebola outbreak was one of the largest that have occurred; it started in Guinea and spread to Nigeria, Liberia and Sierra Leone. Phylogenetic analysis of the current virus species indicated that this outbreak is the result of a divergent lineage of the Zaire ebolavirus. The L protein of Ebola virus (EBOV) is the catalytic subunit of the RNA‑dependent RNA polymerase complex, which, with VP35, is key for the replication and transcription of viral RNA. Earlier sequence analysis demonstrated that the L protein of all non‑segmented negative‑sense (NNS) RNA viruses consists of six domains containing conserved functional motifs. The aim of the present study was to analyze the presence of these motifs in 2014 EBOV isolates, highlight their function and how they may contribute to the overall pathogenicity of the isolates. For this purpose, 81 2014 EBOV L protein sequences were aligned with 475 other NNS RNA viruses, including Paramyxoviridae and Rhabdoviridae viruses. Phylogenetic analysis of all EBOV outbreak L protein sequences was also performed. Analysis of the amino acid substitutions in the 2014 EBOV outbreak was conducted using sequence analysis. The alignment demonstrated the presence of previously conserved motifs in the 2014 EBOV isolates and novel residues. Notably, all the mutations identified in the 2014 EBOV isolates were tolerant, they were pathogenic with certain examples occurring within previously determined functional conserved motifs, possibly altering viral pathogenicity, replication and virulence. The phylogenetic analysis demonstrated that all sequences with the exception of the 2014 EBOV sequences were clustered together. The 2014 EBOV outbreak has acquired a great number of mutations, which may explain the reasons behind this unprecedented outbreak. Certain residues critical to the function of the polymerase remain conserved and may be targets for the development of antiviral therapeutic agents. PMID:27082438

  18. Exceptional motifs in different Markov chain models for a statistical analysis of DNA sequences.

    PubMed

    Schbath, S; Prum, B; de Turckheim, E

    1995-01-01

    Identifying exceptional motifs is often used for extracting information from long DNA sequences. The two difficulties of the method are the choice of the model that defines the expected frequencies of words and the approximation of the variance of the difference T(W) between the number of occurrences of a word W and its estimation. We consider here different Markov chain models, either with stationary or periodic transition probabilities. We estimate the variance of the difference T(W) by the conditional variance of the number of occurrences of W given the oligonucleotides counts that define the model. Two applications show how to use asymptotically standard normal statistics associated with the counts to describe a given sequence in terms of its outlying words. Sequences of Escherichia coli and of Bacillus subtilis are compared with respect to their exceptional tri- and tetranucleotides. For both bacteria, exceptional 3-words are mainly found in the coding frame. E. coli palindrome counts are analyzed in different models, showing that many overabundant words are one-letter mutations of avoided palindromes. PMID:8521272

  19. The nature of actinomycin D binding to d(AACCAXYG) sequence motifs

    PubMed Central

    Chen, Fu-Ming; Sha, Feng; Chin, Ko-Hsin; Chou, Shan-Ho

    2004-01-01

    Earlier studies by others had indicated that actinomycin D (ACTD) binds well to d(AACCATAG) and the end sequence TAG-3′ is essential for its strong binding. In an effort to verify these assertions and to uncover other possible strong ACTD binding sequences as well as to elucidate the nature of their binding, systematic studies have been carried out with oligomers of d(AACCAXYG) sequence motifs, where X and Y can be any DNA base. The results indicate that in addition to TAG-3′, oligomers ending with XAG-3′ and XCG-3′ all provide binding constants ≥1 × 107 M–1 and even sequences ending with XTG-3′ and XGG-3′ exhibit binding affinities in the range 1–8 × 106 M–1. The nature of the strong ACTD affinity of the sequences d(A1A2C3C4A5X6Y7G8) was delineated via comparative binding studies of d(AACCAAAG), d(AGCCAAAG) and their base substituted derivatives. Two binding modes are proposed to coexist, with the major component consisting of the 3′-terminus G base folding back to base pair with C4 and the ACTD inserting at A2C3C4 by looping out the C3 while both faces of the chromophore are stacked by A and G bases, respectively. The minor mode is for the G to base pair with C3 and to have the same A/chromophore/G stacking but without a looped out base. These assertions are supported by induced circular dichroic and fluorescence spectral measurements. PMID:14715925

  20. Identification of Internal Transcribed Spacer Sequence Motifs in Truffles: a First Step toward Their DNA Bar Coding▿ †

    PubMed Central

    El Karkouri, Khalid; Murat, Claude; Zampieri, Elisa; Bonfante, Paola

    2007-01-01

    This work presents DNA sequence motifs from the internal transcribed spacer (ITS) of the nuclear rRNA repeat unit which are useful for the identification of five European and Asiatic truffles (Tuber magnatum, T. melanosporum, T. indicum, T. aestivum, and T. mesentericum). Truffles are edible mycorrhizal ascomycetes that show similar morphological characteristics but that have distinct organoleptic and economic values. A total of 36 out of 46 ITS1 or ITS2 sequence motifs have allowed an accurate in silico distinction of the five truffles to be made (i.e., by pattern matching and/or BLAST analysis on downloaded GenBank sequences and directly against GenBank databases). The motifs considered the intraspecific genetic variability of each species, including rare haplotypes, and assigned their respective species from either the ascocarps or ectomycorrhizas. The data indicate that short ITS1 or ITS2 motifs (≤50 bp in size) can be considered promising tools for truffle species identification. A dot blot hybridization analysis of T. magnatum and T. melanosporum compared with other close relatives or distant lineages allowed at least one highly specific motif to be identified for each species. These results were confirmed in a blind test which included new field isolates. The current work has provided a reliable new tool for a truffle oligonucleotide bar code and identification in ecological and evolutionary studies. PMID:17601808

  1. Identification of internal transcribed spacer sequence motifs in truffles: a first step toward their DNA bar coding.

    PubMed

    El Karkouri, Khalid; Murat, Claude; Zampieri, Elisa; Bonfante, Paola

    2007-08-01

    This work presents DNA sequence motifs from the internal transcribed spacer (ITS) of the nuclear rRNA repeat unit which are useful for the identification of five European and Asiatic truffles (Tuber magnatum, T. melanosporum, T. indicum, T. aestivum, and T. mesentericum). Truffles are edible mycorrhizal ascomycetes that show similar morphological characteristics but that have distinct organoleptic and economic values. A total of 36 out of 46 ITS1 or ITS2 sequence motifs have allowed an accurate in silico distinction of the five truffles to be made (i.e., by pattern matching and/or BLAST analysis on downloaded GenBank sequences and directly against GenBank databases). The motifs considered the intraspecific genetic variability of each species, including rare haplotypes, and assigned their respective species from either the ascocarps or ectomycorrhizas. The data indicate that short ITS1 or ITS2 motifs (< or = 50 bp in size) can be considered promising tools for truffle species identification. A dot blot hybridization analysis of T. magnatum and T. melanosporum compared with other close relatives or distant lineages allowed at least one highly specific motif to be identified for each species. These results were confirmed in a blind test which included new field isolates. The current work has provided a reliable new tool for a truffle oligonucleotide bar code and identification in ecological and evolutionary studies. PMID:17601808

  2. miRNA-mediated deadenylation is orchestrated by GW182 through two conserved motifs that interact with CCR4-NOT.

    PubMed

    Fabian, Marc R; Cieplak, Maja K; Frank, Filipp; Morita, Masahiro; Green, Jonathan; Srikumar, Tharan; Nagar, Bhushan; Yamamoto, Tadashi; Raught, Brian; Duchaine, Thomas F; Sonenberg, Nahum

    2011-11-01

    miRNAs recruit the miRNA-induced silencing complex (miRISC), which includes Argonaute and GW182 as core proteins. GW182 proteins effect translational repression and deadenylation of target mRNAs. However, the molecular mechanisms of GW182-mediated repression remain obscure. We show here that human GW182 independently interacts with the PAN2-PAN3 and CCR4-NOT deadenylase complexes. Interaction of GW182 with CCR4-NOT is mediated by two newly discovered phylogenetically conserved motifs. Although either motif is sufficient to bind CCR4-NOT, only one of them can promote processive deadenylation of target mRNAs. Thus, GW182 serves as both a platform that recruits deadenylases and as a deadenylase coactivator that facilitates the removal of the poly(A) tail by CCR4-NOT. PMID:21984185

  3. A conserved motif in transmembrane helix 1 of diphtheria toxin mediates catalytic domain delivery to the cytosol

    PubMed Central

    Ratts, Ryan; Trujillo, Carolina; Bharti, Ajit; vanderSpek, Johanna; Harrison, Robert; Murphy, John R.

    2005-01-01

    A 10-aa motif in transmembrane helix 1 of diphtheria toxin that is conserved in anthrax edema factor, anthrax lethal factor, and botulinum neurotoxin serotypes A, C, and D was identified by blast, clustal w, and meme computational analysis. Using the diphtheria toxin-related fusion protein toxin DAB389IL-2, we demonstrate that introduction of the L221E mutation into a highly conserved residue within this motif results in a nontoxic catalytic domain translocation deficient phenotype. To further probe the function of this motif in the process by which the catalytic domain is delivered from the lumen of early endosomes to the cytosol, we constructed a gene encoding a portion of diphtheria toxin transmembrane helix 1, T1, which carries the motif and is expressed from a CMV promoter. We then isolated stable transfectants of Hut102/6TG cells that express the T1 peptide, Hut102/6TG-T1. In contrast to the parental cell line, Hut102/6TG-T1 cells are ca. 104-fold more resistant to the fusion protein toxin. This resistance is completely reversed by coexpression of small interfering RNA directed against the gene encoding the T1 peptide in Hut102/6TG-T1 cells. We further demonstrate by GST-DT140-271 pull-down experiments in the presence and absence of synthetic T1 peptides the specific binding of coatomer protein complex subunit β to this region of the diphtheria toxin transmembrane domain. PMID:16230620

  4. Protospacer Adjacent Motif (PAM)-Distal Sequences Engage CRISPR Cas9 DNA Target Cleavage

    PubMed Central

    Ethier, Sylvain; Schmeing, T. Martin; Dostie, Josée; Pelletier, Jerry

    2014-01-01

    The clustered regularly interspaced short palindromic repeat (CRISPR)-associated enzyme Cas9 is an RNA-guided nuclease that has been widely adapted for genome editing in eukaryotic cells. However, the in vivo target specificity of Cas9 is poorly understood and most studies rely on in silico predictions to define the potential off-target editing spectrum. Using chromatin immunoprecipitation followed by sequencing (ChIP-seq), we delineate the genome-wide binding panorama of catalytically inactive Cas9 directed by two different single guide (sg) RNAs targeting the Trp53 locus. Cas9:sgRNA complexes are able to load onto multiple sites with short seed regions adjacent to 5′NGG3′ protospacer adjacent motifs (PAM). Yet among 43 ChIP-seq sites harboring seed regions analyzed for mutational status, we find editing only at the intended on-target locus and one off-target site. In vitro analysis of target site recognition revealed that interactions between the 5′ end of the guide and PAM-distal target sequences are necessary to efficiently engage Cas9 nucleolytic activity, providing an explanation for why off-target editing is significantly lower than expected from ChIP-seq data. PMID:25275497

  5. Amplification of human papillomavirus DNA sequences by using conserved primers.

    PubMed Central

    Gregoire, L; Arella, M; Campione-Piccardo, J; Lancaster, W D

    1989-01-01

    The polymerase chain reaction has potential for use in the detection of small amounts of human papillomavirus (HPV) viral nucleic acids present in clinical specimens. However, new HPV types for which no probes exist would remain undetected by using type-specific primers for the polymerase chain reaction before hybridization. Primers corresponding to highly conserved HPV sequences may be useful for detecting low amounts of known HPV DNA as well as new HPV types. Here we analyze a pair of primers derived from conserved sequences within the E1 open reading frame for HPV sequence amplification by using the polymerase chain reaction. The longest perfect homology among HPV sequences is a 12-mer within the first exon of E1M. A region of conserved amino acids coded by the E1 open reading frame allowed the detection of another highly conserved region about 850 base pairs downstream. Two 21-mers derived from these conserved regions were used to amplify sequences from all HPV DNAs used as templates. The amplified DNA was shown to be specific for HPV sequences within the E1 open reading frame. DNA from HPVs whose sequences were not available were amplified by using these two primers. HPV DNA sequences in clinical specimens could also be amplified with the primers. Images PMID:2556429

  6. Human immunodeficiency virus type 1 and 2 envelope glycoproteins oligomerize through conserved sequences.

    PubMed Central

    Center, R J; Kemp, B E; Poumbourios, P

    1997-01-01

    Hetero-oligomerization between human immunodeficiency virus type 2 (HIV-2) envelope glycoprotein (Env) truncation mutants and epitope-tagged gp160 is dependent on the presence of gp41 transmembrane protein (TM) amino acids 552 to 589, a putative amphipathic alpha-helical sequence. HIV-2 Env truncation mutants containing this sequence were also able to form cross-type hetero-oligomers with HIV-1 Env. HIV-2/HIV-1 hetero-oligomerization was, however, more sensitive to disruption by mutagenesis or increased temperature. The conservation of the Env oligomerization function of the HIV-1 and HIV-2 alpha-helical sequences suggests that retroviral TM alpha-helical motifs may have a universal role in oligomerization. PMID:9188654

  7. Identification of G and P genotype-specific motifs in the predicted VP7 and VP4 amino acid sequences.

    PubMed

    Ma, Yongping

    2015-12-01

    Equine rotavirus (ERV) strain L338 (G13P[18]) has a unique G and P genotype. However, the evolutionary relationship of L338 with other ERVs is still unknown. Here whole genome analysis of the L338 ERV strain was independently performed. Its genotype constellations were determined as G13-P[18]-I6-R9-C9-M6-A6-N9-T12-E14-H11, confirming previous genotype assignments. The L338 strain only shared the P[18] and I6 genotypes with other ERVs. The nucleotide sequences of the other 9 RNA segments were different from those of cogent genes of all other group A rotavirus (RVA) strains including ERVs and formed unique phylogenetic lineages. The L338 evolutionary footprints were tentatively identified in both VP7 and VP4 amino acid sequences: two regions were found in VP7 and twelve in VP4. The conserved regions shared between L338 and other group A rotavirus strains (RVAs) indicated that L338 was more closely related genomically to animal and human RVAs other than ERVs, suggesting that L338 may not be an endogenous equine RV but have emerged as an interspecies reassortant with other RVA strains. Furthermore, genotype-specific motifs of all 27 G and 37 P types were identified in regions 7-1a (aa 91-100) of VP7 and regions 8-1 (aa146-151) and 8-3 (aa113-118 and 125-135) of VP4 (VP8*). PMID:26321159

  8. Flow Cytometry-assisted Cloning of Specific Sequence Motifs fromComplex 16S ribosomal RNA Gene Libraries.

    SciTech Connect

    Nielsen, J.L.; Schramm, A.; Bernhard, A.E.; van den Engh, G.J.; Stahl, D.A.

    2004-07-21

    A flow cytometry method was developed for rapid screeningand recovery of cloned DNA containing common sequence motifs. Thisapproach, termed fluorescence-activated cell sorting-assisted cloning,was used to recover sequences affiliated with a unique lineage within theBacteroidetes not abundant in a clone library of environmental 16S rRNAgenes. Retrieval and sequence analysis of phylogenetically informativegenes has become a standard cultivation-independent technique toinvestigate microbial diversity in nature (7, 18). Genes encoding the 16SrRNA, because of the relative ease of their selective amplification, havebeen most frequently employed for general diversity surveys (16).Environmental studies have also focused on specific subpopulationsaffiliated with a phylogenetic group or identified by genes encodingspecific metabolic functions (e.g., ammonia oxidation, sulfaterespiration, and nitrate reduction) (8,15,20). However, specificpopulations may be of low abundance (1,23), or the genes encodingspecific metabolic functions may be insufficiently conserved to providepriming sites for general PCR amplification. Three general approacheshave been used to obtain 16S rRNA sequence information from low-abundancepopulations: screening hundreds to thousands of clones in a general 16SrRNA gene library (21), flow cytometric sorting of a subpopulation ofenvironmentally derived cells labeled by fluorescent in situhybridization (FISH) (27), or selective PCR amplification using primersspecific for the subpopulation (2,23). While the first approach is simplytime-consuming and tedious, the second has been restricted to fairlylarge and strongly fluorescent cells from aquatic samples (5, 27). Thethird approach often generates fragments of only a few hundred bases dueto the limited number of specific priming sites. Partial sequenceinformation often degrades analysis, obscuring or distorting thephylogenetic placement of the new sequences (11, 20). A more robustcharacterization of environ

  9. Sequence, structure, and cooperativity in folding of elementary protein structural motifs

    PubMed Central

    Lai, Jason K.; Kubelka, Ginka S.; Kubelka, Jan

    2015-01-01

    Residue-level unfolding of two helix-turn-helix proteins—one naturally occurring and one de novo designed—is reconstructed from multiple sets of site-specific 13C isotopically edited infrared (IR) and circular dichroism (CD) data using Ising-like statistical-mechanical models. Several model variants are parameterized to test the importance of sequence-specific interactions (approximated by Miyazawa–Jernigan statistical potentials), local structural flexibility (derived from the ensemble of NMR structures), interhelical hydrogen bonds, and native contacts separated by intervening disordered regions (through the Wako–Saitô–Muñoz–Eaton scheme, which disallows such configurations). The models are optimized by directly simulating experimental observables: CD ellipticity at 222 nm for model proteins and their fragments and 13C-amide I′ bands for multiple isotopologues of each protein. We find that data can be quantitatively reproduced by the model that allows two interacting segments flanking a disordered loop (double sequence approximation) and incorporates flexibility in the native contact maps, but neither sequence-specific interactions nor hydrogen bonds are required. The near-identical free energy profiles as a function of the global order parameter are consistent with expected similar folding kinetics for nearly identical structures. However, the predicted folding mechanism for the two motifs is different, reflecting the order of local stability. We introduce free energy profiles for “experimental” reaction coordinates—namely, the degree of local folding as sensed by site-specific 13C-edited IR, which highlight folding heterogeneity and contrast its overall, average description with the detailed, local picture. PMID:26216963

  10. Sequence, structure, and cooperativity in folding of elementary protein structural motifs.

    PubMed

    Lai, Jason K; Kubelka, Ginka S; Kubelka, Jan

    2015-08-11

    Residue-level unfolding of two helix-turn-helix proteins--one naturally occurring and one de novo designed--is reconstructed from multiple sets of site-specific (13)C isotopically edited infrared (IR) and circular dichroism (CD) data using Ising-like statistical-mechanical models. Several model variants are parameterized to test the importance of sequence-specific interactions (approximated by Miyazawa-Jernigan statistical potentials), local structural flexibility (derived from the ensemble of NMR structures), interhelical hydrogen bonds, and native contacts separated by intervening disordered regions (through the Wako-Saitô-Muñoz-Eaton scheme, which disallows such configurations). The models are optimized by directly simulating experimental observables: CD ellipticity at 222 nm for model proteins and their fragments and (13)C-amide I' bands for multiple isotopologues of each protein. We find that data can be quantitatively reproduced by the model that allows two interacting segments flanking a disordered loop (double sequence approximation) and incorporates flexibility in the native contact maps, but neither sequence-specific interactions nor hydrogen bonds are required. The near-identical free energy profiles as a function of the global order parameter are consistent with expected similar folding kinetics for nearly identical structures. However, the predicted folding mechanism for the two motifs is different, reflecting the order of local stability. We introduce free energy profiles for "experimental" reaction coordinates--namely, the degree of local folding as sensed by site-specific (13)C-edited IR, which highlight folding heterogeneity and contrast its overall, average description with the detailed, local picture. PMID:26216963

  11. A conserved motif in JNK/p38-specific MAPK phosphatases as a determinant for JNK1 recognition and inactivation

    PubMed Central

    Liu, Xin; Zhang, Chen-Song; Lu, Chang; Lin, Sheng-Cai; Wu, Jia-Wei; Wang, Zhi-Xin

    2016-01-01

    Mitogen-activated protein kinases (MAPKs), important in a large array of signalling pathways, are tightly controlled by a cascade of protein kinases and by MAPK phosphatases (MKPs). MAPK signalling efficiency and specificity is modulated by protein–protein interactions between individual MAPKs and the docking motifs in cognate binding partners. Two types of docking interactions have been identified: D-motif-mediated interaction and FXF-docking interaction. Here we report the crystal structure of JNK1 bound to the catalytic domain of MKP7 at 2.4-Å resolution, providing high-resolution structural insight into the FXF-docking interaction. The 285FNFL288 segment in MKP7 directly binds to a hydrophobic site on JNK1 that is near the MAPK insertion and helix αG. Biochemical studies further reveal that this highly conserved structural motif is present in all members of the MKP family, and the interaction mode is universal and critical for the MKP-MAPK recognition and biological function. PMID:26988444

  12. Comparative Sequence and Structure Analysis Reveals the Conservation and Diversity of Nucleotide Positions and Their Associated Tertiary Interactions in the Riboswitches

    PubMed Central

    Appasamy, Sri D.; Ramlan, Effirul Ikhwan; Firdaus-Raih, Mohd

    2013-01-01

    The tertiary motifs in complex RNA molecules play vital roles to either stabilize the formation of RNA 3D structure or to provide important biological functionality to the molecule. In order to better understand the roles of these tertiary motifs in riboswitches, we examined 11 representative riboswitch PDB structures for potential agreement of both motif occurrences and conservations. A total of 61 unique tertiary interactions were found in the reference structures. In addition to the expected common A-minor motifs and base-triples mainly involved in linking distant regions the riboswitch structures three highly conserved variants of A-minor interactions called G-minors were found in the SAM-I and FMN riboswitches where they appear to be involved in the recognition of the respective ligand’s functional groups. From our structural survey as well as corresponding structure and sequence alignments, the agreement between motif occurrences and conservations are very prominent across the representative riboswitches. Our analysis provide evidence that some of these tertiary interactions are essential components to form the structure where their sequence positions are conserved despite a high degree of diversity in other parts of the respective riboswitches sequences. This is indicative of a vital role for these tertiary interactions in determining the specific biological function of riboswitch. PMID:24040136

  13. Structure of the Brd4 ET domain bound to a C-terminal motif from γ-retroviral integrases reveals a conserved mechanism of interaction

    PubMed Central

    Crowe, Brandon L.; Larue, Ross C.; Yuan, Chunhua; Hess, Sonja; Kvaratskhelia, Mamuka; Foster, Mark P.

    2016-01-01

    The bromodomain and extraterminal domain (BET) protein family are promising therapeutic targets for a range of diseases linked to transcriptional activation, cancer, viral latency, and viral integration. Tandem bromodomains selectively tether BET proteins to chromatin by engaging cognate acetylated histone marks, and the extraterminal (ET) domain is the focal point for recruiting a range of cellular and viral proteins. BET proteins guide γ-retroviral integration to transcription start sites and enhancers through bimodal interaction with chromatin and the γ-retroviral integrase (IN). We report the NMR-derived solution structure of the Brd4 ET domain bound to a conserved peptide sequence from the C terminus of murine leukemia virus (MLV) IN. The complex reveals a protein–protein interaction governed by the binding-coupled folding of disordered regions in both interacting partners to form a well-structured intermolecular three-stranded β sheet. In addition, we show that a peptide comprising the ET binding motif (EBM) of MLV IN can disrupt the cognate interaction of Brd4 with NSD3, and that substitutions of Brd4 ET residues essential for binding MLV IN also impair interaction of Brd4 with a number of cellular partners involved in transcriptional regulation and chromatin remodeling. This suggests that γ-retroviruses have evolved the EBM to mimic a cognate interaction motif to achieve effective integration in host chromatin. Collectively, our findings identify key structural features of the ET domain of Brd4 that allow for interactions with both cellular and viral proteins. PMID:26858406

  14. AptaTRACE Elucidates RNA Sequence-Structure Motifs from Selection Trends in HT-SELEX Experiments.

    PubMed

    Dao, Phuong; Hoinka, Jan; Takahashi, Mayumi; Zhou, Jiehua; Ho, Michelle; Wang, Yijie; Costa, Fabrizio; Rossi, John J; Backofen, Rolf; Burnett, John; Przytycka, Teresa M

    2016-07-01

    Aptamers, short RNA or DNA molecules that bind distinct targets with high affinity and specificity, can be identified using high-throughput systematic evolution of ligands by exponential enrichment (HT-SELEX), but scalable analytic tools for understanding sequence-function relationships from diverse HT-SELEX data are not available. Here we present AptaTRACE, a computational approach that leverages the experimental design of the HT-SELEX protocol, RNA secondary structure, and the potential presence of many secondary motifs to identify sequence-structure motifs that show a signature of selection. We apply AptaTRACE to identify nine motifs in C-C chemokine receptor type 7 targeted by aptamers in an in vitro cell-SELEX experiment. We experimentally validate two aptamers whose binding required both sequence and structural features. AptaTRACE can identify low-abundance motifs, and we show through simulations that, because of this, it could lower HT-SELEX cost and time by reducing the number of selection cycles required. PMID:27467247

  15. Explorations of linked editosome domains leading to the discovery of motifs defining conserved pockets in editosome OB-folds

    PubMed Central

    Park, Young-Jun; Hol, Wim G. J.

    2012-01-01

    Trypanosomatids form a group of protozoa which contain parasites of human, animals and plants. Several of these species cause major human diseases, including Trypanosoma brucei which is the causative agent of human African trypanosomiasis, also called sleeping sickness. These organisms have many highly unusual features including a unique U-insertion/deletion RNA editing process in the single mitochondrion. A key multi-protein complex, called the ~20S editosome, or editosome, carries out a cascade of essential RNA-modifying reactions and contains a core of 12 different proteins of which six are the interaction proteins A1 to A6. Each of these interaction proteins comprises a C-terminal OB-fold and the smallest interaction protein A6 has been shown to interact with four other editosome OB-folds. Here we report the results of a “linked OB-fold” approach to obtain a view of how multiple OB-folds might interact in the core of the editosome. Constructs of multiple variants of linked domains in 25 expression and co-expression experiments resulted in 13 soluble multi-OB-fold complexes. In several instances, these complexes were more homogeneous in size than those obtained from corresponding unlinked OB-folds. The crystal structure of A3OB linked to A6 could be elucidated and confirmed the tight interaction between these two OB domains as seen also in our recent complex of A3OB and A6 with nanobodies. In the current crystal structure of A3OB linked to A6, hydrophobic side chains reside in well-defined pockets of neighboring OB-fold domains. When analyzing the available crystal structures of editosome OB-folds, it appears that in five instances “Pocket 1” of A1OB, A3OB and A6 is occupied by a hydrophobic side chain from a neighboring protein. In these three different OB-folds, Pocket 1 is formed by two conserved sequence motifs and an invariant arginine. These pockets might play a key role in the assembly or mechanism of the editosome by interacting with hydrophobic

  16. Finishing and Special Motifs: Lessons Learned from CRISPR Analysis Using Next-Generation Draft Sequences ( 7th Annual SFAF Meeting, 2012)

    ScienceCinema

    Campbell, Catherine [Noblis

    2013-03-22

    Catherine Campbell on "Finishing and Special Motifs: Lessons learned from CRISPR analysis using next-generation draft sequences" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  17. Finishing and Special Motifs: Lessons Learned from CRISPR Analysis Using Next-Generation Draft Sequences ( 7th Annual SFAF Meeting, 2012)

    SciTech Connect

    Campbell, Catherine

    2012-06-01

    Catherine Campbell on "Finishing and Special Motifs: Lessons learned from CRISPR analysis using next-generation draft sequences" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  18. The Molecular Switching Mechanism at the Conserved D(E)RY Motif in Class-A GPCRs.

    PubMed

    Sandoval, Angelica; Eichler, Stefanie; Madathil, Sineej; Reeves, Philip J; Fahmy, Karim; Böckmann, Rainer A

    2016-07-12

    The disruption of ionic and H-bond interactions between the cytosolic ends of transmembrane helices TM3 and TM6 of class-A (rhodopsin-like) G protein-coupled receptors (GPCRs) is a hallmark for their activation by chemical or physical stimuli. In the bovine photoreceptor rhodopsin, this is accompanied by proton uptake at Glu(134) in the class-conserved D(E)RY motif. Studies on TM3 model peptides proposed a crucial role of the lipid bilayer in linking protonation to stabilization of an active state-like conformation. However, the molecular details of this linkage could not be resolved and have been addressed in this study by molecular dynamics (MD) simulations on TM3 model peptides in a bilayer of 1,2-dioleoyl-sn-glycero-3-phosphocholine (DOPC). We show that protonation of the conserved glutamic acid alters the peptide insertion depth in the membrane, its side-chain rotamer preferences, and stabilizes the C-terminal helical structure. These factors contribute to the rise of the side-chain pKa (> 6) and to reduced polarity around the TM3 C terminus as confirmed by fluorescence spectroscopy. Helix stabilization requires the protonated carboxyl group; unexpectedly, this stabilization could not be evoked with an amide in MD simulations. Additionally, time-resolved Fourier transform infrared (FTIR) spectroscopy of TM3 model peptides revealed a different kinetics for lipid ester carbonyl hydration, suggesting that the carboxyl is linked to more extended H-bond clusters than an amide. Remarkably, this was seen as well in DOPC-reconstituted Glu(134)- and Gln(134)-containing bovine opsin mutants and demonstrates that the D(E)RY motif is a hydrated microdomain. The function of the D(E)RY motif as a proton switch is suggested to be based on the reorganization of the H-bond network at the membrane interface. PMID:27410736

  19. Endocytosis and Trafficking of Natriuretic Peptide Receptor-A: Potential Role of Short Sequence Motifs

    PubMed Central

    Pandey, Kailash N.

    2015-01-01

    The targeted endocytosis and redistribution of transmembrane receptors among membrane-bound subcellular organelles are vital for their correct signaling and physiological functions. Membrane receptors committed for internalization and trafficking pathways are sorted into coated vesicles. Cardiac hormones, atrial and brain natriuretic peptides (ANP and BNP) bind to guanylyl cyclase/natriuretic peptide receptor-A (GC-A/NPRA) and elicit the generation of intracellular second messenger cyclic guanosine 3',5'-monophosphate (cGMP), which lowers blood pressure and incidence of heart failure. After ligand binding, the receptor is rapidly internalized, sequestrated, and redistributed into intracellular locations. Thus, NPRA is considered a dynamic cellular macromolecule that traverses different subcellular locations through its lifetime. The utilization of pharmacologic and molecular perturbants has helped in delineating the pathways of endocytosis, trafficking, down-regulation, and degradation of membrane receptors in intact cells. This review describes the investigation of the mechanisms of internalization, trafficking, and redistribution of NPRA compared with other cell surface receptors from the plasma membrane into the cell interior. The roles of different short-signal peptide sequence motifs in the internalization and trafficking of other membrane receptors have been briefly reviewed and their potential significance in the internalization and trafficking of NPRA is discussed. PMID:26151885

  20. Direct contacts between conserved motifs of different subunits provide major contribution to active site organization in human and mycobacterial dUTPases

    PubMed Central

    Takács, Enikő; Nagy, Gergely; Leveles, Ibolya; Harmat, Veronika; Lopata, Anna; Tóth, Judit; Vértessy, Beáta G.

    2010-01-01

    dUTPases are essential for genome integrity. Recent results allowed characterization of the role of conserved residues. Here we analyzed the Asp/Asn mutation within conserved Motif I of human and mycobacterial dUTPases, wherein the Asp residue was previously implicated in Mg2+-coordination. Our results on transient/steady-state kinetics, ligand-binding and a 1.80 Å-resolution structure of the mutant mycobacterial enzyme, in comparison with wild type and C-terminally truncated structures, argue that this residue has a major role in providing intra- and intersubunit contacts, but is not essential for Mg2+ accommodation. We conclude that in addition to the role of conserved motifs in substrate accommodation, direct subunit interaction between protein atoms of active site residues from different conserved motifs are crucial for enzyme function. PMID:20493855

  1. Direct contacts between conserved motifs of different subunits provide major contribution to active site organization in human and mycobacterial dUTPases.

    PubMed

    Takács, Eniko; Nagy, Gergely; Leveles, Ibolya; Harmat, Veronika; Lopata, Anna; Tóth, Judit; Vértessy, Beáta G

    2010-07-16

    dUTP pyrophosphatases (dUTPases) are essential for genome integrity. Recent results allowed characterization of the role of conserved residues. Here we analyzed the Asp/Asn mutation within conserved Motif I of human and mycobacterial dUTPases, wherein the Asp residue was previously implicated in Mg(2+)-coordination. Our results on transient/steady-state kinetics, ligand binding and a 1.80 A resolution structure of the mutant mycobacterial enzyme, in comparison with wild type and C-terminally truncated structures, argue that this residue has a major role in providing intra- and intersubunit contacts, but is not essential for Mg(2+) accommodation. We conclude that in addition to the role of conserved motifs in substrate accommodation, direct subunit interaction between protein atoms of active site residues from different conserved motifs are crucial for enzyme function. PMID:20493855

  2. Defining RNA motif-aminoglycoside interactions via two-dimensional combinatorial screening and structure-activity relationships through sequencing.

    PubMed

    Velagapudi, Sai Pradeep; Disney, Matthew D

    2013-10-15

    RNA is an extremely important target for the development of chemical probes of function or small molecule therapeutics. Aminoglycosides are the most well studied class of small molecules to target RNA. However, the RNA motifs outside of the bacterial rRNA A-site that are likely to be bound by these compounds in biological systems is largely unknown. If such information were known, it could allow for aminoglycosides to be exploited to target other RNAs and, in addition, could provide invaluable insights into potential bystander targets of these clinically used drugs. We utilized two-dimensional combinatorial screening (2DCS), a library-versus-library screening approach, to select the motifs displayed in a 3×3 nucleotide internal loop library and in a 6-nucleotide hairpin library that bind with high affinity and selectivity to six aminoglycoside derivatives. The selected RNA motifs were then analyzed using structure-activity relationships through sequencing (StARTS), a statistical approach that defines the privileged RNA motif space that binds a small molecule. StARTS allowed for the facile annotation of the selected RNA motif-aminoglycoside interactions in terms of affinity and selectivity. The interactions selected by 2DCS generally have nanomolar affinities, which is higher affinity than the binding of aminoglycosides to a mimic of their therapeutic target, the bacterial rRNA A-site. PMID:23719281

  3. DNA recognition for virus assembly through multiple sequence-independent interactions with a helix-turn-helix motif

    PubMed Central

    Greive, Sandra J.; Fung, Herman K.H.; Chechik, Maria; Jenkins, Huw T.; Weitzel, Stephen E.; Aguiar, Pedro M.; Brentnall, Andrew S.; Glousieau, Matthieu; Gladyshev, Grigory V.; Potts, Jennifer R.; Antson, Alfred A.

    2016-01-01

    The helix-turn-helix (HTH) motif features frequently in protein DNA-binding assemblies. Viral pac site-targeting small terminase proteins possess an unusual architecture in which the HTH motifs are displayed in a ring, distinct from the classical HTH dimer. Here we investigate how such a circular array of HTH motifs enables specific recognition of the viral genome for initiation of DNA packaging during virus assembly. We found, by surface plasmon resonance and analytical ultracentrifugation, that individual HTH motifs of the Bacillus phage SF6 small terminase bind the packaging regions of SF6 and related SPP1 genome weakly, with little local sequence specificity. Nuclear magnetic resonance chemical shift perturbation studies with an arbitrary single-site substrate suggest that the HTH motif contacts DNA similarly to how certain HTH proteins contact DNA non-specifically. Our observations support a model where specificity is generated through conformational selection of an intrinsically bent DNA segment by a ring of HTHs which bind weakly but cooperatively. Such a system would enable viral gene regulation and control of the viral life cycle, with a minimal genome, conferring a major evolutionary advantage for SPP1-like viruses. PMID:26673721

  4. DNA recognition for virus assembly through multiple sequence-independent interactions with a helix-turn-helix motif.

    PubMed

    Greive, Sandra J; Fung, Herman K H; Chechik, Maria; Jenkins, Huw T; Weitzel, Stephen E; Aguiar, Pedro M; Brentnall, Andrew S; Glousieau, Matthieu; Gladyshev, Grigory V; Potts, Jennifer R; Antson, Alfred A

    2016-01-29

    The helix-turn-helix (HTH) motif features frequently in protein DNA-binding assemblies. Viral pac site-targeting small terminase proteins possess an unusual architecture in which the HTH motifs are displayed in a ring, distinct from the classical HTH dimer. Here we investigate how such a circular array of HTH motifs enables specific recognition of the viral genome for initiation of DNA packaging during virus assembly. We found, by surface plasmon resonance and analytical ultracentrifugation, that individual HTH motifs of the Bacillus phage SF6 small terminase bind the packaging regions of SF6 and related SPP1 genome weakly, with little local sequence specificity. Nuclear magnetic resonance chemical shift perturbation studies with an arbitrary single-site substrate suggest that the HTH motif contacts DNA similarly to how certain HTH proteins contact DNA non-specifically. Our observations support a model where specificity is generated through conformational selection of an intrinsically bent DNA segment by a ring of HTHs which bind weakly but cooperatively. Such a system would enable viral gene regulation and control of the viral life cycle, with a minimal genome, conferring a major evolutionary advantage for SPP1-like viruses. PMID:26673721

  5. Identification of a Novel Sequence Motif Recognized by the Ankyrin Repeat Domain of zDHHC17/13 S-Acyltransferases.

    PubMed

    Lemonidis, Kimon; Sanchez-Perez, Maria C; Chamberlain, Luke H

    2015-09-01

    S-Acylation is a major post-translational modification affecting several cellular processes. It is particularly important for neuronal functions. This modification is catalyzed by a family of transmembrane S-acyltransferases that contain a conserved zinc finger DHHC (zDHHC) domain. Typically, eukaryote genomes encode for 7-24 distinct zDHHC enzymes, with two members also harboring an ankyrin repeat (AR) domain at their cytosolic N termini. The AR domain of zDHHC enzymes is predicted to engage in numerous interactions and facilitates both substrate recruitment and S-acylation-independent functions; however, the sequence/structural features recognized by this module remain unknown. The two mammalian AR-containing S-acyltransferases are the Golgi-localized zDHHC17 and zDHHC13, also known as Huntingtin-interacting proteins 14 and 14-like, respectively; they are highly expressed in brain, and their loss in mice leads to neuropathological deficits that are reminiscent of Huntington's disease. Here, we report that zDHHC17 and zDHHC13 recognize, via their AR domain, evolutionary conserved and closely related sequences of a [VIAP][VIT]XXQP consensus in SNAP25, SNAP23, cysteine string protein, Huntingtin, cytoplasmic linker protein 3, and microtubule-associated protein 6. This novel AR-binding sequence motif is found in regions predicted to be unstructured and is present in a number of zDHHC17 substrates and zDHHC17/13-interacting S-acylated proteins. This is the first study to identify a motif recognized by AR-containing zDHHCs. PMID:26198635

  6. Fine Scale Analysis of Crossover and Non-Crossover and Detection of Recombination Sequence Motifs in the Honeybee (Apis mellifera)

    PubMed Central

    Bessoltane, Nadia; Toffano-Nioche, Claire; Solignac, Michel; Mougel, Florence

    2012-01-01

    Background Meiotic exchanges are non-uniformly distributed across the genome of most studied organisms. This uneven distribution suggests that recombination is initiated by specific signals and/or regulations. Some of these signals were recently identified in humans and mice. However, it is unclear whether or not sequence signals are also involved in chromosomal recombination of insects. Methodology We analyzed recombination frequencies in the honeybee, in which genome sequencing provided a large amount of SNPs spread over the entire set of chromosomes. As the genome sequences were obtained from a pool of haploid males, which were the progeny of a single queen, an oocyte method (study of recombination on haploid males that develop from unfertilized eggs and hence are the direct reflect of female gametes haplotypes) was developed to detect recombined pairs of SNP sites. Sequences were further compared between recombinant and non-recombinant fragments to detect recombination-specific motifs. Conclusions Recombination events between adjacent SNP sites were detected at an average distance of 92 bp and revealed the existence of high rates of recombination events. This study also shows the presence of conversion without crossover (i. e. non-crossover) events, the number of which largely outnumbers that of crossover events. Furthermore the comparison of sequences that have undergone recombination with sequences that have not, led to the discovery of sequence motifs (CGCA, GCCGC, CCGCA), which may correspond to recombination signals. PMID:22567142

  7. A Conserved alpha-helical motif mediates the binding of diverse nuclear proteins to the SRC1 interaction domain of CBP.

    PubMed

    Matsuda, Sachiko; Harries, Janet C; Viskaduraki, Maria; Troke, Philip J F; Kindle, Karin B; Ryan, Colm; Heery, David M

    2004-04-01

    CREB-binding protein (CBP) and p300 contain modular domains that mediate protein-protein interactions with a wide variety of nuclear factors. A C-terminal domain of CBP (referred to as the SID) is responsible for interaction with the alpha-helical AD1 domain of p160 coactivators such as the steroid receptor coactivator (SRC1), and also other transcriptional regulators such as E1A, Ets-2, IRF3, and p53. Here we show that the pointed (PNT) domain of Ets-2 mediates its interaction with the CBP SID, and describe the effects of mutations in the SID on binding of Ets-2, E1A, and SRC1. In vitro binding studies indicate that SRC1, Ets-2 and E1A display mutually exclusive binding to the CBP SID. Consistent with this, we observed negative cross-talk between ERalpha/SRC1, Ets-2, and E1A proteins in reporter assays in transiently transfected cells. Transcriptional inhibition of Ets-2 or GAL4-AD1 activity by E1A was rescued by co-transfection with a CBP expression plasmid, consistent with the hypothesis that the observed inhibition was due to competition for CBP in vivo. Sequence comparisons revealed that SID-binding proteins contain a leucine-rich motif similar to the alpha-helix Aalpha1 of the SRC1 AD1 domain. Deletion mutants of E1A and Ets-2 lacking the conserved motif were unable to bind the CBP SID. Moreover, a peptide corresponding to this sequence competed the binding of full-length SRC1, Ets-2, and E1A proteins to the CBP SID. Thus, a leucine-rich amphipathic alpha-helix mediates mutually exclusive interactions of functionally diverse nuclear proteins with CBP. PMID:14722092

  8. The 2.2 Å resolution crystal structure of Bacillus cereus Nif3-family protein YqfO reveals a conserved dimetal-binding motif and a regulatory domain

    PubMed Central

    Godsey, Michael H.; Minasov, George; Shuvalova, Ludmilla; Brunzelle, Joseph S.; Vorontsov, Ivan I.; Collart, Frank R.; Anderson, Wayne F.

    2007-01-01

    YqfO of Bacillus cereus is a member of the widespread Nif3 family of proteins, which has been highlighted as an important target for structural genomics. The N- and C-terminal domains are conserved across the family and contain a dimetal-binding motif in a putative active site. YqfO contains an insert in the middle of the protein, present in a minority of bacterial family members. The structure of YqfO was determined at a resolution of 2.2 Å and reveals conservation of the putative active site. It also reveals the previously unknown structure of the insert, which despite extremely limited sequence conservation, bears great similarity to PII, CutA, and a number of other trimeric regulatory proteins. Our results suggest that this domain acts as a signal sensor to regulate the still-unknown catalytic activity of the more-conserved domains. PMID:17586767

  9. Creation of Hybrid Nanorods From Sequences of Natural Trimeric Fibrous Proteins Using the Fibritin Trimerization Motif

    NASA Astrophysics Data System (ADS)

    Papanikolopoulou, Katerina; van Raaij, Mark J.; Mitraki, Anna

    Stable, artificial fibrous proteins that can be functionalized open new avenues in fields such as bionanomaterials design and fiber engineering. An important source of inspiration for the creation of such proteins are natural fibrous proteins such as collagen, elastin, insect silks, and fibers from phages and viruses. The fibrous parts of this last class of proteins usually adopt trimeric, β-stranded structural folds and are appended to globular, receptor-binding domains. It has been recently shown that the globular domains are essential for correct folding and trimerization and can be successfully substituted by a very small (27-amino acid) trimerization motif from phage T4 fibritin. The hybrid proteins are correctly folded nanorods that can withstand extreme conditions. When the fibrous part derives from the adenovirus fiber shaft, different tissue-targeting specificities can be engineered into the hybrid proteins, which therefore can be used as gene therapy vectors. The integration of such stable nanorods in devices is also a big challenge in the field of biomechanical design. The fibritin foldon domain is a versatile trimerization motif and can be combined with a variety of fibrous motifs, such as coiled-coil, collagenous, and triple β-stranded motifs, provided the appropriate linkers are used. The combination of different motifs within the same fibrous molecule to create stable rods with multiple functions can even be envisioned. We provide a comprehensive overview of the experimental procedures used for designing, creating, and characterizing hybrid fibrous nanorods using the fibritin trimerization motif.

  10. Sequence and Spatiotemporal Expression Analysis of CLE-Motif Containing Genes from the Reniform Nematode (Rotylenchulus reniformis Linford & Oliveira)

    PubMed Central

    Wubben, Martin J.; Gavilano, Lily; Baum, Thomas J.; Davis, Eric L.

    2015-01-01

    The reniform nematode, Rotylenchulus reniformis, is a sedentary semi-endoparasitic species with a host range that encompasses more than 77 plant families. Nematode effector proteins containing plant-ligand motifs similar to CLAVATA3/ESR (CLE) peptides have been identified in the Heterodera, Globodera, and Meloidogyne genera of sedentary endoparasites. Here, we describe the isolation, sequence analysis, and spatiotemporal expression of three R. reniformis genes encoding putative CLE motifs named Rr-cle-1, Rr-cle-2, and Rr-cle-3. The Rr-cle cDNAs showed >98% identity with each other and the predicted peptides were identical with the exception of a short stretch of residues at the carboxy(C)-terminus of the variable domain (VD). Each RrCLE peptide possessed an amino-terminal signal peptide for secretion and a single C-terminal CLE motif that was most similar to Heterodera CLE motifs. Aligning the Rr-cle cDNAs with their corresponding genomic sequences showed three exons with an intron separating the signal peptide from the VD and a second intron separating the VD from the CLE motif. An alignment of the RrCLE1 peptide with Heterodera glycines and Heterodera schachtii CLE proteins revealed a high level of homology within the VD region associated with regulating in planta trafficking of the processed CLE peptide. Quantitative RT-PCR (qRT-PCR) showed similar expression profiles for each Rr-cle transcript across the R. reniformis life-cycle with the greatest transcript abundance being in sedentary parasitic female nematodes. In situ hybridization showed specific Rr-cle expression within the dorsal esophageal gland cell of sedentary parasitic females. PMID:26170479

  11. Sequence and Spatiotemporal Expression Analysis of CLE-Motif Containing Genes from the Reniform Nematode (Rotylenchulus reniformis Linford & Oliveira).

    PubMed

    Wubben, Martin J; Gavilano, Lily; Baum, Thomas J; Davis, Eric L

    2015-06-01

    The reniform nematode, Rotylenchulus reniformis, is a sedentary semi-endoparasitic species with a host range that encompasses more than 77 plant families. Nematode effector proteins containing plant-ligand motifs similar to CLAVATA3/ESR (CLE) peptides have been identified in the Heterodera, Globodera, and Meloidogyne genera of sedentary endoparasites. Here, we describe the isolation, sequence analysis, and spatiotemporal expression of three R. reniformis genes encoding putative CLE motifs named Rr-cle-1, Rr-cle-2, and Rr-cle-3. The Rr-cle cDNAs showed >98% identity with each other and the predicted peptides were identical with the exception of a short stretch of residues at the carboxy(C)-terminus of the variable domain (VD). Each RrCLE peptide possessed an amino-terminal signal peptide for secretion and a single C-terminal CLE motif that was most similar to Heterodera CLE motifs. Aligning the Rr-cle cDNAs with their corresponding genomic sequences showed three exons with an intron separating the signal peptide from the VD and a second intron separating the VD from the CLE motif. An alignment of the RrCLE1 peptide with Heterodera glycines and Heterodera schachtii CLE proteins revealed a high level of homology within the VD region associated with regulating in planta trafficking of the processed CLE peptide. Quantitative RT-PCR (qRT-PCR) showed similar expression profiles for each Rr-cle transcript across the R. reniformis life-cycle with the greatest transcript abundance being in sedentary parasitic female nematodes. In situ hybridization showed specific Rr-cle expression within the dorsal esophageal gland cell of sedentary parasitic females. PMID:26170479

  12. Sequence conservation of an avian centromeric repeated DNA component.

    PubMed

    Madsen, C S; Brooks, J E; de Kloet, E; de Kloet, S R

    1994-06-01

    The approximately 190-bp centromeric repeat monomers of the spur-winged lapwing (Vanellus spinosus, Charadriidae), the Chilean flamingo (Phoenicopterus chilensis, Phoenicopteridae), the sarus crane (Grus antigone, Gruidae), parrots (Psittacidae), waterfowl (Anatidae), and the merlin (Falco columbarius, Falconidae) contain elements that are interspecifically highly variable, as well as elements (trinucleotides and higher order oligonucleotides) that are highly conserved in sequence and relative location within the repeat. Such conservation suggests that the centromeric repeats of these avian species have evolved from a common ancestral sequence that may date from very early stages of avian radiation. PMID:8034177

  13. Specific Prenylation of Tomato Rab Proteins by Geranylgeranyl Type-II Transferase Requires a Conserved Cysteine-Cysteine Motif.

    PubMed Central

    Yalovsky, S.; Loraine, A. E.; Gruissem, W.

    1996-01-01

    Posttranslational isoprenylation of some small GTP-binding proteins is required for their biological activity. Rab geranylgeranyl transferase (Rab GGTase) uses geranylgeranyl pyrophosphate to modify Rab proteins, its only known substrates. Geranylgeranylation of Rabs is believed to promote their association with target membranes and interaction with other proteins. Plants, like other eukaryotes, contain Rab-like proteins that are associated with intracellular membranes. However, to our knowledge, the geranylgeranylation of Rab proteins has not yet been characterized from any plant source. This report presents an activity assay that allows the characterization of prenylation of Rab-like proteins in vitro, by protein extracts prepared from plants. Tomato Rab1 proteins and mammalian Rab1a were modified by geranylgeranyl pyrophosphate but not by farnesyl pyrophosphate. This modification required a conserved cysteine-cysteine motif. A mutant form lacking the cysteine-cysteine motif could not be modified, but inhibited the geranylgeranylation of its wild-type homolog. The tomato Rab proteins were modified in vitro by protein extract prepared from yeast, but failed to become modified when the protein extract was prepared from a yeast strain containing a mutant allele for the [alpha] subunit of yeast Rab GGTase (bet4 ts). These results demonstrate that plant cells, like other eukaryotes, contain Rab GGTase-like activity. PMID:12226265

  14. Conserved sequence pattern in a wide variety of phosphoesterases.

    PubMed Central

    Koonin, E. V.

    1994-01-01

    A unique sequence pattern, designated the GD/GNH signature, was shown to be conserved in a wide variety of phosphoesterases. The enzymes containing this signature cleave phosphoester bonds in such different substrates as (1) phosphoserine and phosphothreonine in polypeptides; (2) bis(5'-nucleosidyl)-tetraphosphates; (3) nucleoside 5' phosphates; (4) 2',3'-cyclic nucleotide phosphates; (5) polynucleotides; (6) 2'-5' phosphodiesters in RNA (intron) lariats; (7) sphingomyelin; and (7) various phosphomonoesters. Two conserved acidic amino acid residues and a conserved histidine residue may be directly involved in phosphoester bond cleavage. PMID:8003970

  15. Conserved Sequence Preferences Contribute to Substrate Recognition by the Proteasome*

    PubMed Central

    Yu, Houqing; Singh Gautam, Amit K.; Wilmington, Shameika R.; Wylie, Dennis; Martinez-Fonts, Kirby; Kago, Grace; Warburton, Marie; Chavali, Sreenivas; Inobe, Tomonao; Finkelstein, Ilya J.; Babu, M. Madan

    2016-01-01

    The proteasome has pronounced preferences for the amino acid sequence of its substrates at the site where it initiates degradation. Here, we report that modulating these sequences can tune the steady-state abundance of proteins over 2 orders of magnitude in cells. This is the same dynamic range as seen for inducing ubiquitination through a classic N-end rule degron. The stability and abundance of His3 constructs dictated by the initiation site affect survival of yeast cells and show that variation in proteasomal initiation can affect fitness. The proteasome's sequence preferences are linked directly to the affinity of the initiation sites to their receptor on the proteasome and are conserved between Saccharomyces cerevisiae, Schizosaccharomyces pombe, and human cells. These findings establish that the sequence composition of unstructured initiation sites influences protein abundance in vivo in an evolutionarily conserved manner and can affect phenotype and fitness. PMID:27226608

  16. The Annotation of RNA Motifs

    PubMed Central

    2002-01-01

    The recent deluge of new RNA structures, including complete atomic-resolution views of both subunits of the ribosome, has on the one hand literally overwhelmed our individual abilities to comprehend the diversity of RNA structure, and on the other hand presented us with new opportunities for comprehensive use of RNA sequences for comparative genetic, evolutionary and phylogenetic studies. Two concepts are key to understanding RNA structure: hierarchical organization of global structure and isostericity of local interactions. Global structure changes extremely slowly, as it relies on conserved long-range tertiary interactions. Tertiary RNA–RNA and quaternary RNA–protein interactions are mediated by RNA motifs, defined as recurrent and ordered arrays of non-Watson–Crick base-pairs. A single RNA motif comprises a family of sequences, all of which can fold into the same three-dimensional structure and can mediate the same interaction(s). The chemistry and geometry of base pairing constrain the evolution of motifs in such a way that random mutations that occur within motifs are accepted or rejected insofar as they can mediate a similar ordered array of interactions. The steps involved in the analysis and annotation of RNA motifs in 3D structures are: (a) decomposition of each motif into non-Watson–Crick base-pairs; (b) geometric classification of each basepair; (c) identification of isosteric substitutions for each basepair by comparison to isostericity matrices; (d) alignment of homologous sequences using the isostericity matrices to identify corresponding positions in the crystal structure; (e) acceptance or rejection of the null hypothesis that the motif is conserved. PMID:18629252

  17. Application of PCR amplicon sequencing using a single primer pair in PCR amplification to assess variations in Helicobacter pylori CagA EPIYA tyrosine phosphorylation motifs

    PubMed Central

    2010-01-01

    Background The presence of various EPIYA tyrosine phosphorylation motifs in the CagA protein of Helicobacter pylori has been suggested to contribute to pathogenesis in adults. In this study, a unique PCR assay and sequencing strategy was developed to establish the number and variation of cagA EPIYA motifs. Findings MDA-DNA derived from gastric biopsy specimens from eleven subjects with gastritis was used with M13- and T7-sequence-tagged primers for amplification of the cagA EPIYA motif region. Automated capillary electrophoresis using a high resolution kit and amplicon sequencing confirmed variations in the cagA EPIYA motif region. In nine cases, sequencing revealed the presence of AB, ABC, or ABCC (Western type) cagA EPIYA motif, respectively. In two cases, double cagA EPIYA motifs were detected (ABC/ABCC or ABC/AB), indicating the presence of two H. pylori strains in the same biopsy. Conclusion Automated capillary electrophoresis and Amplicon sequencing using a single, M13- and T7-sequence-tagged primer pair in PCR amplification enabled a rapid molecular typing of cagA EPIYA motifs. Moreover, the techniques described allowed for a rapid detection of mixed H. pylori strains present in the same biopsy specimen. PMID:20181142

  18. Modeling and analysis of MH1 domain of Smads and their interaction with promoter DNA sequence motif.

    PubMed

    Makkar, Pooja; Metpally, Raghu Prasad R; Sangadala, Sreedhara; Reddy, Boojala Vijay B

    2009-04-01

    The Smads are a group of related intracellular proteins critical for transmitting the signals to the nucleus from the transforming growth factor-beta (TGF-beta) superfamily of proteins at the cell surface. The prototypic members of the Smad family, Mad and Sma, were first described in Drosophila and Caenorhabditis elegans, respectively. Related proteins in Xenopus, Humans, Mice and Rats were subsequently identified, and are now known as Smads. Smad protein family members act downstream in the TGF-beta signaling pathway mediating various biological processes, including cell growth, differentiation, matrix production, apoptosis and development. Smads range from about 400-500 amino acids in length and are grouped into the receptor-regulated Smads (R-Smads), the common Smads (Co-Smads) and the inhibitory Smads (I-Smads). There are eight Smads in mammals, Smad1/5/8 (bone morphogenetic protein regulated) and Smad2/3 (TGF-beta/activin regulated) are termed R-Smads, Smad4 is denoted as Co-Smad and Smad6/7 are inhibitory Smads. A typical Smad consists of a conserved N-terminal Mad Homology 1 (MH1) domain and a C-terminal Mad Homology 2 (MH2) domain connected by a proline rich linker. The MH1 domain plays key role in DNA recognition and also facilitates the binding of Smad4 to the phosphorylated C-terminus of R-Smads to form activated complex. The MH2 domain exhibits transcriptional activation properties. In order to understand the structural basis of interaction of various Smads with their target proteins and the promoter DNA, we modeled MH1 domain of the remaining mammalian Smads based on known crystal structures of Smad3-MH1 domain bound to GTCT Smad box DNA sequence (1OZJ). We generated a B-DNA structure using average base-pair parameters of Twist, Tilt, Roll and base Slide angles. We then modeled interaction pose of the MH1 domain of Smad1/5/8 to their corresponding DNA sequence motif GCCG. These models provide the structural basis towards understanding functional

  19. Space-related pharma-motifs for fast search of protein binding motifs and polypharmacological targets

    PubMed Central

    2012-01-01

    Background To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. Results We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. Conclusions SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery

  20. A conserved intronic U1 snRNP-binding sequence promotes trans-splicing in Drosophila

    PubMed Central

    Gao, Jun-Li; Fan, Yu-Jie; Wang, Xiu-Ye; Zhang, Yu; Pu, Jia; Li, Liang; Shao, Wei; Zhan, Shuai; Hao, Jianjiang

    2015-01-01

    Unlike typical cis-splicing, trans-splicing joins exons from two separate transcripts to produce chimeric mRNA and has been detected in most eukaryotes. Trans-splicing in trypanosomes and nematodes has been characterized as a spliced leader RNA-facilitated reaction; in contrast, its mechanism in higher eukaryotes remains unclear. Here we investigate mod(mdg4), a classic trans-spliced gene in Drosophila, and report that two critical RNA sequences in the middle of the last 5′ intron, TSA and TSB, promote trans-splicing of mod(mdg4). In TSA, a 13-nucleotide (nt) core motif is conserved across Drosophila species and is essential and sufficient for trans-splicing, which binds U1 small nuclear RNP (snRNP) through strong base-pairing with U1 snRNA. In TSB, a conserved secondary structure acts as an enhancer. Deletions of TSA and TSB using the CRISPR/Cas9 system result in developmental defects in flies. Although it is not clear how the 5′ intron finds the 3′ introns, compensatory changes in U1 snRNA rescue trans-splicing of TSA mutants, demonstrating that U1 recruitment is critical to promote trans-splicing in vivo. Furthermore, TSA core-like motifs are found in many other trans-spliced Drosophila genes, including lola. These findings represent a novel mechanism of trans-splicing, in which RNA motifs in the 5′ intron are sufficient to bring separate transcripts into close proximity to promote trans-splicing. PMID:25838544

  1. Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas

    PubMed Central

    Petrov, Anton I.; Zirbel, Craig L.; Leontis, Neocles B.

    2013-01-01

    The analysis of atomic-resolution RNA three-dimensional (3D) structures reveals that many internal and hairpin loops are modular, recurrent, and structured by conserved non-Watson–Crick base pairs. Structurally similar loops define RNA 3D motifs that are conserved in homologous RNA molecules, but can also occur at nonhomologous sites in diverse RNAs, and which often vary in sequence. To further our understanding of RNA motif structure and sequence variability and to provide a useful resource for structure modeling and prediction, we present a new method for automated classification of internal and hairpin loop RNA 3D motifs and a new online database called the RNA 3D Motif Atlas. To classify the motif instances, a representative set of internal and hairpin loops is automatically extracted from a nonredundant list of RNA-containing PDB files. Their structures are compared geometrically, all-against-all, using the FR3D program suite. The loops are clustered into motif groups, taking into account geometric similarity and structural annotations and making allowance for a variable number of bulged bases. The automated procedure that we have implemented identifies all hairpin and internal loop motifs previously described in the literature. All motif instances and motif groups are assigned unique and stable identifiers and are made available in the RNA 3D Motif Atlas (http://rna.bgsu.edu/motifs), which is automatically updated every four weeks. The RNA 3D Motif Atlas provides an interactive user interface for exploring motif diversity and tools for programmatic data access. PMID:23970545

  2. Gibbs motif sampling: detection of bacterial outer membrane protein repeats.

    PubMed Central

    Neuwald, A. F.; Liu, J. S.; Lawrence, C. E.

    1995-01-01

    The detection and alignment of locally conserved regions (motifs) in multiple sequences can provide insight into protein structure, function, and evolution. A new Gibbs sampling algorithm is described that detects motif-encoding regions in sequences and optimally partitions them into distinct motif models; this is illustrated using a set of immunoglobulin fold proteins. When applied to sequences sharing a single motif, the sampler can be used to classify motif regions into related submodels, as is illustrated using helix-turn-helix DNA-binding proteins. Other statistically based procedures are described for searching a database for sequences matching motifs found by the sampler. When applied to a set of 32 very distantly related bacterial integral outer membrane proteins, the sampler revealed that they share a subtle, repetitive motif. Although BLAST (Altschul SF et al., 1990, J Mol Biol 215:403-410) fails to detect significant pairwise similarity between any of the sequences, the repeats present in these outer membrane proteins, taken as a whole, are highly significant (based on a generally applicable statistical test for motifs described here). Analysis of bacterial porins with known trimeric beta-barrel structure and related proteins reveals a similar repetitive motif corresponding to alternating membrane-spanning beta-strands. These beta-strands occur on the membrane interface (as opposed to the trimeric interface) of the beta-barrel. The broad conservation and structural location of these repeats suggests that they play important functional roles. PMID:8520488

  3. Amino Acids of Conserved Kinase Motifs of Cytomegalovirus Protein UL97 Are Essential for Autophosphorylation

    PubMed Central

    Michel, Detlef; Kramer, Silke; Höhn, Simone; Schaarschmidt, Peter; Wunderlich, Kirsten; Mertens, Thomas

    1999-01-01

    Thirteen point mutations targeting predicted domains conserved in homologous protein kinases were introduced into the UL97 coding region of the human cytomegalovirus. All mutagenized proteins were expressed in cells infected with recombinant vaccinia viruses (rVV). Several mutations drastically reduced ganciclovir (GCV) phosphorylation. Mutations at amino acids G340, A442, L446, and F523 resulted in a complete loss of pUL97 phosphorylation, which was strictly associated with a loss of GCV phosphorylation. Our results confirm that in rVV-infected cells pUL97 phosphorylation is due to autophosphorylation and show that several amino acids conserved within domains of protein kinases are essential for this pUL97 phosphorylation. GCV phosphorylation is dependent on pUL97 phosphorylation. PMID:10482650

  4. Using the Gibbs Motif Sampler for Phylogenetic Footprinting

    SciTech Connect

    Thompson, William; Conlan, Sean; McCue, Lee Ann; Lawrence, Charles

    2007-07-01

    The Gibbs Motif Sampler (Gibbs) (1) is a software package used to predict conserved elements in biopolymer sequences. While the software can be used to locate conserved motifs in protein sequences, its most common use is the prediction of transcription factor binding sites (TFBSs) in promoters upstream of gene sequences. We will describe approaches that use Gibbs to locate TFBSs in a collection of orthologous nucleotide sequences, i.e. phylogenetic footprinting. To illustrate this technique, we present examples that use Gibbs to detect binding sites for the transcription factor LexA in orthologous sequence data from representative species belonging to two different proteobacterial divisions.

  5. Bayesian Markov models consistently outperform PWMs at predicting motifs in nucleotide sequences.

    PubMed

    Siebert, Matthias; Söding, Johannes

    2016-07-27

    Position weight matrices (PWMs) are the standard model for DNA and RNA regulatory motifs. In PWMs nucleotide probabilities are independent of nucleotides at other positions. Models that account for dependencies need many parameters and are prone to overfitting. We have developed a Bayesian approach for motif discovery using Markov models in which conditional probabilities of order k - 1 act as priors for those of order k This Bayesian Markov model (BaMM) training automatically adapts model complexity to the amount of available data. We also derive an EM algorithm for de-novo discovery of enriched motifs. For transcription factor binding, BaMMs achieve significantly (P    =  1/16) higher cross-validated partial AUC than PWMs in 97% of 446 ChIP-seq ENCODE datasets and improve performance by 36% on average. BaMMs also learn complex multipartite motifs, improving predictions of transcription start sites, polyadenylation sites, bacterial pause sites, and RNA binding sites by 26-101%. BaMMs never performed worse than PWMs. These robust improvements argue in favour of generally replacing PWMs by BaMMs. PMID:27288444

  6. Defining a Conformational Consensus Motif in Cotransin-Sensitive Signal Sequences: A Proteomic and Site-Directed Mutagenesis Study

    PubMed Central

    Klein, Wolfgang; Westendorf, Carolin; Schmidt, Antje; Conill-Cortés, Mercè; Rutz, Claudia; Blohs, Marcus; Beyermann, Michael; Protze, Jonas; Krause, Gerd; Krause, Eberhard; Schülein, Ralf

    2015-01-01

    The cyclodepsipeptide cotransin was described to inhibit the biosynthesis of a small subset of proteins by a signal sequence-discriminatory mechanism at the Sec61 protein-conducting channel. However, it was not clear how selective cotransin is, i.e. how many proteins are sensitive. Moreover, a consensus motif in signal sequences mediating cotransin sensitivity has yet not been described. To address these questions, we performed a proteomic study using cotransin-treated human hepatocellular carcinoma cells and the stable isotope labelling by amino acids in cell culture technique in combination with quantitative mass spectrometry. We used a saturating concentration of cotransin (30 micromolar) to identify also less-sensitive proteins and to discriminate the latter from completely resistant proteins. We found that the biosynthesis of almost all secreted proteins was cotransin-sensitive under these conditions. In contrast, biosynthesis of the majority of the integral membrane proteins was cotransin-resistant. Cotransin sensitivity of signal sequences was neither related to their length nor to their hydrophobicity. Instead, in the case of signal anchor sequences, we identified for the first time a conformational consensus motif mediating cotransin sensitivity. PMID:25806945

  7. Variability Of The Conserved V3 Loop Tip Motif In Hiv-1 Subtype B Isolates Collected From Brazilian And French Patients

    PubMed Central

    Tomasini-Grotto, Rejane-Maria; Montes, Brigitte; Triglia, Denise; Torres-Braconi, Carla; Aliano-Block, Juliana; de A. Zanotto, Paolo M.; de M. C. Pardini, Maria-Inès; Segondy, Michel

    2010-01-01

    The diversity of the V3 loop tip motif sequences of HIV-1 subtype B was analyzed in patients from Botucatu (Brazil) and Montpellier (France). Overall, 37 tetrameric tip motifs were identified, 28 and 17 of them being recognized in Brazilian and French patients, respectively. The GPGR (P) motif was predominant in French but not in Brazilian patients (53.5% vs 31.0%), whereas the GWGR (W) motif was frequent in Brazilian patients (23.0%) and absent in French patients. Three tip motif groups were considered: P, W, and non-P non-W groups. The distribution of HIV-1 isolates into the three groups was significantly different between isolates from Botucatu and from Montpellier (P < 0.001). A higher proportion of CXCR4-using HIV-1 (X4 variants) was observed in the non-P non-W group as compared with the P group (37.5% vs 19.1%), and no X4 variant was identified in the W group (P < 0.001). The higher proportion of X4 variants in the non-P non-W group was essentially observed among the patients from Montpellier, who have been infected with HIV-1 for a longer period of time than those from Botucatu. Among patients from Montpellier, CD4+ cell counts were lower in patients belonging to the non-P non-W group than in those belonging to the P group (24 cells/μL vs 197 cells/μL; P = 0.005). Taken together, the results suggest that variability of the V3 loop tip motif may be related to HIV-1 coreceptor usage and to disease progression. However, as analyzed by a bioinformatic method, the substitution of the V3 loop tip motif of the subtype B consensus sequence with the different tip motifs identified in the present study was not sufficient to induce a change in HIV-1 coreceptor usage. PMID:24031549

  8. Sequence-Specific Recognition of DNA by Proteins: Binding Motifs Discovered Using a Novel Statistical/Computational Analysis.

    PubMed

    Jakubec, David; Laskowski, Roman A; Vondrasek, Jiri

    2016-01-01

    Decades of intensive experimental studies of the recognition of DNA sequences by proteins have provided us with a view of a diverse and complicated world in which few to no features are shared between individual DNA-binding protein families. The originally conceived direct readout of DNA residue sequences by amino acid side chains offers very limited capacity for sequence recognition, while the effects of the dynamic properties of the interacting partners remain difficult to quantify and almost impossible to generalise. In this work we investigated the energetic characteristics of all DNA residue-amino acid side chain combinations in the conformations found at the interaction interface in a very large set of protein-DNA complexes by the means of empirical potential-based calculations. General specificity-defining criteria were derived and utilised to look beyond the binding motifs considered in previous studies. Linking energetic favourability to the observed geometrical preferences, our approach reveals several additional amino acid motifs which can distinguish between individual DNA bases. Our results remained valid in environments with various dielectric properties. PMID:27384774

  9. Sequence-Specific Recognition of DNA by Proteins: Binding Motifs Discovered Using a Novel Statistical/Computational Analysis

    PubMed Central

    Jakubec, David; Laskowski, Roman A.; Vondrasek, Jiri

    2016-01-01

    Decades of intensive experimental studies of the recognition of DNA sequences by proteins have provided us with a view of a diverse and complicated world in which few to no features are shared between individual DNA-binding protein families. The originally conceived direct readout of DNA residue sequences by amino acid side chains offers very limited capacity for sequence recognition, while the effects of the dynamic properties of the interacting partners remain difficult to quantify and almost impossible to generalise. In this work we investigated the energetic characteristics of all DNA residue—amino acid side chain combinations in the conformations found at the interaction interface in a very large set of protein—DNA complexes by the means of empirical potential-based calculations. General specificity-defining criteria were derived and utilised to look beyond the binding motifs considered in previous studies. Linking energetic favourability to the observed geometrical preferences, our approach reveals several additional amino acid motifs which can distinguish between individual DNA bases. Our results remained valid in environments with various dielectric properties. PMID:27384774

  10. Drosophila EYA regulates the immune response against DNA through an evolutionarily conserved threonine phosphatase motif.

    PubMed

    Liu, Xi; Sano, Teruyuki; Guan, Yongsheng; Nagata, Shigekazu; Hoffmann, Jules A; Fukuyama, Hidehiro

    2012-01-01

    Innate immune responses against DNA are essential to counter both pathogen infections and tissue damages. Mammalian EYAs were recently shown to play a role in regulating the innate immune responses against DNA. Here, we demonstrate that the unique Drosophila eya gene is also involved in the response specific to DNA. Haploinsufficiency of eya in mutants deficient for lysosomal DNase activity (DNaseII) reduces antimicrobial peptide gene expression, a hallmark for immune responses in flies. Like the mammalian orthologues, Drosophila EYA features a N-terminal threonine and C-terminal tyrosine phosphatase domain. Through the generation of a series of mutant EYA fly strains, we show that the threonine phosphatase domain, but not the tyrosine phosphatase domain, is responsible for the innate immune response against DNA. A similar role for the threonine phosphatase domain in mammalian EYA4 had been surmised on the basis of in vitro studies. Furthermore EYA associates with IKKβ and full-length RELISH, and the induction of the IMD pathway-dependent antimicrobial peptide gene is independent of SO. Our data provide the first in vivo demonstration for the immune function of EYA and point to their conserved immune function in response to endogenous DNA, throughout evolution. PMID:22916150

  11. Local Function Conservation in Sequence and Structure Space

    PubMed Central

    Weinhold, Nils; Sander, Oliver; Domingues, Francisco S.; Lengauer, Thomas; Sommer, Ingolf

    2008-01-01

    We assess the variability of protein function in protein sequence and structure space. Various regions in this space exhibit considerable difference in the local conservation of molecular function. We analyze and capture local function conservation by means of logistic curves. Based on this analysis, we propose a method for predicting molecular function of a query protein with known structure but unknown function. The prediction method is rigorously assessed and compared with a previously published function predictor. Furthermore, we apply the method to 500 functionally unannotated PDB structures and discuss selected examples. The proposed approach provides a simple yet consistent statistical model for the complex relations between protein sequence, structure, and function. The GOdot method is available online (http://godot.bioinf.mpi-inf.mpg.de). PMID:18604264

  12. A Glance at Microsatellite Motifs from 454 Sequencing Reads of Watermelon Genomic DNA

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A single 454 (Life Sciences Sequencing Technology) run of Charleston Gray watermelon (Citrullus lanatus var. lanatus) genomic DNA was performed and sequence data were assembled. A large scale identification of simple sequence repeat (SSR) was performed and SSR sequence data were used for the develo...

  13. Epsilon glutathione transferases possess a unique class-conserved subunit interface motif that directly interacts with glutathione in the active site.

    PubMed

    Wongsantichon, Jantana; Robinson, Robert C; Ketterman, Albert J

    2015-01-01

    Epsilon class glutathione transferases (GSTs) have been shown to contribute significantly to insecticide resistance. We report a new Epsilon class protein crystal structure from Drosophila melanogaster for the glutathione transferase DmGSTE6. The structure reveals a novel Epsilon clasp motif that is conserved across hundreds of millions of years of evolution of the insect Diptera order. This histidine-serine motif lies in the subunit interface and appears to contribute to quaternary stability as well as directly connecting the two glutathiones in the active sites of this dimeric enzyme. PMID:26487708

  14. Epsilon glutathione transferases possess a unique class-conserved subunit interface motif that directly interacts with glutathione in the active site

    PubMed Central

    Wongsantichon, Jantana; Robinson, Robert C.; Ketterman, Albert J.

    2015-01-01

    Epsilon class glutathione transferases (GSTs) have been shown to contribute significantly to insecticide resistance. We report a new Epsilon class protein crystal structure from Drosophila melanogaster for the glutathione transferase DmGSTE6. The structure reveals a novel Epsilon clasp motif that is conserved across hundreds of millions of years of evolution of the insect Diptera order. This histidine-serine motif lies in the subunit interface and appears to contribute to quaternary stability as well as directly connecting the two glutathiones in the active sites of this dimeric enzyme. PMID:26487708

  15. A conserved secondary structural motif in 23S rRNA defines the site of interaction of amicetin, a universal inhibitor of peptide bond formation.

    PubMed Central

    Leviev, I G; Rodriguez-Fonseca, C; Phan, H; Garrett, R A; Heilek, G; Noller, H F; Mankin, A S

    1994-01-01

    The binding site and probable site of action have been determined for the universal antibiotic amicetin which inhibits peptide bond formation. Evidence from in vivo mutants, site-directed mutations and chemical footprinting all implicate a highly conserved motif in the secondary structure of the 23S-like rRNA close to the central circle of domain V. We infer that this motif lies at, or close to, the catalytic site in the peptidyl transfer centre. The binding site of amicetin is the first of a group of functionally related hexose-cytosine inhibitors to be localized on the ribosome. Images PMID:8157007

  16. Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses

    PubMed Central

    Turco, Gina; Schnable, James C.; Pedersen, Brent; Freeling, Michael

    2013-01-01

    Conserved non-coding sequences (CNS) are islands of non-coding sequence that, like protein coding exons, show less divergence in sequence between related species than functionless DNA. Several CNSs have been demonstrated experimentally to function as cis-regulatory regions. However, the specific functions of most CNSs remain unknown. Previous searches for CNS in plants have either anchored on exons and only identified nearby sequences or required years of painstaking manual annotation. Here we present an open source tool that can accurately identify CNSs between any two related species with sequenced genomes, including both those immediately adjacent to exons and distal sequences separated by >12 kb of non-coding sequence. We have used this tool to characterize new motifs, associate CNSs with additional functions, and identify previously undetected genes encoding RNA and protein in the genomes of five grass species. We provide a list of 15,363 orthologous CNSs conserved across all grasses tested. We were also able to identify regulatory sequences present in the common ancestor of grasses that have been lost in one or more extant grass lineages. Lists of orthologous gene pairs and associated CNSs are provided for reference inbred lines of arabidopsis, Japonica rice, foxtail millet, sorghum, brachypodium, and maize. PMID:23874343

  17. Properties of Sequence Conservation in Upstream Regulatory and Protein Coding Sequences among Paralogs in Arabidopsis thaliana

    NASA Astrophysics Data System (ADS)

    Richardson, Dale N.; Wiehe, Thomas

    Whole genome duplication (WGD) has catalyzed the formation of new species, genes with novel functions, altered expression patterns, complexified signaling pathways and has provided organisms a level of genetic robustness. We studied the long-term evolution and interrelationships of 5’ upstream regulatory sequences (URSs), protein coding sequences (CDSs) and expression correlations (EC) of duplicated gene pairs in Arabidopsis. Three distinct methods revealed significant evolutionary conservation between paralogous URSs and were highly correlated with microarray-based expression correlation of the respective gene pairs. Positional information on exact matches between sequences unveiled the contribution of micro-chromosomal rearrangements on expression divergence. A three-way rank analysis of URS similarity, CDS divergence and EC uncovered specific gene functional biases. Transcription factor activity was associated with gene pairs exhibiting conserved URSs and divergent CDSs, whereas a broad array of metabolic enzymes was found to be associated with gene pairs showing diverged URSs but conserved CDSs.

  18. Sturgeon conservation genomics: SNP discovery and validation using RAD sequencing.

    PubMed

    Ogden, R; Gharbi, K; Mugue, N; Martinsohn, J; Senn, H; Davey, J W; Pourkazemi, M; McEwing, R; Eland, C; Vidotto, M; Sergeev, A; Congiu, L

    2013-06-01

    Caviar-producing sturgeons belonging to the genus Acipenser are considered to be one of the most endangered species groups in the world. Continued overfishing in spite of increasing legislation, zero catch quotas and extensive aquaculture production have led to the collapse of wild stocks across Europe and Asia. The evolutionary relationships among Adriatic, Russian, Persian and Siberian sturgeons are complex because of past introgression events and remain poorly understood. Conservation management, traceability and enforcement suffer a lack of appropriate DNA markers for the genetic identification of sturgeon at the species, population and individual level. This study employed RAD sequencing to discover and characterize single nucleotide polymorphism (SNP) DNA markers for use in sturgeon conservation in these four tetraploid species over three biological levels, using a single sequencing lane. Four population meta-samples and eight individual samples from one family were barcoded separately before sequencing. Analysis of 14.4 Gb of paired-end RAD data focused on the identification of SNPs in the paired-end contig, with subsequent in silico and empirical validation of candidate markers. Thousands of putatively informative markers were identified including, for the first time, SNPs that show population-wide differentiation between Russian and Persian sturgeons, representing an important advance in our ability to manage these cryptic species. The results highlight the challenges of genotyping-by-sequencing in polyploid taxa, while establishing the potential genetic resources for developing a new range of caviar traceability and enforcement tools. PMID:23473098

  19. Conservation patterns in different functional sequence categoriesof divergent Drosophila species

    SciTech Connect

    Papatsenko, Dmitri; Kislyuk, Andrey; Levine, Michael; Dubchak, Inna

    2005-10-01

    We have explored the distributions of fully conservedungapped blocks in genome-wide pairwise alignments of recently completedspecies of Drosophila: D.yakuba, D.ananassae, D.pseudoobscura, D.virilisand D.mojavensis. Based on these distributions we have found that nearlyevery functional sequence category possesses its own distinctiveconservation pattern, sometimes independent of the overall sequenceconservation level. In the coding and regulatory regions, the ungappedblocks were longer than in introns, UTRs and non-functional sequences. Atthe same time, the blocks in the coding regions carried 3N+2 signaturecharacteristic to synonymic substitutions in the 3rd codon positions.Larger block sizes in transcription regulatory regions can be explainedby the presence of conserved arrays of binding sites for transcriptionfactors. We also have shown that the longest ungapped blocks, or'ultraconserved' sequences, are associated with specific gene groups,including those encoding ion channels and components of the cytoskeleton.We discussed how restrained conservation patterns may help in mappingfunctional sequence categories and improving genomeannotation.

  20. Conservation patterns in angiosperm rDNA ITS2 sequences.

    PubMed Central

    Hershkovitz, M A; Zimmer, E A

    1996-01-01

    The two internal transcribed spacers (ITS1 and ITS2) of nuclear ribosomal DNA have become commonly exploited sources of informative variation for interspecific-/intergeneric-level phylogenetic analyses among angiosperms and other eukaryotes. We present an alignment in which one-third to one-half of the ITS2 sequence is alignable above the family level in angiosperms and a phenetic analysis showing that ITS2 contains information sufficient to diagnose lineages at several hierarchical levels. Base compositional analysis shows that angiosperm ITS2 is inherently GC-rich, and that the proportion of T is much more variable than that for other bases. We propose a general model of angiosperm ITS2 secondary structure that shows common pairing relationships for most of the conserved sequence tracts. Variations in our secondary structure predictions for sequences from different taxa indicate that compensatory mutation is not limited to paired positions. PMID:8760866

  1. Conservative Patch Algorithm and Mesh Sequencing for PAB3D

    NASA Technical Reports Server (NTRS)

    Pao, S. P.; Abdol-Hamid, K. S.

    2005-01-01

    A mesh-sequencing algorithm and a conservative patched-grid-interface algorithm (hereafter Patch Algorithm ) have been incorporated into the PAB3D code, which is a computer program that solves the Navier-Stokes equations for the simulation of subsonic, transonic, or supersonic flows surrounding an aircraft or other complex aerodynamic shapes. These algorithms are efficient, flexible, and have added tremendously to the capabilities of PAB3D. The mesh-sequencing algorithm makes it possible to perform preliminary computations using only a fraction of the grid cells (provided the original cell count is divisible by an integer) along any grid coordinate axis, independently of the other axes. The patch algorithm addresses another critical need in multi-block grid situation where the cell faces of adjacent grid blocks may not coincide, leading to errors in calculating fluxes of conserved physical quantities across interfaces between the blocks. The patch algorithm, based on the Stokes integral formulation of the applicable conservation laws, effectively matches each of the interfacial cells on one side of the block interface to the corresponding fractional cell area pieces on the other side. This approach is comprehensive and unified such that all interface topology is automatically processed without user intervention. This algorithm is implemented in a preprocessing code that creates a cell-by-cell database that will maintain flux conservation at any level of full or reduced grid density as the user may choose by way of the mesh-sequencing algorithm. These two algorithms have enhanced the numerical accuracy of the code, reduced the time and effort for grid preprocessing, and provided users with the flexibility of performing computations at any desired full or reduced grid resolution to suit their specific computational requirements.

  2. Conserved Sequences at the Origin of Adenovirus DNA Replication

    PubMed Central

    Stillman, Bruce W.; Topp, William C.; Engler, Jeffrey A.

    1982-01-01

    The origin of adenovirus DNA replication lies within an inverted sequence repetition at either end of the linear, double-stranded viral DNA. Initiation of DNA replication is primed by a deoxynucleoside that is covalently linked to a protein, which remains bound to the newly synthesized DNA. We demonstrate that virion-derived DNA-protein complexes from five human adenovirus serological subgroups (A to E) can act as a template for both the initiation and the elongation of DNA replication in vitro, using nuclear extracts from adenovirus type 2 (Ad2)-infected HeLa cells. The heterologous template DNA-protein complexes were not as active as the homologous Ad2 DNA, most probably due to inefficient initiation by Ad2 replication factors. In an attempt to identify common features which may permit this replication, we have also sequenced the inverted terminal repeated DNA from human adenovirus serotypes Ad4 (group E), Ad9 and Ad10 (group D), and Ad31 (group A), and we have compared these to previously determined sequences from Ad2 and Ad5 (group C), Ad7 (group B), and Ad12 and Ad18 (group A) DNA. In all cases, the sequence around the origin of DNA replication can be divided into two structural domains: a proximal A · T-rich region which is partially conserved among these serotypes, and a distal G · C-rich region which is less well conserved. The G · C-rich region contains sequences similar to sequences present in papovavirus replication origins. The two domains may reflect a dual mechanism for initiation of DNA replication: adenovirus-specific protein priming of replication, and subsequent utilization of this primer by host replication factors for completion of DNA synthesis. Images PMID:7143575

  3. Conserved sequences in the carboxyl terminus of integrase that are essential for human immunodeficiency virus type 1 replication.

    PubMed

    Cannon, P M; Byles, E D; Kingsman, S M; Kingsman, A J

    1996-01-01

    We have previously identified a residue in the carboxyl terminus of human immunodeficiency virus type 1 integrase (HIV-1 IN), W-235, the requirement for which is only revealed in viral assays for integrase function (P. M. Cannon, W. Wilson, E. Byles, S. M. Kingsman, and A. J. Kingsman, J. Virol. 68:4768-4775, 1994). Our further analysis of this region of retroviral IN has now identified several sequence motifs which are conserved in all the retroviruses we examined, apart from human spumaretrovirus. We have made mutations within these motifs in HIV-1 IN and examined their phenotypes when reintroduced into an infectious proviral clone. The deleterious effects of several of these mutations demonstrate the importance of these regions for IN function in vivo. We observed a further discrepancy, at a motif that is only conserved in the lentiviruses, in the ability of mutants to function in in vitro and in vivo assays. Substitutions both in this region and at W-235 abolish HIV-1 infectivity but do not affect particle production, morphology, reverse transcription, or nuclear import in T-cell lines. Taken together with the in vitro data suggesting that neither of these residues is directly involved in the catalytic reactions of IN, it seems likely that we have identified regions of IN that are essential for interactions with other components of the integration machinery. PMID:8523588

  4. Nucleotide sequence and organization of the human S-protein gene: repeating peptide motifs in the pexin family and a model for their evolution

    SciTech Connect

    Jenne, D.; Stanley, K.K.

    1987-10-20

    The S-protein/vitronectin gene was isolated from a human genomic DNA library, and its sequence of about 5.3 kilobases including the adjacent 5' and 3' flanking regions was established. Alignment of the genomic DNA nucleotide sequence and the cDNA sequence indicated that the gene consisted of eight exons and seven introns. The intron positions in the S-protein gene and their phase type were compared to those in the hemopexin gene which shares amino acid sequence homologies with transin and the S-protein. Three introns have been found at equivalent positions; two other introns are very close to these positions and are interpreted as cases of intron sliding. Introns 3-7 occur at a conserved glycine residue within repeating peptide segments, whereas introns 1 and 2 are at the boundaries of the Somatomedin B domain of S-protein. The analysis of the exon structure in relations to repeating peptide motifs within the S-protein strongly suggest that it contains only seven repeats, one less than the hemopexin molecule. A very similar repeat pattern like that in hemopexin is shown to be present also in two other related proteins, transin and interstitial collagenase. An evolutionary model for the generation of the repeat pattern in the S-protein and the other members of this novel pexin gene family is proposed, and the sequence modifications for some of the repeats during divergent evolution are discussed in relation to know unique functional properties of hemopexin and S-protein.

  5. Fox-2 Splicing Factor Binds to a Conserved Intron Motif to PromoteInclusion of Protein 4.1R Alternative Exon 16

    SciTech Connect

    Ponthier, Julie L.; Schluepen, Christina; Chen, Weiguo; Lersch,Robert A.; Gee, Sherry L.; Hou, Victor C.; Lo, Annie J.; Short, Sarah A.; Chasis, Joel A.; Winkelmann, John C.; Conboy, John G.

    2006-03-01

    Activation of protein 4.1R exon 16 (E16) inclusion during erythropoiesis represents a physiologically important splicing switch that increases 4.1R affinity for spectrin and actin. Previous studies showed that negative regulation of E16 splicing is mediated by the binding of hnRNP A/B proteins to silencer elements in the exon and that downregulation of hnRNP A/B proteins in erythroblasts leads to activation of E16 inclusion. This paper demonstrates that positive regulation of E16 splicing can be mediated by Fox-2 or Fox-1, two closely related splicing factors that possess identical RNA recognition motifs. SELEX experiments with human Fox-1 revealed highly selective binding to the hexamer UGCAUG. Both Fox-1 and Fox-2 were able to bind the conserved UGCAUG elements in the proximal intron downstream of E16, and both could activate E16 splicing in HeLa cell co-transfection assays in a UGCAUG-dependent manner. Conversely, knockdown of Fox-2 expression, achieved with two different siRNA sequences resulted in decreased E16 splicing. Moreover, immunoblot experiments demonstrate mouse erythroblasts express Fox-2, but not Fox-1. These findings suggest that Fox-2 is a physiological activator of E16 splicing in differentiating erythroid cells in vivo. Recent experiments show that UGCAUG is present in the proximal intron sequence of many tissue-specific alternative exons, and we propose that the Fox family of splicing enhancers plays an important role in alternative splicing switches during differentiation in metazoan organisms.

  6. In Vivo Enhancer Analysis Chromosome 16 Conserved NoncodingSequences

    SciTech Connect

    Pennacchio, Len A.; Ahituv, Nadav; Moses, Alan M.; Nobrega,Marcelo; Prabhakar, Shyam; Shoukry, Malak; Minovitsky, Simon; Visel,Axel; Dubchak, Inna; Holt, Amy; Lewis, Keith D.; Plajzer-Frick, Ingrid; Akiyama, Jennifer; De Val, Sarah; Afzal, Veena; Black, Brian L.; Couronne, Olivier; Eisen, Michael B.; Rubin, Edward M.

    2006-02-01

    The identification of enhancers with predicted specificitiesin vertebrate genomes remains a significant challenge that is hampered bya lack of experimentally validated training sets. In this study, weleveraged extreme evolutionary sequence conservation as a filter toidentify putative gene regulatory elements and characterized the in vivoenhancer activity of human-fish conserved and ultraconserved1 noncodingelements on human chromosome 16 as well as such elements from elsewherein the genome. We initially tested 165 of these extremely conservedsequences in a transgenic mouse enhancer assay and observed that 48percent (79/165) functioned reproducibly as tissue-specific enhancers ofgene expression at embryonic day 11.5. While driving expression in abroad range of anatomical structures in the embryo, the majority of the79 enhancers drove expression in various regions of the developingnervous system. Studying a set of DNA elements that specifically droveforebrain expression, we identified DNA signatures specifically enrichedin these elements and used these parameters to rank all ~;3,400human-fugu conserved noncoding elements in the human genome. The testingof the top predictions in transgenic mice resulted in a three-foldenrichment for sequences with forebrain enhancer activity. These datadramatically expand the catalogue of in vivo-characterized human geneenhancers and illustrate the future utility of such training sets for avariety of iological applications including decoding the regulatoryvocabulary of the human genome.

  7. Conserved Ser/Arg-rich Motif in PPZ Orthologs from Fungi Is Important for Its Role in Cation Tolerance

    PubMed Central

    Minhas, Anupriya; Sharma, Anupam; Kaur, Harsimran; Rawal, Yashpal; Ganesan, Kaliannan; Mondal, Alok K.

    2012-01-01

    PPZ1 orthologs, novel members of a phosphoprotein phosphatase family of phosphatases, are found only in fungi. They regulate diverse physiological processes in fungi e.g. ion homeostasis, cell size, cell integrity, etc. Although they are an important determinant of salt tolerance in fungi, their physiological role remained unexplored in any halotolerant species. In this context we report here molecular and functional characterization of DhPPZ1 from Debaryomyces hansenii, which is one of the most halotolerant and osmotolerant species of yeast. Our results showed that DhPPZ1 knock-out strain displayed higher tolerance to toxic cations, and unlike in Saccharomyces cerevisiae, Na+/H+ antiporter appeared to have an important role in this process. Besides salt tolerance, DhPPZ1 also had role in cell wall integrity and growth in D. hansenii. We have also identified a short, serine-arginine-rich sequence motif in DhPpz1p that is essential for its role in salt tolerance but not in other physiological processes. Taken together, these results underscore a distinct role of DhPpz1p in D. hansenii and illustrate an example of how organisms utilize the same molecular tool box differently to garner adaptive fitness for their respective ecological niches. PMID:22232558

  8. Assessment of the potential contribution of the highly conserved C-terminal motif (C10) of Borrelia burgdorferi outer surface protein C in transmission and infectivity.

    PubMed

    Earnhart, Christopher G; Rhodes, DeLacy V L; Smith, Alexis A; Yang, Xiuli; Tegels, Brittney; Carlyon, Jason A; Pal, Utpal; Marconi, Richard T

    2014-03-01

    OspC is produced by all species of the Borrelia burgdorferi sensu lato complex and is required for infectivity in mammals. To test the hypothesis that the conserved C-terminal motif (C10) of OspC is required for function in vivo, a mutant B. burgdorferi strain (B31::ospCΔC10) was created in which ospC was replaced with an ospC gene lacking the C10 motif. The ability of the mutant to infect mice was investigated using tick transmission and needle inoculation. Infectivity was assessed by cultivation, qRT-PCR, and measurement of IgG antibody responses. B31::ospCΔC10 retained the ability to infect mice by both needle and tick challenge and was competent to survive in ticks after exposure to the blood meal. To determine whether recombinant OspC protein lacking the C-terminal 10 amino acid residues (rOspCΔC10) can bind plasminogen, the only known mammalian-derived ligand for OspC, binding analyses were performed. Deletion of the C10 motif resulted in a statistically significant decrease in plasminogen binding. Although deletion of the C10 motif influenced plasminogen binding, it can be concluded that the C10 motif is not required for OspC to carry out its critical in vivo functions in tick to mouse transmission. PMID:24376161

  9. The Ku-binding motif is a conserved module for recruitment and stimulation of non-homologous end-joining proteins

    PubMed Central

    Grundy, Gabrielle J.; Rulten, Stuart L.; Arribas-Bosacoma, Raquel; Davidson, Kathryn; Kozik, Zuzanna; Oliver, Antony W.; Pearl, Laurence H.; Caldecott, Keith W.

    2016-01-01

    The Ku-binding motif (KBM) is a short peptide module first identified in APLF that we now show is also present in Werner syndrome protein (WRN) and in Modulator of retrovirus infection homologue (MRI). We also identify a related but functionally distinct motif in XLF, WRN, MRI and PAXX, which we denote the XLF-like motif. We show that WRN possesses two KBMs; one at the N terminus next to the exonuclease domain and one at the C terminus next to an XLF-like motif. We reveal that the WRN C-terminal KBM and XLF-like motif function cooperatively to bind Ku complexes and that the N-terminal KBM mediates Ku-dependent stimulation of WRN exonuclease activity. We also show that WRN accelerates DSB repair by a mechanism requiring both KBMs, demonstrating the importance of WRN interaction with Ku. These data define a conserved family of KBMs that function as molecular tethers to recruit and/or stimulate enzymes during NHEJ. PMID:27063109

  10. Co-conservation of rRNA tetraloop sequences and helix length suggests involvement of the tetraloops in higher-order interactions

    NASA Technical Reports Server (NTRS)

    Hedenstierna, K. O.; Siefert, J. L.; Fox, G. E.; Murgola, E. J.

    2000-01-01

    Terminal loops containing four nucleotides (tetraloops) are common in structural RNAs, and they frequently conform to one of three sequence motifs, GNRA, UNCG, or CUUG. Here we compare available sequences and secondary structures for rRNAs from bacteria, and we show that helices capped by phylogenetically conserved GNRA loops display a strong tendency to be of conserved length. The simplest interpretation of this correlation is that the conserved GNRA loops are involved in higher-order interactions, intramolecular or intermolecular, resulting in a selective pressure for maintaining the lengths of these helices. A small number of conserved UNCG loops were also found to be associated with conserved length helices, consistent with the possibility that this type of tetraloop also takes part in higher-order interactions.

  11. Rice MEL2, the RNA recognition motif (RRM) protein, binds in vitro to meiosis-expressed genes containing U-rich RNA consensus sequences in the 3'-UTR.

    PubMed

    Miyazaki, Saori; Sato, Yutaka; Asano, Tomoya; Nagamura, Yoshiaki; Nonomura, Ken-Ichi

    2015-10-01

    Post-transcriptional gene regulation by RNA recognition motif (RRM) proteins through binding to cis-elements in the 3'-untranslated region (3'-UTR) is widely used in eukaryotes to complete various biological processes. Rice MEIOSIS ARRESTED AT LEPTOTENE2 (MEL2) is the RRM protein that functions in the transition to meiosis in proper timing. The MEL2 RRM preferentially associated with the U-rich RNA consensus, UUAGUU[U/A][U/G][A/U/G]U, dependently on sequences and proportionally to MEL2 protein amounts in vitro. The consensus sequences were located in the putative looped structures of the RNA ligand. A genome-wide survey revealed a tendency of MEL2-binding consensus appearing in 3'-UTR of rice genes. Of 249 genes that conserved the consensus in their 3'-UTR, 13 genes spatiotemporally co-expressed with MEL2 in meiotic flowers, and included several genes whose function was supposed in meiosis; such as Replication protein A and OsMADS3. The proteome analysis revealed that the amounts of small ubiquitin-related modifier-like protein and eukaryotic translation initiation factor3-like protein were dramatically altered in mel2 mutant anthers. Taken together with transcriptome and gene ontology results, we propose that the rice MEL2 is involved in the translational regulation of key meiotic genes on 3'-UTRs to achieve the faithful transition of germ cells to meiosis. PMID:26319516

  12. The BEN domain is a novel sequence-specific DNA-binding domain conserved in neural transcriptional repressors

    PubMed Central

    Dai, Qi; Ren, Aiming; Westholm, Jakub O.; Serganov, Artem A.; Patel, Dinshaw J.; Lai, Eric C.

    2013-01-01

    We recently reported that Drosophila Insensitive (Insv) promotes sensory organ development and has activity as a nuclear corepressor for the Notch transcription factor Suppressor of Hairless [Su(H)]. Insv lacks domains of known biochemical function but contains a single BEN domain (i.e., a “BEN-solo” protein). Our chromatin immunoprecipitation (ChIP) sequencing (ChIP-seq) analysis confirmed binding of Insensitive to Su(H) target genes in the Enhancer of split gene complex [E(spl)-C]; however, de novo motif analysis revealed a novel site strongly enriched in Insv peaks (TCYAATHRGAA). We validate binding of endogenous Insv to genomic regions bearing such sites, whose associated genes are enriched for neural functions and are functionally repressed by Insv. Unexpectedly, we found that the Insv BEN domain binds specifically to this sequence motif and that Insv directly regulates transcription via this motif. We determined the crystal structure of the BEN–DNA target complex, revealing homodimeric binding of the BEN domain and extensive nucleotide contacts via α helices and a C-terminal loop. Point mutations in key DNA-contacting residues severely impair DNA binding in vitro and capacity for transcriptional regulation in vivo. We further demonstrate DNA-binding and repression activities by the mammalian neural BEN-solo protein BEND5. Altogether, we define novel DNA-binding activity in a conserved family of transcriptional repressors, opening a molecular window on this extensive gene family. PMID:23468431

  13. Rapid characterization of CRISPR-Cas9 protospacer adjacent motif sequence elements.

    PubMed

    Karvelis, Tautvydas; Gasiunas, Giedrius; Young, Joshua; Bigelyte, Greta; Silanskas, Arunas; Cigan, Mark; Siksnys, Virginijus

    2015-01-01

    To expand the repertoire of Cas9s available for genome targeting, we present a new in vitro method for the simultaneous examination of guide RNA and protospacer adjacent motif (PAM) requirements. The method relies on the in vitro cleavage of plasmid libraries containing a randomized PAM as a function of Cas9-guide RNA complex concentration. Using this method, we accurately reproduce the canonical PAM preferences for Streptococcus pyogenes, Streptococcus thermophilus CRISPR3 (Sth3), and CRISPR1 (Sth1). Additionally, PAM and sgRNA solutions for a novel Cas9 protein from Brevibacillus laterosporus are provided by the assay and are demonstrated to support functional activity in vitro and in plants. PMID:26585795

  14. A comparative genomics strategy for targeted discovery of single-nucleotide polymorphisms and conserved-noncoding sequences in orphan crops.

    PubMed

    Feltus, F A; Singh, H P; Lohithaswa, H C; Schulze, S R; Silva, T D; Paterson, A H

    2006-04-01

    Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species. PMID:16607031

  15. An apoptosis-inhibiting gene from a nuclear polyhedrosis virus encoding a polypeptide with Cys/His sequence motifs.

    PubMed Central

    Birnbaum, M J; Clem, R J; Miller, L K

    1994-01-01

    Two different baculovirus genes are known to be able to block apoptosis triggered upon infection of Spodoptera frugiperda cells with p35 mutants of the insect baculovirus Autographa californica nuclear polyhedrosis virus (AcMNPV):p35 (P35-encoding gene) of AcMNPV (R. J. Clem, M. Fechheimer, and L. K. Miller, Science 254:1388-1390, 1991) and iap (inhibitor of apoptosis gene) of Cydia pomonella granulosis virus (CpGV) (N. E. Crook, R. J. Clem, and L. K. Miller, J. Virol. 67:2168-2174, 1993). Using a genetic complementation assay to identify additional genes which inhibit apoptosis during infection with a p35 mutant, we have isolated a gene from Orgyia pseudotsugata NPV (OpMNPV) that was able to functionally substitute for AcMNPV p35. The nucleotide sequence of this gene, Op-iap, predicted a 30-kDa polypeptide product with approximately 58% amino acid sequence identity to the product of CpGV iap, Cp-IAP. Like Cp-IAP, the predicted product of Op-iap has a carboxy-terminal C3HC4 zinc finger-like motif. In addition, a pair of additional cysteine/histidine motifs were found in the N-terminal regions of both polypeptide sequences. Recombinant p35 mutant viruses carrying either Op-iap or Cp-iap appeared to have a normal phenotype in S. frugiperda cells. Thus, Cp-IAP and Op-IAP appear to be functionally analogous to P35 but are likely to block apoptosis by a different mechanism which may involve direct interaction with DNA. Images PMID:8139034

  16. An apoptosis-inhibiting gene from a nuclear polyhedrosis virus encoding a polypeptide with Cys/His sequence motifs.

    PubMed

    Birnbaum, M J; Clem, R J; Miller, L K

    1994-04-01

    Two different baculovirus genes are known to be able to block apoptosis triggered upon infection of Spodoptera frugiperda cells with p35 mutants of the insect baculovirus Autographa californica nuclear polyhedrosis virus (AcMNPV):p35 (P35-encoding gene) of AcMNPV (R. J. Clem, M. Fechheimer, and L. K. Miller, Science 254:1388-1390, 1991) and iap (inhibitor of apoptosis gene) of Cydia pomonella granulosis virus (CpGV) (N. E. Crook, R. J. Clem, and L. K. Miller, J. Virol. 67:2168-2174, 1993). Using a genetic complementation assay to identify additional genes which inhibit apoptosis during infection with a p35 mutant, we have isolated a gene from Orgyia pseudotsugata NPV (OpMNPV) that was able to functionally substitute for AcMNPV p35. The nucleotide sequence of this gene, Op-iap, predicted a 30-kDa polypeptide product with approximately 58% amino acid sequence identity to the product of CpGV iap, Cp-IAP. Like Cp-IAP, the predicted product of Op-iap has a carboxy-terminal C3HC4 zinc finger-like motif. In addition, a pair of additional cysteine/histidine motifs were found in the N-terminal regions of both polypeptide sequences. Recombinant p35 mutant viruses carrying either Op-iap or Cp-iap appeared to have a normal phenotype in S. frugiperda cells. Thus, Cp-IAP and Op-IAP appear to be functionally analogous to P35 but are likely to block apoptosis by a different mechanism which may involve direct interaction with DNA. PMID:8139034

  17. Juvenile hormone regulates Aedes aegypti Krüppel homolog 1 through a conserved E box motif.

    PubMed

    Cui, Yingjun; Sui, Yipeng; Xu, Jingjing; Zhu, Fang; Palli, Subba Reddy

    2014-09-01

    Juvenile hormone (JH) plays important roles in regulation of many physiological processes including development, reproduction and metabolism in insects. However, the molecular mechanisms of JH signaling pathway are not completely understood. To elucidate the molecular mechanisms of JH regulation of Krüppel homolog 1 gene (Kr-h1) in Aedes aegypti, we employed JH-sensitive Aag-2 cells developed from the embryos of this insect. In Aag-2 cells, AaKr-h1 gene is induced by nanomolar concentration of JH III, its expression peaked at 1.5 h after treatment with JH III. RNAi studies showed that JH induction of this gene requires the presence of Ae. aegypti methoprene-tolerant (AaMet). A conserved 13 nucleotide JH response element (JHRE, TGCCTCCACGTGC) containing canonical E box motif (underlined) identified in the promoter of AaKr-h1 is required for JH induction of this gene. Critical nucleotides in the JHRE required for JH action were identified by employing mutagenesis and reporter assays. Reporter assays also showed that basic helix loop helix (bHLH) domain of AaMet is required for JH induction of AaKr-h1. 5' rapid amplification of cDNA ends method identified two isoforms of AaKr-h1, AaKr-h1α and AaKr-h1β, the expression of both isoforms is induced by JH III, but AaKr-h1α is the predominant isoform in both Aag-2 cells and Ae. aegypti larvae. PMID:24931431

  18. In vitro enzymatic activity of human immunodeficiency virus type 1 reverse transcriptase mutants in the highly conserved YMDD amino acid motif correlates with the infectious potential of the proviral genome.

    PubMed Central

    Wakefield, J K; Jablonski, S A; Morrow, C D

    1992-01-01

    Reverse transcriptases contain a highly conserved YXDD amino acid motif believed to be important in enzyme function. The second amino acid is not strictly conserved, with a methionine, valine or alanine occupying the second position in reverse transcriptases from various retroviruses and retroelements. Recently, a 3.5-A (0.35-nm) resolution electron density map of human immunodeficiency virus type 1 (HIV-1) reverse transcriptase positioned the YMDD motif within an antiparallel beta-hairpin structure which forms a portion of its catalytic site. To further explore the role of methionine of the conserved YMDD motif in HIV-1 reverse transcriptase function, we have substituted methionine with a valine, alanine, serine, glycine, or proline, reflecting in some cases sequence motifs of other related reverse transcriptases. Wild-type and mutant enzymes were expressed in Escherichia coli, partially purified by phosphocellulose chromatography, and assayed for the capacity to polymerize TTP by using a homopolymeric template [poly(rA)] with either a DNA [oligo(dT)] or an RNA [oligo(U)] primer. With a poly(rA).oligo(dT) template-primer, reverse transcriptases with the methionine replaced by valine (YVDD), serine (YSDD), or alanine (YADD) were 70 to 100% as active as the wild type, while those with the glycine substitution (YGDD) were approximately 5 to 10% as active. A proline substitution (YPDD) completely inactivated the enzyme. With a poly(rA).oligo(U) template-primer, only the activity of mutants with YVDD was similar to that of the wild type, while mutants with YADD and YSDD were approximately 5 to 10% as active as the wild-type enzyme. The reverse transcriptases with the YGDD and YPDD mutations demonstrated no activity above background. Proviruses containing the reverse transcriptase with the valine mutation (YVDD) produced viruses with infectivities similar to that of the wild type, as determined by measurement of p24 antigen in culture supernatants and visual inspection

  19. Core sequence in the RNA motif recognized by the ErmE methyltransferase revealed by relaxing the fidelity of the enzyme for its target.

    PubMed Central

    Hansen, L H; Vester, B; Douthwaite, S

    1999-01-01

    Under physiological conditions, the ErmE methyltransferase specifically modifies a single adenosine within ribosomal RNA (rRNA), and thereby confers resistance to multiple antibiotics. The adenosine (A2058 in Escherichia coli 23S rRNA) lies within a highly conserved structure, and is methylated efficiently, and with equally high fidelity, in rRNAs from phylogenetically diverse bacteria. However, the fidelity of ErmE is reduced when magnesium is removed, and over twenty new sites of ErmE methylation appear in E. coli 16S and 23S rRNAs. These sites show widely different degrees of reactivity to ErmE. The canonical A2058 site is largely unaffected by magnesium depletion and remains the most reactive site in the rRNA. This suggests that methylation at the new sites results from changes in the RNA substrate rather than the methyltransferase. Chemical probing confirms that the rRNA structure opens upon magnesium depletion, exposing potential new interaction sites to the enzyme. The new ErmE sites show homology with the canonical A2058 site, and have the consensus sequence aNNNcgGAHAg (ErmE methylation occurs exclusively at adenosines (underlined); these are preceded by a guanosine, equivalent to G2057; there is a high preference for the adenosine equivalent to A2060; H is any nucleotide except G; N is any nucleotide; and there are slight preferences for the nucleotides shown in lower case). This consensus is believed to represent the core of the motif that Erm methyltransferases recognize at their canonical A2058 site. The data also reveal constraints on the higher order structure of the motif that affect methyltransferase recognition. PMID:9917069

  20. Hemagglutinin Sequence Conservation Guided Stem Immunogen Design from Influenza A H3 Subtype

    PubMed Central

    Mallajosyula, V. Vamsee Aditya; Citron, Michael; Ferrara, Francesca; Temperton, Nigel J.; Liang, Xiaoping; Flynn, Jessica A.; Varadarajan, Raghavan

    2015-01-01

    Seasonal epidemics caused by influenza A (H1 and H3 subtypes) and B viruses are a major global health threat. The traditional, trivalent influenza vaccines have limited efficacy because of rapid antigenic evolution of the circulating viruses. This antigenic variability mediates viral escape from the host immune responses, necessitating annual vaccine updates. Influenza vaccines elicit a protective antibody response, primarily targeting the viral surface glycoprotein hemagglutinin (HA). However, the predominant humoral response is against the hypervariable head domain of HA, thereby restricting the breadth of protection. In contrast, the conserved, subdominant stem domain of HA is a potential “universal” vaccine candidate. We designed an HA stem-fragment immunogen from the 1968 pandemic H3N2 strain (A/Hong Kong/1/68) guided by a comprehensive H3 HA sequence conservation analysis. The biophysical properties of the designed immunogen were further improved by C-terminal fusion of a trimerization motif, “isoleucine-zipper”, or “foldon”. These immunogens elicited cross-reactive, antiviral antibodies and conferred partial protection against a lethal, homologous HK68 virus challenge in vivo. Furthermore, bacterial expression of these immunogens is economical and facilitates rapid scale-up. PMID:26167164

  1. Genetic diversity of the conserved motifs of six bacterial leaf blight resistance genes in a set of rice landraces

    PubMed Central

    2014-01-01

    Background Bacterial leaf blight (BLB) caused by the vascular pathogen Xanthomonas oryzae pv. oryzae (Xoo) is one of the most serious diseases leading to crop failure in rice growing countries. A total of 37 resistance genes against Xoo has been identified in rice. Of these, ten BLB resistance genes have been mapped on rice chromosomes, while 6 have been cloned, sequenced and characterized. Diversity analysis at the resistance gene level of this disease is scanty, and the landraces from West Bengal and North Eastern states of India have received little attention so far. The objective of this study was to assess the genetic diversity at conserved domains of 6 BLB resistance genes in a set of 22 rice accessions including landraces and check genotypes collected from the states of Assam, Nagaland, Mizoram and West Bengal. Results In this study 34 pairs of primers were designed from conserved domains of 6 BLB resistance genes; Xa1, xa5, Xa21, Xa21(A1), Xa26 and Xa27. The designed primer pairs were used to generate PCR based polymorphic DNA profiles to detect and elucidate the genetic diversity of the six genes in the 22 diverse rice accessions of known disease phenotype. A total of 140 alleles were identified including 41 rare and 26 null alleles. The average polymorphism information content (PIC) value was 0.56/primer pair. The DNA profiles identified each of the rice landraces unequivocally. The amplified polymorphic DNA bands were used to calculate genetic similarity of the rice landraces in all possible pair combinations. The similarity among the rice accessions ranged from 18% to 89% and the dendrogram produced from the similarity values was divided into 2 major clusters. The conserved domains identified within the sequenced rare alleles include Leucine-Rich Repeat, BED-type zinc finger domain, sugar transferase domain and the domain of the carbohydrate esterase 4 superfamily. Conclusions This study revealed high genetic diversity at conserved domains of six BLB

  2. G-boxes, bigfoot genes, and environmental response: characterization of intragenomic conserved noncoding sequences in Arabidopsis.

    PubMed

    Freeling, Michael; Rapaka, Lakshmi; Lyons, Eric; Pedersen, Brent; Thomas, Brian C

    2007-05-01

    A tetraploidy left Arabidopsis thaliana with 6358 pairs of homoeologs that, when aligned, generated 14,944 intragenomic conserved noncoding sequences (CNSs). Our previous work assembled these phylogenetic footprints into a database. We show that known transcription factor (TF) binding motifs, including the G-box, are overrepresented in these CNSs. A total of 254 genes spanning long lengths of CNS-rich chromosomes (Bigfoot) dominate this database. Therefore, we made subdatabases: one containing Bigfoot genes and the other containing genes with three to five CNSs (Smallfoot). Bigfoot genes are generally TFs that respond to signals, with their modal CNS positioned 3.1 kb 5' from the ATG. Smallfoot genes encode components of signal transduction machinery, the cytoskeleton, or involve transcription. We queried each subdatabase with each possible 7-nucleotide sequence. Among hundreds of hits, most were purified from CNSs, and almost all of those significantly enriched in CNSs had no experimental history. The 7-mers in CNSs are not 5'- to 3'-oriented in Bigfoot genes but are often oriented in Smallfoot genes. CNSs with one G-box tend to have two G-boxes. CNSs were shared with the homoeolog only and with no other gene, suggesting that binding site turnover impedes detection. Bigfoot genes may function in adaptation to environmental change. PMID:17496117

  3. A Conserved Motif in the Membrane Proximal C-Terminal Tail of Human Muscarinic M1 Acetylcholine Receptors Affects Plasma Membrane Expression

    PubMed Central

    Ehlert, Frederick J.; Shults, Crystal A.

    2010-01-01

    We investigated the functional role of a conserved motif, F(x)6LL, in the membrane proximal C-tail of the human muscarinic M1 (hM1) receptor. By use of site-directed mutagenesis, several different point mutations were introduced into the C-tail sequence 423FRDTFRLLL431. Wild-type and mutant hM1 receptors were transiently expressed in Chinese hamster ovary cells, and the amount of plasma membrane-expressed receptor was determined by use of intact, whole-cell [3H]N-methylscopolamine binding assays. The plasma membrane expression of hM1 receptors possessing either L430A or L431A or both point mutations was significantly reduced compared with the wild type. The hM1 receptor possessing a L430A/L431A double-point mutation was retained in the endoplasmic reticulum (ER), and atropine treatment caused the redistribution of the mutant receptor from the ER to the plasma membrane. Atropine treatment also caused an increase in the maximal response and potency of carbachol-stimulated phosphoinositide hydrolysis elicited by the L430A/L431A mutant. The effect of atropine on the L430A/L431A receptor mutant suggests that L430 and L431 play a role in folding hM1 receptors, which is necessary for exit from the ER. Using site-directed mutagenesis, we also identified amino acid residues at the base of transmembrane-spanning domain 1 (TM1), V46 and L47, that, when mutated, reduce the plasma membrane expression of hM1 receptors in an atropine-reversible manner. Overall, these mutagenesis data show that amino acid residues in the membrane-proximal C-tail and base of TM1 are necessary for hM1 receptors to achieve a transport-competent state. PMID:19841475

  4. The tryptophan repressor sequence is highly conserved among the Enterobacteriaceae.

    PubMed Central

    Arvidson, D N; Arvidson, C G; Lawson, C L; Miner, J; Adams, C; Youderian, P

    1994-01-01

    Tryptophan biosynthesis in Escherichia coli is regulated by the product of the trpR gene, the tryptophan (Trp) repressor. Trp aporepressor binds the corepressor, L-tryptophan, to form a holorepressor complex, which binds trp operator DNA tightly, and inhibits transcription of the tryptophan biosynthetic operon. The conservation of trp operator sequences among enteric Gram-negative bacteria suggests that trpR genes from other bacterial species can be cloned by complementation in E. coli. To clone trpR homologues, a deletion of the E. coli trpR gene, delta trpR504, was made on a plasmid by site-directed mutagenesis, then crossed onto the E. coli genome. Plasmid clones of the trpR genes of Enterobacter aerogenes and Enterobacter cloacae were isolated by complementation of the delta trpR504 allele, scored as the ability to repress beta-galactosidase synthesis from a prophage-borne trpE-lacZ gene fusion. The predicted amino acid sequences of four enteric TrpR proteins show differences, clustered on the backside of the folded repressor, opposite the DNA-binding helix-turn-helix substructures. These differences are predicted to have little effect on the interactions of the aporepressor with tryptophan, holorepressor with operator DNA, or tandemly bound holorepressor dimers with one another. Although there is some variation observed at the dimer interface, interactions predicted to stabilize the interface are conserved. The phylogenetic relationships revealed by the TrpR amino acid sequence alignment agree with the results of others. PMID:8208606

  5. Cross-reactivity between the rheumatoid arthritis-associated motif EQKRAA and structurally related sequences found in Proteus mirabilis.

    PubMed

    Tiwana, H; Wilson, C; Alvarez, A; Abuknesha, R; Bansal, S; Ebringer, A

    1999-06-01

    Cross-reactivity or molecular mimicry may be one of the underlying mechanisms involved in the etiopathogenesis of rheumatoid arthritis (RA). Antiserum against the RA susceptibility sequence EQKRAA was shown to bind to a similar peptide ESRRAL present in the hemolysin of the gram-negative bacterium Proteus mirabilis, and an anti-ESRRAL serum reacted with EQKRAA. There was no reactivity with either anti-EQKRAA or anti-ESRRAL to a peptide containing the EDERAA sequence which is present in HLA-DRB1*0402, an allele not associated with RA. Furthermore, the EQKRAA and ESRRAL antisera bound to a mouse fibroblast transfectant cell line (Dap.3) expressing HLA-DRB1*0401 but not to DRB1*0402. However, peptide sequences structurally related to the RA susceptibility motif LEIEKDFTTYGEE (P. mirabilis urease), VEIRAEGNRFTY (collagen type II) and DELSPETSPYVKE (collagen type XI) did not bind significantly to cell lines expressing HLA-DRB1*0401 or HLA-DRB1*0402 compared to the control peptide YASGASGASGAS. It is suggested here that molecular mimicry between HLA alleles associated with RA and P. mirabilis may be relevant in the etiopathogenesis of the disease. PMID:10338479

  6. A Conserved Acidic Motif in the N-Terminal Domain of Nitrate Reductase Is Necessary for the Inactivation of the Enzyme in the Dark by Phosphorylation and 14-3-3 Binding1

    PubMed Central

    Pigaglio, Emmanuelle; Durand, Nathalie; Meyer, Christian

    1999-01-01

    It has previously been shown that the N-terminal domain of tobacco (Nicotiana tabacum) nitrate reductase (NR) is involved in the inactivation of the enzyme by phosphorylation, which occurs in the dark (L. Nussaume, M. Vincentz, C. Meyer, J.P. Boutin, and M. Caboche [1995] Plant Cell 7: 611–621). The activity of a mutant NR protein lacking this N-terminal domain was no longer regulated by light-dark transitions. In this study smaller deletions were performed in the N-terminal domain of tobacco NR that removed protein motifs conserved among higher plant NRs. The resulting truncated NR-coding sequences were then fused to the cauliflower mosaic virus 35S RNA promoter and introduced in NR-deficient mutants of the closely related species Nicotiana plumbaginifolia. We found that the deletion of a conserved stretch of acidic residues led to an active NR protein that was more thermosensitive than the wild-type enzyme, but it was relatively insensitive to the inactivation by phosphorylation in the dark. Therefore, the removal of this acidic stretch seems to have the same effects on NR activation state as the deletion of the N-terminal domain. A hypothetical explanation for these observations is that a specific factor that impedes inactivation remains bound to the truncated enzyme. A synthetic peptide derived from this acidic protein motif was also found to be a good substrate for casein kinase II. PMID:9880364

  7. Polymorphism, monomorphism, and sequences in conserved microsatellites in primate species.

    PubMed

    Blanquer-Maumont, A; Crouau-Roy, B

    1995-10-01

    Dimeric short tandem repeats are a source of highly polymorphic markers in the mammalian genome. Genetic variation at these hypervariable loci is extensively used for linkage analysis, for the identification of individuals, and may be useful for interpopulation and interspecies studies. In this paper, we analyze the variability and the sequences of a segment including three microsatellites, first described in man, in several species of primates (chimpanzee, orangutan, gibbon, and macaque) using the heterologous primers (man primers). This region is located on the human chromosome 6p, near the tumor necrosis factor genes, in the major histocompatibility complex. The fact that these primers work in all species studied indicates that they are conserved throughout the different lineages of the two superfamilies, the Hominoidea and the Cercopithecidea, represented by the macaques. However, the intervening sequence displays intraspecific and interspecific variability. The sites of base substitutions and the insertion/deletion events are not evenly distributed within this region. The data suggest that it is necessary to have a minimal number of repeats to increase the rate of mutation sufficiently to allow the development of polymorphism. In some species, the microsatellites present single base variations which reduce the number of contiguous repeats, thus apparently slowing the rate of additional slippage events. Species with such variations or a low number of repeats are monomorphic. These microsatellite sequences are informative in the comparison of closely related species and reflect the phylogeny of the Old World monkeys, apes, and man. PMID:7563137

  8. Structural Analysis of a Repetitive Protein Sequence Motif in Strepsirrhine Primate Amelogenin

    PubMed Central

    Bromley, Keith M.; Hacia, Joseph G.; Bromage, Timothy G.; Snead, Malcolm L.; Moradian-Oldak, Janet; Paine, Michael L.

    2011-01-01

    Strepsirrhines are members of a primate suborder that has a distinctive set of features associated with the development of the dentition. Amelogenin (AMEL), the better known of the enamel matrix proteins, forms 90% of the secreted organic matrix during amelogenesis. Although AMEL has been sequenced in numerous mammalian lineages, the only reported strepsirrhine AMEL sequences are those of the ring-tailed lemur and galago, which contain a set of additional proline-rich tandem repeats absent in all other primates species analyzed to date, but present in some non-primate mammals. Here, we first determined that these repeats are present in AMEL from three additional lemur species and thus are likely to be widespread throughout this group. To evaluate the functional relevance of these repeats in strepsirrhines, we engineered a mutated murine amelogenin sequence containing a similar proline-rich sequence to that of Lemur catta. In the monomeric form, the MQP insertions had no influence on the secondary structure or refolding properties, whereas in the assembled form, the insertions increased the hydrodynamic radii. We speculate that increased AMEL nanosphere size may influence enamel formation in strepsirrhine primates. PMID:21437261

  9. Structural analysis of a repetitive protein sequence motif in strepsirrhine primate amelogenin.

    PubMed

    Lacruz, Rodrigo S; Lakshminarayanan, Rajamani; Bromley, Keith M; Hacia, Joseph G; Bromage, Timothy G; Snead, Malcolm L; Moradian-Oldak, Janet; Paine, Michael L

    2011-01-01

    Strepsirrhines are members of a primate suborder that has a distinctive set of features associated with the development of the dentition. Amelogenin (AMEL), the better known of the enamel matrix proteins, forms 90% of the secreted organic matrix during amelogenesis. Although AMEL has been sequenced in numerous mammalian lineages, the only reported strepsirrhine AMEL sequences are those of the ring-tailed lemur and galago, which contain a set of additional proline-rich tandem repeats absent in all other primates species analyzed to date, but present in some non-primate mammals. Here, we first determined that these repeats are present in AMEL from three additional lemur species and thus are likely to be widespread throughout this group. To evaluate the functional relevance of these repeats in strepsirrhines, we engineered a mutated murine amelogenin sequence containing a similar proline-rich sequence to that of Lemur catta. In the monomeric form, the MQP insertions had no influence on the secondary structure or refolding properties, whereas in the assembled form, the insertions increased the hydrodynamic radii. We speculate that increased AMEL nanosphere size may influence enamel formation in strepsirrhine primates. PMID:21437261

  10. Complex architecture of major histocompatibility complex class II promoters: reiterated motifs and conserved protein-protein interactions.

    PubMed Central

    Jabrane-Ferrat, N; Fontes, J D; Boss, J M; Peterlin, B M

    1996-01-01

    The S box (also known as at the H, W, or Z box) is the 5'-most element of the conserved upstream sequences in promoters of major histocompatibility complex class II genes. It is important for their B-cell-specific and interferon gamma-inducible expression. In this study, we demonstrate that the S box represents a duplication of the downstream X box. First, RFX, which is composed of the RFX5-p36 heterodimer that binds to the X box, also binds to the S box and its 5'-flanking sequence. Second, NF-Y, which binds to the Y box and increases interactions between RFX and the X box, also increases the binding of RFX to the S box. Third, RFXs bound to S and X boxes interact with each other in a spatially constrained manner. Finally, we confirmed these protein-protein and protein-DNA interactions by expressing a hybrid RFX5-VP16 protein in cells. We conclude that RFX binds to S and X boxes and that complex interactions between RFX and NF-Y direct B-cell-specific and interferon gamma-inducible expression or major histocompatibility complex class II genes. PMID:8756625

  11. The histone chaperone sNASP binds a conserved peptide motif within the globular core of histone H3 through its TPR repeats

    PubMed Central

    Bowman, Andrew; Lercher, Lukas; Singh, Hari R.; Zinne, Daria; Timinszky, Gyula; Carlomagno, Teresa; Ladurner, Andreas G.

    2016-01-01

    Eukaryotic chromatin is a complex yet dynamic structure, which is regulated in part by the assembly and disassembly of nucleosomes. Key to this process is a group of proteins termed histone chaperones that guide the thermodynamic assembly of nucleosomes by interacting with soluble histones. Here we investigate the interaction between the histone chaperone sNASP and its histone H3 substrate. We find that sNASP binds with nanomolar affinity to a conserved heptapeptide motif in the globular domain of H3, close to the C-terminus. Through functional analysis of sNASP homologues we identified point mutations in surface residues within the TPR domain of sNASP that disrupt H3 peptide interaction, but do not completely disrupt binding to full length H3 in cells, suggesting that sNASP interacts with H3 through additional contacts. Furthermore, chemical shift perturbations from 1H-15N HSQC experiments show that H3 peptide binding maps to the helical groove formed by the stacked TPR motifs of sNASP. Our findings reveal a new mode of interaction between a TPR repeat domain and an evolutionarily conserved peptide motif found in canonical H3 and in all histone H3 variants, including CenpA and have implications for the mechanism of histone chaperoning within the cell. PMID:26673727

  12. A conserved arginine-containing motif crucial for the assembly and enzymatic activity of the mixed lineage leukemia protein-1 core complex.

    PubMed

    Patel, Anamika; Vought, Valarie E; Dharmarajan, Venkatasubramanian; Cosgrove, Michael S

    2008-11-21

    The mixed lineage leukemia protein-1 (MLL1) belongs to the SET1 family of histone H3 lysine 4 methyltransferases. Recent studies indicate that the catalytic subunits of SET1 family members are regulated by interaction with a conserved core group of proteins that include the WD repeat protein-5 (WDR5), retinoblastoma-binding protein-5 (RbBP5), and the absent small homeotic-2-like protein (Ash2L). It has been suggested that WDR5 functions to bridge the interactions between the catalytic and regulatory subunits of SET1 family complexes. However, the molecular details of these interactions are unknown. To gain insight into the interactions among these proteins, we have determined the biophysical basis for the interaction between the human WDR5 and MLL1. Our studies reveal that WDR5 preferentially recognizes a previously unidentified and conserved arginine-containing motif, called the "Win" or WDR5 interaction motif, which is located in the N-SET region of MLL1 and other SET1 family members. Surprisingly, our structural and functional studies show that WDR5 recognizes arginine 3765 of the MLL1 Win motif using the same arginine binding pocket on WDR5 that was previously shown to bind histone H3. We demonstrate that WDR5's recognition of arginine 3765 of MLL1 is essential for the assembly and enzymatic activity of the MLL1 core complex in vitro. PMID:18829457

  13. Identification and Characterization of Functionally Critical, Conserved Motifs in the Internal Repeats and N-terminal Domain of Yeast Translation Initiation Factor 4B (yeIF4B)*

    PubMed Central

    Zhou, Fujun; Walker, Sarah E.; Mitchell, Sarah F.; Lorsch, Jon R.; Hinnebusch, Alan G.

    2014-01-01

    eIF4B has been implicated in attachment of the 43 S preinitiation complex (PIC) to mRNAs and scanning to the start codon. We recently determined that the internal seven repeats (of ∼26 amino acids each) of Saccharomyces cerevisiae eIF4B (yeIF4B) compose the region most critically required to enhance mRNA recruitment by 43 S PICs in vitro and stimulate general translation initiation in yeast. Moreover, although the N-terminal domain (NTD) of yeIF4B contributes to these activities, the RNA recognition motif is dispensable. We have now determined that only two of the seven internal repeats are sufficient for wild-type (WT) yeIF4B function in vivo when all other domains are intact. However, three or more repeats are needed in the absence of the NTD or when the functions of eIF4F components are compromised. We corroborated these observations in the reconstituted system by demonstrating that yeIF4B variants with only one or two repeats display substantial activity in promoting mRNA recruitment by the PIC, whereas additional repeats are required at lower levels of eIF4A or when the NTD is missing. These findings indicate functional overlap among the 7-repeats and NTD domains of yeIF4B and eIF4A in mRNA recruitment. Interestingly, only three highly conserved positions in the 26-amino acid repeat are essential for function in vitro and in vivo. Finally, we identified conserved motifs in the NTD and demonstrate functional overlap of two such motifs. These results provide a comprehensive description of the critical sequence elements in yeIF4B that support eIF4F function in mRNA recruitment by the PIC. PMID:24285537

  14. High-throughput sequencing enhanced phage display enables the identification of patient-specific epitope motifs in serum

    PubMed Central

    Christiansen, Anders; Kringelum, Jens V.; Hansen, Christian S.; Bøgh, Katrine L.; Sullivan, Eric; Patel, Jigar; Rigby, Neil M.; Eiwegger, Thomas; Szépfalusi, Zsolt; Masi, Federico de; Nielsen, Morten; Lund, Ole; Dufva, Martin

    2015-01-01

    Phage display is a prominent screening technique with a multitude of applications including therapeutic antibody development and mapping of antigen epitopes. In this study, phages were selected based on their interaction with patient serum and exhaustively characterised by high-throughput sequencing. A bioinformatics approach was developed in order to identify peptide motifs of interest based on clustering and contrasting to control samples. Comparison of patient and control samples confirmed a major issue in phage display, namely the selection of unspecific peptides. The potential of the bioinformatic approach was demonstrated by identifying epitopes of a prominent peanut allergen, Ara h 1, in sera from patients with severe peanut allergy. The identified epitopes were confirmed by high-density peptide micro-arrays. The present study demonstrates that high-throughput sequencing can empower phage display by (i) enabling the analysis of complex biological samples, (ii) circumventing the traditional laborious picking and functional testing of individual phage clones and (iii) reducing the number of selection rounds. PMID:26246327

  15. CpG island erosion, polycomb occupancy and sequence motif enrichment at bivalent promoters in mammalian embryonic stem cells

    PubMed Central

    Mantsoki, Anna; Devailly, Guillaume; Joshi, Anagha

    2015-01-01

    In embryonic stem (ES) cells, developmental regulators have a characteristic bivalent chromatin signature marked by simultaneous presence of both activation (H3K4me3) and repression (H3K27me3) signals and are thought to be in a ‘poised’ state for subsequent activation or silencing during differentiation. We collected eleven pairs (H3K4me3 and H3K27me3) of ChIP sequencing datasets in human ES cells and eight pairs in murine ES cells, and predicted high-confidence (HC) bivalent promoters. Over 85% of H3K27me3 marked promoters were bivalent in human and mouse ES cells. We found that (i) HC bivalent promoters were enriched for developmental factors and were highly likely to be differentially expressed upon transcription factor perturbation; (ii) murine HC bivalent promoters were occupied by both polycomb repressive component classes (PRC1 and PRC2) and grouped into four distinct clusters with different biological functions; (iii) HC bivalent and active promoters were CpG rich while H3K27me3-only promoters lacked CpG islands. Binding enrichment of distinct sets of regulators distinguished bivalent from active promoters. Moreover, a ‘TCCCC’ sequence motif was specifically enriched in bivalent promoters. Finally, this analysis will serve as a resource for future studies to further understand transcriptional regulation during embryonic development. PMID:26582124

  16. On the Concept of Cis-regulatory Information: From Sequence Motifs to Logic Functions

    NASA Astrophysics Data System (ADS)

    Tarpine, Ryan; Istrail, Sorin

    The regulatory genome is about the “system level organization of the core genomic regulatory apparatus, and how this is the locus of causality underlying the twin phenomena of animal development and animal evolution” (E.H. Davidson. The Regulatory Genome: Gene Regulatory Networks in Development and Evolution, Academic Press, 2006). Information processing in the regulatory genome is done through regulatory states, defined as sets of transcription factors (sequence-specific DNA binding proteins which determine gene expression) that are expressed and active at the same time. The core information processing machinery consists of modular DNA sequence elements, called cis-modules, that interact with transcription factors. The cis-modules “read” the information contained in the regulatory state of the cell through transcription factor binding, “process” it, and directly or indirectly communicate with the basal transcription apparatus to determine gene expression. This endowment of each gene with the information-receiving capacity through their cis-regulatory modules is essential for the response to every possible regulatory state to which it might be exposed during all phases of the life cycle and in all cell types. We present here a set of challenges addressed by our CYRENE research project aimed at studying the cis-regulatory code of the regulatory genome. The CYRENE Project is devoted to (1) the construction of a database, the cis-Lexicon, containing comprehensive information across species about experimentally validated cis-regulatory modules; and (2) the software development of a next-generation genome browser, the cis-Browser, specialized for the regulatory genome. The presentation is anchored on three main computational challenges: the Gene Naming Problem, the Consensus Sequence Bottleneck Problem, and the Logic Function Inference Problem.

  17. A highly conserved DNA replication module from Streptococcus thermophilus phages is similar in sequence and topology to a module from Lactococcus lactis phages.

    PubMed

    Desiere, F; Lucchini, S; Bruttin, A; Zwahlen, M C; Brüssow, H

    1997-08-01

    A highly conserved DNA region extending over 5 kb was observed in Streptococcus thermophilus bacteriophages. Comparative sequencing of one temperate and 26 virulent phages demonstrated in the most extreme case an 18% aa difference for a predicted protein, while the majority of the phages showed fewer, if any aa changes. The relative degree of aa conservation was not homogeneous over the DNA segment investigated. Sequence analysis of the conserved segment revealed genes possibly involved in DNA transactions. Three predicted proteins (orf 233, 443, and 382 gene product (gp)) showed nucleoside triphosphate binding motifs. Orf 443 gp showed in addition a DEAH box motif, characteristically found in a subgroup of helicases, and a variant zinc finger motif known from a phage T7 helicase/primase. Tree analysis classified orf 443 gp as a distant member of the helicase superfamily. Orf 382 gp showed similarity to putative plasmid DNA primases. Downstream of orf 382 a noncoding repeat region was identified that showed similarity to a putative minus origin from a cryptic S. thermophilus plasmid. Four predicted proteins showed not only high degrees of aa identity (34 to 63%) with proteins from Lactococcus lactis phages, but their genes showed a similar topological organization. We interpret this as evidence for a horizontal gene transfer event between phages of the two bacterial genera in the distant past. PMID:9268169

  18. Characterization of the fibronectin-attachment protein of Mycobacterium avium reveals a fibronectin-binding motif conserved among mycobacteria.

    PubMed

    Schorey, J S; Holsti, M A; Ratliff, T L; Allen, P M; Brown, E J

    1996-07-01

    Mycobacterium avium is an intracellular pathogen and a major opportunistic infectious agent observed in patients with acquired immune deficiency syndrome (AIDS). Evidence suggests that the initial portal of infection by M. avium is often the gastrointestinal tract. However, the mechanism by which the M. avium crosses the epithelial barrier is unclear. A possible mechanism is suggested by the ability of M. avium to bind fibronectin, an extracellular matrix protein that is a virulence factor for several extracellular pathogenic bacteria which bind to mucosal surfaces. To further characterize fibronectin binding by M. avium, we have cloned the M. avium fibronectin-attachment protein (FAP). The M. avium FAP (FAP-A) has an unusually large number of Pro and Ala residues (40% overall) and is 50% identical to FAP of both Mycobacterium leprae and Mycobacterium tuberculosis. Using recombinant FAP-A and FAP-A peptides, we show that two non-continuous regions in FAP-A bind fibronectin. Peptides from these regions and homologous sequences from M. leprae FAP inhibit fibronectin binding by both M. avium and Mycobacterium bovis Bacillus Calmette-Guerin (BCG). These regions have no homology to eukaryotic fibronectin-binding proteins and are only distantly related to fibronectin-binding peptides of Gram-positive bacteria. Nevertheless, these fibronectin-binding regions are highly conserved among the mycobacterial FAPs, suggesting an essential function for this interaction in mycobacteria infection of their metazoan hosts. PMID:8858587

  19. Rare k-mer DNA: Identification of sequence motifs and prediction of CpG island and promoter.

    PubMed

    Mohamed Hashim, Ezzeddin Kamil; Abdullah, Rosni

    2015-12-21

    Empirical analysis on k-mer DNA has been proven as an effective tool in finding unique patterns in DNA sequences which can lead to the discovery of potential sequence motifs. In an extensive study of empirical k-mer DNA on hundreds of organisms, the researchers found unique multi-modal k-mer spectra occur in the genomes of organisms from the tetrapod clade only which includes all mammals. The multi-modality is caused by the formation of the two lowest modes where k-mers under them are referred as the rare k-mers. The suppression of the two lowest modes (or the rare k-mers) can be attributed to the CG dinucleotide inclusions in them. Apart from that, the rare k-mers are selectively distributed in certain genomic features of CpG Island (CGI), promoter, 5' UTR, and exon. We correlated the rare k-mers with hundreds of annotated features using several bioinformatic tools, performed further intrinsic rare k-mer analyses within the correlated features, and modeled the elucidated rare k-mer clustering feature into a classifier to predict the correlated CGI and promoter features. Our correlation results show that rare k-mers are highly associated with several annotated features of CGI, promoter, 5' UTR, and open chromatin regions. Our intrinsic results show that rare k-mers have several unique topological, compositional, and clustering properties in CGI and promoter features. Finally, the performances of our RWC (rare-word clustering) method in predicting the CGI and promoter features are ranked among the top three, in eight of the CGI and promoter evaluations, among eight of the benchmarked datasets. PMID:26427337

  20. Sequence and structural analysis of the Asp-box motif and Asp-box beta-propellers; a widespread propeller-type characteristic of the Vps10 domain family and several glycoside hydrolase families

    PubMed Central

    Quistgaard, Esben M; Thirup, Søren S

    2009-01-01

    Background The Asp-box is a short sequence and structure motif that folds as a well-defined β-hairpin. It is present in different folds, but occurs most prominently as repeats in β-propellers. Asp-box β-propellers are known to be characteristically irregular and to occur in many medically important proteins, most of which are glycosidase enzymes, but they are otherwise not well characterized and are only rarely treated as a distinct β-propeller family. We have analyzed the sequence, structure, function and occurrence of the Asp-box and s-Asp-box -a related shorter variant, and provide a comprehensive classification and computational analysis of the Asp-box β-propeller family. Results We find that all conserved residues of the Asp-box support its structure, whereas the residues in variable positions are generally used for other purposes. The Asp-box clearly has a structural role in β-propellers and is highly unlikely to be involved in ligand binding. Sequence analysis of the Asp-box β-propeller family reveals it to be very widespread especially in bacteria and suggests a wide functional range. Disregarding the Asp-boxes, sequence conservation of the propeller blades is very low, but a distinct pattern of residues with specific properties have been identified. Interestingly, Asp-boxes are occasionally found very close to other propeller-associated repeats in extensive mixed-motif stretches, which strongly suggests the existence of a novel class of hybrid β-propellers. Structural analysis reveals that the top and bottom faces of Asp-box β-propellers have striking and consistently different loop properties; the bottom is structurally conserved whereas the top shows great structural variation. Interestingly, only the top face is used for functional purposes in known structures. A structural analysis of the 10-bladed β-propeller fold, which has so far only been observed in the Asp-box family, reveals that the inner strands of the blades are unusually far apart

  1. Crystal Structure of pb9, the Distal Tail Protein of Bacteriophage T5: a Conserved Structural Motif among All Siphophages

    PubMed Central

    Flayhan, Ali; Vellieux, Frédéric M. D.; Lurz, Rudi; Maury, Olivier; Contreras-Martel, Carlos; Girard, Eric; Boulanger, Pascale

    2014-01-01

    The tail of Caudovirales bacteriophages serves as an adsorption device, a host cell wall-perforating machine, and a genome delivery pathway. In Siphoviridae, the assembly of the long and flexible tail is a highly cooperative and regulated process that is initiated from the proteins forming the distal tail tip complex. In Gram-positive-bacterium-infecting siphophages, the distal tail (Dit) protein has been structurally characterized and is proposed to represent a baseplate hub docking structure. It is organized as a hexameric ring that connects the tail tube and the adsorption device. In this study, we report the characterization of pb9, a tail tip protein of Escherichia coli bacteriophage T5. By immunolocalization, we show that pb9 is located in the upper part of the cone of the T5 tail tip, at the end of the tail tube. The crystal structure of pb9 reveals a two-domain protein. Domain A exhibits remarkable structural similarity with the N-terminal domain of known Dit proteins, while domain B adopts an oligosaccharide/oligonucleotide-binding fold (OB-fold) that is not shared by these proteins. We thus propose that pb9 is the Dit protein of T5, making it the first Dit protein described for a Gram-negative-bacterium-infecting siphophage. Multiple sequence alignments suggest that pb9 is a paradigm for a large family of Dit proteins of siphophages infecting mostly Gram-negative hosts. The modular structure of the Dit protein maintains the basic building block that would be conserved among all siphophages, combining it with a more divergent domain that might serve specific host adhesion properties. PMID:24155371

  2. Identification of a Novel Calcium Binding Motif Based on the Detection of Sequence Insertions in the Animal Peroxidase Domain of Bacterial Proteins

    PubMed Central

    Santamaría-Hernando, Saray

    2012-01-01

    Proteins of the animal heme peroxidase (ANP) superfamily differ greatly in size since they have either one or two catalytic domains that match profile PS50292. The orf PP_2561 of Pseudomonas putida KT2440 that we have called PepA encodes a two-domain ANP. The alignment of these domains with those of PepA homologues revealed a variable number of insertions with the consensus G-x-D-G-x-x-[GN]-[TN]-x-D-D. This motif has also been detected in the structure of pseudopilin (pdb 3G20), where it was found to be involved in Ca2+ coordination although a sequence analysis did not reveal the presence of any known calcium binding motifs in this protein. Isothermal titration calorimetry revealed that a peptide containing this consensus motif bound specifically calcium ions with affinities ranging between 33–79 µM depending on the pH. Microcalorimetric titrations of the purified N-terminal ANP-like domain of PepA revealed Ca2+ binding with a KD of 12 µM and stoichiometry of 1.25 calcium ions per protein monomer. This domain exhibited peroxidase activity after its reconstitution with heme. These data led to the definition of a novel calcium binding motif that we have termed PERCAL and which was abundantly present in animal peroxidase-like domains of bacterial proteins. Bacterial heme peroxidases thus possess two different types of calcium binding motifs, namely PERCAL and the related hemolysin type calcium binding motif, with the latter being located outside the catalytic domains and in their C-terminal end. A phylogenetic tree of ANP-like catalytic domains of bacterial proteins with PERCAL motifs, including single domain peroxidases, was divided into two major clusters, representing domains with and without PERCAL motif containing insertions. We have verified that the recently reported classification of bacterial heme peroxidases in two families (cd09819 and cd09821) is unrelated to these insertions. Sequences matching PERCAL were detected in all kingdoms of life. PMID

  3. Identification of a novel calcium binding motif based on the detection of sequence insertions in the animal peroxidase domain of bacterial proteins.

    PubMed

    Santamaría-Hernando, Saray; Krell, Tino; Ramos-González, María-Isabel

    2012-01-01

    Proteins of the animal heme peroxidase (ANP) superfamily differ greatly in size since they have either one or two catalytic domains that match profile PS50292. The orf PP_2561 of Pseudomonas putida KT2440 that we have called PepA encodes a two-domain ANP. The alignment of these domains with those of PepA homologues revealed a variable number of insertions with the consensus G-x-D-G-x-x-[GN]-[TN]-x-D-D. This motif has also been detected in the structure of pseudopilin (pdb 3G20), where it was found to be involved in Ca(2+) coordination although a sequence analysis did not reveal the presence of any known calcium binding motifs in this protein. Isothermal titration calorimetry revealed that a peptide containing this consensus motif bound specifically calcium ions with affinities ranging between 33-79 µM depending on the pH. Microcalorimetric titrations of the purified N-terminal ANP-like domain of PepA revealed Ca(2+) binding with a K(D) of 12 µM and stoichiometry of 1.25 calcium ions per protein monomer. This domain exhibited peroxidase activity after its reconstitution with heme. These data led to the definition of a novel calcium binding motif that we have termed PERCAL and which was abundantly present in animal peroxidase-like domains of bacterial proteins. Bacterial heme peroxidases thus possess two different types of calcium binding motifs, namely PERCAL and the related hemolysin type calcium binding motif, with the latter being located outside the catalytic domains and in their C-terminal end. A phylogenetic tree of ANP-like catalytic domains of bacterial proteins with PERCAL motifs, including single domain peroxidases, was divided into two major clusters, representing domains with and without PERCAL motif containing insertions. We have verified that the recently reported classification of bacterial heme peroxidases in two families (cd09819 and cd09821) is unrelated to these insertions. Sequences matching PERCAL were detected in all kingdoms of life. PMID

  4. The role of context in RNA structure: flanking sequences reconfigure CAG motif folding in huntingtin exon 1 transcripts

    PubMed Central

    Busan, Steven; Weeks, Kevin M.

    2016-01-01

    The length of the CAG repeat region in the huntingtin messenger RNA is predictive of Huntington’s disease. Structural studies of CAG repeat-containing RNAs suggest that these sequences form simple hairpin structures; however, in the context of the full-length huntingtin mRNA, CAG repeats may form complex structures that could be targeted for therapeutic intervention. We examined the structures of transcripts spanning the first exon of the huntingtin mRNA with both healthy and disease-prone repeat lengths. In transcripts with 17 to 70 repeats, the CAG sequences base paired extensively with bases in the 5′ UTR and with a conserved region downstream of the CCG repeat region. In huntingtin transcripts with healthy numbers of repeats, the previously observed CAG hairpin was either absent or short. In contrast, in transcripts with disease-associated numbers of repeats, a CAG hairpin was present and extended from a three-helix junction. Our findings demonstrate the profound importance of sequence context in RNA folding and identify specific structural differences between healthy and disease-inducing huntingtin alleles that may be targets for therapeutic intervention. PMID:24199621

  5. De Novo Regulatory Motif Discovery Identifies Significant Motifs in Promoters of Five Classes of Plant Dehydrin Genes

    PubMed Central

    Zolotarov, Yevgen; Strömvik, Martina

    2015-01-01

    Plants accumulate dehydrins in response to osmotic stresses. Dehydrins are divided into five different classes, which are thought to be regulated in different manners. To better understand differences in transcriptional regulation of the five dehydrin classes, de novo motif discovery was performed on 350 dehydrin promoter sequences from a total of 51 plant genomes. Overrepresented motifs were identified in the promoters of five dehydrin classes. The Kn dehydrin promoters contain motifs linked with meristem specific expression, as well as motifs linked with cold/dehydration and abscisic acid response. KS dehydrin promoters contain a motif with a GATA core. SKn and YnSKn dehydrin promoters contain motifs that match elements connected with cold/dehydration, abscisic acid and light response. YnKn dehydrin promoters contain motifs that match abscisic acid and light response elements, but not cold/dehydration response elements. Conserved promoter motifs are present in the dehydrin classes and across different plant lineages, indicating that dehydrin gene regulation is likely also conserved. PMID:26114291

  6. Localization of Daucus carota NMCP1 to the nuclear periphery: the role of the N-terminal region and an NLS-linked sequence motif, RYNLRR, in the tail domain

    PubMed Central

    Kimura, Yuta; Fujino, Kaien; Ogawa, Kana; Masuda, Kiyoshi

    2014-01-01

    Recent ultrastructural studies revealed that a structure similar to the vertebrate nuclear lamina exists in the nuclei of higher plants. However, plant genomes lack genes for lamins and intermediate-type filament proteins, and this suggests that plant-specific nuclear coiled-coil proteins make up the lamina-like structure in plants. NMCP1 is a protein, first identified in Daucus carota cells, that localizes exclusively to the nuclear periphery in interphase cells. It has a tripartite structure comprised of head, rod, and tail domains, and includes putative nuclear localization signal (NLS) motifs. We identified the functional NLS of DcNMCP1 (carrot NMCP1) and determined the protein regions required for localizing to the nuclear periphery using EGFP-fused constructs transiently expressed in Apium graveolens epidermal cells. Transcription was driven under a CaMV35S promoter, and the genes were introduced into the epidermal cells by a DNA-coated microprojectile delivery system. Of the NLS motifs, KRRRK and RRHK in the tail domain were highly functional for nuclear localization. Addition of the N-terminal 141 amino acids from DcNMCP1 shifted the localization of a region including these NLSs from the entire nucleus to the nuclear periphery. Using this same construct, the replacement of amino acids in RRHK or its preceding sequence, YNL, with alanine residues abolished localization to the nuclear periphery, while replacement of KRRRK did not affect localization. The sequence R/Q/HYNLRR/H, including YNL and the first part of the sequence of RRHK, is evolutionarily conserved in a subclass of NMCP1 sequences from many plant species. These results show that NMCP1 localizes to the nuclear periphery by a combined action of a sequence composed of R/Q/HYNLRR/H, NLS, and the N-terminal region including the head and a portion of the rod domain, suggesting that more than one binding site is implicated in localization of NMCP1. PMID:24616728

  7. Formation and Dissociation of the Interstrand i-Motif by the Sequences d(XnC4Ym) Monitored with Electrospray Ionization Mass Spectrometry

    NASA Astrophysics Data System (ADS)

    Cao, Yanwei; Qin, Yujiao; Bruist, Michael; Gao, Shang; Wang, Bing; Wang, Huixin; Guo, Xinhua

    2015-06-01

    Formation and dissociation of the interstrand i-motifs by DNA with the sequence d(XnC4Ym) (X and Y represent thymine, adenine, or guanine, and n, m range from 0 to 2) are studied with electrospray ionization mass spectrometry (ESI-MS), circular dichroism (CD), and UV spectrophotometry. The ion complexes detected in the gas phase and the melting temperatures (Tm) obtained in solution show that a non-C base residue located at 5' end favors formation of the four-stranded structures, with T > A > G for imparting stability. Comparatively, no rule is found when a non-C base is located at the 3' end. Detection of penta- and hexa-stranded ions indicates the formation of i-motifs with more than four strands. In addition, the i-motifs seen in our mass spectra are accompanied by single-, double-, and triple-stranded ions, and the trimeric ions were always less abundant during annealing and heat-induced dissociation process of the DNA strands in solution (pH = 4.5). This provides a direct evidence of a strand-by-strand formation and dissociation pathway of the interstrand i-motif and formation of the triple strands is the rate-limiting step. In contrast, the trimeric ions are abundant when the tetramolecular ions are subjected to collision-induced dissociation (CID) in the gas phase, suggesting different dissociation behaviors of the interstrand i-motif in the gas phase and in solution. Furthermore, hysteretic UV absorption melting and cooling curves reveal an irreversible dissociation and association kinetic process of the interstrand i-motif in solution.

  8. Tumor-associated mutations in a conserved structural motif alter physical and biochemical properties of human RAD51 recombinase

    PubMed Central

    Chen, Jianhong; Morrical, Milagros D.; Donigan, Katherine A.; Weidhaas, Joanne B.; Sweasy, Joann B.; Averill, April M.; Tomczak, Jennifer A.; Morrical, Scott W.

    2015-01-01

    Human RAD51 protein catalyzes DNA pairing and strand exchange reactions that are central to homologous recombination and homology-directed DNA repair. Successful recombination/repair requires the formation of a presynaptic filament of RAD51 on ssDNA. Mutations in BRCA2 and other proteins that control RAD51 activity are associated with human cancer. Here we describe a set of mutations associated with human breast tumors that occur in a common structural motif of RAD51. Tumor-associated D149N, R150Q and G151D mutations map to a Schellman loop motif located on the surface of the RecA homology domain of RAD51. All three variants are proficient in DNA strand exchange, but G151D is slightly more sensitive to salt than wild-type (WT). Both G151D and R150Q exhibit markedly lower catalytic efficiency for adenosine triphosphate hydrolysis compared to WT. All three mutations alter the physical properties of RAD51 nucleoprotein filaments, with G151D showing the most dramatic changes. G151D forms mixed nucleoprotein filaments with WT RAD51 that have intermediate properties compared to unmixed filaments. These findings raise the possibility that mutations in RAD51 itself may contribute to genome instability in tumor cells, either directly through changes in recombinase properties, or indirectly through changes in interactions with regulatory proteins. PMID:25539919

  9. Molecular cloning and sequence analysis of expansins--a highly conserved, multigene family of proteins that mediate cell wall extension in plants.

    PubMed Central

    Shcherban, T Y; Shi, J; Durachko, D M; Guiltinan, M J; McQueen-Mason, S J; Shieh, M; Cosgrove, D J

    1995-01-01

    Expansins are unusual proteins discovered by virtue of their ability to mediate cell wall extension in plants. We identified cDNA clones for two cucumber expansins on the basis of peptide sequences of proteins purified from cucumber hypocotyls. The expansin cDNAs encode related proteins with signal peptides predicted to direct protein secretion to the cell wall. Northern blot analysis showed moderate transcript abundance in the growing region of the hypocotyl and no detectable transcripts in the nongrowing region. Rice and Arabidopsis expansin cDNAs were identified from collections of anonymous cDNAs (expressed sequence tags). Sequence comparisons indicate at least four distinct expansin cDNAs in rice and at least six in Arabidopsis. Expansins are highly conserved in size and sequence (60-87% amino acid sequence identity and 75-95% similarity between any pairwise comparison), and phylogenetic trees indicate that this multigene family formed before the evolutionary divergence of monocotyledons and dicotyledons. Sequence and motif analyses show no similarities to known functional domains that might account for expansin action on wall extension. A series of highly conserved tryptophans may function in expansin binding to cellulose or other glycans. The high conservation of this multigene family indicates that the mechanism by which expansins promote wall extensin tolerates little variation in protein structure. Images Fig. 2 PMID:7568110

  10. Ovodefensins, an Oviduct-Specific Antimicrobial Gene Family, Have Evolved in Birds and Reptiles to Protect the Egg by Both Sequence and Intra-Six-Cysteine Sequence Motif Spacing.

    PubMed

    Whenham, Natasha; Lu, Tian Chee; Maidin, Maisarah B M; Wilson, Peter W; Bain, Maureen M; Stevenson, M Lynn; Stevens, Mark P; Bedford, Michael R; Dunn, Ian C

    2015-06-01

    Ovodefensins are a novel beta defensin-related family of antimicrobial peptides containing conserved glycine and six cysteine residues. Originally thought to be restricted to the albumen-producing region of the avian oviduct, expression was found in chicken, turkey, duck, and zebra finch in large quantities in many parts of the oviduct, but this varied between species and between gene forms in the same species. Using new search strategies, the ovodefensin family now has 35 members, including reptiles, but no representatives outside birds and reptiles have been found. Analysis of their evolution shows that ovodefensins divide into six groups based on the intra-cysteine amino acid spacing, representing a unique mechanism alongside traditional evolution of sequence. The groups have been used to base a nomenclature for the family. Antimicrobial activity for three ovodefensins from chicken and duck was confirmed against Escherichia coli and a pathogenic E. coli strain as well as a Gram-positive organism, Staphylococcus aureus, for the first time. However, activity varied greatly between peptides, with Gallus gallus OvoDA1 being the most potent, suggesting a link with the different structures. Expression of Gallus gallus OvoDA1 (gallin) in the oviduct was increased by estrogen and progesterone and in the reproductive state. Overall, the results support the hypothesis that ovodefensins evolved to protect the egg, but they are not necessarily restricted to the egg white. Therefore, divergent motif structure and sequence present an interesting area of research for antimicrobial peptide design and understanding protection of the cleidoic egg. PMID:25972010

  11. Loop Sequence Context Influences the Formation and Stability of the i-Motif for DNA Oligomers of Sequence (CCCXXX)4, where X = A and/or T, under Slightly Acidic Conditions.

    PubMed

    McKim, Mikeal; Buxton, Alexander; Johnson, Courtney; Metz, Amanda; Sheardy, Richard D

    2016-08-11

    The structure and stability of DNA is highly dependent upon the sequence context of the bases (A, G, C, and T) and the environment under which the DNA is prepared (e.g., buffer, temperature, pH, ionic strength). Understanding the factors that influence structure and stability of the i-motif conformation can lead to the design of DNA sequences with highly tunable properties. We have been investigating the influence of pH and temperature on the conformations and stabilities for all permutations of the DNA sequence (CCCXXX)4, where X = A and/or T, using spectroscopic approaches. All oligomers undergo transitions from single-stranded structures at pH 7.0 to i-motif conformations at pH 5.0 as evidenced by circular dichroism (CD) studies. These folded structures possess stacked C:CH(+) base pairs joined by loops of 5'-XXX-3'. Although the pH at the midpoint of the transition (pHmp) varies slightly with loop sequence, the linkage between pH and log K for the proton induced transition is highly loop sequence dependent. All oligomers also undergo the thermally induced i-motif to single-strand transition at pH 5.0 as the temperature is increased from 25 to 95 °C. The temperature at the midpoint of this transition (Tm) is also highly dependent on loop sequence context effects. For seven of eight possible permutations, the pH induced, and thermally induced transitions appear to be highly cooperative and two state. Analysis of the CD optical melting profiles via a van't Hoff approach reveals sequence-dependent thermodynamic parameters for the unfolding as well. Together, these data reveal that the i-motif conformation exhibits exquisite sensitivity to loop sequence context with respect to formation and stability. PMID:27438583

  12. Dynamic behavior of an intrinsically unstructured linker domain is conserved in the face of negligible amino acid sequence conservation.

    PubMed

    Daughdrill, Gary W; Narayanaswami, Pranesh; Gilmore, Sara H; Belczyk, Agniezka; Brown, Celeste J

    2007-09-01

    Proteins or regions of proteins that do not form compact globular structures are classified as intrinsically unstructured proteins (IUPs). IUPs are common in nature and have essential molecular functions, but even a limited understanding of the evolution of their dynamic behavior is lacking. The primary objective of this work was to test the evolutionary conservation of dynamic behavior for a particular class of IUPs that form intrinsically unstructured linker domains (IULD) that tether flanking folded domains. This objective was accomplished by measuring the backbone flexibility of several IULD homologues using nuclear magnetic resonance (NMR) spectroscopy. The backbone flexibility of five IULDs, representing three kingdoms, was measured and analyzed. Two IULDs from animals, one IULD from fungi, and two IULDs from plants showed similar levels of backbone flexibility that were consistent with the absence of a compact globular structure. In contrast, the amino acid sequences of the IULDs from these three taxa showed no significant similarity. To investigate how the dynamic behavior of the IULDs could be conserved in the absence of detectable sequence conservation, evolutionary rate studies were performed on a set of nine mammalian IULDs. The results of this analysis showed that many sites in the IULD are evolving neutrally, suggesting that dynamic behavior can be maintained in the absence of natural selection. This work represents the first experimental test of the evolutionary conservation of dynamic behavior and demonstrates that amino acid sequence conservation is not required for the conservation of dynamic behavior and presumably molecular function. PMID:17721672

  13. Mutation of the Conserved Calcium-Binding Motif in Neisseria gonorrhoeae PilC1 Impacts Adhesion but Not Piliation

    PubMed Central

    Cheng, Yuan; Johnson, Michael D. L.; Burillo-Kirch, Christine; Mocny, Jeffrey C.; Anderson, James E.; Garrett, Christopher K.; Redinbo, Matthew R.

    2013-01-01

    Neisseria gonorrhoeae PilC1 is a member of the PilC family of type IV pilus-associated adhesins found in Neisseria species and other type IV pilus-producing genera. Previously, a calcium-binding domain was described in the C-terminal domains of PilY1 of Pseudomonas aeruginosa and in PilC1 and PilC2 of Kingella kingae. Genetic analysis of N. gonorrhoeae revealed a similar calcium-binding motif in PilC1. To evaluate the potential significance of this calcium-binding region in N. gonorrhoeae, we produced recombinant full-length PilC1 and a PilC1 C-terminal domain fragment. We show that, while alterations of the calcium-binding motif disrupted the ability of PilC1 to bind calcium, they did not grossly affect the secondary structure of the protein. Furthermore, we demonstrate that both full-length wild-type PilC1 and full-length calcium-binding-deficient PilC1 inhibited gonococcal adherence to cultured human cervical epithelial cells, unlike the truncated PilC1 C-terminal domain. Similar to PilC1 in K. kingae, but in contrast to the calcium-binding mutant of P. aeruginosa PilY1, an equivalent mutation in N. gonorrhoeae PilC1 produced normal amounts of pili. However, the N. gonorrhoeae PilC1 calcium-binding mutant still had partial defects in gonococcal adhesion to ME180 cells and genetic transformation, which are both essential virulence factors in this human pathogen. Thus, we conclude that calcium binding to PilC1 plays a critical role in pilus function in N. gonorrhoeae. PMID:24002068

  14. A short conserved motif in ALYREF directs cap- and EJC-dependent assembly of export complexes on spliced mRNAs

    PubMed Central

    Gromadzka, Agnieszka M.; Steckelberg, Anna-Lena; Singh, Kusum K.; Hofmann, Kay; Gehring, Niels H.

    2016-01-01

    The export of messenger RNAs (mRNAs) is the final of several nuclear posttranscriptional steps of gene expression. The formation of export-competent mRNPs involves the recruitment of export factors that are assumed to facilitate transport of the mature mRNAs. Using in vitro splicing assays, we show that a core set of export factors, including ALYREF, UAP56 and DDX39, readily associate with the spliced RNAs in an EJC (exon junction complex)- and cap-dependent manner. In order to elucidate how ALYREF and other export adaptors mediate mRNA export, we conducted a computational analysis and discovered four short, conserved, linear motifs present in RNA-binding proteins. We show that mutation in one of the new motifs (WxHD) in an unstructured region of ALYREF reduced RNA binding and abolished the interaction with eIF4A3 and CBP80. Additionally, the mutation impaired proper localization to nuclear speckles and export of a spliced reporter mRNA. Our results reveal important details of the orchestrated recruitment of export factors during the formation of export competent mRNPs. PMID:26773052

  15. Sequence Conservation, Radial Distance and Packing Density in Spherical Viral Capsids

    PubMed Central

    Lee, Chi-Wen; Huang, Tsun-Tsao; Shih, Chung-Shiuan; Hwang, Jenn-Kang

    2015-01-01

    The conservation level of a residue is a useful measure about the importance of that residue in protein structure and function. Much information about sequence conservation comes from aligning homologous sequences. Profiles showing the variation of the conservation level along the sequence are usually interpreted in evolutionary terms and dictated by site similarities of a proper set of homologous sequences. Here, we report that, of the viral icosahedral capsids, the sequence conservation profile can be determined by variations in the distances between residues and the centroid of the capsid – with a direct inverse proportionality between the conservation level and the centroid distance – as well as by the spatial variations in local packing density. Examining both the centroid and the packing density models against a dataset of 51 crystal structures of nonhomologous icosahedral capsids, we found that many global patterns and minor features derived from the viral structures are consistent with those present in the sequence conservation profiles. The quantitative link between the level of conservation and structural features like centroid-distance or packing density allows us to look at residue conservation from a structural viewpoint as well as from an evolutionary viewpoint. PMID:26132081

  16. A conserved predicted pseudoknot in the NS2A-encoding sequence of West Nile and Japanese encephalitis flaviviruses suggests NS1' may derive from ribosomal frameshifting

    PubMed Central

    Firth, Andrew E; Atkins, John F

    2009-01-01

    Japanese encephalitis, West Nile, Usutu and Murray Valley encephalitis viruses form a tight subgroup within the larger Flavivirus genus. These viruses utilize a single-polyprotein expression strategy, resulting in ~10 mature proteins. Plotting the conservation at synonymous sites along the polyprotein coding sequence reveals strong conservation peaks at the very 5' end of the coding sequence, and also at the 5' end of the sequence encoding the NS2A protein. Such peaks are generally indicative of functionally important non-coding sequence elements. The second peak corresponds to a predicted stable pseudoknot structure whose biological importance is supported by compensatory mutations that preserve the structure. The pseudoknot is preceded by a conserved slippery heptanucleotide (Y CCU UUU), thus forming a classical stimulatory motif for -1 ribosomal frameshifting. We hypothesize, therefore, that the functional importance of the pseudoknot is to stimulate a portion of ribosomes to shift -1 nt into a short (45 codon), conserved, overlapping open reading frame, termed foo. Since cleavage at the NS1-NS2A boundary is known to require synthesis of NS2A in cis, the resulting transframe fusion protein is predicted to be NS1-NS2AN-term-FOO. We hypothesize that this may explain the origin of the previously identified NS1 'extension' protein in JEV-group flaviviruses, known as NS1'. PMID:19196463

  17. Secondary structure model of the Mason-Pfizer monkey virus 5' leader sequence: identification of a structural motif common to a variety of retroviruses.

    PubMed Central

    Harrison, G P; Hunter, E; Lever, A M

    1995-01-01

    A stable secondary structure model is presented for the region 3' of the primer-binding site to 130 bases into the gag sequence of the prototype type D retrovirus Mason-Pfizer monkey virus. Using biochemical probing of RNA from this region in association with free energy minimization, we have identified a stem-loop structure in the region, which from other studies has been shown to be important for genomic RNA encapsidation. The structure involves a highly stable stem of five G-C pairs terminating in a heptaloop. Comparison of the Mason-Pfizer monkey virus structure with one predicted for squirrel monkey retrovirus demonstrates an identical stem and a common ACC motif in the loop. Free energy studies of the secondary structure of the 5' regions of eight other retroviruses predict stem loops which have similar GAYC motifs. We believe this may represent a common structural and sequence motif which among other functions may be involved in genomic RNA packaging in these viruses. PMID:7884866

  18. Expression and characterization of EF-hand I loop mutants of aequorin replaced with other loop sequences of Ca2+-binding proteins: an approach to studying the EF-hand motif of proteins.

    PubMed

    Inouye, Satoshi; Sahara-Miura, Yuiko

    2016-07-01

    The binding properties of Ca(2+) to EF-hand I of aequorin (AQ) were characterized by replacing the loop sequence of EF-hand I (AQ[I]) with other known loop sequences of Ca(2+)-binding proteins, including photoproteins (aequorin, clytin-I, clytin-II and mitrocomin), Renilla luciferin-binding protein (RLBP) and calmodulin (CaM). For evaluation of the binding affinity of Ca(2+) to AQ[I] mutants, the half-decay time of the maximum intensity in the luminescence reaction triggered by Ca(2+) was used as an indicator and 22 kinds of AQ[I] mutants were expressed in Escherichia coli cells. AQ[I] mutants replaced with the EF-hand I and EF-hand III from photoproteins showed sufficient luminescence activity, but it was not shown by other EF-hands from RLBP and CaM. An AQ[I] mutant with a lysine or arginine residue at the second position of the non-conserved amino acid residue showed a slow-decay pattern of luminescence, indicating that the Ca(2+)-binding affinity to aequorin was reduced by a positive charge at the second position of the loop sequence. The specific loop sequence of the EF-hand I motif in aequorin caused the specific Ca(2+)-triggered luminescence pattern. PMID:26896488

  19. Nuclear Magnetic Resonance Solution Structures of Lacticin Q and Aureocin A53 Reveal a Structural Motif Conserved among Leaderless Bacteriocins with Broad-Spectrum Activity.

    PubMed

    Acedo, Jeella Z; van Belkum, Marco J; Lohans, Christopher T; Towle, Kaitlyn M; Miskolzie, Mark; Vederas, John C

    2016-02-01

    Lacticin Q (LnqQ) and aureocin A53 (AucA) are leaderless bacteriocins from Lactococcus lactis QU5 and Staphylococcus aureus A53, respectively. These bacteriocins are characterized by the absence of an N-terminal leader sequence and are active against a broad range of Gram-positive bacteria. LnqQ and AucA consist of 53 and 51 amino acids, respectively, and have 47% identical sequences. In this study, their three-dimensional structures were elucidated using solution nuclear magnetic resonance and were shown to consist of four α-helices that assume a very similar compact, globular overall fold (root-mean-square deviation of 1.7 Å) with a highly cationic surface and a hydrophobic core. The structures of LnqQ and AucA resemble the shorter two-component leaderless bacteriocins, enterocins 7A and 7B, despite having low levels of sequence identity. Homology modeling revealed that the observed structural motif may be shared among leaderless bacteriocins with broad-spectrum activity against Gram-positive organisms. The elucidated structures of LnqQ and AucA also exhibit some resemblance to circular bacteriocins. Despite their similar overall fold, inhibition studies showed that LnqQ and AucA have different antimicrobial potency against the Gram-positive strains tested, suggesting that sequence disparities play a crucial role in their mechanisms of action. PMID:26771761

  20. Functional analysis reveals the possible role of the C-terminal sequences and PI motif in the function of lily (Lilium longiflorum) PISTILLATA (PI) orthologues

    PubMed Central

    Chen, Ming-Kun; Hsieh, Wen-Ping; Yang, Chang-Hsien

    2012-01-01

    Two lily (Lilium longiflorum) PISTILLATA (PI) genes, Lily MADS Box Gene 8 and 9 (LMADS8/9), were characterized. LMADS9 lacked 29 C-terminal amino acids including the PI motif that was present in LMADS8. Both LMADS8/9 mRNAs were prevalent in the first and second whorl tepals during all stages of development and were expressed in the stamen only in young flower buds. LMADS8/9 could both form homodimers, but the ability of LMADS8 homodimers to bind to CArG1 was relatively stronger than that of LMADS9 homodimers. 35S:LMADS8 completely, and 35S:LMADS9 only partially, rescued the second whorl petal formation and partially converted the first whorl sepal into a petal-like structure in Arabidopsis pi-1 mutants. Ectopic expression of LMADS8-C (with deletion of the 29 amino acids of the C-terminal sequence) or LMADS8-PI (with only the PI motif deleted) only partially rescued petal formation in pi mutants, which was similar to what was observed in 35S:LMADS9/pi plants. In contrast, 35:LMADS9+L8C (with the addition of the 29 amino acids of the LMADS8 C-terminal sequence) or 35S:LMADS9+L8PI (with the addition of the LMADS8 PI motif) demonstrated an increased ability to rescue petal formation in pi mutants, which was similar to what was observed in 35S:LMADS8/pi plants. Furthermore, ectopic expression of LMADS8-M (with the MADS domain truncated) generated more severe dominant negative phenotypes than those seen in 35S:LMADS9-M flowers. These results revealed that the 29 amino acids including the PI motif in the C-terminal region of the lily PI orthologue are valuable for its function in regulating perianth organ formation. PMID:22068145

  1. Identification of the First Prokaryotic Collagen Sequence Motif That Mediates Binding to Human Collagen Receptors, Integrins α2β1 and α11β1*

    PubMed Central

    Caswell, Clayton C.; Barczyk, Malgorzata; Keene, Douglas R.; Lukomska, Ewa; Gullberg, Donald E.; Lukomski, Slawomir

    2008-01-01

    Many pathogenic bacteria interact with human integrins to enter host cells and to augment host colonization. Group A Streptococcus (GAS) employs molecular mimicry by direct interactions between the cell surface streptococcal collagen-like protein-1 (Scl1) and the human collagen receptor, integrin α2β1. The collagen-like (CL) region of the Scl1 protein mediates integrin-binding, although, the integrin binding motif was not defined. Here, we used molecular cloning and site-directed mutagenesis to identify the GLPGER sequence as the α2β1 and the α11β1 binding motif. Electron microscopy experiments mapped binding sites of the recombinant α2-integrin-inserted domain to the GLPGER motif of the recombinant Scl (rScl) protein. rScl proteins and a synthetic peptide harboring the GLPGER motif mediated the attachment of C2C12-α2 + myoblasts expressing the α2β1 integrin as the sole collagen receptor. The C2C12-α11 + myoblasts expressing the α11β1 integrin also attached to GLPGER-harboring rScl proteins. Furthermore, the C2C12-α11 + cells attached to rScl1 more efficiently than C2C12-α2 + cells, suggesting that the α11β1 integrin may have a higher binding affinity for the GLPGER sequence. Human endothelial cells and dermal fibroblasts adhered to rScl proteins, indicating that multiple cell types may recognize and bind the Scl proteins via their collagen receptors. This work is a stepping stone toward defining the utilization of collagen receptors by microbial collagen-like proteins that are expressed by pathogenic bacteria. PMID:18990704

  2. Sequence and spatiotemporal expression analysis of CLE-motif containing genes from the reniform nematode (Rotylenchulus reniformis Linford & Oliveira)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The reniform nematode, Rotylenchulus reniformis, is a sedentary semi-endoparasitic species with a host range that encompasses more than 77 plant families. Nematode effector proteins containing plant-ligand motifs similar to CLAVATA3/ESR (CLE) peptides have been identified in the Heterodera, Globode...

  3. High sequence conservation among cucumber mosaic virus isolates from lily.

    PubMed

    Chen, Y K; Derks, A F; Langeveld, S; Goldbach, R; Prins, M

    2001-08-01

    For classification of Cucumber mosaic virus (CMV) isolates from ornamental crops of different geographical areas, these were characterized by comparing the nucleotide sequences of RNAs 4 and the encoded coat proteins. Within the ornamental-infecting CMV viruses both subgroups were represented. CMV isolates of Alstroemeria and crocus were classified as subgroup II isolates, whereas 8 other isolates, from lily, gladiolus, amaranthus, larkspur, and lisianthus, were identified as subgroup I members. In general, nucleotide sequence comparisons correlated well with geographic distribution, with one notable exception: the analyzed nucleotide sequences of 5 lily isolates showed remarkably high homology despite different origins. PMID:11676424

  4. Structural and sequence similarities of hydra xeroderma pigmentosum A protein to human homolog suggest early evolution and conservation.

    PubMed

    Barve, Apurva; Ghaskadbi, Saroj; Ghaskadbi, Surendra

    2013-01-01

    Xeroderma pigmentosum group A (XPA) is a protein that binds to damaged DNA, verifies presence of a lesion, and recruits other proteins of the nucleotide excision repair (NER) pathway to the site. Though its homologs from yeast, Drosophila, humans, and so forth are well studied, XPA has not so far been reported from protozoa and lower animal phyla. Hydra is a fresh-water cnidarian with a remarkable capacity for regeneration and apparent lack of organismal ageing. Cnidarians are among the first metazoa with a defined body axis, tissue grade organisation, and nervous system. We report here for the first time presence of XPA gene in hydra. Putative protein sequence of hydra XPA contains nuclear localization signal and bears the zinc-finger motif. It contains two conserved Pfam domains and various characterized features of XPA proteins like regions for binding to excision repair cross-complementing protein-1 (ERCC1) and replication protein A 70 kDa subunit (RPA70) proteins. Hydra XPA shows a high degree of similarity with vertebrate homologs and clusters with deuterostomes in phylogenetic analysis. Homology modelling corroborates the very close similarity between hydra and human XPA. The protein thus most likely functions in hydra in the same manner as in other animals, indicating that it arose early in evolution and has been conserved across animal phyla. PMID:24083246

  5. Structural and Sequence Similarities of Hydra Xeroderma Pigmentosum A Protein to Human Homolog Suggest Early Evolution and Conservation

    PubMed Central

    Ghaskadbi, Saroj

    2013-01-01

    Xeroderma pigmentosum group A (XPA) is a protein that binds to damaged DNA, verifies presence of a lesion, and recruits other proteins of the nucleotide excision repair (NER) pathway to the site. Though its homologs from yeast, Drosophila, humans, and so forth are well studied, XPA has not so far been reported from protozoa and lower animal phyla. Hydra is a fresh-water cnidarian with a remarkable capacity for regeneration and apparent lack of organismal ageing. Cnidarians are among the first metazoa with a defined body axis, tissue grade organisation, and nervous system. We report here for the first time presence of XPA gene in hydra. Putative protein sequence of hydra XPA contains nuclear localization signal and bears the zinc-finger motif. It contains two conserved Pfam domains and various characterized features of XPA proteins like regions for binding to excision repair cross-complementing protein-1 (ERCC1) and replication protein A 70 kDa subunit (RPA70) proteins. Hydra XPA shows a high degree of similarity with vertebrate homologs and clusters with deuterostomes in phylogenetic analysis. Homology modelling corroborates the very close similarity between hydra and human XPA. The protein thus most likely functions in hydra in the same manner as in other animals, indicating that it arose early in evolution and has been conserved across animal phyla. PMID:24083246

  6. Mutations in a Highly Conserved Motif of nsp1β Protein Attenuate the Innate Immune Suppression Function of Porcine Reproductive and Respiratory Syndrome Virus

    PubMed Central

    Li, Yanhua; Shyu, Duan-Liang; Shang, Pengcheng; Bai, Jianfa; Ouyang, Kang; Dhakal, Santosh; Hiremath, Jagadish; Binjawadagi, Basavaraj

    2016-01-01

    ABSTRACT Porcine reproductive and respiratory syndrome virus (PRRSV) nonstructural protein 1β (nsp1β) is a multifunctional viral protein, which is involved in suppressing the host innate immune response and activating a unique −2/−1 programmed ribosomal frameshifting (PRF) signal for the expression of frameshifting products. In this study, site-directed mutagenesis analysis showed that the R128A or R129A mutation introduced into a highly conserved motif (123GKYLQRRLQ131) reduced the ability of nsp1β to suppress interferon beta (IFN-β) activation and also impaired nsp1β's function as a PRF transactivator. Three recombinant viruses, vR128A, vR129A, and vRR129AA, carrying single or double mutations in the GKYLQRRLQ motif were characterized. In comparison to the wild-type (WT) virus, vR128A and vR129A showed slightly reduced growth abilities, while the vRR129AA mutant had a significantly reduced growth ability in infected cells. Consistent with the attenuated growth phenotype in vitro, pigs infected with nsp1β mutants had lower levels of viremia than did WT virus-infected pigs. Compared to the WT virus in infected cells, all three mutated viruses stimulated high levels of IFN-α expression and exhibited a reduced ability to suppress the mRNA expression of selected interferon-stimulated genes (ISGs). In pigs infected with nsp1β mutants, IFN-α production was increased in the lungs at early time points postinfection, which was correlated with increased innate NK cell function. Furthermore, the augmented innate response was consistent with the increased production of IFN-γ in pigs infected with mutated viruses. These data demonstrate that residues R128 and R129 are critical for nsp1β function and that modifying these key residues in the GKYLQRRLQ motif attenuates virus growth ability and improves the innate and adaptive immune responses in infected animals. IMPORTANCE PRRSV infection induces poor antiviral innate IFN and cytokine responses, which results in

  7. MIDDAS-M: Motif-Independent De Novo Detection of Secondary Metabolite Gene Clusters through the Integration of Genome Sequencing and Transcriptome Data

    PubMed Central

    Umemura, Myco; Koike, Hideaki; Nagano, Nozomi; Ishii, Tomoko; Kawano, Jin; Yamane, Noriko; Kozone, Ikuko; Horimoto, Katsuhisa; Shin-ya, Kazuo; Asai, Kiyoshi; Yu, Jiujiang; Bennett, Joan W.; Machida, Masayuki

    2013-01-01

    Many bioactive natural products are produced as “secondary metabolites” by plants, bacteria, and fungi. During the middle of the 20th century, several secondary metabolites from fungi revolutionized the pharmaceutical industry, for example, penicillin, lovastatin, and cyclosporine. They are generally biosynthesized by enzymes encoded by clusters of coordinately regulated genes, and several motif-based methods have been developed to detect secondary metabolite biosynthetic (SMB) gene clusters using the sequence information of typical SMB core genes such as polyketide synthases (PKS) and non-ribosomal peptide synthetases (NRPS). However, no detection method exists for SMB gene clusters that are functional and do not include core SMB genes at present. To advance the exploration of SMB gene clusters, especially those without known core genes, we developed MIDDAS-M, a motif-independent de novo detection algorithm for SMB gene clusters. We integrated virtual gene cluster generation in an annotated genome sequence with highly sensitive scoring of the cooperative transcriptional regulation of cluster member genes. MIDDAS-M accurately predicted 38 SMB gene clusters that have been experimentally confirmed and/or predicted by other motif-based methods in 3 fungal strains. MIDDAS-M further identified a new SMB gene cluster for ustiloxin B, which was experimentally validated. Sequence analysis of the cluster genes indicated a novel mechanism for peptide biosynthesis independent of NRPS. Because it is fully computational and independent of empirical knowledge about SMB core genes, MIDDAS-M allows a large-scale, comprehensive analysis of SMB gene clusters, including those with novel biosynthetic mechanisms that do not contain any functionally characterized genes. PMID:24391870

  8. PscanChIP: finding over-represented transcription factor-binding site motifs and their correlations in sequences from ChIP-Seq experiments

    PubMed Central

    Zambelli, Federico; Pesole, Graziano; Pavesi, Giulio

    2013-01-01

    Chromatin immunoprecipitation followed by sequencing with next-generation technologies (ChIP-Seq) has become the de facto standard for building genome-wide maps of regions bound by a given transcription factor (TF). The regions identified, however, have to be further analyzed to determine the actual DNA-binding sites for the TF, as well as sites for other TFs belonging to the same TF complex or in general co-operating or interacting with it in transcription regulation. PscanChIP is a web server that, starting from a collection of genomic regions derived from a ChIP-Seq experiment, scans them using motif descriptors like JASPAR or TRANSFAC position-specific frequency matrices, or descriptors uploaded by users, and it evaluates both motif enrichment and positional bias within the regions according to different measures and criteria. PscanChIP can successfully identify not only the actual binding sites for the TF investigated by a ChIP-Seq experiment but also secondary motifs corresponding to other TFs that tend to bind the same regions, and, if present, precise positional correlations among their respective sites. The web interface is free for use, and there is no login requirement. It is available at http://www.beaconlab.it/pscan_chip_dev. PMID:23748563

  9. A unique transactivation sequence motif is found in the carboxyl-terminal domain of the single-strand-binding protein FBP.

    PubMed Central

    Duncan, R; Collins, I; Tomonaga, T; Zhang, T; Levens, D

    1996-01-01

    The far-upstream element-binding protein (FBP) is one of several recently described factors which bind to a single strand of DNA in the 5' region of the c-myc gene. Although cotransfection of FBP increases expression from a far-upstream element-bearing c-myc promoter reporter, the mechanism of this stimulation is heretofore unknown. Can a single-strand-binding protein function as a classical transactivator, or are these proteins restricted to stabilizing or altering the conformation of DNA in an architectural role? Using chimeric GAL4-FBP fusion proteins we have shown that the carboxyl-terminal region (residues 448 to 644) is a potent transcriptional activation domain. This region contains three copies of a unique amino acid sequence motif containing tyrosine diads. Analysis of deletion mutants demonstrated that a single tyrosine motif alone (residues 609 to 644) was capable of activating transcription. The activation property of the C-terminal domain is repressed by the N-terminal 107 amino acids of FBP. These results show that FBP contains a transactivation domain which can function alone, suggesting that FBP contributes directly to c-myc transcription while bound to a single-strand site. Furthermore, activation is mediated by a new motif which can be negatively regulated by a repression domain of FBP. PMID:8628294

  10. The valine and lysine residues in the conserved FxVTxK motif are important for the function of phylogenetically distant plant cellulose synthases.

    PubMed

    Slabaugh, Erin; Scavuzzo-Duggan, Tess; Chaves, Arielle; Wilson, Liza; Wilson, Carmen; Davis, Jonathan K; Cosgrove, Daniel J; Anderson, Charles T; Roberts, Alison W; Haigler, Candace H

    2016-05-01

    Cellulose synthases (CESAs) synthesize the β-1,4-glucan chains that coalesce to form cellulose microfibrils in plant cell walls. In addition to a large cytosolic (catalytic) domain, CESAs have eight predicted transmembrane helices (TMHs). However, analogous to the structure of BcsA, a bacterial CESA, predicted TMH5 in CESA may instead be an interfacial helix. This would place the conserved FxVTxK motif in the plant cell cytosol where it could function as a substrate-gating loop as occurs in BcsA. To define the functional importance of the CESA region containing FxVTxK, we tested five parallel mutations in Arabidopsis thaliana CESA1 and Physcomitrella patens CESA5 in complementation assays of the relevant cesa mutants. In both organisms, the substitution of the valine or lysine residues in FxVTxK severely affected CESA function. In Arabidopsis roots, both changes were correlated with lower cellulose anisotropy, as revealed by Pontamine Fast Scarlet. Analysis of hypocotyl inner cell wall layers by atomic force microscopy showed that two altered versions of Atcesa1 could rescue cell wall phenotypes observed in the mutant background line. Overall, the data show that the FxVTxK motif is functionally important in two phylogenetically distant plant CESAs. The results show that Physcomitrella provides an efficient model for assessing the effects of engineered CESA mutations affecting primary cell wall synthesis and that diverse testing systems can lead to nuanced insights into CESA structure-function relationships. Although CESA membrane topology needs to be experimentally determined, the results support the possibility that the FxVTxK region functions similarly in CESA and BcsA. PMID:26646446

  11. The Evolutionarily Conserved Tre2/Bub2/Cdc16 (TBC), Lysin Motif (LysM), Domain Catalytic (TLDc) Domain Is Neuroprotective against Oxidative Stress*

    PubMed Central

    Finelli, Mattéa J.; Sanchez-Pulido, Luis; Liu, Kevin X; Davies, Kay E.; Oliver, Peter L.

    2016-01-01

    Oxidative stress is a pathological feature of many neurological disorders; therefore, utilizing proteins that are protective against such cellular insults is a potentially valuable therapeutic approach. Oxidation resistance 1 (OXR1) has been shown previously to be critical for oxidative stress resistance in neuronal cells; deletion of this gene causes neurodegeneration in mice, yet conversely, overexpression of OXR1 is protective in cellular and mouse models of amyotrophic lateral sclerosis. However, the molecular mechanisms involved are unclear. OXR1 contains the Tre2/Bub2/Cdc16 (TBC), lysin motif (LysM), domain catalytic (TLDc) domain, a motif present in a family of proteins including TBC1 domain family member 24 (TBC1D24), a protein mutated in a range of disorders characterized by seizures, hearing loss, and neurodegeneration. The TLDc domain is highly conserved across species, although the structure-function relationship is unknown. To understand the role of this domain in the stress response, we carried out systematic analysis of all mammalian TLDc domain-containing proteins, investigating their expression and neuroprotective properties in parallel. In addition, we performed a detailed structural and functional study of this domain in which we identified key residues required for its activity. Finally, we present a new mouse insertional mutant of Oxr1, confirming that specific disruption of the TLDc domain in vivo is sufficient to cause neurodegeneration. Our data demonstrate that the integrity of the TLDc domain is essential for conferring neuroprotection, an important step in understanding the functional significance of all TLDc domain-containing proteins in the cellular stress response and disease. PMID:26668325

  12. Sequence and peptide-binding motif for a variant of HLA-A*0214 (A*02142) in an HIV-1-resistant individual from the Nairobi Sex Worker cohort.

    PubMed

    Luscher, M A; MacDonald, K S; Bwayo, J J; Plummer, F A; Barber, B H

    2001-02-01

    As part of the ongoing study of natural HIV-1 resistance in the women of the Nairobi Sex Workers' study, we have examined a resistance-associated HLA class I allele at the molecular level. Typing by polymerase chain reaction using sequence-specific primers determined that this molecule is closely related to HLA-A*0214, one of a family of HLA-A2 supertype alleles which correlate with HIV-1 resistance in this population. Direct nucleotide sequencing shows that this molecule differs from A*0214, having a silent nucleotide substitution. We therefore propose to designate it HLA-A*02142. We have determined the peptide-binding motif of HLA-A*0214/02142 by peptide elution and bulk Edman degradative sequencing. The resulting motif, X-[Q,V]-X-X-X-K-X-X-[V,L], includes lysine as an anchor at position 6. The data complement available information on the peptide-binding characteristics of this molecule, and will be of use in identifying antigenic peptides from HIV-1 and other pathogens. PMID:11261925

  13. Sequence motif upstream of the Hendra virus fusion protein cleavage site is not sufficient to promote efficient proteolytic processing

    SciTech Connect

    Craft, Willie Warren; Dutch, Rebecca Ellis . E-mail: rdutc2@uky.edu

    2005-10-10

    The Hendra virus fusion (HeV F) protein is synthesized as a precursor, F{sub 0}, and proteolytically cleaved into the mature F{sub 1} and F{sub 2} heterodimer, following an HDLVDGVK{sub 109} motif. This cleavage event is required for fusogenic activity. To determine the amino acid requirements for processing of the HeV F protein, we constructed multiple mutants. Individual and simultaneous alanine substitutions of the eight residues immediately upstream of the cleavage site did not eliminate processing. A chimeric SV5 F protein in which the furin site was substituted for the VDGVK{sub 109} motif of the HeV F protein was not processed but was expressed on the cell surface. Another chimeric SV5 F protein containing the HDLVDGVK{sub 109} motif of the HeV F protein underwent partial cleavage. These data indicate that the upstream region can play a role in protease recognition, but is neither absolutely required nor sufficient for efficient processing of the HeV F protein.

  14. Conservation of the function counts: homologous neurons express sequence-related neuropeptides that originate from different genes.

    PubMed

    Neupert, Susanne; Huetteroth, Wolf; Schachtner, Joachim; Predel, Reinhard

    2009-11-01

    By means of single-cell matrix assisted laser desorption/ionization time-of-flight mass spectrometry, we analysed neuropeptide expression in all FXPRLamide/pheromone biosynthesis activating neuropeptide synthesizing neurons of the adult tobacco hawk moth, Manduca sexta. Mass spectra clearly suggest a completely identical processing of the pheromone biosynthesis activating neuropeptide-precursor in the mandibular, maxillary and labial neuromeres of the subesophageal ganglion. Only in the pban-neurons of the labial neuromere, products of two neuropeptide genes, namely the pban-gene and the capa-gene, were detected. Both of these genes expressed, amongst others, sequence-related neuropeptides (extended WFGPRLamides). We speculate that the expression of the two neuropeptide genes is a plesiomorph character typical of moths. A detailed examination of the neuroanatomy and the peptidome of the (two) pban-neurons in the labial neuromere of moths with homologous neurons of different insects indicates a strong conservation of the function of this neuroendocrine system. In other insects, however, the labial neurons either express products of the fxprl-gene or products of the capa-gene. The processing of the respective genes is reduced to extended WFGPRLamides in each case and yields a unique peptidome in the labial cells. Thus, sequence-related messenger molecules are always produced in these cells and it seems that the respective neurons recruited different neuropeptide genes for this motif. PMID:19712058

  15. GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions.

    PubMed

    Besemer, J; Lomsadze, A; Borodovsky, M

    2001-06-15

    Improving the accuracy of prediction of gene starts is one of a few remaining open problems in computer prediction of prokaryotic genes. Its difficulty is caused by the absence of relatively strong sequence patterns identifying true translation initiation sites. In the current paper we show that the accuracy of gene start prediction can be improved by combining models of protein-coding and non-coding regions and models of regulatory sites near gene start within an iterative Hidden Markov model based algorithm. The new gene prediction method, called GeneMarkS, utilizes a non-supervised training procedure and can be used for a newly sequenced prokaryotic genome with no prior knowledge of any protein or rRNA genes. The GeneMarkS implementation uses an improved version of the gene finding program GeneMark.hmm, heuristic Markov models of coding and non-coding regions and the Gibbs sampling multiple alignment program. GeneMarkS predicted precisely 83.2% of the translation starts of GenBank annotated Bacillus subtilis genes and 94.4% of translation starts in an experimentally validated set of Escherichia coli genes. We have also observed that GeneMarkS detects prokaryotic genes, in terms of identifying open reading frames containing real genes, with an accuracy matching the level of the best currently used gene detection methods. Accurate translation start prediction, in addition to the refinement of protein sequence N-terminal data, provides the benefit of precise positioning of the sequence region situated upstream to a gene start. Therefore, sequence motifs related to transcription and translation regulatory sites can be revealed and analyzed with higher precision. These motifs were shown to possess a significant variability, the functional and evolutionary connections of which are discussed. PMID:11410670

  16. Analysis of BAC-end sequences in common bean (Phaseolus vulgaris L.) towards the development and characterization of long motifs SSRs.

    PubMed

    Müller, Bárbara Salomão de Faria; Sakamoto, Tetsu; de Menezes, Ivandilson Pessoa Pinto; Prado, Guilherme Souza; Martins, Wellington Santos; Brondani, Claudio; de Barros, Everaldo Gonçalves; Vianello, Rosana Pereira

    2014-11-01

    The increasing volume of genomic data on the Phaseolus vulgaris species have contributed to its importance as a model genetic species and positively affected the investigation of other legumes of scientific and economic value. To expand and gain a more in-depth knowledge of the common bean genome, the ends of a number of bacterial artificial chromosome (BAC) were sequenced, annotated and the presence of repetitive sequences was determined. In total, 52,270 BESs (BAC-end sequences), equivalent to 32 Mbp (~6 %) of the genome, were processed. In total, 3,789 BES-SSRs were identified, with a distribution of one SSR (simple sequence repeat) per 8.36 kbp and 2,000 were suitable for the development of SSRs, of which 194 were evaluated in low-resolution screening. From 40 BES-SSRs based on long motifs SSRs (≥ trinucleotides) analyzed in high-resolution genotyping, 34 showed an equally good amplification for the Andean and for the Mesoamerican genepools, exhibiting an average gene diversity (H E) of 0.490 and 5.59 alleles/locus, of which six classified as Class I showed a H E ≥ 0.7. The PCoA and structure analysis allowed to discriminate the gene pools (K = 2, FST = 0.733). From the 52,270 BESs, 2 % corresponded to transcription factors and 3 % to transposable elements. Putative functions for 24,321 BESs were identified and for 19,363 were assigned functional categories (gene ontology). This study identified highly polymorphic BES-SSRs containing tri- to hexanucleotides motifs and bringing together relevant genetic characteristics useful for breeding programs. Additionally, the BESs were incorporated into the international genome-sequencing project for the common bean. PMID:25164100

  17. CONSERVED SEQUENCE IN THE AGGRECAN INTERGLOBULAR DOMAIN MODULATES CLEAVAGE BY ADAMTS-4 AND ADAMTS-5

    PubMed Central

    Miwa, Hazuki E; Gerken, Thomas A; Huynh, Tru D; Duesler, Lori R; Cotter, Meghan; Hering, Thomas M.

    2008-01-01

    Background Cleavage of aggrecan by ADAMTS proteinases at specific sites within highly conserved regions may be important to normal physiological enzyme functions, as well as pathological degradation. Methods To examine ADAMTS selectivity, we assayed ADAMTS-4 and -5 cleavage of recombinant bovine aggrecan mutated at amino acids N-terminal or C-terminal to the interglobular domain cleavage site. Results Mutations of conserved amino acids from P18 to P12 to increase hydrophilicity resulted in ADAMTS-4 cleavage inhibition. Mutation of Thr, but not Asn within the conserved N-glycosylation motif Asn-Ile-Thr from P6 to P4 enhanced cleavage. Mutation of conserved Thr residues from P22 to P17 to increase hydrophobicity enhanced ADAMTS-4 cleavage. A P4′ Ser377Gln mutant inhibited cleavage by ADAMTS-4 and -5, while a neutral Ser377Ala mutant and species mimicking mutants Ser377Thr, Ser377Asn, and Arg375Leu were cleaved normally by ADAMTS-4. The Ser377Thr mutant, however, was resistant to cleavage by ADAMTS-5. Conclusion We have identified multiple conserved amino acids within regions N- and C-terminal to the site of scission that may influence enzyme-substrate recognition, and may interact with exosites on ADAMTS-4 and ADAMTS-5. General Significance Inhibition of the binding of ADAMTS-4 and ADAMTS-5 exosites to aggrecan should be explored as a therapeutic intervention for osteoarthritis. PMID:19101611

  18. Bioinformatics Approaches for Predicting Disordered Protein Motifs.

    PubMed

    Bhowmick, Pallab; Guharoy, Mainak; Tompa, Peter

    2015-01-01

    Short, linear motifs (SLiMs) in proteins are functional microdomains consisting of contiguous residue segments along the protein sequence, typically not more than 10 consecutive amino acids in length with less than 5 defined positions. Many positions are 'degenerate' thus offering flexibility in terms of the amino acid types allowed at those positions. Their short length and degenerate nature confers evolutionary plasticity meaning that SLiMs often evolve convergently. Further, SLiMs have a propensity to occur within intrinsically unstructured protein segments and this confers versatile functionality to unstructured regions of the proteome. SLiMs mediate multiple types of protein interactions based on domain-peptide recognition and guide functions including posttranslational modifications, subcellular localization of proteins, and ligand binding. SLiMs thus behave as modular interaction units that confer versatility to protein function and SLiM-mediated interactions are increasingly being recognized as therapeutic targets. In this chapter we start with a brief description about the properties of SLiMs and their interactions and then move on to discuss algorithms and tools including several web-based methods that enable the discovery of novel SLiMs (de novo motif discovery) as well as the prediction of novel occurrences of known SLiMs. Both individual amino acid sequences as well as sets of protein sequences can be scanned using these methods to obtain statistically overrepresented sequence patterns. Lists of putatively functional SLiMs are then assembled based on parameters such as evolutionary sequence conservation, disorder scores, structural data, gene ontology terms and other contextual information that helps to assess the functional credibility or significance of these motifs. These bioinformatics methods should certainly guide experiments aimed at motif discovery. PMID:26387106

  19. One exon of the human LSF gene includes conserved regions involved in novel DNA-binding and dimerization motifs.

    PubMed Central

    Shirra, M K; Zhu, Q; Huang, H C; Pallas, D; Hansen, U

    1994-01-01

    The transcription factor LSF, identified as a HeLa protein that binds the simian virus 40 late promoter, recognizes direct repeats with a center-to-center spacing of 10 bp. The characterization of two human cDNAs, representing alternatively spliced mRNAs, provides insight into the unusual DNA-binding and oligomerization properties of LSF. The sequence of the full-length LSF is identical to that of the transcription factors alpha CP2 and LBP-1c and has similarity to the Drosophila transcription factor Elf-1/NTF-1. Using an epitope-counting method, we show that LSF binds DNA as a homodimer. LSF-ID, which is identical to LBP-1d, contains an in-frame internal deletion of 51 amino acids resulting from alternative mRNA splicing. Unlike LSF, LSF-ID did not bind LSF DNA-binding sites. Furthermore, LSF-ID did not affect the binding of LSF to DNA, suggesting that the two proteins do not interact. Of three short regions with a high degree of homology between LSF and Elf-1/NTF-1, LSF-ID lacks two, which are predicted to form beta-strands. Double amino acid substitutions in each of these regions eliminated specific DNA-binding activity, similarly to the LSF-ID deletion. The dimerization potential of these mutants was measured both by the ability to inhibit the binding of LSF to DNA and by direct protein-protein interaction studies. Mutations in one homology region, but not the other, functionally eliminated dimerization. Images PMID:8035790

  20. Reversibly bound chloride in the atrial natriuretic peptide receptor hormone-binding domain: Possible allosteric regulation and a conserved structural motif for the chloride-binding site

    PubMed Central

    Ogawa, Haruo; Qiu, Yue; Philo, John S; Arakawa, Tsutomu; Ogata, Craig M; Misono, Kunio S

    2010-01-01

    The binding of atrial natriuretic peptide (ANP) to its receptor requires chloride, and it is chloride concentration dependent. The extracellular domain (ECD) of the ANP receptor (ANPR) contains a chloride near the ANP-binding site, suggesting a possible regulatory role. The bound chloride, however, is completely buried in the polypeptide fold, and its functional role has remained unclear. Here, we have confirmed that chloride is necessary for ANP binding to the recombinant ECD or the full-length ANPR expressed in CHO cells. ECD without chloride (ECD(−)) did not bind ANP. Its binding activity was fully restored by bromide or chloride addition. A new X-ray structure of the bromide-bound ECD is essentially identical to that of the chloride-bound ECD. Furthermore, bromide atoms are localized at the same positions as chloride atoms both in the apo and in the ANP-bound structures, indicating exchangeable and reversible halide binding. Far-UV CD and thermal unfolding data show that ECD(−) largely retains the native structure. Sedimentation equilibrium in the absence of chloride shows that ECD(−) forms a strongly associated dimer, possibly preventing the structural rearrangement of the two monomers that is necessary for ANP binding. The primary and tertiary structures of the chloride-binding site in ANPR are highly conserved among receptor-guanylate cyclases and metabotropic glutamate receptors. The chloride-dependent ANP binding, reversible chloride binding, and the highly conserved chloride-binding site motif suggest a regulatory role for the receptor bound chloride. Chloride-dependent regulation of ANPR may operate in the kidney, modulating ANP-induced natriuresis. PMID:20066666

  1. Reversibly Bound Chloride in the Atrial Natriuretic Peptide Receptor Hormone Binding Domain: Possible Allosteric Regulation and a Conserved Structural Motif for the Chloride-binding Site

    SciTech Connect

    Ogawa, H.; Qiu, Y; Philo, J; Arakawa, T; Ogata, C; Misono, K

    2010-01-01

    The binding of atrial natriuretic peptide (ANP) to its receptor requires chloride, and it is chloride concentration dependent. The extracellular domain (ECD) of the ANP receptor (ANPR) contains a chloride near the ANP-binding site, suggesting a possible regulatory role. The bound chloride, however, is completely buried in the polypeptide fold, and its functional role has remained unclear. Here, we have confirmed that chloride is necessary for ANP binding to the recombinant ECD or the full-length ANPR expressed in CHO cells. ECD without chloride (ECD(-)) did not bind ANP. Its binding activity was fully restored by bromide or chloride addition. A new X-ray structure of the bromide-bound ECD is essentially identical to that of the chloride-bound ECD. Furthermore, bromide atoms are localized at the same positions as chloride atoms both in the apo and in the ANP-bound structures, indicating exchangeable and reversible halide binding. Far-UV CD and thermal unfolding data show that ECD(-) largely retains the native structure. Sedimentation equilibrium in the absence of chloride shows that ECD(-) forms a strongly associated dimer, possibly preventing the structural rearrangement of the two monomers that is necessary for ANP binding. The primary and tertiary structures of the chloride-binding site in ANPR are highly conserved among receptor-guanylate cyclases and metabotropic glutamate receptors. The chloride-dependent ANP binding, reversible chloride binding, and the highly conserved chloride-binding site motif suggest a regulatory role for the receptor bound chloride. Chloride-dependent regulation of ANPR may operate in the kidney, modulating ANP-induced natriuresis.

  2. Incorporating substrate sequence motifs and spatial amino acid composition to identify kinase-specific phosphorylation sites on protein three-dimensional structures

    PubMed Central

    2013-01-01

    Background Protein phosphorylation catalyzed by kinases plays crucial regulatory roles in cellular processes. Given the high-throughput mass spectrometry-based experiments, the desire to annotate the catalytic kinases for in vivo phosphorylation sites has motivated. Thus, a variety of computational methods have been developed for performing a large-scale prediction of kinase-specific phosphorylation sites. However, most of the proposed methods solely rely on the local amino acid sequences surrounding the phosphorylation sites. An increasing number of three-dimensional structures make it possible to physically investigate the structural environment of phosphorylation sites. Results In this work, all of the experimental phosphorylation sites are mapped to the protein entries of Protein Data Bank by sequence identity. It resulted in a total of 4508 phosphorylation sites containing the protein three-dimensional (3D) structures. To identify phosphorylation sites on protein 3D structures, this work incorporates support vector machines (SVMs) with the information of linear motifs and spatial amino acid composition, which is determined for each kinase group by calculating the relative frequencies of 20 amino acid types within a specific radial distance from central phosphorylated amino acid residue. After the cross-validation evaluation, most of the kinase-specific models trained with the consideration of structural information outperform the models considering only the sequence information. Furthermore, the independent testing set which is not included in training set has demonstrated that the proposed method could provide a comparable performance to other popular tools. Conclusion The proposed method is shown to be capable of predicting kinase-specific phosphorylation sites on 3D structures and has been implemented as a web server which is freely accessible at http://csb.cse.yzu.edu.tw/PhosK3D/. Due to the difficulty of identifying the kinase-specific phosphorylation

  3. Distinct Functional Constraints Partition Sequence Conservation in a cis-Regulatory Element

    PubMed Central

    Ruvinsky, Ilya

    2011-01-01

    Different functional constraints contribute to different evolutionary rates across genomes. To understand why some sequences evolve faster than others in a single cis-regulatory locus, we investigated function and evolutionary dynamics of the promoter of the Caenorhabditis elegans unc-47 gene. We found that this promoter consists of two distinct domains. The proximal promoter is conserved and is largely sufficient to direct appropriate spatial expression. The distal promoter displays little if any conservation between several closely related nematodes. Despite this divergence, sequences from all species confer robustness of expression, arguing that this function does not require substantial sequence conservation. We showed that even unrelated sequences have the ability to promote robust expression. A prominent feature shared by all of these robustness-promoting sequences is an AT-enriched nucleotide composition consistent with nucleosome depletion. Because general sequence composition can be maintained despite sequence turnover, our results explain how different functional constraints can lead to vastly disparate rates of sequence divergence within a promoter. PMID:21655084

  4. Accelerated Evolution of Conserved Noncoding Sequences in theHuman Genome

    SciTech Connect

    Prambhakar, Shyam; Noonan, James P.; Paabo, Svante; Rubin, EdwardM.

    2006-07-06

    Genomic comparisons between human and distant, non-primatemammals are commonly used to identify cis-regulatory elements based onconstrained sequence evolution. However, these methods fail to detect"cryptic" functional elements, which are too weakly conserved amongmammals to distinguish from nonfunctional DNA. To address this problem,we explored the potential of deep intra-primate sequence comparisons. Wesequenced the orthologs of 558 kb of human genomic sequence, coveringmultiple loci involved in cholesterol homeostasis, in 6 nonhumanprimates. Our analysis identified 6 noncoding DNA elements displayingsignificant conservation among primates, but undetectable in more distantcomparisons. In vitro and in vivo tests revealed that at least three ofthese 6 elements have regulatory function. Notably, the mouse orthologsof these three functional human sequences had regulatory activity despitetheir lack of significant sequence conservation, indicating that they arecryptic ancestral cis-regulatory elements. These regulatory elementscould still be detected in a smaller set of three primate speciesincluding human, rhesus and marmoset. Since the human and rhesus genomesequences are already available, and the marmoset genome is activelybeing sequenced, the primate-specific conservation analysis describedhere can be applied in the near future on a whole-genome scale, tocomplement the annotation provided by more distant speciescomparisons.

  5. Detection of Weakly Conserved Ancestral Mammalian RegulatorySequences by Primate Comparisons

    SciTech Connect

    Wang, Qian-fei; Prabhakar, Shyam; Chanan, Sumita; Cheng,Jan-Fang; Rubin, Edward M.; Boffelli, Dario

    2006-06-01

    Genomic comparisons between human and distant, non-primatemammals are commonly used to identify cis-regulatory elements based onconstrained sequence evolution. However, these methods fail to detectcryptic functional elements, which are too weakly conserved among mammalsto distinguish from nonfunctional DNA. To address this problem, weexplored the potential of deep intra-primate sequence comparisons. Wesequenced the orthologs of 558 kb of human genomic sequence, coveringmultiple loci involved in cholesterol homeostasis, in 6 nonhumanprimates. Our analysis identified 6 noncoding DNA elements displayingsignificant conservation among primates, but undetectable in more distantcomparisons. In vitro and in vivo tests revealed that at least three ofthese 6 elements have regulatory function. Notably, the mouse orthologsof these three functional human sequences had regulatory activity despitetheir lack of significant sequence conservation, indicating that they arecryptic ancestral cis-regulatory elements. These regulatory elementscould still be detected in a smaller set of three primate speciesincluding human, rhesus and marmoset. Since the human and rhesus genomesequences are already available, and the marmoset genome is activelybeing sequenced, the primate-specific conservation analysis describedhere can be applied in the near future on a whole-genome scale, tocomplement the annotation provided by more distant speciescomparisons.

  6. Comparative genomic analysis of upstream miRNA regulatory motifs in Caenorhabditis.

    PubMed

    Jovelin, Richard; Krizus, Aldis; Taghizada, Bakhtiyar; Gray, Jeremy C; Phillips, Patrick C; Claycomb, Julie M; Cutter, Asher D

    2016-07-01

    MicroRNAs (miRNAs) comprise a class of short noncoding RNA molecules that play diverse developmental and physiological roles by controlling mRNA abundance and protein output of the vast majority of transcripts. Despite the importance of miRNAs in regulating gene function, we still lack a complete understanding of how miRNAs themselves are transcriptionally regulated. To fill this gap, we predicted regulatory sequences by searching for abundant short motifs located upstream of miRNAs in eight species of Caenorhabditis nematodes. We identified three conserved motifs across the Caenorhabditis phylogeny that show clear signatures of purifying selection from comparative genomics, patterns of nucleotide changes in motifs of orthologous miRNAs, and correlation between motif incidence and miRNA expression. We then validated our predictions with transgenic green fluorescent protein reporters and site-directed mutagenesis for a subset of motifs located in an enhancer region upstream of let-7 We demonstrate that a CT-dinucleotide motif is sufficient for proper expression of GFP in the seam cells of adult C. elegans, and that two other motifs play incremental roles in combination with the CT-rich motif. Thus, functional tests of sequence motifs identified through analysis of molecular evolutionary signatures provide a powerful path for efficiently characterizing the transcriptional regulation of miRNA genes. PMID:27140965

  7. Relation between mRNA expression and sequence information in Desulfovibrio vulgaris: Combinatorial contributions of upstream regulatory motifs and coding sequence features to variations in mRNA abundance

    SciTech Connect

    Wu, Gang; Nie, Lei; Zhang, Weiwen

    2006-05-26

    ABSTRACT-The context-dependent expression of genes is the core for biological activities, and significant attention has been given to identification of various factors contributing to gene expression at genomic scale. However, so far this type of analysis has been focused whether on relation between mRNA expression and non-coding sequence features such as upstream regulatory motifs or on correlation between mRN abundance and non-random features in coding sequences (e.g. codon usage and amino acid usage). In this study multiple regression analyses of the mRNA abundance and all sequence information in Desulfovibrio vulgaris were performed, with the goal to investigate how much coding and non-coding sequence features contribute to the variations in mRNA expression, and in what manner they act together...

  8. Conservation of the human telomere sequence (TTAGGG)n among vertebrates.

    PubMed Central

    Meyne, J; Ratliff, R L; Moyzis, R K

    1989-01-01

    To determine the evolutionary origin of the human telomere sequence (TTAGGG)n, biotinylated oligodeoxynucleotides of this sequence were hybridized to metaphase spreads from 91 different species, including representative orders of bony fish, reptiles, amphibians, birds, and mammals. Under stringent hybridization conditions, fluorescent signals were detected at the telomeres of all chromosomes, in all 91 species. The conservation of the (TTAGGG)n sequence and its telomeric location, in species thought to share a common ancestor over 400 million years ago, strongly suggest that this sequence is the functional vertebrate telomere. Images PMID:2780561

  9. Evolution of conserved non-coding sequences within the vertebrate Hox clusters through the two-round whole genome duplications revealed by phylogenetic footprinting analysis.

    PubMed

    Matsunami, Masatoshi; Sumiyama, Kenta; Saitou, Naruya

    2010-12-01

    As a result of two-round whole genome duplications, four or more paralogous Hox clusters exist in vertebrate genomes. The paralogous genes in the Hox clusters show similar expression patterns, implying shared regulatory mechanisms for expression of these genes. Previous studies partly revealed the expression mechanisms of Hox genes. However, cis-regulatory elements that control these paralogous gene expression are still poorly understood. Toward solving this problem, the authors searched conserved non-coding sequences (CNSs), which are candidates of cis-regulatory elements. When comparing orthologous Hox clusters of 19 vertebrate species, 208 intergenic conserved regions were found. The authors then searched for CNSs that were conserved not only between orthologous clusters but also among the four paralogous Hox clusters. The authors found three regions that are conserved among all the four clusters and eight regions that are conserved between intergenic regions of two paralogous Hox clusters. In total, 28 CNSs were identified in the paralogous Hox clusters, and nine of them were newly found in this study. One of these novel regions bears a RARE motif. These CNSs are candidates for gene expression regulatory regions among paralogous Hox clusters. The authors also compared vertebrate CNSs with amphioxus CNSs within the Hox cluster, and found that two CNSs in the HoxA and HoxB clusters retain homology with amphioxus CNSs through the two-round whole genome duplications. PMID:20981416

  10. Motifs, modules and games in bacteria

    SciTech Connect

    Wolf, Denise M.; Arkin, Adam P.

    2003-04-01

    Global explorations of regulatory network dynamics, organization and evolution have become tractable thanks to high-throughput sequencing and molecular measurement of bacterial physiology. From these, a nascent conceptual framework is developing, that views the principles of regulation in term of motifs, modules and games. Motifs are small, repeated, and conserved biological units ranging from molecular domains to small reaction networks. They are arranged into functional modules, genetically dissectible cellular functions such as the cell cycle, or different stress responses. The dynamical functioning of modules defines the organism's strategy to survive in a game, pitting cell against cell, and cell against environment. Placing pathway structure and dynamics into an evolutionary context begins to allow discrimination between those physical and molecular features that particularize a species to its surroundings, and those that provide core physiological function. This approach promises to generate a higher level understanding of cellular design, pathway evolution and cellular bioengineering.

  11. A comprehensive analysis of the La-motif protein superfamily

    PubMed Central

    Bousquet-Antonelli, Cécile; Deragon, Jean-Marc

    2009-01-01

    The extremely well-conserved La motif (LAM), in synergy with the immediately following RNA recognition motif (RRM), allows direct binding of the (genuine) La autoantigen to RNA polymerase III primary transcripts. This motif is not only found on La homologs, but also on La-related proteins (LARPs) of unrelated function. LARPs are widely found amongst eukaryotes and, although poorly characterized, appear to be RNA-binding proteins fulfilling crucial cellular functions. We searched the fully sequenced genomes of 83 eukaryotic species scattered along the tree of life for the presence of LAM-containing proteins. We observed that these proteins are absent from archaea and present in all eukaryotes (except protists from the Plasmodium genus), strongly suggesting that the LAM is an ancestral motif that emerged early after the archaea-eukarya radiation. A complete evolutionary and structural analysis of these proteins resulted in their classification into five families: the genuine La homologs and four LARP families. Unexpectedly, in each family a conserved domain representing either a classical RRM or an RRM-like motif immediately follows the LAM of most proteins. An evolutionary analysis of the LAM-RRM/RRM-L regions shows that these motifs co-evolved and should be used as a single entity to define the functional region of interaction of LARPs with their substrates. We also found two extremely well conserved motifs, named LSA and DM15, shared by LARP6 and LARP1 family members, respectively. We suggest that members of the same family are functional homologs and/or share a common molecular mode of action on different RNA baits. PMID:19299548

  12. Efficient exact motif discovery

    PubMed Central

    Marschall, Tobias; Rahmann, Sven

    2009-01-01

    Motivation: The motif discovery problem consists of finding over-represented patterns in a collection of biosequences. It is one of the classical sequence analysis problems, but still has not been satisfactorily solved in an exact and efficient manner. This is partly due to the large number of possibilities of defining the motif search space and the notion of over-representation. Even for well-defined formalizations, the problem is frequently solved in an ad hoc manner with heuristics that do not guarantee to find the best motif. Results: We show how to solve the motif discovery problem (almost) exactly on a practically relevant space of IUPAC generalized string patterns, using the p-value with respect to an i.i.d. model or a Markov model as the measure of over-representation. In particular, (i) we use a highly accurate compound Poisson approximation for the null distribution of the number of motif occurrences. We show how to compute the exact clump size distribution using a recently introduced device called probabilistic arithmetic automaton (PAA). (ii) We define two p-value scores for over-representation, the first one based on the total number of motif occurrences, the second one based on the number of sequences in a collection with at least one occurrence. (iii) We describe an algorithm to discover the optimal pattern with respect to either of the scores. The method exploits monotonicity properties of the compound Poisson approximation and is by orders of magnitude faster than exhaustive enumeration of IUPAC strings (11.8 h compared with an extrapolated runtime of 4.8 years). (iv) We justify the use of the proposed scores for motif discovery by showing our method to outperform other motif discovery algorithms (e.g. MEME, Weeder) on benchmark datasets. We also propose new motifs on Mycobacterium tuberculosis. Availability and Implementation: The method has been implemented in Java. It can be obtained from http://ls11-www

  13. Crystal Structures of Two Novel Dye-Decolorizing Peroxidases Reveal a Beta-Bar Fold With a Conserved Heme-Binding Motif

    SciTech Connect

    Zubieta, C.; Krishna, S.S.; Kapoor, M.; Kozbial, P.; McMullan, D.; Axelrod, H.L.; Miller, M.D.; Abdubek, P.; Ambing, E.; Astakhova, T.; Carlton, D.; Chiu, H.J.; Clayton, T.; Deller, M.C.; Duan, L.; Elsliger, M.A.; Feuerhelm, J.; Grzechnik, S.K.; Hale, J.; Hampton, E.; Han, G.W.; /JCSG /SLAC, SSRL /Burnham Inst. Med. Res. /UC, San Diego /Scripps Res. Inst. /Novartis Res. Found.

    2007-10-31

    BtDyP from Bacteroides thetaiotaomicron (strain VPI-5482) and TyrA from Shewanella oneidensis are dye-decolorizing peroxidases (DyPs), members of a new family of heme-dependent peroxidases recently identified in fungi and bacteria. Here, we report the crystal structures of BtDyP and TyrA at 1.6 and 2.7 Angstroms, respectively. BtDyP assembles into a hexamer, while TyrA assembles into a dimer; the dimerization interface is conserved between the two proteins. Each monomer exhibits a two-domain, {alpha}+{beta} ferredoxin-like fold. A site for heme binding was identified computationally, and modeling of a heme into the proposed active site allowed for identification of residues likely to be functionally important. Structural and sequence comparisons with other DyPs demonstrate a conservation of putative heme-binding residues, including an absolutely conserved histidine. Isothermal titration calorimetry experiments confirm heme binding, but with a stoichiometry of 0.3:1 (heme:protein).

  14. Distinct XPPX sequence motifs induce ribosome stalling, which is rescued by the translation elongation factor EF-P

    PubMed Central

    Peil, Lauri; Starosta, Agata L.; Lassak, Jürgen; Atkinson, Gemma C.; Virumäe, Kai; Spitzer, Michaela; Tenson, Tanel; Jung, Kirsten; Remme, Jaanus; Wilson, Daniel N.

    2013-01-01

    Ribosomes are the protein synthesizing factories of the cell, polymerizing polypeptide chains from their constituent amino acids. However, distinct combinations of amino acids, such as polyproline stretches, cannot be efficiently polymerized by ribosomes, leading to translational stalling. The stalled ribosomes are rescued by the translational elongation factor P (EF-P), which by stimulating peptide-bond formation allows translation to resume. Using metabolic stable isotope labeling and mass spectrometry, we demonstrate in vivo that EF-P is important for expression of not only polyproline-containing proteins, but also for specific subsets of proteins containing diprolyl motifs (XPP/PPX). Together with a systematic in vitro and in vivo analysis, we provide a distinct hierarchy of stalling triplets, ranging from strong stallers, such as PPP, DPP, and PPN to weak stallers, such as CPP, PPR, and PPH, all of which are substrates for EF-P. These findings provide mechanistic insight into how the characteristics of the specific amino acid substrates influence the fundamentals of peptide bond formation. PMID:24003132

  15. PTS-Mediated Regulation of the Transcription Activator MtlR from Different Species: Surprising Differences despite Strong Sequence Conservation.

    PubMed

    Joyet, Philippe; Derkaoui, Meriem; Bouraoui, Houda; Deutscher, Josef

    2015-01-01

    The hexitol D-mannitol is transported by many bacteria via a phosphoenolpyruvate (PEP):carbohydrate phosphotransferase system (PTS). In most Firmicutes, the transcription activator MtlR controls the expression of the genes encoding the D-mannitol-specific PTS components and D-mannitol-1-P dehydrogenase. MtlR contains an N-terminal helix-turn-helix motif followed by an Mga-like domain, two PTS regulation domains (PRDs), an EIIB(Gat)- and an EIIA(Mtl)-like domain. The four regulatory domains are the target of phosphorylation by PTS components. Despite strong sequence conservation, the mechanisms controlling the activity of MtlR from Lactobacillus casei, Bacillus subtilis and Geobacillus stearothermophilus are quite different. Owing to the presence of a tyrosine in place of the second conserved histidine (His) in PRD2, L. casei MtlR is not phosphorylated by Enzyme I (EI) and HPr. When the corresponding His in PRD2 of MtlR from B. subtilis and G. stearothermophilus was replaced with alanine, the transcription regulator was no longer phosphorylated and remained inactive. Surprisingly, L. casei MtlR functions without phosphorylation in PRD2 because in a ptsI (EI) mutant MtlR is constitutively active. EI inactivation prevents not only phosphorylation of HPr, but also of the PTS(Mtl) components, which inactivate MtlR by phosphorylating its EIIB(Gat)- or EIIA(Mtl)-like domain. This explains the constitutive phenotype of the ptsI mutant. The absence of EIIB(Mtl)-mediated phosphorylation leads to induction of the L. caseimtl operon. This mechanism resembles mtlARFD induction in G. stearothermophilus, but differs from EIIA(Mtl)-mediated induction in B. subtilis. In contrast to B. subtilis MtlR, L. casei MtlR activation does not require sequestration to the membrane via the unphosphorylated EIIB(Mtl) domain. PMID:26159071

  16. Characterizing and controlling intrinsic biases of lambda exonuclease in nascent strand sequencing reveals phasing between nucleosomes and G-quadruplex motifs around a subset of human replication origins

    PubMed Central

    Foulk, Michael S.; Urban, John M.; Casella, Cinzia; Gerbi, Susan A.

    2015-01-01

    Nascent strand sequencing (NS-seq) is used to discover DNA replication origins genome-wide, allowing identification of features for their specification. NS-seq depends on the ability of lambda exonuclease (λ-exo) to efficiently digest parental DNA while leaving RNA-primer protected nascent strands intact. We used genomics and biochemical approaches to determine if λ-exo digests all parental DNA sequences equally. We report that λ-exo does not efficiently digest G-quadruplex (G4) structures in a plasmid. Moreover, λ-exo digestion of nonreplicating genomic DNA (LexoG0) enriches GC-rich DNA and G4 motifs genome-wide. We used LexoG0 data to control for nascent strand–independent λ-exo biases in NS-seq and validated this approach at the rDNA locus. The λ-exo–controlled NS-seq peaks are not GC-rich, and only 35.5% overlap with 6.8% of all G4s, suggesting that G4s are not general determinants for origin specification but may play a role for a subset. Interestingly, we observed a periodic spacing of G4 motifs and nucleosomes around the peak summits, suggesting that G4s may position nucleosomes at this subset of origins. Finally, we demonstrate that use of Na+ instead of K+ in the λ-exo digestion buffer reduced the effect of G4s on λ-exo digestion and discuss ways to increase both the sensitivity and specificity of NS-seq. PMID:25695952

  17. RNAPattMatch: a web server for RNA sequence/structure motif detection based on pattern matching with flexible gaps

    PubMed Central

    Drory Retwitzer, Matan; Polishchuk, Maya; Churkin, Elena; Kifer, Ilona; Yakhini, Zohar; Barash, Danny

    2015-01-01

    Searching for RNA sequence-structure patterns is becoming an essential tool for RNA practitioners. Novel discoveries of regulatory non-coding RNAs in targeted organisms and the motivation to find them across a wide range of organisms have prompted the use of computational RNA pattern matching as an enhancement to sequence similarity. State-of-the-art programs differ by the flexibility of patterns allowed as queries and by their simplicity of use. In particular—no existing method is available as a user-friendly web server. A general program that searches for RNA sequence-structure patterns is RNA Structator. However, it is not available as a web server and does not provide the option to allow flexible gap pattern representation with an upper bound of the gap length being specified at any position in the sequence. Here, we introduce RNAPattMatch, a web-based application that is user friendly and makes sequence/structure RNA queries accessible to practitioners of various background and proficiency. It also extends RNA Structator and allows a more flexible variable gaps representation, in addition to analysis of results using energy minimization methods. RNAPattMatch service is available at http://www.cs.bgu.ac.il/rnapattmatch. A standalone version of the search tool is also available to download at the site. PMID:25940619

  18. Limb body wall complex, amniotic band sequence, or new syndrome caused by mutation in IQ Motif containing K (IQCK)?

    PubMed

    Kruszka, Paul; Uwineza, Annette; Mutesa, Leon; Martinez, Ariel F; Abe, Yu; Zackai, Elaine H; Ganetzky, Rebecca; Chung, Brian; Stevenson, Roger E; Adelstein, Robert S; Ma, Xuefei; Mullikin, James C; Hong, Sung-Kook; Muenke, Maximilian

    2015-09-01

    Limb body wall complex (LBWC) and amniotic band sequence (ABS) are multiple congenital anomaly conditions with craniofacial, limb, and ventral wall defects. LBWC and ABS are considered separate entities by some, and a continuum of severity of the same condition by others. The etiology of LBWC/ABS remains unknown and multiple hypotheses have been proposed. One individual with features of LBWC and his unaffected parents were whole exome sequenced and Sanger sequenced as confirmation of the mutation. Functional studies were conducted using morpholino knockdown studies followed by human mRNA rescue experiments. Using whole exome sequencing, a de novo heterozygous mutation was found in the gene IQCK: c.667C>G; p.Q223E and confirmed by Sanger sequencing in an individual with LBWC. Morpholino knockdown of iqck mRNA in the zebrafish showed ventral defects including failure of ventral fin to develop and cardiac edema. Human wild-type IQCK mRNA rescued the zebrafish phenotype, whereas human p.Q223E IQCK mRNA did not, but worsened the phenotype of the morpholino knockdown zebrafish. This study supports a genetic etiology for LBWC/ABS, or potentially a new syndrome. PMID:26436108

  19. Limb body wall complex, amniotic band sequence, or new syndrome caused by mutation in IQ Motif containing K (IQCK)?

    PubMed Central

    Kruszka, Paul; Uwineza, Annette; Mutesa, Leon; Martinez, Ariel F; Abe, Yu; Zackai, Elaine H; Ganetzky, Rebecca; Chung, Brian; Stevenson, Roger E; Adelstein, Robert S; Ma, Xuefei; Mullikin, James C; Hong, Sung-Kook; Muenke, Maximilian

    2015-01-01

    Limb body wall complex (LBWC) and amniotic band sequence (ABS) are multiple congenital anomaly conditions with craniofacial, limb, and ventral wall defects. LBWC and ABS are considered separate entities by some, and a continuum of severity of the same condition by others. The etiology of LBWC/ABS remains unknown and multiple hypotheses have been proposed. One individual with features of LBWC and his unaffected parents were whole exome sequenced and Sanger sequenced as confirmation of the mutation. Functional studies were conducted using morpholino knockdown studies followed by human mRNA rescue experiments. Using whole exome sequencing, a de novo heterozygous mutation was found in the gene IQCK: c.667C>G; p.Q223E and confirmed by Sanger sequencing in an individual with LBWC. Morpholino knockdown of iqck mRNA in the zebrafish showed ventral defects including failure of ventral fin to develop and cardiac edema. Human wild-type IQCK mRNA rescued the zebrafish phenotype, whereas human p.Q223E IQCK mRNA did not, but worsened the phenotype of the morpholino knockdown zebrafish. This study supports a genetic etiology for LBWC/ABS, or potentially a new syndrome. PMID:26436108

  20. A survey of motif finding Web tools for detecting binding site motifs in ChIP-Seq data.

    PubMed

    Tran, Ngoc Tam L; Huang, Chun-Hsi

    2014-01-01

    ChIP-Seq (chromatin immunoprecipitation sequencing) has provided the advantage for finding motifs as ChIP-Seq experiments narrow down the motif finding to binding site locations. Recent motif finding tools facilitate the motif detection by providing user-friendly Web interface. In this work, we reviewed nine motif finding Web tools that are capable for detecting binding site motifs in ChIP-Seq data. We showed each motif finding Web tool has its own advantages for detecting motifs that other tools may not discover. We recommended the users to use multiple motif finding Web tools that implement different algorithms for obtaining significant motifs, overlapping resemble motifs, and non-overlapping motifs. Finally, we provided our suggestions for future development of motif finding Web tool that better assists researchers for finding motifs in ChIP-Seq data. PMID:24555784

  1. Identification of antimicrobial peptides from teleosts and anurans in expressed sequence tag databases using conserved signal sequences.

    PubMed

    Tessera, Valentina; Guida, Filomena; Juretić, Davor; Tossi, Alessandro

    2012-03-01

    The problem of multidrug resistance requires the efficient and accurate identification of new classes of antimicrobial agents. Endogenous antimicrobial peptides produced by most organisms are a promising source of such molecules. We have exploited the high conservation of signal sequences in teleost and anuran antimicrobial peptides to search cDNA (expressed sequence tag) databases for likely candidates. Subject sequences were then analysed for the presence of potential antimicrobial peptides based on physicochemical properties (amphipathic helical structure, cationicity) and use of the D-descriptor model to predict the therapeutic index (relation between the minimum inhibitory concentration and the concentration giving 50% haemolysis). This analysis also suggested mutations to probe the role of the primary structure in determining potency and selectivity. Selected sequences were chemically synthesized and the antimicrobial activity of the peptides was confirmed. In particular, a short (21-residue) sequence, likely of sticklefish origin, showed potent activity and it was possible to tune the spectrum of action and/or selectivity by combining three directed mutations. Membrane permeabilization studies on both bacterial and host cells indicate that the mode of action was prevalently membranolytic. This method opens up the possibility for more effective searching of the vast and continuously growing expressed sequence tag databases for novel antimicrobial peptides, which are likely abundant, and the efficient identification of the most promising candidates among them. PMID:22188679

  2. A functional Small Ubiquitin-like Modifier (SUMO) interacting motif (SIM) in the gibberellin hormone receptor GID1 is conserved in cereal crops and disrupting this motif does not abolish hormone dependency of the DELLA-GID1 interaction

    PubMed Central

    Nelis, Stuart; Conti, Lucio; Zhang, Cunjin; Sadanandom, Ari

    2015-01-01

    Plants survive adversity by modulating their growth in response to changing environmental signals. The phytohormone Gibberellic acid (GA) plays a central role in regulating these adaptive responses by stimulating the degradation of growth repressing DELLA proteins which accumulate during stress. The current model for GA signaling describes how this hormone binds to its receptor GID1 so promoting association of GID1 with DELLA, which then undergoes ubiquitin-mediated proteasomal degradation. Recent data revealed that conjugation of DELLAs to the Small Ubiquitin-like Modifier (SUMO) protein enables plants to modulate its abundance during environmental stress. This is achieved by SUMOylated DELLAs sequestering GID1 via its SUMO interacting motif (SIM) allowing non-SUMOylated DELLAs to accumulate leading to growth restraint under stress and potential yield loss. We demonstrate that GID1 proteins across the major cereal crops contain a functional SIM able to bind SUMO1. Site directed mutagenesis and yeast 2 hybrid experiments reveal that it is possible to disrupt the SIM-SUMO interaction motif without affecting the GA dependent DELLA–GID1 interaction and thereby uncoupling SUMO–mediated inhibition from DELLA degradation. Arabidopsis plants overexpressing a SIM mutant allele of GID1 perform better at relieving DELLA restraint than wild–type GID1. This evidence suggests that manipulating the SIM motif in the GA receptor may provide a possible route to developing stress tolerant crops plants. PMID:25761145

  3. Molecular characterization of a bovine Y-specific DNA sequence conserved in taurine and zebu breeds.

    PubMed

    Alves, Beatriz C A; Mayer, Mário G; Taber, Anna Paula; Egito, Andréa A; Fagundes, Valéria; McElreavey, Ken; Moreira-Filho, Carlos A

    2006-06-01

    The identification of new bovine male-specific DNA sequences is of great interest because the bovine Y chromosome remains poorly characterized in terms of physical and genetic maps. Since taurine and zebu Y chromosomes are structurally different, the identification of Y-specific sequences present in both sub-species is particularly important: these sequences are of evolutionary significance and can be broadly used for embryo sexing. In this work, we initially used the random amplified polymorphic DNA (RAPD) technique to search for male-specific sequences present as monomorphic markers in genomic DNA from zebu and taurine bulls. A male-specific RAPD band was found to be present and highly conserved in both sub-species, as demonstrated by Southern blotting, fluorescent in situ hybridization (FISH) and DNA sequencing. In a previous work, a pair of primers derived from this marker was successfully used in taurine and zebu embryo sexing. PMID:17286047

  4. A structural-alphabet-based strategy for finding structural motifs across protein families.

    PubMed

    Wu, Chih Yuan; Chen, Yao Chi; Lim, Carmay

    2010-08-01

    Proteins with insignificant sequence and overall structure similarity may still share locally conserved contiguous structural segments; i.e. structural/3D motifs. Most methods for finding 3D motifs require a known motif to search for other similar structures or functionally/structurally crucial residues. Here, without requiring a query motif or essential residues, a fully automated method for discovering 3D motifs of various sizes across protein families with different folds based on a 16-letter structural alphabet is presented. It was applied to structurally non-redundant proteins bound to DNA, RNA, obligate/non-obligate proteins as well as free DNA-binding proteins (DBPs) and proteins with known structures but unknown function. Its usefulness was illustrated by analyzing the 3D motifs found in DBPs. A non-specific motif was found with a 'corner' architecture that confers a stable scaffold and enables diverse interactions, making it suitable for binding not only DNA but also RNA and proteins. Furthermore, DNA-specific motifs present 'only' in DBPs were discovered. The motifs found can provide useful guidelines in detecting binding sites and computational protein redesign. PMID:20525797

  5. The Moraxella catarrhalis immunoglobulin D-binding protein MID has conserved sequences and is regulated by a mechanism corresponding to phase variation.

    PubMed

    Möllenkvist, Andrea; Nordström, Therése; Halldén, Christer; Christensen, Jens Jørgen; Forsgren, Arne; Riesbeck, Kristian

    2003-04-01

    The prevalence of the Moraxella catarrhalis immunoglobulin D (IgD)-binding outer membrane protein MID and its gene was determined in 91 clinical isolates and in 7 culture collection strains. Eighty-four percent of the clinical Moraxella strains expressed MID-dependent IgD binding. The mid gene was detected in all strains as revealed by homology of the signal peptide sequence and a conserved area in the 3' end of the gene. When MID proteins from five different strains were compared, an identity of 65.3 to 85.0% and a similarity of 71.2 to 89.1% were detected. Gene analyses showed several amino acid repeat motifs in the open reading frames, and MID could be called a putative autotransport protein. Interestingly, homopolymeric [polyguanine [poly(G)

  6. Studying RNA Homology and Conservation with Infernal: From Single Sequences to RNA Families.

    PubMed

    Barquist, Lars; Burge, Sarah W; Gardner, Paul P

    2016-01-01

    Emerging high-throughput technologies have led to a deluge of putative non-coding RNA (ncRNA) sequences identified in a wide variety of organisms. Systematic characterization of these transcripts will be a tremendous challenge. Homology detection is critical to making maximal use of functional information gathered about ncRNAs: identifying homologous sequence allows us to transfer information gathered in one organism to another quickly and with a high degree of confidence. ncRNA presents a challenge for homology detection, as the primary sequence is often poorly conserved and de novo secondary structure prediction and search remain difficult. This unit introduces methods developed by the Rfam database for identifying "families" of homologous ncRNAs starting from single "seed" sequences, using manually curated sequence alignments to build powerful statistical models of sequence and structure conservation known as covariance models (CMs), implemented in the Infernal software package. We provide a step-by-step iterative protocol for identifying ncRNA homologs and then constructing an alignment and corresponding CM. We also work through an example for the bacterial small RNA MicA, discovering a previously unreported family of divergent MicA homologs in genus Xenorhabdus in the process. © 2016 by John Wiley & Sons, Inc. PMID:27322404

  7. Radiation Desiccation Response Motif-Like Sequences Are Involved in Transcriptional Activation of the Deinococcal ssb Gene by Ionizing Radiation but Not by Desiccation▿

    PubMed Central

    Ujaoney, Aman Kumar; Potnis, Akhilesh A.; Kane, Pratiksha; Mukhopadhyaya, Rita; Apte, Shree Kumar

    2010-01-01

    Single-stranded-DNA binding protein (SSB) levels during poststress recovery of Deinococcus radiodurans were significantly enhanced by 60Co gamma rays or mitomycin C treatment but not by exposure to UV rays, hydrogen peroxide (H2O2), or desiccation. Addition of rifampin prior to postirradiation recovery blocked such induction. In silico analysis of the ssb promoter region revealed a 17-bp palindromic radiation/desiccation response motif (RDRM1) at bp −114 to −98 and a somewhat similar sequence (RDRM2) at bp −213 to −197, upstream of the ssb open reading frame. Involvement of these cis elements in radiation-responsive ssb gene expression was assessed by constructing transcriptional fusions of edited versions of the ssb promoter region with a nonspecific acid phosphatase encoding reporter gene, phoN. Recombinant D. radiodurans strains carrying such constructs clearly revealed (i) transcriptional induction of the ssb promoter upon irradiation and mitomycin C treatment but not upon UV or H2O2 treatment and (ii) involvement of both RDRM-like sequences in such activation of SSB expression, in an additive manner. PMID:20802034

  8. A Conserved Interaction between a C-Terminal Motif in Norovirus VPg and the HEAT-1 Domain of eIF4G Is Essential for Translation Initiation

    PubMed Central

    Leen, Eoin N.; Sorgeloos, Frédéric; Correia, Samantha; Chaudhry, Yasmin; Cannac, Fabien; Pastore, Chiara; Xu, Yingqi; Graham, Stephen C.; Matthews, Stephen J.; Goodfellow, Ian G.; Curry, Stephen

    2016-01-01

    Translation initiation is a critical early step in the replication cycle of the positive-sense, single-stranded RNA genome of noroviruses, a major cause of gastroenteritis in humans. Norovirus RNA, which has neither a 5´ m7G cap nor an internal ribosome entry site (IRES), adopts an unusual mechanism to initiate protein synthesis that relies on interactions between the VPg protein covalently attached to the 5´-end of the viral RNA and eukaryotic initiation factors (eIFs) in the host cell. For murine norovirus (MNV) we previously showed that VPg binds to the middle fragment of eIF4G (4GM; residues 652–1132). Here we have used pull-down assays, fluorescence anisotropy, and isothermal titration calorimetry (ITC) to demonstrate that a stretch of ~20 amino acids at the C terminus of MNV VPg mediates direct and specific binding to the HEAT-1 domain within the 4GM fragment of eIF4G. Our analysis further reveals that the MNV C terminus binds to eIF4G HEAT-1 via a motif that is conserved in all known noroviruses. Fine mutagenic mapping suggests that the MNV VPg C terminus may interact with eIF4G in a helical conformation. NMR spectroscopy was used to define the VPg binding site on eIF4G HEAT-1, which was confirmed by mutagenesis and binding assays. We have found that this site is non-overlapping with the binding site for eIF4A on eIF4G HEAT-1 by demonstrating that norovirus VPg can form ternary VPg-eIF4G-eIF4A complexes. The functional significance of the VPg-eIF4G interaction was shown by the ability of fusion proteins containing the C-terminal peptide of MNV VPg to inhibit in vitro translation of norovirus RNA but not cap- or IRES-dependent translation. These observations define important structural details of a functional interaction between norovirus VPg and eIF4G and reveal a binding interface that might be exploited as a target for antiviral therapy. PMID:26734730

  9. Sequence analysis of mouse vomeronasal receptor gene clusters reveals common promoter motifs and a history of recent expansion

    PubMed Central

    Lane, Robert P.; Cutforth, Tyler; Axel, Richard; Hood, Leroy; Trask, Barbara J.

    2002-01-01

    We have analyzed the organization and sequence of 73 V1R genes encoding putative pheromone receptors to identify regulatory features and characterize the evolutionary history of the V1R family. The 73 V1Rs arose from seven ancestral genes around the time of mouse–rat speciation through large local duplications, and this expansion may contribute to speciation events. Orthologous V1R genes appear to have been lost during primate evolution. Exceptional noncoding homology is observed across four V1R subfamilies at one cluster and thus may be important for locus-specific transcriptional regulation. PMID:11752409

  10. Cooperative Hybridization of γPNA Miniprobes to a Repeating Sequence Motif and Application to Telomere Analysis

    PubMed Central

    Sureshkumar, Gopalsamy; Ly, Danith H.; Opresko, Patricia L.; Armitage, Bruce A.

    2014-01-01

    GammaPNA oligomers having one or two repeats of the sequence AATCCC were designed to hybridize to DNA having one or more repeats of the complementary TTAGGG sequence found in the human telomere. UV melting curves and surface plasmon resonance experiments demonstrate high affinity and cooperativity for hybridization of these miniprobes to DNA having multiple complementary repeats. Fluorescence spectroscopy for Cy3-labeled miniprobes demonstrate increases in fluorescence intensity for assembling multiple short probes on a DNA target compared with fewer longer probes. The fluorescent γPNA miniprobes were then used to stain telomeres in metaphase chromosomes derived from U2OS cells possessing heterogeneous long telomeres and Jurkat cells harboring homogenous short telomeres. The miniprobes yielded comparable fluorescence intensity to a commercially available PNA 18mer probe in U2OS cells, but significantly brighter fluorescence was observed for telomeres in Jurkat cells. These results suggest that γPNA miniprobes can be effective telomere-staining reagents with applications toward analysis of critically short telomeres, which have been implicated in a range of human diseases. PMID:25115693

  11. Identification of amino acid sequence motifs in desmocollin, a desmosomal glycoprotein, that are required for plakoglobin binding and plaque formation.

    PubMed

    Troyanovsky, S M; Troyanovsky, R B; Eshkind, L G; Leube, R E; Franke, W W

    1994-11-01

    By transfecting epithelial cells with gene constructs encoding chimeric proteins of the transmembrane part of the gap junction protein connexin 32 in combination with various segments of the cytoplasmic part of the desmosomal cadherin desmocollin 1a, we have determined that a relatively short sequence element is necessary for the formation of desmosome-like plaques and for the specific anchorage of bundles of intermediate-sized filaments (IFs). Deletion of as little as the carboxyl-terminal 37 aa resulted in a lack of IF anchorage and binding of the plaque protein plakoglobin, as shown by immunolocalization and immunoprecipitation experiments. In addition, we show that the sequence requirements for the recruitment of desmoplakin, another desmosomal plaque protein, differ and that a short (10 aa) segment of the desmocollin 1a tail, located close to the plasma membrane, is also required for the binding of plakoglobin, as well as of desmoplakin, and also for IF anchorage. The importance of the carboxyl-terminal domain, homologous in diverse types of cadherins, is emphasized, as it must harbor, in a mutually exclusive pattern, the information for assembly of the IF-anchoring desmosomal plaque in desmocollins and for formation of the alpha-/beta-catenin- and vinculin-containing, actin filament-anchoring plaque in E- and N-cadherin. PMID:7971964

  12. Newly identified motifs in Candida albicans Cdr1 protein nucleotide binding domains are pleiotropic drug resistance subfamily-specific and functionally asymmetric.

    PubMed

    Rawal, Manpreet Kaur; Banerjee, Atanu; Shah, Abdul Haseeb; Khan, Mohammad Firoz; Sen, Sobhan; Saxena, Ajay Kumar; Monk, Brian C; Cannon, Richard D; Bhatnagar, Rakesh; Mondal, Alok Kumar; Prasad, Rajendra

    2016-01-01

    An analysis of Candida albicans ABC transporters identified conserved related α-helical sequence motifs immediately C-terminal of each Walker A sequence. Despite the occurrence of these motifs in ABC subfamilies of other yeasts and higher eukaryotes, their roles in protein function remained unexplored. In this study we have examined the functional significance of these motifs in the C. albicans PDR transporter Cdr1p. The motifs present in NBD1 and NBD2 were subjected to alanine scanning mutagenesis, deletion, or replacement of an entire motif. Systematic replacement of individual motif residues with alanine did not affect the function of Cdr1p but deletion of the M1-motif in NBD1 (M1-Del) resulted in Cdr1p being trapped within the endoplasmic reticulum. In contrast, deletion of the M2-motif in NBD2 (M2-Del) yielded a non-functional protein with normal plasma membrane localization. Replacement of the motif in M1-Del with six alanines (M1-Ala) significantly improved localization of the protein and partially restored function. Conversely, replacement of the motif in M2-Del with six alanines (M2-Ala) did not reverse the phenotype and susceptibility to antifungal substrates of Cdr1p was unchanged. Together, the M1 and M2 motifs contribute to the functional asymmetry of NBDs and are important for maturation of Cdr1p and ATP catalysis, respectively. PMID:27251950

  13. Newly identified motifs in Candida albicans Cdr1 protein nucleotide binding domains are pleiotropic drug resistance subfamily-specific and functionally asymmetric

    PubMed Central

    Rawal, Manpreet Kaur; Banerjee, Atanu; Shah, Abdul Haseeb; Khan, Mohammad Firoz; Sen, Sobhan; Saxena, Ajay Kumar; Monk, Brian C.; Cannon, Richard D.; Bhatnagar, Rakesh; Mondal, Alok Kumar; Prasad, Rajendra

    2016-01-01

    An analysis of Candida albicans ABC transporters identified conserved related α-helical sequence motifs immediately C-terminal of each Walker A sequence. Despite the occurrence of these motifs in ABC subfamilies of other yeasts and higher eukaryotes, their roles in protein function remained unexplored. In this study we have examined the functional significance of these motifs in the C. albicans PDR transporter Cdr1p. The motifs present in NBD1 and NBD2 were subjected to alanine scanning mutagenesis, deletion, or replacement of an entire motif. Systematic replacement of individual motif residues with alanine did not affect the function of Cdr1p but deletion of the M1-motif in NBD1 (M1-Del) resulted in Cdr1p being trapped within the endoplasmic reticulum. In contrast, deletion of the M2-motif in NBD2 (M2-Del) yielded a non-functional protein with normal plasma membrane localization. Replacement of the motif in M1-Del with six alanines (M1-Ala) significantly improved localization of the protein and partially restored function. Conversely, replacement of the motif in M2-Del with six alanines (M2-Ala) did not reverse the phenotype and susceptibility to antifungal substrates of Cdr1p was unchanged. Together, the M1 and M2 motifs contribute to the functional asymmetry of NBDs and are important for maturation of Cdr1p and ATP catalysis, respectively. PMID:27251950

  14. Sequence-related human proteins cluster by degree of evolutionary conservation

    NASA Astrophysics Data System (ADS)

    Mrowka, Ralf; Patzak, Andreas; Herzel, Hanspeter; Holste, Dirk

    2004-11-01

    Gene duplication followed by adaptive evolution is thought to be a central mechanism for the emergence of novel genes. To illuminate the contribution of duplicated protein-coding sequences to the complexity of the human genome, we study the connectivity of pairwise sequence-related human proteins and construct a network (N) of linked protein sequences with shared similarities. We find that (i) the connectivity distribution P(k) for k sequence-related proteins decays as a power law P(k)˜k-γ with γ≈1.2 , (ii) the top rank of N consists of a single large cluster of proteins (≈70%) , while bottom ranks consist of multiple isolated clusters, and (iii) structural characteristics of N show both a high degree of clustering and an intermediate connectivity (“small-world” features). We gain further insight into structural properties of N by studying the relationship between the connectivity distribution and the phylogenetic conservation of proteins in bacteria, plants, invertebrates, and vertebrates. We find that (iv) the proportion of sequence-related proteins increases with increasing extent of evolutionary conservation. Our results support that small-world network properties constitute a footprint of an evolutionary mechanism and extend the traditional interpretation of protein families.

  15. The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans

    PubMed Central

    Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

    2015-01-01

    Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. PMID:26199191

  16. The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans.

    PubMed

    Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

    2015-08-01

    Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. PMID:26199191

  17. Auditory sequence processing reveals evolutionarily conserved regions of frontal cortex in macaques and humans.

    PubMed

    Wilson, Benjamin; Kikuchi, Yukiko; Sun, Li; Hunter, David; Dick, Frederic; Smith, Kenny; Thiele, Alexander; Griffiths, Timothy D; Marslen-Wilson, William D; Petkov, Christopher I

    2015-01-01

    An evolutionary account of human language as a neurobiological system must distinguish between human-unique neurocognitive processes supporting language and evolutionarily conserved, domain-general processes that can be traced back to our primate ancestors. Neuroimaging studies across species may determine whether candidate neural processes are supported by homologous, functionally conserved brain areas or by different neurobiological substrates. Here we use functional magnetic resonance imaging in Rhesus macaques and humans to examine the brain regions involved in processing the ordering relationships between auditory nonsense words in rule-based sequences. We find that key regions in the human ventral frontal and opercular cortex have functional counterparts in the monkey brain. These regions are also known to be associated with initial stages of human syntactic processing. This study raises the possibility that certain ventral frontal neural systems, which play a significant role in language function in modern humans, originally evolved to support domain-general abilities involved in sequence processing. PMID:26573340

  18. Auditory sequence processing reveals evolutionarily conserved regions of frontal cortex in macaques and humans

    PubMed Central

    Wilson, Benjamin; Kikuchi, Yukiko; Sun, Li; Hunter, David; Dick, Frederic; Smith, Kenny; Thiele, Alexander; Griffiths, Timothy D.; Marslen-Wilson, William D.; Petkov, Christopher I.

    2015-01-01

    An evolutionary account of human language as a neurobiological system must distinguish between human-unique neurocognitive processes supporting language and evolutionarily conserved, domain-general processes that can be traced back to our primate ancestors. Neuroimaging studies across species may determine whether candidate neural processes are supported by homologous, functionally conserved brain areas or by different neurobiological substrates. Here we use functional magnetic resonance imaging in Rhesus macaques and humans to examine the brain regions involved in processing the ordering relationships between auditory nonsense words in rule-based sequences. We find that key regions in the human ventral frontal and opercular cortex have functional counterparts in the monkey brain. These regions are also known to be associated with initial stages of human syntactic processing. This study raises the possibility that certain ventral frontal neural systems, which play a significant role in language function in modern humans, originally evolved to support domain-general abilities involved in sequence processing. PMID:26573340

  19. Envelope formation is blocked by mutation of a sequence related to the HKD phospholipid metabolism motif in the vaccinia virus F13L protein.

    PubMed

    Roper, R L; Moss, B

    1999-02-01

    The outer envelope of the extracellular form of vaccinia virus is derived from Golgi membranes that have been modified by the insertion of specific viral proteins, of which the major component is the 37-kDa, palmitylated, nonglycosylated product of the F13L gene. The F13L protein contains a variant of the HKD (His-Lys-Asp) motif, which is conserved in numerous enzymes of phospholipid metabolism. Vaccinia virus mutants with a conservative substitution of either the K (K314R) or the D (D319E) residue of the F13L protein formed only tiny plaques similar to those produced by an F13L deletion mutant, were unable to produce extracellular enveloped virions, and failed to mediate low-pH-induced fusion of infected cells. Membrane-wrapped forms of intracellular virus were rarely detected in electron microscopic images of cells infected with either of the mutants. Western blotting and pulse-chase experiments demonstrated that the D319E protein was less stable than either the K314R or wild-type F13L protein. Most striking, however, was the failure of either of the two mutated proteins to concentrate in the Golgi compartment. Palmitylation, oleation, and partitioning of the F13L protein in Triton X-114 detergent were unaffected by the K314R substitution. These results indicated that the F13L protein must retain the K314 and D319 for it to localize in the Golgi compartment and function in membrane envelopment of vaccinia virus. PMID:9882312

  20. Evolutionary Analysis and Classification of OATs, OCTs, OCTNs, and Other SLC22 Transporters: Structure-Function Implications and Analysis of Sequence Motifs

    PubMed Central

    Date, Rishabh C.; Bush, Kevin T.; Springer, Stevan A.; Saier, Milton H.; Wu, Wei; Nigam, Sanjay K.

    2015-01-01

    The SLC22 family includes organic anion transporters (OATs), organic cation transporters (OCTs) and organic carnitine and zwitterion transporters (OCTNs). These are often referred to as drug transporters even though they interact with many endogenous metabolites and signaling molecules (Nigam, S.K., Nature Reviews Drug Discovery, 14:29–44, 2015). Phylogenetic analysis of SLC22 supports the view that these transporters may have evolved over 450 million years ago. Many OAT members were found to appear after a major expansion of the SLC22 family in mammals, suggesting a physiological and/or toxicological role during the mammalian radiation. Putative SLC22 orthologs exist in worms, sea urchins, flies, and ciona. At least six groups of SLC22 exist. OATs and OCTs form two Major clades of SLC22, within which (apart from Oat and Oct subclades), there are also clear Oat-like, Octn, and Oct-related subclades, as well as a distantly related group we term “Oat-related” (which may have different functions). Based on available data, it is arguable whether SLC22A18, which is related to bacterial drug-proton antiporters, should be assigned to SLC22. Disease-causing mutations, single nucleotide polymorphisms (SNPs) and other functionally analyzed mutations in OAT1, OAT3, URAT1, OCT1, OCT2, OCTN1, and OCTN2 map to the first extracellular domain, the large central intracellular domain, and transmembrane domains 9 and 10. These regions are highly conserved within subclades, but not between subclades, and may be necessary for SLC22 transporter function and functional diversification. Our results not only link function to evolutionarily conserved motifs but indicate the need for a revised sub-classification of SLC22. PMID:26536134

  1. Genome-wide identification of conserved regulatory function in diverged sequences

    PubMed Central

    Taher, Leila; McGaughey, David M.; Maragh, Samantha; Aneas, Ivy; Bessling, Seneca L.; Miller, Webb; Nobrega, Marcelo A.; McCallion, Andrew S.; Ovcharenko, Ivan

    2011-01-01

    Plasticity of gene regulatory encryption can permit DNA sequence divergence without loss of function. Functional information is preserved through conservation of the composition of transcription factor binding sites (TFBS) in a regulatory element. We have developed a method that can accurately identify pairs of functional noncoding orthologs at evolutionarily diverged loci by searching for conserved TFBS arrangements. With an estimated 5% false-positive rate (FPR) in approximately 3000 human and zebrafish syntenic loci, we detected approximately 300 pairs of diverged elements that are likely to share common ancestry and have similar regulatory activity. By analyzing a pool of experimentally validated human enhancers, we demonstrated that 7/8 (88%) of their predicted functional orthologs retained in vivo regulatory control. Moreover, in 5/7 (71%) of assayed enhancer pairs, we observed concordant expression patterns. We argue that TFBS composition is often necessary to retain and sufficient to predict regulatory function in the absence of overt sequence conservation, revealing an entire class of functionally conserved, evolutionarily diverged regulatory elements that we term “covert.” PMID:21628450

  2. A conserved motif N-terminal to the DNA-binding domains of myogenic bHLH transcription factors mediates cooperative DNA binding with pbx-Meis1/Prep1.

    PubMed

    Knoepfler, P S; Bergstrom, D A; Uetsuki, T; Dac-Korytko, I; Sun, Y H; Wright, W E; Tapscott, S J; Kamps, M P

    1999-09-15

    The t(1;19) chromosomal translocation of pediatric pre-B cell leukemia produces chimeric oncoprotein E2a-Pbx1, which contains the N-terminal transactivation domain of the basic helix-loop-helix (bHLH) transcription factor, E2a, joined to the majority of the homeodomain protein, Pbx1. There are three Pbx family members, which bind DNA as heterodimers with both broadly expressed Meis/Prep1 homeo-domain proteins and specifically expressed Hox homeodomain proteins. These Pbx heterodimers can augment the function of transcriptional activators bound to adjacent elements. In heterodimers, a conserved tryptophan motif in Hox proteins binds a pocket on the surface of the Pbx homeodomain, while Meis/Prep1 proteins bind an N-terminal Pbx domain, raising the possibility that the tryptophan-interaction pocket of the Pbx component of a Pbx-Meis/Prep1 complex is still available to bind trypto-phan motifs of other transcription factors bound to flanking elements. Here, we report that Pbx-Meis1/Prep1 binds DNA cooperatively with heterodimers of E2a and MyoD, myogenin, Mrf-4 or Myf-5. As with Hox proteins, a highly conserved tryptophan motif N-terminal to the DNA-binding domains of each myogenic bHLH family protein is required for cooperative DNA binding with Pbx-Meis1/Prep1. In vivo, MyoD requires this tryptophan motif to evoke chromatin remodeling in the Myogenin promoter and to activate Myogenin transcription. Pbx-Meis/Prep1 complexes, therefore, have the potential to cooperate with the myogenic bHLH proteins in regulating gene transcription. PMID:10471746

  3. Conservation of uORF repressiveness and sequence features in mouse, human and zebrafish

    PubMed Central

    Chew, Guo-Liang; Pauli, Andrea; Schier, Alexander F.

    2016-01-01

    Upstream open reading frames (uORFs) are ubiquitous repressive genetic elements in vertebrate mRNAs. While much is known about the regulation of individual genes by their uORFs, the range of uORF-mediated translational repression in vertebrate genomes is largely unexplored. Moreover, it is unclear whether the repressive effects of uORFs are conserved across species. To address these questions, we analyse transcript sequences and ribosome profiling data from human, mouse and zebrafish. We find that uORFs are depleted near coding sequences (CDSes) and have initiation contexts that diminish their translation. Linear modelling reveals that sequence features at both uORFs and CDSes modulate the translation of CDSes. Moreover, the ratio of translation over 5′ leaders and CDSes is conserved between human and mouse, and correlates with the number of uORFs. These observations suggest that the prevalence of vertebrate uORFs may be explained by their conserved role in repressing CDS translation. PMID:27216465

  4. Conservation of uORF repressiveness and sequence features in mouse, human and zebrafish.

    PubMed

    Chew, Guo-Liang; Pauli, Andrea; Schier, Alexander F

    2016-01-01

    Upstream open reading frames (uORFs) are ubiquitous repressive genetic elements in vertebrate mRNAs. While much is known about the regulation of individual genes by their uORFs, the range of uORF-mediated translational repression in vertebrate genomes is largely unexplored. Moreover, it is unclear whether the repressive effects of uORFs are conserved across species. To address these questions, we analyse transcript sequences and ribosome profiling data from human, mouse and zebrafish. We find that uORFs are depleted near coding sequences (CDSes) and have initiation contexts that diminish their translation. Linear modelling reveals that sequence features at both uORFs and CDSes modulate the translation of CDSes. Moreover, the ratio of translation over 5' leaders and CDSes is conserved between human and mouse, and correlates with the number of uORFs. These observations suggest that the prevalence of vertebrate uORFs may be explained by their conserved role in repressing CDS translation. PMID:27216465

  5. Motif types, motif locations and base composition patterns around the RNA polyadenylation site in microorganisms, plants and animals

    PubMed Central

    2014-01-01

    Background The polyadenylation of RNA is critical for gene functioning, but the conserved sequence motifs (often called signal or signature motifs), motif locations and abundances, and base composition patterns around mRNA polyadenylation [poly(A)] sites are still uncharacterized in most species. The evolutionary tendency for poly(A) site selection is still largely unknown. Results We analyzed the poly(A) site regions of 31 species or phyla. Different groups of species showed different poly(A) signal motifs: UUACUU at the poly(A) site in the parasite Trypanosoma cruzi; UGUAAC (approximately 13 bases upstream of the site) in the alga Chlamydomonas reinhardtii; UGUUUG (or UGUUUGUU) at mainly the fourth base downstream of the poly(A) site in the parasite Blastocystis hominis; and AAUAAA at approximately 16 bases and approximately 19 bases upstream of the poly(A) site in animals and plants, respectively. Polyadenylation signal motifs are usually several hundred times more abundant around poly(A) sites than in whole genomes. These predominant motifs usually had very specific locations, whether upstream of, at, or downstream of poly(A) sites, depending on the species or phylum. The poly(A) site was usually an adenosine (A) in all analyzed species except for B. hominis, and there was weak A predominance in C. reinhardtii. Fungi, animals, plants, and the protist Phytophthora infestans shared a general base abundance pattern (or base composition pattern) of “U-rich—A-rich—U-rich—Poly(A) site—U-rich regions”, or U-A-U-A-U for short, with some variation for each kingdom or subkingdom. Conclusion This study identified the poly(A) signal motifs, motif locations, and base composition patterns around mRNA poly(A) sites in protists, fungi, plants, and animals and provided insight into poly(A) site evolution. PMID:25052519

  6. Sequence conservation and functional constraint on intergenic spacers in reduced genomes of the obligate symbiont Buchnera.

    PubMed

    Degnan, Patrick H; Ochman, Howard; Moran, Nancy A

    2011-09-01

    Analyses of genome reduction in obligate bacterial symbionts typically focus on the removal and retention of protein-coding regions, which are subject to ongoing inactivation and deletion. However, these same forces operate on intergenic spacers (IGSs) and affect their contents, maintenance, and rates of evolution. IGSs comprise both non-coding, non-functional regions, including decaying pseudogenes at varying stages of recognizability, as well as functional elements, such as genes for sRNAs and regulatory control elements. The genomes of Buchnera and other small genome symbionts display biased nucleotide compositions and high rates of sequence evolution and contain few recognizable regulatory elements. However, IGS lengths are highly correlated across divergent Buchnera genomes, suggesting the presence of functional elements. To identify functional regions within the IGSs, we sequenced two Buchnera genomes (from aphid species Uroleucon ambrosiae and Acyrthosiphon kondoi) and applied a phylogenetic footprinting approach to alignments of orthologous IGSs from a total of eight Buchnera genomes corresponding to six aphid species. Inclusion of these new genomes allowed comparative analyses at intermediate levels of divergence, enabling the detection of both conserved elements and previously unrecognized pseudogenes. Analyses of these genomes revealed that 232 of 336 IGS alignments over 50 nucleotides in length displayed substantial sequence conservation. Conserved alignment blocks within these IGSs encompassed 88 Shine-Dalgarno sequences, 55 transcriptional terminators, 5 Sigma-32 binding sites, and 12 novel small RNAs. Although pseudogene formation, and thus IGS formation, are ongoing processes in these genomes, a large proportion of intergenic spacers contain functional sequences. PMID:21912528

  7. Variation in conserved non-coding sequences on chromosome 5q andsusceptibility to asthma and atopy

    SciTech Connect

    Donfack, Joseph; Schneider, Daniel H.; Tan, Zheng; Kurz,Thorsten; Dubchak, Inna; Frazer, Kelly A.; Ober, Carole

    2005-09-10

    Background: Evolutionarily conserved sequences likely havebiological function. Methods: To determine whether variation in conservedsequences in non-coding DNA contributes to risk for human disease, westudied six conserved non-coding elements in the Th2 cytokine cluster onhuman chromosome 5q31 in a large Hutterite pedigree and in samples ofoutbred European American and African American asthma cases and controls.Results: Among six conserved non-coding elements (>100 bp,>70percent identity; human-mouse comparison), we identified one singlenucleotide polymorphism (SNP) in each of two conserved elements and sixSNPs in the flanking regions of three conserved elements. We genotypedour samples for four of these SNPs and an additional three SNPs each inthe IL13 and IL4 genes. While there was only modest evidence forassociation with single SNPs in the Hutterite and European Americansamples (P<0.05), there were highly significant associations inEuropean Americans between asthma and haplotypes comprised of SNPs in theIL4 gene (P<0.001), including a SNP in a conserved non-codingelement. Furthermore, variation in the IL13 gene was strongly associatedwith total IgE (P = 0.00022) and allergic sensitization to mold allergens(P = 0.00076) in the Hutterites, and more modestly associated withsensitization to molds in the European Americans and African Americans (P<0.01). Conclusion: These results indicate that there is overalllittle variation in the conserved non-coding elements on 5q31, butvariation in IL4 and IL13, including possibly one SNP in a conservedelement, influence asthma and atopic phenotypes in diversepopulations.

  8. A survey of DNA motif finding algorithms

    PubMed Central

    Das, Modan K; Dai, Ho-Kwok

    2007-01-01

    Background Unraveling the mechanisms that regulate gene expression is a major challenge in biology. An important task in this challenge is to identify regulatory elements, especially the binding sites in deoxyribonucleic acid (DNA) for transcription factors. These binding sites are short DNA segments that are called motifs. Recent advances in genome sequence availability and in high-throughput gene expression analysis technologies have allowed for the development of computational methods for motif finding. As a result, a large number of motif finding algorithms have been implemented and applied to various motif models over the past decade. This survey reviews the latest developments in DNA motif finding algorithms. Results Earlier algorithms use promoter sequences of coregulated genes from single genome and search for statistically overrepresented motifs. Recent algorithms are designed to use phylogenetic footprinting or orthologous sequences and also an integrated approach where promoter sequences of coregulated genes and phylogenetic footprinting are used. All the algorithms studied have been reported to correctly detect the motifs that have been previously detected by laboratory experimental approaches, and some algorithms were able to find novel motifs. However, most of these motif finding algorithms have been shown to work successfully in yeast and other lower organisms, but perform significantly worse in higher organisms. Conclusion Despite considerable efforts to date, DNA motif finding remains a complex challenge for biologists and computer scientists. Researchers have taken many different approaches in developing motif discovery tools and the progress made in this area of research is very encouraging. Performance comparison of different motif finding tools and identification of the best tools have proven to be a difficult task because tools are designed based on algorithms and motif models that are diverse and complex and our incomplete understanding of

  9. Characterization of G protein coupling mediated by the conserved D1343.49 of DRY motif, M2416.34, and F2516.44 residues on human CXCR1

    PubMed Central

    Han, Xinbing; Feng, Yan; Chen, Xinhua; Gerard, Craig; Boisvert, William A.

    2015-01-01

    CXCR1, a receptor for interleukin-8 (IL-8), plays an important role in defending against pathogen invasion during neutrophil-mediated innate immune response. Human CXCR1 is a G protein-coupled receptor (GPCR) with its characteristic seven transmembrane domains (TMs). Functional and structural analyses of several GPCRs have revealed that conserved residues on TM3 (including the highly conserved Asp-Arg-Tyr (DRY) motif) and TM6 near intracellular loops contain domains critical for G protein coupling as well as GPCR activation. The objective of this study was to elucidate the role of critical amino acid residues on TM3 near intracellular loop 2 (i2) and TM6 near intracellular loop 3 (i3), including S1323.47 (Baldwin location), D1343.49, M2416.34, and F2516.44, in G protein coupling and CXCR1 activation. The results demonstrate that mutations of D1343.49 at DRY motif of CXCR1 (D134N and D134V) completely abolished the ligand binding and functional response of the receptor. Additionally, point mutations at positions 241 and 251 between TM6 and i3 loop generated mutant receptors with modest constitutive activity via Gα15 signaling activation. Our results show that D1343.49 on the highly conserved DRY motif has a distinct role for CXCR1 compared to its homologues (CXCR2 and KSHV-GPCR) in G protein coupling and receptor activation. In addition, M2416.34 and F2516.44 along with our previously identified V2476.40 on TM6 are spatially located in a “hot spot” likely essential for CXCR1 activation. Identification of these amino acid residues may be useful for elucidating mechanism of CXCR1 activation and designing specific antagonists for the treatment of CXCR1-mediated diseases. PMID:25834784

  10. Quantification of tertiary structural conservation despite primary sequence drift in the globin fold.

    PubMed

    Aronson, H E; Royer, W E; Hendrickson, W A

    1994-10-01

    The globin family of protein structures was the first for which it was recognized that tertiary structure can be highly conserved even when primary sequences have diverged to a virtually undetectable level of similarity. This principle of structural inertia in molecular evolution is now evident for many other protein families. We have performed a systematic comparison of the sequences and structures of 6 representative hemoglobin subunits as diverse in origin as plants, clams, and humans. Our analysis is based on a 97-residue helical core in common to all 6 structures. Amino acid sequence identities range from 12.4% to 42.3% in pairwise comparisons, and, despite these variations, the maximal RMS deviation in alpha-carbon positions is 3.02 A. Overall, sequence similarity and structural deviation are significantly anticorrelated, with a correlation coefficient of -0.71, but for a set of structures having under 20% pairwise identity, this anticorrelation falls to -0.38, which emphasizes the weak connection between a specific sequence and the tertiary fold. There is substantial variability in structure outside the helical core, and functional characteristics of these globins also differ appreciably. Nevertheless, despite variations in detail that the sequence dissimilarities and functional differences imply, the core structures of these globins remain remarkably preserved. PMID:7849587

  11. Highly conserved D-loop-like nuclear mitochondrial sequences (Numts) in tiger (Panthera tigris).

    PubMed

    Zhang, Wenping; Zhang, Zhihe; Shen, Fujun; Hou, Rong; Lv, Xiaoping; Yue, Bisong

    2006-08-01

    Using oligonucleotide primers designed to match hypervariable segments I (HVS-1) of Panthera tigris mitochondrial DNA (mtDNA), we amplified two different PCR products (500 bp and 287 bp) in the tiger (Panthera tigris), but got only one PCR product (287 bp) in the leopard (Panthera pardus). Sequence analyses indicated that the sequence of 287 bp was a D-loop-like nuclear mitochondrial sequence (Numts), indicating a nuclear transfer that occurred approximately 4.8-17 million years ago in the tiger and 4.6-16 million years ago in the leopard. Although the mtDNA D-loop sequence has a rapid rate of evolution, the 287-bp Numts are highly conserved; they are nearly identical in tiger subspecies and only 1.742% different between tiger and leopard. Thus, such sequences represent molecular 'fossils' that can shed light on evolution of the mitochondrial genome and may be the most appropriate outgroup for phylogenetic analysis. This is also proved by comparing the phylogenetic trees reconstructed using the D-loop sequence of snow leopard and the 287-bp Numts as outgroup. PMID:17072079

  12. The Crystal Structure of the Extracellular 11-heme Cytochrome UndA Reveals a Conserved 10-heme Motif and Defined Binding Site for Soluble Iron Chelates.

    SciTech Connect

    Edwards, Marcus; Hall, Andrea; Shi, Liang; Fredrickson, Jim K.; Zachara, John M.; Butt, Julea N.; Richardson, David; Clarke, Thomas A.

    2012-07-03

    Members of the genus Shewanella translocate deca- or undeca-heme cytochromes to the external cell surface thus enabling respiration using extracellular minerals and polynuclear Fe(III) chelates. The high resolution structure of the first undeca-heme outer membrane cytochrome, UndA, reveals a crossed heme chain with four potential electron ingress/egress sites arranged within four domains. Sequence and structural alignment of UndA and the deca-heme MtrF reveals the extra heme of UndA is inserted between MtrF hemes 6 and 7. The remaining UndA hemes can be superposed over the heme chain of the decaheme MtrF, suggesting that a ten heme core is conserved between outer membrane cytochromes. The UndA structure is the first outer membrane cytochrome to be crystallographically resolved in complex with substrates, an Fe(III)-nitrilotriacetate dimer or an Fe(III)-citrate trimer. The structural resolution of these UndA-Fe(III)-chelate complexes provides a rationale for previous kinetic measurements on UndA and other outer membrane cytochromes.

  13. Coagulase and Efb of Staphylococcus aureus Have a Common Fibrinogen Binding Motif

    PubMed Central

    Ko, Ya-Ping; Kang, Mingsong; Ganesh, Vannakambadi K.; Ravirajan, Dharmanand; Li, Bin

    2016-01-01

    ABSTRACT Coagulase (Coa) and Efb, secreted Staphylococcus aureus proteins, are important virulence factors in staphylococcal infections. Coa interacts with fibrinogen (Fg) and induces the formation of fibrin(ogen) clots through activation of prothrombin. Efb attracts Fg to the bacterial surface and forms a shield to protect the bacteria from phagocytic clearance. This communication describes the use of an array of synthetic peptides to identify variants of a linear Fg binding motif present in Coa and Efb which are responsible for the Fg binding activities of these proteins. This motif represents the first Fg binding motif identified for any microbial protein. We initially located the Fg binding sites to Coa’s C-terminal disordered segment containing tandem repeats by using recombinant fragments of Coa in enzyme-linked immunosorbent assay-type binding experiments. Sequence analyses revealed that this Coa region contained shorter segments with sequences similar to the Fg binding segments in Efb. An alanine scanning approach allowed us to identify the residues in Coa and Efb that are critical for Fg binding and to define the Fg binding motifs in the two proteins. In these motifs, the residues required for Fg binding are largely conserved, and they therefore constitute variants of a common Fg binding motif which binds to Fg with high affinity. Defining a specific motif also allowed us to identify a functional Fg binding register for the Coa repeats that is different from the repeat unit previously proposed. PMID:26733070

  14. A Short Sequence Motif in the 5′ Leader of the HIV-1 Genome Modulates Extended RNA Dimer Formation and Virus Replication*

    PubMed Central

    van Bel, Nikki; Das, Atze T.; Cornelissen, Marion; Abbink, Truus E. M.; Berkhout, Ben

    2014-01-01

    The 5′ leader of the HIV-1 RNA genome encodes signals that control various steps in the replication cycle, including the dimerization initiation signal (DIS) that triggers RNA dimerization. The DIS folds a hairpin structure with a palindromic sequence in the loop that allows RNA dimerization via intermolecular kissing loop (KL) base pairing. The KL dimer can be stabilized by including the DIS stem nucleotides in the intermolecular base pairing, forming an extended dimer (ED). The role of the ED RNA dimer in HIV-1 replication has hardly been addressed because of technical challenges. We analyzed a set of leader mutants with a stabilized DIS hairpin for in vitro RNA dimerization and virus replication in T cells. In agreement with previous observations, DIS hairpin stability modulated KL and ED dimerization. An unexpected previous finding was that mutation of three nucleotides immediately upstream of the DIS hairpin significantly reduced in vitro ED formation. In this study, we tested such mutants in vivo for the importance of the ED in HIV-1 biology. Mutants with a stabilized DIS hairpin replicated less efficiently than WT HIV-1. This defect was most severe when the upstream sequence motif was altered. Virus evolution experiments with the defective mutants yielded fast replicating HIV-1 variants with second site mutations that (partially) restored the WT hairpin stability. Characterization of the mutant and revertant RNA molecules and the corresponding viruses confirmed the correlation between in vitro ED RNA dimer formation and efficient virus replication, thus indicating that the ED structure is important for HIV-1 replication. PMID:25368321

  15. Function of a unique sequence motif in the long terminal repeat of feline leukemia virus isolated from an unusual set of naturally occurring tumors.

    PubMed

    Athas, G B; Lobelle-Rich, P; Levy, L S

    1995-06-01

    Feline leukemia virus (FeLV) proviruses have been characterized from naturally occurring non-B-cell, non-T-cell tumors occurring in the spleens of infected cats. These proviruses exhibit a unique sequence motif in the long terminal repeat (LTR), namely, a 21-bp tandem triplication beginning 25 bp downstream of the enhancer. The repeated finding of the triplication-containing LTR in non-B-cell, non-T-cell lymphomas of the spleen suggests that the unique LTR is an essential participant in the development of tumors of this particular phenotype. The nucleotide sequence of the triplication-containing LTR most closely resembles that of FeLV subgroup C. Studies performed to measure the ability of the triplication-containing LTR to modulate gene expression indicate that the 21-bp triplication provides transcriptional enhancer function to the LTR that contains it and that it substitutes at least in part for the duplication of the enhancer. The 21-bp triplication confers a bona fide enhancer function upon LTR-directed reporter gene expression; however, the possibility of a spacer function was not eliminated. The studies demonstrate further that the triplication-containing LTR acts preferentially in a cell-type-specific manner, i.e., it is 12-fold more active in K-562 cells than is an LTR lacking the triplication. A recombinant, infectious FeLV bearing the 21-bp triplication in U3 was constructed. Cells infected with the recombinant were shown to accumulate higher levels of viral RNA transcripts and virus particles in culture supernatants than did cells infected with the parental type. The triplication-containing LTR is implicated in the induction of tumors of a particular phenotype, perhaps through transcriptional regulation of the virus and/or adjacent cellular genes, in the appropriate target cell. PMID:7745680

  16. Reptiles and Mammals Have Differentially Retained Long Conserved Noncoding Sequences from the Amniote Ancestor

    PubMed Central

    Janes, D.E.; Chapus, C.; Gondo, Y.; Clayton, D.F.; Sinha, S.; Blatti, C.A.; Organ, C.L.; Fujita, M.K.; Balakrishnan, C.N.; Edwards, S.V.

    2011-01-01

    Many noncoding regions of genomes appear to be essential to genome function. Conservation of large numbers of noncoding sequences has been reported repeatedly among mammals but not thus far among birds and reptiles. By searching genomes of chicken (Gallus gallus), zebra finch (Taeniopygia guttata), and green anole (Anolis carolinensis), we quantified the conservation among birds and reptiles and across amniotes of long, conserved noncoding sequences (LCNS), which we define as sequences ≥500 bp in length and exhibiting ≥95% similarity between species. We found 4,294 LCNS shared between chicken and zebra finch and 574 LCNS shared by the two birds and Anolis. The percent of genomes comprised by LCNS in the two birds (0.0024%) is notably higher than the percent in mammals (<0.0003% to <0.001%), differences that we show may be explained in part by differences in genome-wide substitution rates. We reconstruct a large number of LCNS for the amniote ancestor (ca. 8,630) and hypothesize differential loss and substantial turnover of these sites in descendent lineages. By contrast, we estimated a small role for recruitment of LCNS via acquisition of novel functions over time. Across amniotes, LCNS are significantly enriched with transcription factor binding sites for many developmental genes, and 2.9% of LCNS shared between the two birds show evidence of expression in brain expressed sequence tag databases. These results show that the rate of retention of LCNS from the amniote ancestor differs between mammals and Reptilia (including birds) and that this may reflect differing roles and constraints in gene regulation. PMID:21183607

  17. Binding of Actinomycin D to Single-Stranded DNA of Sequence Motifs d(TGTCTnG) and d(TGTnGTCT)

    PubMed Central

    Chen, Fu-Ming; Sha, Feng; Chin, Ko-Hsin; Chou, Shan-Ho

    2003-01-01

    Our recent binding studies with oligomers derived from base replacements on d(CGTCGTCG) had led to the finding that actinomycin D (ACTD) binds strongly to d(TGTCATTG) of apparent single-stranded conformation without GpC sequence. A fold-back binding model was speculated in which the planar phenoxazone inserts at the GTC site with a loop-out T base whereas the G base at the 3′-terminus folds back to form a basepair with the internal C and stacks on the opposite face of the chromophore. To provide a more concrete support for such a model, ACTD equilibrium binding studies were carried out and the results are reported herein on oligomers of sequence motifs d(TGTCTnG) and d(TGTnGTC). These oligomers are not expected to form dimeric duplexes and contain no canonical GpC sequences. It was found that ACTD binds strongly to d(TGTCTTTTG), d(TGTTTTGTC), and d(TGTTTTTGTC), all exhibiting 1:1 drug/strand binding stoichiometry. The fold-back binding model with displaced T base is further supported by the finding that appending TC and TCA at the 3′-terminus of d(TGTCTTTTG) results in oligomers that exhibit enhanced ACTD affinities, consequence of the added basepairing to facilitate the hairpin formation of d(TGTCTTTTGTC) and d(TGTCTTTTGTCA) in stabilizing the GTC/GTC binding site for juxtaposing the two G bases for easy stacking on both faces of the phenoxazone chromophore. Further support comes from the observation of considerable reduction in ACTD affinity when GTC is replaced by GTTC in an oligomer, in line with the reasoning that displacing two T bases to form a bulge for ACTD binding is more difficult than displacing a single base. Based on the elucidated binding principle of phenoxazone ring requiring its opposite faces to be stacked by the 3′-sides of two G bases for tight ACTD binding, several oligonucleotide sequences have been designed and found to bind well. PMID:12524296

  18. MotifMiner: A Table Driven Greedy Algorithm for DNA Motif Mining

    NASA Astrophysics Data System (ADS)

    Seeja, K. R.; Alam, M. A.; Jain, S. K.

    DNA motif discovery is a much explored problem in functional genomics. This paper describes a table driven greedy algorithm for discovering regulatory motifs in the promoter sequences of co-expressed genes. The proposed algorithm searches both DNA strands for the common patterns or motifs. The inputs to the algorithm are set of promoter sequences, the motif length and minimum Information Content. The algorithm generates subsequences of given length from the shortest input promoter sequence. It stores these subsequences and their reverse complements in a table. Then it searches the remaining sequences for good matches of these subsequences. The Information Content score is used to measure the goodness of the motifs. The algorithm has been tested with synthetic data and real data. The results are found promising. The algorithm could discover meaningful motifs from the muscle specific regulatory sequences.

  19. Polyclonal antibody against conserved sequences of mce1A protein blocks MTB infection in macrophages.

    PubMed

    Sivagnanam, Sasikala; Namasivayam, Nalini; Chellam, Rajamanickam

    2012-03-01

    The pathogenesis of Mycobacterium tuberculosis is largely due to its ability to enter and survive within human macrophages. It is suggested that a specific protein namely mammalian cell entry protein is involved in the pathogenesis and the specific gene for this protein mce1A has been identified in several pathogenic organisms such as Rickettsia, Shigella, Escherichia coli, Helicobacter, Streptomyces, Klebsiella, Vibrio, Neisseria, Rhodococcus, Nocardioides, Saccharopolyspora erthyrae, and Pseudomonas. Analysis of mce1 operons in the above mentioned organisms through bioinformatics tools has revealed the presence of unique sequences (conserved regions) suggesting that these sequences may be involved in the process of infection. Presently, the mce1A full-length (1,365 bp) region from Mycobacterium bovis and its conserved regions (303 bp) were cloned in to an expression vector and the purified expressed proteins of molecular weight ~47 and ~11 kDa, respectively, were injected to rabbits to raise the polyclonal antibodies. The purified polyclonal antibodies were checked for their ability to inhibit the Mycobacterium infection in cultured human macrophages. In macrophage invasion assay, when antibody added at high concentration, decrease in viable counts was observed in all cell cultures within the first 5 days after infection, where the intracellular bacterial CFU obtained from the infected MTB increased by the 3rd day at low concentration of antibody. The macrophage invasion assay has indicated that the purified antibodies of mce1A conserved region can inhibit the infection of Mycobacterium. PMID:22159737

  20. Cytochrome Oxidase I (COI) sequence conservation and variation patterns in the yellowfin and longtail tunas.

    PubMed

    Kunal, Swaraj Priyaranjan; Kumar, Girish

    2013-01-01

    Tunas are commercially important fishery worldwide. There are at least 13 species of tuna belonging to three genera, out of which genus Thunnus has maximum eight species. On the basis of their availability, they can be characterised as oceanic such as Thunnus albacares (yellowfin tuna) or coastal such as Thunnus tonggol (longtail tuna). Although these two are different species, morphological differentiation can only be seen in mature individuals, hence misidentification may result in erroneous data set, which ultimately affect conservation strategies. The mitochondrial DNA cytochrome oxidase c subunit 1 (COI) gene is one of the most popular markers for population genetic and phylogeographic studies across the animal kingdom. The present study aims to study the sequence conservation and variation in mitochondrial Cytochrome Oxidase I (COI) between these two species of tuna. COI sequence analysis of yellowfin and longtail revealed the close relationship between them in Thunnus genera. The present study is the first direct comparison of mitochondrial COI sequences of these two tuna species. PMID:23649742

  1. Lack of evidence of conserved lentiviral sequences in pigs with post weaning multisystemic wasting syndrome.

    PubMed Central

    Bratanich, A; Lairmore, M; Heneine, W; Konoby, C; Harding, J; West, K; Vasquez, G; Allan, G; Ellis, J

    1999-01-01

    In order to investigate the role of retroviruses in the recently described porcine postweaning multisystemic wasting syndrome (PMWS) serum and leukocytes were screened for reverse transcriptase (RT) activity, and tissues were examined for the presence of conserved lentiviral sequences using degenerate primers in a polymerase chain reaction (PCR). Serum and stimulated leukocytes from the blood and lymph nodes from pigs with PMWS, as well as from control pigs had RT activity that was detected by the sensitive Amp-RT assay. A 257-bp fragment was amplified from DNA from the blood and bone marrow of pigs with PMWS. This fragment was identical in size to conserved lentiviral sequences that were amplified from plasmids containing DNA from several lentiviruses. Cloning and sequencing of the fragment from affected pigs, however, did not reveal homology with the recognized lentiviruses. Together the results of these analyses suggest that the RT activity present in tissues from control and affected pigs is the result of endogenous retrovirus expression, and that a lentivirus is not a primary pathogen in PMWS. Images Figure 1. Figure 2. PMID:10480463

  2. Inferring the evolutionary history of primate microRNA binding sites: overcoming motif counting biases.

    PubMed

    Simkin, Alfred T; Bailey, Jeffrey A; Gao, Fen-Biao; Jensen, Jeffrey D

    2014-07-01

    The first microRNAs (miRNAs) were identified as essential, conserved regulators of gene expression, targeting the same genes across nearly all bilaterians. However, there are also prominent examples of conserved miRNAs whose functions appear to have shifted dramatically, sometimes over very brief periods of evolutionary time. To determine whether the functions of conserved miRNAs are stable or dynamic over evolutionary time scales, we have here defined the neutral turnover rates of short sequence motifs in predicted primate 3'-UTRs. We find that commonly used approaches to quantify motif turnover rates, which use a presence/absence scoring in extant lineages to infer ancestral states, are inherently biased to infer the accumulation of new motifs, leading to the false inference of continually increasing regulatory complexity over time. Using a maximum likelihood approach to reconstruct individual ancestral nucleotides, we observe that binding sites of conserved miRNAs in fact have roughly equal numbers of gain and loss events relative to ancestral states and turnover extremely slowly relative to nearly identical permutations of the same motif. Contrary to case studies showing examples of functional turnover, our systematic study of miRNA binding sites suggests that in primates, the regulatory roles of conserved miRNAs are strongly conserved. Our revised methodology may be used to quantify the mechanism by which regulatory networks evolve. PMID:24723422

  3. Comparative sequence analysis suggests a conserved gating mechanism for TRP channels

    PubMed Central

    Palovcak, Eugene; Delemotte, Lucie; Klein, Michael L.

    2015-01-01

    The transient receptor potential (TRP) channel superfamily plays a central role in transducing diverse sensory stimuli in eukaryotes. Although dissimilar in sequence and domain organization, all known TRP channels act as polymodal cellular sensors and form tetrameric assemblies similar to those of their distant relatives, the voltage-gated potassium (Kv) channels. Here, we investigated the related questions of whether the allosteric mechanism underlying polymodal gating is common to all TRP channels, and how this mechanism differs from that underpinning Kv channel voltage sensitivity. To provide insight into these questions, we performed comparative sequence analysis on large, comprehensive ensembles of TRP and Kv channel sequences, contextualizing the patterns of conservation and correlation observed in the TRP channel sequences in light of the well-studied Kv channels. We report sequence features that are specific to TRP channels and, based on insight from recent TRPV1 structures, we suggest a model of TRP channel gating that differs substantially from the one mediating voltage sensitivity in Kv channels. The common mechanism underlying polymodal gating involves the displacement of a defect in the H-bond network of S6 that changes the orientation of the pore-lining residues at the hydrophobic gate. PMID:26078053

  4. Conserved Noncoding Sequences Highlight Shared Components of Regulatory Networks in Dicotyledonous Plants[W

    PubMed Central

    Baxter, Laura; Jironkin, Aleksey; Hickman, Richard; Moore, Jay; Barrington, Christopher; Krusche, Peter; Dyer, Nigel P.; Buchanan-Wollaston, Vicky; Tiskin, Alexander; Beynon, Jim; Denby, Katherine; Ott, Sascha

    2012-01-01

    Conserved noncoding sequences (CNSs) in DNA are reliable pointers to regulatory elements controlling gene expression. Using a comparative genomics approach with four dicotyledonous plant species (Arabidopsis thaliana, papaya [Carica papaya], poplar [Populus trichocarpa], and grape [Vitis vinifera]), we detected hundreds of CNSs upstream of Arabidopsis genes. Distinct positioning, length, and enrichment for transcription factor binding sites suggest these CNSs play a functional role in transcriptional regulation. The enrichment of transcription factors within the set of genes associated with CNS is consistent with the hypothesis that together they form part of a conserved transcriptional network whose function is to regulate other transcription factors and control development. We identified a set of promoters where regulatory mechanisms are likely to be shared between the model organism Arabidopsis and other dicots, providing areas of focus for further research. PMID:23110901

  5. Packaging of Mason-Pfizer monkey virus (MPMV) genomic RNA depends upon conserved long-range interactions (LRIs) between U5 and gag sequences.

    PubMed

    Kalloush, Rawan M; Vivet-Boudou, Valérie; Ali, Lizna M; Mustafa, Farah; Marquet, Roland; Rizvi, Tahir A

    2016-06-01

    MPMV has great potential for development as a vector for gene therapy. In this respect, precisely defining the sequences and structural motifs that are important for dimerization and packaging of its genomic RNA (gRNA) are of utmost importance. A distinguishing feature of the MPMV gRNA packaging signal is two phylogenetically conserved long-range interactions (LRIs) between U5 and gag complementary sequences, LRI-I and LRI-II. To test their biological significance in the MPMV life cycle, we introduced mutations into these structural motifs and tested their effects on MPMV gRNA packaging and propagation. Furthermore, we probed the structure of key mutants using SHAPE (selective 2'hydroxyl acylation analyzed by primer extension). Disrupting base-pairing of the LRIs affected gRNA packaging and propagation, demonstrating their significance to the MPMV life cycle. A double mutant restoring a heterologous LRI-I was fully functional, whereas a similar LRI-II mutant failed to restore gRNA packaging and propagation. These results demonstrate that while LRI-I acts at the structural level, maintaining base-pairing is not sufficient for LRI-II function. In addition, in vitro RNA dimerization assays indicated that the loss of RNA packaging in LRI mutants could not be attributed to the defects in dimerization. Our findings suggest that U5-gag LRIs play an important architectural role in maintaining the structure of the 5' region of the MPMV gRNA, expanding the crucial role of LRIs to the nonlentiviral group of retroviruses. PMID:27095024

  6. A Collection of Conserved Noncoding Sequences to Study Gene Regulation in Flowering Plants1[OPEN

    PubMed Central

    2016-01-01

    Transcription factors (TFs) regulate gene expression by binding cis-regulatory elements, of which the identification remains an ongoing challenge owing to the prevalence of large numbers of nonfunctional TF binding sites. Powerful comparative genomics methods, such as phylogenetic footprinting, can be used for the detection of conserved noncoding sequences (CNSs), which are functionally constrained and can greatly help in reducing the number of false-positive elements. In this study, we applied a phylogenetic footprinting approach for the identification of CNSs in 10 dicot plants, yielding 1,032,291 CNSs associated with 243,187 genes. To annotate CNSs with TF binding sites, we made use of binding site information for 642 TFs originating from 35 TF families in Arabidopsis (Arabidopsis thaliana). In three species, the identified CNSs were evaluated using TF chromatin immunoprecipitation sequencing data, resulting in significant overlap for the majority of data sets. To identify ultraconserved CNSs, we included genomes of additional plant families and identified 715 binding sites for 501 genes conserved in dicots, monocots, mosses, and green algae. Additionally, we found that genes that are part of conserved mini-regulons have a higher coherence in their expression profile than other divergent gene pairs. All identified CNSs were integrated in the PLAZA 3.0 Dicots comparative genomics platform (http://bioinformatics.psb.ugent.be/plaza/versions/plaza_v3_dicots/) together with new functionalities facilitating the exploration of conserved cis-regulatory elements and their associated genes. The availability of this data set in a user-friendly platform enables the exploration of functional noncoding DNA to study gene regulation in a variety of plant species, including crops. PMID:27261064

  7. A Collection of Conserved Noncoding Sequences to Study Gene Regulation in Flowering Plants.

    PubMed

    Van de Velde, Jan; Van Bel, Michiel; Vaneechoutte, Dries; Vandepoele, Klaas

    2016-08-01

    Transcription factors (TFs) regulate gene expression by binding cis-regulatory elements, of which the identification remains an ongoing challenge owing to the prevalence of large numbers of nonfunctional TF binding sites. Powerful comparative genomics methods, such as phylogenetic footprinting, can be used for the detection of conserved noncoding sequences (CNSs), which are functionally constrained and can greatly help in reducing the number of false-positive elements. In this study, we applied a phylogenetic footprinting approach for the identification of CNSs in 10 dicot plants, yielding 1,032,291 CNSs associated with 243,187 genes. To annotate CNSs with TF binding sites, we made use of binding site information for 642 TFs originating from 35 TF families in Arabidopsis (Arabidopsis thaliana). In three species, the identified CNSs were evaluated using TF chromatin immunoprecipitation sequencing data, resulting in significant overlap for the majority of data sets. To identify ultraconserved CNSs, we included genomes of additional plant families and identified 715 binding sites for 501 genes conserved in dicots, monocots, mosses, and green algae. Additionally, we found that genes that are part of conserved mini-regulons have a higher coherence in their expression profile than other divergent gene pairs. All identified CNSs were integrated in the PLAZA 3.0 Dicots comparative genomics platform (http://bioinformatics.psb.ugent.be/plaza/versions/plaza_v3_dicots/) together with new functionalities facilitating the exploration of conserved cis-regulatory elements and their associated genes. The availability of this data set in a user-friendly platform enables the exploration of functional noncoding DNA to study gene regulation in a variety of plant species, including crops. PMID:27261064

  8. Substitution of a conserved cysteine-996 in a cysteine-rich motif of the laminin {alpha}2-chain in congenital muscular dystrophy with partial deficiency of the protein

    SciTech Connect

    Nissinen, M.; Xu Zhang; Tryggvason, K.

    1996-06-01

    Congenital muscular dystrophies (CMDs) are autosomal recessive muscle disorders of early onset. Approximately half of CMD patients present laminin {alpha}2-chain (merosin) deficiency in muscle biopsies, and the disease locus has been mapped to the region of the LAMA2 gene (6q22-23) in several families. Recently, two nonsense mutations in the laminin {alpha}2-chain gene were identified in CMD patients exhibiting complete deficiency of the laminin {alpha}2-chain in muscle biopsies. However, a subset of CMD patients with linkage to LAMA2 show only partial absence of the laminin {alpha}2-chain around muscle fibers, by immunocytochemical analysis. In the present study we have identified a homozygous missense mutation in the {alpha}2-chain gene of a consanguineous Turkish family with partial laminin {alpha}2-chain deficiency. The T{r_arrow}C transition at position 3035 in the cDNA sequence results in a Cys996{r_arrow}Arg substitution. The mutation that affects one of the conserved cysteine-rich repeats in the short arm of the laminin {alpha}2-chain should result in normal synthesis of the chain and in formation and secretion of a heterotrimeric laminin molecule. Muscular dysfunction is possibly caused either by abnormal disulfide cross-links and folding of the laminin repeat, leading to the disturbance of an as yet unknown binding function of the laminin {alpha}2-chain and to shorter half-life of the muscle-specific laminin-2 and laminin-4 isoforms, or by increased proteolytic sensitivity, leading to truncation of the short arm. 42 refs., 7 figs.

  9. DNA motifs determining the accuracy of repeat duplication during CRISPR adaptation in Haloarcula hispanica.

    PubMed

    Wang, Rui; Li, Ming; Gong, Luyao; Hu, Songnian; Xiang, Hua

    2016-05-19

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) acquire new spacers to generate adaptive immunity in prokaryotes. During spacer integration, the leader-preceded repeat is always accurately duplicated, leading to speculations of a repeat-length ruler. Here in Haloarcula hispanica, we demonstrate that the accurate duplication of its 30-bp repeat requires two conserved mid-repeat motifs, AACCC and GTGGG. The AACCC motif was essential and needed to be ∼10 bp downstream from the leader-repeat junction site, where duplication consistently started. Interestingly, repeat duplication terminated sequence-independently and usually with a specific distance from the GTGGG motif, which seemingly served as an anchor site for a molecular ruler. Accordingly, altering the spacing between the two motifs led to an aberrant duplication size (29, 31, 32 or 33 bp). We propose the adaptation complex may recognize these mid-repeat elements to enable measuring the repeat DNA for spacer integration. PMID:27085805

  10. DNA motifs determining the accuracy of repeat duplication during CRISPR adaptation in Haloarcula hispanica

    PubMed Central

    Wang, Rui; Li, Ming; Gong, Luyao; Hu, Songnian; Xiang, Hua

    2016-01-01

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) acquire new spacers to generate adaptive immunity in prokaryotes. During spacer integration, the leader-preceded repeat is always accurately duplicated, leading to speculations of a repeat-length ruler. Here in Haloarcula hispanica, we demonstrate that the accurate duplication of its 30-bp repeat requires two conserved mid-repeat motifs, AACCC and GTGGG. The AACCC motif was essential and needed to be ∼10 bp downstream from the leader-repeat junction site, where duplication consistently started. Interestingly, repeat duplication terminated sequence-independently and usually with a specific distance from the GTGGG motif, which seemingly served as an anchor site for a molecular ruler. Accordingly, altering the spacing between the two motifs led to an aberrant duplication size (29, 31, 32 or 33 bp). We propose the adaptation complex may recognize these mid-repeat elements to enable measuring the repeat DNA for spacer integration. PMID:27085805

  11. MEME Suite: tools for motif discovery and searching

    PubMed Central

    Bailey, Timothy L.; Boden, Mikael; Buske, Fabian A.; Frith, Martin; Grant, Charles E.; Clementi, Luca; Ren, Jingyuan; Li, Wilfred W.; Noble, William S.

    2009-01-01

    The MEME Suite web server provides a unified portal for online discovery and analysis of sequence motifs representing features such as DNA binding sites and protein interaction domains. The popular MEME motif discovery algorithm is now complemented by the GLAM2 algorithm which allows discovery of motifs containing gaps. Three sequence scanning algorithms—MAST, FIMO and GLAM2SCAN—allow scanning numerous DNA and protein sequence databases for motifs discovered by MEME and GLAM2. Transcription factor motifs (including those discovered using MEME) can be compared with motifs in many popular motif databases using the motif database scanning algorithm Tomtom. Transcription factor motifs can be further analyzed for putative function by association with Gene Ontology (GO) terms using the motif-GO term association tool GOMO. MEME output now contains sequence LOGOS for each discovered motif, as well as buttons to allow motifs to be conveniently submitted to the sequence and motif database scanning algorithms (MAST, FIMO and Tomtom), or to GOMO, for further analysis. GLAM2 output similarly contains buttons for further analysis using GLAM2SCAN and for rerunning GLAM2 with different parameters. All of the motif-based tools are now implemented as web services via Opal. Source code, binaries and a web server are freely available for noncommercial use at http://meme.nbcr.net. PMID:19458158

  12. New insights into SRY regulation through identification of 5' conserved sequences

    PubMed Central

    Ross, Diana GF; Bowles, Josephine; Koopman, Peter; Lehnert, Sigrid

    2008-01-01

    Background SRY is the pivotal gene initiating male sex determination in most mammals, but how its expression is regulated is still not understood. In this study we derived novel SRY 5' flanking genomic sequence data from bovine and caprine genomic BAC clones. Results We identified four intervals of high homology upstream of SRY by comparison of human, bovine, pig, goat and mouse genomic sequences. These conserved regions contain putative binding sites for a large number of known transcription factor families, including several that have been implicated previously in sex determination and early gonadal development. Conclusion Our results reveal potentially important SRY regulatory elements, mutations in which might underlie cases of idiopathic human XY sex reversal. PMID:18851760

  13. Vertebrate paralogous conserved noncoding sequences may be related to gene expressions in brain.

    PubMed

    Matsunami, Masatoshi; Saitou, Naruya

    2013-01-01

    Vertebrate genomes include gene regulatory elements in protein-noncoding regions. A part of gene regulatory elements are expected to be conserved according to their functional importance, so that evolutionarily conserved noncoding sequences (CNSs) might be good candidates for those elements. In addition, paralogous CNSs, which are highly conserved among both orthologous loci and paralogous loci, have the possibility of controlling overlapping expression patterns of their adjacent paralogous protein-coding genes. The two-round whole-genome duplications (2R WGDs), which most probably occurred in the vertebrate common ancestors, generated large numbers of paralogous protein-coding genes and their regulatory elements. These events could contribute to the emergence of vertebrate features. However, the evolutionary history and influences of the 2R WGDs are still unclear, especially in noncoding regions. To address this issue, we identified paralogous CNSs. Region-focused Basic Local Alignment Search Tool (BLAST) search of each synteny block revealed 7,924 orthologous CNSs and 309 paralogous CNSs conserved among eight high-quality vertebrate genomes. Paralogous CNSs we found contained 115 previously reported ones and newly detected 194 ones. Through comparisons with VISTA Enhancer Browser and available ChIP-seq data, one-third (103) of paralogous CNSs detected in this study showed gene regulatory activity in the brain at several developmental stages. Their genomic locations are highly enriched near the transcription factor-coding regions, which are expressed in brain and neural systems. These results suggest that paralogous CNSs are conserved mainly because of maintaining gene expression in the vertebrate brain. PMID:23267051

  14. The Chinese hamster Alu-equivalent sequence: a conserved highly repetitious, interspersed deoxyribonucleic acid sequence in mammals has a structure suggestive of a transposable element.

    PubMed Central

    Haynes, S R; Toomey, T P; Leinwand, L; Jelinek, W R

    1981-01-01

    A consensus sequence has been determined for a major interspersed deoxyribonucleic acid repeat in the genome of Chinese hamster ovary cells (CHO cells). This sequence is extensively homologous to (i) the human Alu sequence (P. L. Deininger et al., J. Mol. Biol., in press), (ii) the mouse B1 interspersed repetitious sequence (Krayev et al., Nucleic Acids Res. 8:1201-1215, 1980) (iii) an interspersed repetitious sequence from African green monkey deoxyribonucleic acid (Dhruva et al., Proc. Natl. Acad. Sci. U.S.A. 77:4514-4518, 1980) and (iv) the CHO and mouse 4.5S ribonucleic acid (this report; F. Harada and N. Kato, Nucleic Acids Res. 8:1273-1285, 1980). Because the CHO consensus sequence shows significant homology to the human Alu sequence it is termed the CHO Alu-equivalent sequence. A conserved structure surrounding CHO Alu-equivalent family members can be recognized. It is similar to that surrounding the human Alu and the mouse B1 sequences, and is represented as follows: direct repeat-CHO-Alu-A-rich sequence-direct repeat. A composite interspersed repetitious sequence has been identified. Its structure is represented as follows: direct repeat-residue 47 to 107 of CHO-Alu-non-Alu repetitious sequence-A-rich sequence-direct repeat. Because the Alu flanking sequences resemble those that flank known transposable elements, we think it likely that the Alu sequence dispersed throughout the mammalian genome by transposition. Images PMID:9279371

  15. Sequence of Radiotherapy and Chemotherapy in Breast Cancer After Breast-Conserving Surgery

    SciTech Connect

    Jobsen, Jan J.; Palen, Job van der; Brinkhuis, Marieel; Ong, Francisca; Struikmans, Henk

    2012-04-01

    Purpose: The optimal sequence of radiotherapy and chemotherapy in breast-conserving therapy is unknown. Methods and Materials: From 1983 through 2007, a total of 641 patients with 653 instances of breast-conserving therapy (BCT), received both chemotherapy and radiotherapy and are the basis of this analysis. Patients were divided into three groups. Groups A and B comprised patients treated before 2005, Group A radiotherapy first and Group B chemotherapy first. Group C consisted of patients treated from 2005 onward, when we had a fixed sequence of radiotherapy first, followed by chemotherapy. Results: Local control did not show any differences among the three groups. For distant metastasis, no difference was shown between Groups A and B. Group C, when compared with Group A, showed, on univariate and multivariate analyses, a significantly better distant metastasis-free survival. The same was noted for disease-free survival. With respect to disease-specific survival, no differences were shown on multivariate analysis among the three groups. Conclusion: Radiotherapy, as an integral part of the primary treatment of BCT, should be administered first, followed by adjuvant chemotherapy.

  16. Conserved Noncoding Sequences Regulate lhx5 Expression in the Zebrafish Forebrain

    PubMed Central

    Sun, Liu; Chen, Fengjiao; Peng, Gang

    2015-01-01

    The LIM homeobox family protein Lhx5 plays important roles in forebrain development in the vertebrates. The lhx5 gene exhibits complex temporal and spatial expression patterns during early development but its transcriptional regulation mechanisms are not well understood. Here, we have used transgenesis in zebrafish in order to define regulatory elements that drive lhx5 expression in the forebrain. Through comparative genomic analysis we identified 10 non-coding sequences conserved in five teleost species. We next examined the enhancer activities of these conserved non-coding sequences with Tol2 transposon mediated transgenesis. We found a proximately located enhancer gave rise to robust reporter EGFP expression in the forebrain regions. In addition, we identified an enhancer located at approximately 50 kb upstream of lhx5 coding region that is responsible for reporter gene expression in the hypothalamus. We also identify an enhancer located approximately 40 kb upstream of the lhx5 coding region that is required for expression in the prethalamus (ventral thalamus). Together our results suggest discrete enhancer elements control lhx5 expression in different regions of the forebrain. PMID:26147098

  17. Genomes of sequence type 121 Listeria monocytogenes strains harbor highly conserved plasmids and prophages

    PubMed Central

    Schmitz-Esser, Stephan; Müller, Anneliese; Stessl, Beatrix; Wagner, Martin

    2015-01-01

    The food-borne pathogen Listeria (L.) monocytogenes is often found in food production environments. Thus, controlling the occurrence of L. monocytogenes in food production is a great challenge for food safety. Among a great diversity of L. monocytogenes strains from food production, particularly strains belonging to sequence type (ST)121 are prevalent. The molecular reasons for the abundance of ST121 strains are however currently unknown. We therefore determined the genome sequences of three L. monocytogenes ST121 strains: 6179 and 4423, which persisted for up to 8 years in food production plants in Ireland and Austria, and of the strain 3253 and compared them with available L. monocytogenes ST121 genomes. Our results show that the ST121 genomes are highly similar to each other and show a tremendously high degree of conservation among some of their prophages and particularly among their plasmids. This remarkably high level of conservation among prophages and plasmids suggests that strong selective pressure is acting on them. We thus hypothesize that plasmids and prophages are providing important adaptations for survival in food production environments. In addition, the ST121 genomes share common adaptations which might be related to their persistence in food production environments such as the presence of Tn6188, a transposon responsible for increased tolerance against quaternary ammonium compounds, a yet undescribed insertion harboring recombination hotspot (RHS) repeat proteins, which are most likely involved in competition against other bacteria, and presence of homologs of the L. innocua genes lin0464 and lin0465. PMID:25972859

  18. Conservation.

    ERIC Educational Resources Information Center

    National Audubon Society, New York, NY.

    This set of teaching aids consists of seven Audubon Nature Bulletins, providing the teacher and student with informational reading on various topics in conservation. The bulletins have these titles: Plants as Makers of Soil, Water Pollution Control, The Ground Water Table, Conservation--To Keep This Earth Habitable, Our Threatened Air Supply,…

  19. Evolutionary conservation of sequence and secondary structures inCRISPR repeats

    SciTech Connect

    Kunin, Victor; Sorek, Rotem; Hugenholtz, Philip

    2006-09-01

    Clustered Regularly Interspaced Palindromic Repeats (CRISPRs) are a novel class of direct repeats, separated by unique spacer sequences of similar length, that are present in {approx}40% of bacterial and all archaeal genomes analyzed to date. More than 40 gene families, called CRISPR-associated sequences (CAS), appear in conjunction with these repeats and are thought to be involved in the propagation and functioning of CRISPRs. It has been proposed that the CRISPR/CAS system samples, maintains a record of, and inactivates invasive DNA that the cell has encountered, and therefore constitutes a prokaryotic analog of an immune system. Here we analyze CRISPR repeats identified in 195 microbial genomes and show that they can be organized into multiple clusters based on sequence similarity. All individual repeats in any given cluster were inferred to form characteristic RNA secondary structure, ranging from non-existent to pronounced. Stable secondary structures included G:U base pairs and exhibited multiple compensatory base changes in the stem region, indicating evolutionary conservation and functional importance. We also show that the repeat-based classification corresponds to, and expands upon, a previously reported CAS gene-based classification including specific relationships between CRISPR and CAS subtypes.

  20. Amino acid binding by the class I aminoacyl-tRNA synthetases: role for a conserved proline in the signature sequence.

    PubMed Central

    Burbaum, J. J.; Schimmel, P.

    1992-01-01

    Although partial or complete three-dimensional structures are known for three Class I aminoacyl-tRNA synthetases, the amino acid-binding sites in these proteins remain poorly characterized. To explore the methionine binding site of Escherichia coli methionyl-tRNA synthetase, we chose to study a specific, randomly generated methionine auxotroph that contains a mutant methionyl-tRNA synthetase whose defect is manifested in an elevated Km for methionine (Barker, D.G., Ebel, J.-P., Jakes, R.C., & Bruton, C.J., 1982, Eur. J. Biochem. 127, 449-457), and employed the polymerase chain reaction to sequence this mutant synthetase directly. We identified a Pro 14 to Ser replacement (P14S), which accounts for a greater than 300-fold elevation in Km for methionine and has little effect on either the Km for ATP or the kcat of the amino acid activation reaction. This mutation destabilizes the protein in vivo, which may partly account for the observed auxotrophy. The altered proline is found in the "signature sequence" of the Class I synthetases and is conserved. This sequence motif is 1 of 2 found in the 10 Class I aminoacyl-tRNA synthetases and, in the known structures, it is in the nucleotide-binding fold as part of a loop between the end of a beta-strand and the start of an alpha-helix. The phenotype of the mutant and the stability and affinity for methionine of the wild-type and mutant enzymes are influenced by the amino acid that is 25 residues beyond the C-terminus of the signature sequence.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:1304356

  1. Signature motifs of GDP polyribonucleotidyltransferase, a non-segmented negative strand RNA viral mRNA capping enzyme, domain in the L protein are required for covalent enzyme–pRNA intermediate formation

    PubMed Central

    Neubauer, Julie; Ogino, Minako; Green, Todd J.; Ogino, Tomoaki

    2016-01-01

    The unconventional mRNA capping enzyme (GDP polyribonucleotidyltransferase, PRNTase; block V) domain in RNA polymerase L proteins of non-segmented negative strand (NNS) RNA viruses (e.g. rabies, measles, Ebola) contains five collinear sequence elements, Rx(3)Wx(3–8)ΦxGxζx(P/A) (motif A; Φ, hydrophobic; ζ, hydrophilic), (Y/W)ΦGSxT (motif B), W (motif C), HR (motif D) and ζxxΦx(F/Y)QxxΦ (motif E). We performed site-directed mutagenesis of the L protein of vesicular stomatitis virus (VSV, a prototypic NNS RNA virus) to examine participation of these motifs in mRNA capping. Similar to the catalytic residues in motif D, G1100 in motif A, T1157 in motif B, W1188 in motif C, and F1269 and Q1270 in motif E were found to be essential or important for the PRNTase activity in the step of the covalent L-pRNA intermediate formation, but not for the GTPase activity that generates GDP (pRNA acceptor). Cap defective mutations in these residues induced termination of mRNA synthesis at position +40 followed by aberrant stop–start transcription, and abolished virus gene expression in host cells. These results suggest that the conserved motifs constitute the active site of the PRNTase domain and the L-pRNA intermediate formation followed by the cap formation is essential for successful synthesis of full-length mRNAs. PMID:26602696

  2. Signature motifs of GDP polyribonucleotidyltransferase, a non-segmented negative strand RNA viral mRNA capping enzyme, domain in the L protein are required for covalent enzyme-pRNA intermediate formation.

    PubMed

    Neubauer, Julie; Ogino, Minako; Green, Todd J; Ogino, Tomoaki

    2016-01-01

    The unconventional mRNA capping enzyme (GDP polyribonucleotidyltransferase, PRNTase; block V) domain in RNA polymerase L proteins of non-segmented negative strand (NNS) RNA viruses (e.g. rabies, measles, Ebola) contains five collinear sequence elements, Rx(3)Wx(3-8)ΦxGxζx(P/A) (motif A; Φ, hydrophobic; ζ, hydrophilic), (Y/W)ΦGSxT (motif B), W (motif C), HR (motif D) and ζxxΦx(F/Y)QxxΦ (motif E). We performed site-directed mutagenesis of the L protein of vesicular stomatitis virus (VSV, a prototypic NNS RNA virus) to examine participation of these motifs in mRNA capping. Similar to the catalytic residues in motif D, G1100 in motif A, T1157 in motif B, W1188 in motif C, and F1269 and Q1270 in motif E were found to be essential or important for the PRNTase activity in the step of the covalent L-pRNA intermediate formation, but not for the GTPase activity that generates GDP (pRNA acceptor). Cap defective mutations in these residues induced termination of mRNA synthesis at position +40 followed by aberrant stop-start transcription, and abolished virus gene expression in host cells. These results suggest that the conserved motifs constitute the active site of the PRNTase domain and the L-pRNA intermediate formation followed by the cap formation is essential for successful synthesis of full-length mRNAs. PMID:26602696

  3. The Calmodulin-Binding, Short Linear Motif, NSCaTE Is Conserved in L-Type Channel Ancestors of Vertebrate Cav1.2 and Cav1.3 Channels

    PubMed Central

    Taiakina, Valentina; Boone, Adrienne N.; Fux, Julia; Senatore, Adriano; Weber-Adrian, Danielle

    2013-01-01

    NSCaTE is a short linear motif of (xWxxx(I or L)xxxx), composed of residues with a high helix-forming propensity within a mostly disordered N-terminus that is conserved in L-type calcium channels from protostome invertebrates to humans. NSCaTE is an optional, lower affinity and calcium-sensitive binding site for calmodulin (CaM) which competes for CaM binding with a more ancient, C-terminal IQ domain on L-type channels. CaM bound to N- and C- terminal tails serve as dual detectors to changing intracellular Ca2+ concentrations, promoting calcium-dependent inactivation of L-type calcium channels. NSCaTE is absent in some arthropod species, and is also lacking in vertebrate L-type isoforms, Cav1.1 and Cav1.4 channels. The pervasiveness of a methionine just downstream from NSCaTE suggests that L-type channels could generate alternative N-termini lacking NSCaTE through the choice of translational start sites. Long N-terminus with an NSCaTE motif in L-type calcium channel homolog LCav1 from pond snail Lymnaea stagnalis has a faster calcium-dependent inactivation than a shortened N-termini lacking NSCaTE. NSCaTE effects are present in low concentrations of internal buffer (0.5 mM EGTA), but disappears in high buffer conditions (10 mM EGTA). Snail and mammalian NSCaTE have an alpha-helical propensity upon binding Ca2+-CaM and can saturate both CaM N-terminal and C-terminal domains in the absence of a competing IQ motif. NSCaTE evolved in ancestors of the first animals with internal organs for promoting a more rapid, calcium-sensitive inactivation of L-type channels. PMID:23626724

  4. The calmodulin-binding, short linear motif, NSCaTE is conserved in L-type channel ancestors of vertebrate Cav1.2 and Cav1.3 channels.

    PubMed

    Taiakina, Valentina; Boone, Adrienne N; Fux, Julia; Senatore, Adriano; Weber-Adrian, Danielle; Guillemette, J Guy; Spafford, J David

    2013-01-01

    NSCaTE is a short linear motif of (xWxxx(I or L)xxxx), composed of residues with a high helix-forming propensity within a mostly disordered N-terminus that is conserved in L-type calcium channels from protostome invertebrates to humans. NSCaTE is an optional, lower affinity and calcium-sensitive binding site for calmodulin (CaM) which competes for CaM binding with a more ancient, C-terminal IQ domain on L-type channels. CaM bound to N- and C- terminal tails serve as dual detectors to changing intracellular Ca(2+) concentrations, promoting calcium-dependent inactivation of L-type calcium channels. NSCaTE is absent in some arthropod species, and is also lacking in vertebrate L-type isoforms, Cav1.1 and Cav1.4 channels. The pervasiveness of a methionine just downstream from NSCaTE suggests that L-type channels could generate alternative N-termini lacking NSCaTE through the choice of translational start sites. Long N-terminus with an NSCaTE motif in L-type calcium channel homolog LCav1 from pond snail Lymnaea stagnalis has a faster calcium-dependent inactivation than a shortened N-termini lacking NSCaTE. NSCaTE effects are present in low concentrations of internal buffer (0.5 mM EGTA), but disappears in high buffer conditions (10 mM EGTA). Snail and mammalian NSCaTE have an alpha-helical propensity upon binding Ca(2+)-CaM and can saturate both CaM N-terminal and C-terminal domains in the absence of a competing IQ motif. NSCaTE evolved in ancestors of the first animals with internal organs for promoting a more rapid, calcium-sensitive inactivation of L-type channels. PMID:23626724

  5. Analysis of interactions between ribosomal proteins and RNA structural motifs

    PubMed Central

    2010-01-01

    Background One important goal of structural bioinformatics is to recognize and predict the interactions between protein binding sites and RNA. Recently, a comprehensive analysis of ribosomal proteins and their interactions with rRNA has been done. Interesting results emerged from the comparison of r-proteins within the small subunit in T. thermophilus and E. coli, supporting the idea of a core made by both RNA and proteins, conserved by evolution. Recent work showed also that ribosomal RNA is modularly composed. Motifs are generally single-stranded sequences of consecutive nucleotides (ssRNA) with characteristic folding. The role of these motifs in protein-RNA interactions has been so far only sparsely investigated. Results This work explores the role of RNA structural motifs in the interaction of proteins with ribosomal RNA (rRNA). We analyze composition, local geometries and conformation of interface regions involving motifs such as tetraloops, kink turns and single extruded nucleotides. We construct an interaction map of protein binding sites that allows us to identify the common types of shared 3-D physicochemical binding patterns for tetraloops. Furthermore, we investigate the protein binding pockets that accommodate single extruded nucleotides either involved in kink-turns or in arbitrary RNA strands. This analysis reveals a new structural motif, called tripod. It corresponds to small pockets consisting of three aminoacids arranged at the vertices of an almost equilateral triangle. We developed a search procedure for the recognition of tripods, based on an empirical tripod fingerprint. Conclusion A comparative analysis with the overall RNA surface and interfaces shows that contact surfaces involving RNA motifs have distinctive features that may be useful for the recognition and prediction of interactions. PMID:20122215

  6. Regulation of SHOOT MERISTEMLESS genes via an upstream-conserved noncoding sequence coordinates leaf development

    PubMed Central

    Uchida, Naoyuki; Townsley, Brad; Chung, Kook-Hyun; Sinha, Neelima

    2007-01-01

    The indeterminate shoot apical meristem of plants is characterized by the expression of the Class 1 KNOTTED1-LIKE HOMEOBOX (KNOX1) genes. KNOX1 genes have been implicated in the acquisition and/or maintenance of meristematic fate. One of the earliest indicators of a switch in fate from indeterminate meristem to determinate leaf primordium is the down-regulation of KNOX1 genes orthologous to SHOOT MERISTEMLESS (STM) in Arabidopsis (hereafter called STM genes) in the initiating primordia. In simple leafed plants, this down-regulation persists during leaf formation. In compound leafed plants, however, KNOX1 gene expression is reestablished later in the developing primordia, creating an indeterminate environment for leaflet formation. Despite this knowledge, most aspects of how STM gene expression is regulated remain largely unknown. Here, we identify two evolutionarily conserved noncoding sequences within the 5′ upstream region of STM genes in both simple and compound leafed species across monocots and dicots. We show that one of these elements is involved in the regulation of the persistent repression and/or the reestablishment of STM expression in the developing leaves but is not involved in the initial down-regulation in the initiating primordia. We also show evidence that this regulation is developmentally significant for leaf formation in the pathway involving ASYMMETRIC LEAVES1/2 (AS1/2) gene expression; these genes are known to function in leaf development. Together, these findings reveal a regulatory point of leaf development mediated through a conserved, noncoding sequence in STM genes. PMID:17898165

  7. Conservation of plasmid DNA sequences in coronatine-producing pathovars of Pseudomonas syringae

    SciTech Connect

    Bender, C.L.; Young, S.A. ); Mitchell, R.E. )

    1991-04-01

    In Pseudomonas syringae pv. tomato PT23.2, plasmid pPT23A (101 kb) is involved in synthesis of the phytotoxin coronatine. The physical characterization of mutations that abolished coronatine production indicated that at least 30 kb of pPT23A DNA are required for toxin synthesis. In the present study, {sup 32}P-labeled DNA fragments from the 30-kb region of pPT23A hybridized to plasmid DNAs from several coronatine-producing pathovars of P. syringae under conditions of high stringency. These experiments indicated that this region of pPT23A was strongly conserved in large plasmids (90 to 105 kb) that reside in P. syringae pv. atropurpurea, glycinea, and morsprunorum. The functional significance of the observed homology was demonstrated in marker-exchange experiments in which Tn5-inactivated sequences from the 30-kb region of pPT23A were used to mutate coronatine synthesis genes in the three heterologous pathovars. Physical characterization of the Tn5 insertions generated by marker exchange indicated that genes controlling coronatine synthesis in P. syringae pv. atropurpurea 1304, glycinea 4180, and morsprunorum 567 and 3714 were located on the large indigenous plasmids where homology was originally detected. Therefore, coronatine biosynthesis genes are strongly conserved in the plasmid DNAs of four producing pathovars, despite their disparate origins (California, Japan, New Zealand, Great Britain, and Italy).

  8. Stochastic motif extraction using hidden Markov model

    SciTech Connect

    Fujiwara, Yukiko; Asogawa, Minoru; Konagaya, Akihiko

    1994-12-31

    In this paper, we study the application of an HMM (hidden Markov model) to the problem of representing protein sequences by a stochastic motif. A stochastic protein motif represents the small segments of protein sequences that have a certain function or structure. The stochastic motif, represented by an HMM, has conditional probabilities to deal with the stochastic nature of the motif. This HMM directive reflects the characteristics of the motif, such as a protein periodical structure or grouping. In order to obtain the optimal HMM, we developed the {open_quotes}iterative duplication method{close_quotes} for HMM topology learning. It starts from a small fully-connected network and iterates the network generation and parameter optimization until it achieves sufficient discrimination accuracy. Using this method, we obtained an HMM for a leucine zipper motif. Compared to the accuracy of a symbolic pattern representation with accuracy of 14.8 percent, an HMM achieved 79.3 percent in prediction. Additionally, the method can obtain an HMM for various types of zinc finger motifs, and it might separate the mixed data. We demonstrated that this approach is applicable to the validation of the protein databases; a constructed HMM b as indicated that one protein sequence annotated as {open_quotes}lencine-zipper like sequence{close_quotes} in the database is quite different from other leucine-zipper sequences in terms of likelihood, and we found this discrimination is plausible.

  9. Mammalian mitochondrial D-loop region structural analysis: identification of new conserved sequences and their functional and evolutionary implications.

    PubMed

    Sbisà, E; Tanzariello, F; Reyes, A; Pesole, G; Saccone, C

    1997-12-31

    This paper reports the first comprehensive analysis of Displacement loop (D-loop) region sequences from ten different mammalian orders. It represents a systematic evolutionary study at the molecular level on regulatory homologous regions in organisms belonging to a well defined class, mammalia, which radiated about 150 million years ago (Mya). We have aligned and analyzed 26 complete D-loop region sequences available in the literature and the fat dormouse sequence, recently determined in our laboratory. The novelty of our alignment consists of the extensive manual revision of the preliminary output obtained by computer program to optimize sequence similarity, particularly for the two peripheral domains displaying heterogeneity in length and the presence of repeated sequences. The multialignment is available at the WWW site: http://www.ba.cnr.it/dloop.html. Our comparative study has allowed us to identify new conserved sequence blocks present in all the species under consideration and events of insertion/deletion which have important implications in both functional and evolutionary aspects. In particular we have detected two blocks, about 60 bp long, extended termination associated sequences (ETAS1 and ETAS2) conserved in all the organisms considered. Evaluation against experimental work suggests a possible functional role of ETAS1 and ETAS2 in the regulation of replication and transcription and targeted experimental approaches. The analyses on conserved sequence blocks (CSBs) clearly indicate that CSB1 is the only very essential element, common to all mammalian mt genomes, while CSB2 and CSB3 could be involved in different though related functions, probably species specific, and thus more linked to nuclear mitochondrial coevolutionary processes. Our hypothesis on the different functional implications of the conserved elements, CSBs and TASs, reported so far as main regulatory signals, would explain the different conservation of these elements in evolution. Moreover

  10. Intronic motif pairs cooperate across exons to promote pre-mRNA splicing

    PubMed Central

    2010-01-01

    Background A very early step in splice site recognition is exon definition, a process that is as yet poorly understood. Communication between the two ends of an exon is thought to be required for this step. We report genome-wide evidence for exons being defined through the combinatorial activity of motifs located in flanking intronic regions. Results Strongly co-occurring motifs were found to specifically reside in four intronic regions surrounding a large number of human exons. These paired motifs occur around constitutive and alternative exons but not pseudo exons. Most co-occurring motifs are limited to intronic regions within 100 nucleotides of the exon. They are preferentially associated with weaker exons. Their pairing is conserved in evolution and they exhibit a lower frequency of single nucleotide polymorphism when paired. Paired motifs display specificity with respect to distance from the exon borders and in constitutive versus alternative splicing. Many resemble binding sites for heterogeneous nuclear ribonucleoproteins. Specific pairs are associated with tissue-specific genes, the higher expression of which coincides with that of the pertinent RNA binding proteins. Tested pairs acted synergistically to enhance exon inclusion, and this enhancement was found to be exon-specific. Conclusions The exon-flanking sequence pairs identified here by genomic analysis promote exon inclusion and may play a role in the exon definition step in pre-mRNA splicing. We propose a model in which multiple concerted interactions are required between exonic sequences and flanking intronic sequences to effect exon definition. PMID:20704715

  11. Structural Basis for WDR5 Interaction (Win) Motif Recognition in Human SET1 Family Histone Methyltransferases*

    PubMed Central

    Dharmarajan, Venkatasubramanian; Lee, Jeong-Heon; Patel, Anamika; Skalnik, David G.; Cosgrove, Michael S.

    2012-01-01

    Translocations and amplifications of the mixed lineage leukemia-1 (MLL1) gene are associated with aggressive myeloid and lymphocytic leukemias in humans. MLL1 is a member of the SET1 family of histone H3 lysine 4 (H3K4) methyltransferases, which are required for transcription of genes involved in hematopoiesis and development. MLL1 associates with a subcomplex containing WDR5, RbBP5, Ash2L, and DPY-30 (WRAD), which together form the MLL1 core complex that is required for sequential mono- and dimethylation of H3K4. We previously demonstrated that WDR5 binds the conserved WDR5 interaction (Win) motif of MLL1 in vitro, an interaction that is required for the H3K4 dimethylation activity of the MLL1 core complex. In this investigation, we demonstrate that arginine 3765 of the MLL1 Win motif is required to co-immunoprecipitate WRAD from mammalian cells, suggesting that the WDR5-Win motif interaction is important for the assembly of the MLL1 core complex in vivo. We also demonstrate that peptides that mimic SET1 family Win motif sequences inhibit H3K4 dimethylation by the MLL1 core complex with varying degrees of efficiency. To understand the structural basis for these differences, we determined structures of WDR5 bound to six different naturally occurring Win motif sequences at resolutions ranging from 1.9 to 1.2 Å. Our results reveal that binding energy differences result from interactions between non-conserved residues C-terminal to the Win motif and to a lesser extent from subtle variation of residues within the Win motif. These results highlight a new class of methylation inhibitors that may be useful for the treatment of MLL1-related malignancies. PMID:22665483

  12. Lineage-Specific Conserved Noncoding Sequences of Plant Genomes: Their Possible Role in Nucleosome Positioning

    PubMed Central

    Hettiarachchi, Nilmini; Kryukov, Kirill; Sumiyama, Kenta; Saitou, Naruya

    2014-01-01

    Many studies on conserved noncoding sequences (CNSs) have found that CNSs are enriched significantly in regulatory sequence elements. We conducted whole-genome analysis on plant CNSs to identify lineage-specific CNSs in eudicots, monocots, angiosperms, and vascular plants based on the premise that lineage-specific CNSs define lineage-specific characters and functions in groups of organisms. We identified 27 eudicot, 204 monocot, 6,536 grass, 19 angiosperm, and 2 vascular plant lineage-specific CNSs (lengths range from 16 to 1,517 bp) that presumably originated in their respective common ancestors. A stronger constraint on the CNSs located in the untranslated regions was observed. The CNSs were often flanked by genes involved in transcription regulation. A drop of A+T content near the border of CNSs was observed and CNS regions showed a higher nucleosome occupancy probability. These CNSs are candidate regulatory elements, which are expected to define lineage-specific features of various plant groups. PMID:25364802

  13. Temporal motifs in time-dependent networks

    NASA Astrophysics Data System (ADS)

    Kovanen, Lauri; Karsai, Márton; Kaski, Kimmo; Kertész, János; Saramäki, Jari

    2011-11-01

    Temporal networks are commonly used to represent systems where connections between elements are active only for restricted periods of time, such as telecommunication, neural signal processing, biochemical reaction and human social interaction networks. We introduce the framework of temporal motifs to study the mesoscale topological-temporal structure of temporal networks in which the events of nodes do not overlap in time. Temporal motifs are classes of similar event sequences, where the similarity refers not only to topology but also to the temporal order of the events. We provide a mapping from event sequences to coloured directed graphs that enables an efficient algorithm for identifying temporal motifs. We discuss some aspects of temporal motifs, including causality and null models, and present basic statistics of temporal motifs in a large mobile call network.

  14. Discovery of Novel ncRNA Sequences in Multiple Genome Alignments on the Basis of Conserved and Stable Secondary Structures.

    PubMed

    Fu, Yinghan; Xu, Zhenjiang Zech; Lu, Zhi J; Zhao, Shan; Mathews, David H

    2015-01-01

    Recently, non-coding RNAs (ncRNAs) have been discovered with novel functions, and it has been appreciated that there is pervasive transcription of genomes. Moreover, many novel ncRNAs are not conserved on the primary sequence level. Therefore, de novo computational ncRNA detection that is accurate and efficient is desirable. The purpose of this study is to develop a ncRNA detection method based on conservation of structure in more than two genomes. A new method called Multifind, using Multilign, was developed. Multilign predicts the common secondary structure for multiple input sequences. Multifind then uses measures of structure conservation to estimate the probability that the input sequences are a conserved ncRNA using a classification support vector machine. Multilign is based on Dynalign, which folds and aligns two sequences simultaneously using a scoring scheme that does not include sequence identity; its structure prediction quality is therefore not affected by input sequence diversity. Additionally, ensemble defect was introduced to Multifind as an additional discriminating feature that quantifies the compactness of the folding space for a sequence. Benchmarks showed Multifind performs better than RNAz and LocARNATE+RNAz, a method that uses RNAz on structure alignments generated by LocARNATE, on testing sequences extracted from the Rfam database. For de novo ncRNA discovery in three genomes, Multifind and LocARNATE+RNAz had an advantage over RNAz in low similarity regions of genome alignments. Additionally, Multifind and LocARNATE+RNAz found different subsets of known ncRNA sequences, suggesting the two approaches are complementary. PMID:26075601

  15. Whole genome sequencing of Ethiopian highlanders reveals conserved hypoxia tolerance genes

    PubMed Central

    2014-01-01

    Background Although it has long been proposed that genetic factors contribute to adaptation to high altitude, such factors remain largely unverified. Recent advances in high-throughput sequencing have made it feasible to analyze genome-wide patterns of genetic variation in human populations. Since traditionally such studies surveyed only a small fraction of the genome, interpretation of the results was limited. Results We report here the results of the first whole genome resequencing-based analysis identifying genes that likely modulate high altitude adaptation in native Ethiopians residing at 3,500 m above sea level on Bale Plateau or Chennek field in Ethiopia. Using cross-population tests of selection, we identify regions with a significant loss of diversity, indicative of a selective sweep. We focus on a 208 kbp gene-rich region on chromosome 19, which is significant in both of the Ethiopian subpopulations sampled. This region contains eight protein-coding genes and spans 135 SNPs. To elucidate its potential role in hypoxia tolerance, we experimentally tested whether individual genes from the region affect hypoxia tolerance in Drosophila. Three genes significantly impact survival rates in low oxygen: cic, an ortholog of human CIC, Hsl, an ortholog of human LIPE, and Paf-AHα, an ortholog of human PAFAH1B3. Conclusions Our study reveals evolutionarily conserved genes that modulate hypoxia tolerance. In addition, we show that many of our results would likely be unattainable using data from exome sequencing or microarray studies. This highlights the importance of whole genome sequencing for investigating adaptation by natural selection. PMID:24555826

  16. Discovery and profiling of novel and conserved microRNAs during flower development in Carya cathayensis via deep sequencing.

    PubMed

    Wang, Zheng Jia; Huang, Jian Qin; Huang, You Jun; Li, Zheng; Zheng, Bing Song

    2012-08-01

    Hickory (Carya cathayensis Sarg.) is an economically important woody plant in China, but its long juvenile phase delays yield. MicroRNAs (miRNAs) are critical regulators of genes and important for normal plant development and physiology, including flower development. We used Solexa technology to sequence two small RNA libraries from two floral differentiation stages in hickory to identify miRNAs related to flower development. We identified 39 conserved miRNA sequences from 114 loci belonging to 23 families as well as two novel and ten potential novel miRNAs belonging to nine families. Moreover, 35 conserved miRNA*s and two novel miRNA*s were detected. Twenty miRNA sequences from 49 loci belonging to 11 families were differentially expressed; all were up-regulated at the later stage of flower development in hickory. Quantitative real-time PCR of 12 conserved miRNA sequences, five novel miRNA families, and two novel miRNA*s validated that all were expressed during hickory flower development, and the expression patterns were similar to those detected with Solexa sequencing. Finally, a total of 146 targets of the novel and conserved miRNAs were predicted. This study identified a diverse set of miRNAs that were closely related to hickory flower development and that could help in plant floral induction. PMID:22481137

  17. Detecting DNA regulatory motifs by incorporating positional trendsin information content

    SciTech Connect

    Kechris, Katherina J.; van Zwet, Erik; Bickel, Peter J.; Eisen,Michael B.

    2004-05-04

    On the basis of the observation that conserved positions in transcription factor binding sites are often clustered together, we propose a simple extension to the model-based motif discovery methods. We assign position-specific prior distributions to the frequency parameters of the model, penalizing deviations from a specified conservation profile. Examples with both simulated and real data show that this extension helps discover motifs as the data become noisier or when there is a competing false motif.

  18. Identification of Novel N-Glycosylation Sites at Noncanonical Protein Consensus Motifs.

    PubMed

    Lowenthal, Mark S; Davis, Kiersta S; Formolo, Trina; Kilpatrick, Lisa E; Phinney, Karen W

    2016-07-01

    N-glycosylation of proteins is well known to occur at asparagine residues that fall within the canonical consensus sequence N-X-S/T but has also been identified at a small number of asparagine residues within N-X-C motifs, including the N491 residue of human serotransferrin. Here we report novel glycosylation sites within noncanonical consensus motifs, in the conformation N-X-C, based on mass spectrometry analysis of partially deglycosylated glycopeptide targets. Alpha-1-acid glycoprotein (A1AG) and serotransferrin (Tf) were observed for the first time to be N-glycosylated on asparagine residues within a total of six unique noncanonical motifs. N-glycosylation was initially predicted in silico based on the evolutionary conservation of the N-X-C motif among related mammalian species and demonstrated experimentally in A1AG from porcine, canine, and feline sources and in human serotransferrin. High-resolution liquid chromatography-tandem mass spectrometry was employed to collect fragmentation data of predicted GlcNAcylated peptides and to assign modification sites within N-X-C motifs. A combination of targeted analytical techniques that includes complementary mass spectrometry platforms, enzymatic digestions, and partial-deglycosylation procedures was developed to confirm the novel observations. Additionally, we found that A1AG in porcine and canine sources is highly N-glycosylated at a noncanonical motif (N-Q-C) based on semiquantitative multiple reaction monitoring analysis-the first report of an N-X-C motif exhibiting substantial N-glycosylation. Although reports of N-X-C motif N-glycosylation are relatively uncommon in the literature, this work adds to a growing list of glycoproteins reported with glycosylation at various forms of noncanonical motifs. PMID:27246700

  19. Mining Conditional Phosphorylation Motifs.

    PubMed

    Liu, Xiaoqing; Wu, Jun; Gong, Haipeng; Deng, Shengchun; He, Zengyou

    2014-01-01

    Phosphorylation motifs represent position-specific amino acid patterns around the phosphorylation sites in the set of phosphopeptides. Several algorithms have been proposed to uncover phosphorylation motifs, whereas the problem of efficiently discovering a set of significant motifs with sufficiently high coverage and non-redundancy still remains unsolved. Here we present a novel notion called conditional phosphorylation motifs. Through this new concept, the motifs whose over-expressiveness mainly benefits from its constituting parts can be filtered out effectively. To discover conditional phosphorylation motifs, we propose an algorithm called C-Motif for a non-redundant identification of significant phosphorylation motifs. C-Motif is implemented under the Apriori framework, and it tests the statistical significance together with the frequency of candidate motifs in a single stage. Experiments demonstrate that C-Motif outperforms some current algorithms such as MMFPh and Motif-All in terms of coverage and non-redundancy of the results and efficiency of the execution. The source code of C-Motif is available at: https://sourceforge. net/projects/cmotif/. PMID:26356863

  20. RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets.

    PubMed

    Thomas-Chollier, Morgane; Herrmann, Carl; Defrance, Matthieu; Sand, Olivier; Thieffry, Denis; van Helden, Jacques

    2012-02-01

    ChIP-seq is increasingly used to characterize transcription factor binding and chromatin marks at a genomic scale. Various tools are now available to extract binding motifs from peak data sets. However, most approaches are only available as command-line programs, or via a website but with size restrictions. We present peak-motifs, a computational pipeline that discovers motifs in peak sequences, compares them with databases, exports putative binding sites for visualization in the UCSC genome browser and generates an extensive report suited for both naive and expert users. It relies on time- and memory-efficient algorithms enabling the treatment of several thousand peaks within minutes. Regarding time efficiency, peak-motifs outperforms all comparable tools by several orders of magnitude. We demonstrate its accuracy by analyzing data sets ranging from 4000 to 1,28,000 peaks for 12 embryonic stem cell-specific transcription factors. In all cases, the program finds the expected motifs and returns additional motifs potentially bound by cofactors. We further apply peak-motifs to discover tissue-specific motifs in peak collections for the p300 transcriptional co-activator. To our knowledge, peak-motifs is the only tool that performs a complete motif analysis and offers a user-friendly web interface without any restriction on sequence size or number of peaks. PMID:22156162

  1. Phylogenetic Inference From Conserved sites Alignments

    SciTech Connect

    grundy, W.N.; Naylor, G.J.P.

    1999-08-15

    Molecular sequences provide a rich source of data for inferring the phylogenetic relationships among species. However, recent work indicates that even an accurate multiple alignment of a large sequence set may yield an incorrect phylogeny and that the quality of the phylogenetic tree improves when the input consists only of the highly conserved, motif regions of the alignment. This work introduces two methods of producing multiple alignments that include only the conserved regions of the initial alignment. The first method retains conserved motifs, whereas the second retains individual conserved sites in the initial alignment. Using parsimony analysis on a mitochondrial data set containing 19 species among which the phylogenetic relationships are widely accepted, both conserved alignment methods produce better phylogenetic trees than the complete alignment. Unlike any of the 19 inference methods used before to analyze this data, both methods produce trees that are completely consistent with the known phylogeny. The motif-based method employs far fewer alignment sites for comparable error rates. For a larger data set containing mitochondrial sequences from 39 species, the site-based method produces a phylogenetic tree that is largely consistent with known phylogenetic relationships and suggests several novel placements.

  2. Designing synthetic RNAs to determine the relevance of structural motifs in picornavirus IRES elements

    NASA Astrophysics Data System (ADS)

    Fernandez-Chamorro, Javier; Lozano, Gloria; Garcia-Martin, Juan Antonio; Ramajo, Jorge; Dotu, Ivan; Clote, Peter; Martinez-Salas, Encarnacion

    2016-04-01

    The function of Internal Ribosome Entry Site (IRES) elements is intimately linked to their RNA structure. Viral IRES elements are organized in modular domains consisting of one or more stem-loops that harbor conserved RNA motifs critical for internal initiation of translation. A conserved motif is the pyrimidine-tract located upstream of the functional initiation codon in type I and II picornavirus IRES. By computationally designing synthetic RNAs to fold into a structure that sequesters the polypyrimidine tract in a hairpin, we establish a correlation between predicted inaccessibility of the pyrimidine tract and IRES activity, as determined in both in vitro and in vivo systems. Our data supports the hypothesis that structural sequestration of the pyrimidine-tract within a stable hairpin inactivates IRES activity, since the stronger the stability of the hairpin the higher the inhibition of protein synthesis. Destabilization of the stem-loop immediately upstream of the pyrimidine-tract also decreases IRES activity. Our work introduces a hybrid computational/experimental method to determine the importance of structural motifs for biological function. Specifically, we show the feasibility of using the software RNAiFold to design synthetic RNAs with particular sequence and structural motifs that permit subsequent experimental determination of the importance of such motifs for biological function.

  3. Designing synthetic RNAs to determine the relevance of structural motifs in picornavirus IRES elements

    PubMed Central

    Fernandez-Chamorro, Javier; Lozano, Gloria; Garcia-Martin, Juan Antonio; Ramajo, Jorge; Dotu, Ivan; Clote, Peter; Martinez-Salas, Encarnacion

    2016-01-01

    The function of Internal Ribosome Entry Site (IRES) elements is intimately linked to their RNA structure. Viral IRES elements are organized in modular domains consisting of one or more stem-loops that harbor conserved RNA motifs critical for internal initiation of translation. A conserved motif is the pyrimidine-tract located upstream of the functional initiation codon in type I and II picornavirus IRES. By computationally designing synthetic RNAs to fold into a structure that sequesters the polypyrimidine tract in a hairpin, we establish a correlation between predicted inaccessibility of the pyrimidine tract and IRES activity, as determined in both in vitro and in vivo systems. Our data supports the hypothesis that structural sequestration of the pyrimidine-tract within a stable hairpin inactivates IRES activity, since the stronger the stability of the hairpin the higher the inhibition of protein synthesis. Destabilization of the stem-loop immediately upstream of the pyrimidine-tract also decreases IRES activity. Our work introduces a hybrid computational/experimental method to determine the importance of structural motifs for biological function. Specifically, we show the feasibility of using the software RNAiFold to design synthetic RNAs with particular sequence and structural motifs that permit subsequent experimental determination of the importance of such motifs for biological function. PMID:27053355

  4. THE GRK4 SUBFAMILY OF G PROTEIN-COUPLED RECEPTOR KINASES: ALTERNATIVE SPLICING, GENE ORGANIZATION, AND SEQUENCE CONSERVATION

    EPA Science Inventory

    The GRK4 subfamily of G protein-coupled receptor kinases. Alternative splicing, gene organization, and sequence conservation.

    Premont RT, Macrae AD, Aparicio SA, Kendall HE, Welch JE, Lefkowitz RJ.

    Department of Medicine, Howard Hughes Medical Institute, Duke Univer...

  5. Nucleotide sequence of a cluster of early and late genes in a conserved segment of the vaccinia virus genome.

    PubMed Central

    Plucienniczak, A; Schroeder, E; Zettlmeissl, G; Streeck, R E

    1985-01-01

    The nucleotide sequence of a 7.6 kb vaccinia DNA segment from a genomic region conserved among different orthopox virus has been determined. This segment contains a tight cluster of 12 partly overlapping open reading frames most of which can be correlated with previously identified early and late proteins and mRNAs. Regulatory signals used by vaccinia virus have been studied. Presumptive promoter regions are rich in A, T and carry the consensus sequences TATA and AATAA spaced at 20-24 base pairs. Tandem repeats of a CTATTC consensus sequence are proposed to be involved in the termination of early transcription. PMID:2987815

  6. Genotyping by sequencing resolves shallow population structure to inform conservation of Chinook salmon (Oncorhynchus tshawytscha)

    PubMed Central

    Larson, Wesley A; Seeb, Lisa W; Everett, Meredith V; Waples, Ryan K; Templin, William D; Seeb, James E

    2014-01-01

    Recent advances in population genomics have made it possible to detect previously unidentified structure, obtain more accurate estimates of demographic parameters, and explore adaptive divergence, potentially revolutionizing the way genetic data are used to manage wild populations. Here, we identified 10 944 single-nucleotide polymorphisms using restriction-site-associated DNA (RAD) sequencing to explore population structure, demography, and adaptive divergence in five populations of Chinook salmon (Oncorhynchus tshawytscha) from western Alaska. Patterns of population structure were similar to those of past studies, but our ability to assign individuals back to their region of origin was greatly improved (>90% accuracy for all populations). We also calculated effective size with and without removing physically linked loci identified from a linkage map, a novel method for nonmodel organisms. Estimates of effective size were generally above 1000 and were biased downward when physically linked loci were not removed. Outlier tests based on genetic differentiation identified 733 loci and three genomic regions under putative selection. These markers and genomic regions are excellent candidates for future research and can be used to create high-resolution panels for genetic monitoring and population assignment. This work demonstrates the utility of genomic data to inform conservation in highly exploited species with shallow population structure. PMID:24665338

  7. Genotyping by sequencing resolves shallow population structure to inform conservation of Chinook salmon (Oncorhynchus tshawytscha).

    PubMed

    Larson, Wesley A; Seeb, Lisa W; Everett, Meredith V; Waples, Ryan K; Templin, William D; Seeb, James E

    2014-03-01

    Recent advances in population genomics have made it possible to detect previously unidentified structure, obtain more accurate estimates of demographic parameters, and explore adaptive divergence, potentially revolutionizing the way genetic data are used to manage wild populations. Here, we identified 10 944 single-nucleotide polymorphisms using restriction-site-associated DNA (RAD) sequencing to explore population structure, demography, and adaptive divergence in five populations of Chinook salmon (Oncorhynchus tshawytscha) from western Alaska. Patterns of population structure were similar to those of past studies, but our ability to assign individuals back to their region of origin was greatly improved (>90% accuracy for all populations). We also calculated effective size with and without removing physically linked loci identified from a linkage map, a novel method for nonmodel organisms. Estimates of effective size were generally above 1000 and were biased downward when physically linked loci were not removed. Outlier tests based on genetic differentiation identified 733 loci and three genomic regions under putative selection. These markers and genomic regions are excellent candidates for future research and can be used to create high-resolution panels for genetic monitoring and population assignment. This work demonstrates the utility of genomic data to inform conservation in highly exploited species with shallow population structure. PMID:24665338

  8. Conserved Non-Coding Sequences are Associated with Rates of mRNA Decay in Arabidopsis

    PubMed Central

    Spangler, Jacob B.; Feltus, Frank Alex

    2013-01-01

    Steady-state mRNA levels are tightly regulated through a combination of transcriptional and post-transcriptional control mechanisms. The discovery of cis-acting DNA elements that encode these control mechanisms is of high importance. We have investigated the influence of conserved non-coding sequences (CNSs), DNA patterns retained after an ancient whole genome duplication event, on the breadth of gene expression and the rates of mRNA decay in Arabidopsis thaliana. The absence of CNSs near α duplicate genes was associated with a decrease in breadth of gene expression and slower mRNA decay rates while the presence CNSs near α duplicates was associated with an increase in breadth of gene expression and faster mRNA decay rates. The observed difference in mRNA decay rate was fastest in genes with CNSs in both non-transcribed and transcribed regions, albeit through an unknown mechanism. This study supports the notion that some Arabidopsis CNSs regulate the steady-state mRNA levels through post-transcriptional control mechanisms and that CNSs also play a role in controlling the breadth of gene expression. PMID:23675377

  9. QColors: an algorithm for conservative viral quasispecies reconstruction from short and non-contiguous next generation sequencing reads.

    PubMed

    Huang, Austin; Kantor, Rami; DeLong, Allison; Schreier, Leeann; Istrail, Sorin

    Next generation sequencing technologies have recently been applied to characterize mutational spectra of the heterogeneous population of viral genotypes (known as a quasispecies) within HIV-infected patients. Such information is clinically relevant because minority genetic subpopulations of HIV within patients enable viral escape from selection pressures such as the immune response and antiretroviral therapy. However, methods for quasispecies sequence reconstruction from next generation sequencing reads are not yet widely used and remains an emerging area of research. Furthermore, the majority of research methodology in HIV has focused on 454 sequencing, while many next-generation sequencing platforms used in practice are limited to shorter read lengths relative to 454 sequencing. Little work has been done in determining how best to address the read length limitations of other platforms. The approach described here incorporates graph representations of both read differences and read overlap to conservatively determine the regions of the sequence with sufficient variability to separate quasispecies sequences. Within these tractable regions of quasispecies inference, we use constraint programming to solve for an optimal quasispecies subsequence determination via vertex coloring of the conflict graph, a representation which also lends itself to data with non-contiguous reads such as paired-end sequencing. We demonstrate the utility of the method by applying it to simulations based on actual intra-patient clonal HIV-1 sequencing data. PMID:23202421

  10. Experimental Support for the Evolution of Symmetric Protein Architecture from a Simple Peptide Motif

    SciTech Connect

    J Lee; M Blaber

    2011-12-31

    The majority of protein architectures exhibit elements of structural symmetry, and 'gene duplication and fusion' is the evolutionary mechanism generally hypothesized to be responsible for their emergence from simple peptide motifs. Despite the central importance of the gene duplication and fusion hypothesis, experimental support for a plausible evolutionary pathway for a specific protein architecture has yet to be effectively demonstrated. To address this question, a unique 'top-down symmetric deconstruction' strategy was utilized to successfully identify a simple peptide motif capable of recapitulating, via gene duplication and fusion processes, a symmetric protein architecture (the threefold symmetric {beta}-trefoil fold). The folding properties of intermediary forms in this deconstruction agree precisely with a previously proposed 'conserved architecture' model for symmetric protein evolution. Furthermore, a route through foldable sequence-space between the simple peptide motif and extant protein fold is demonstrated. These results provide compelling experimental support for a plausible evolutionary pathway of symmetric protein architecture via gene duplication and fusion processes.

  11. Beta-turn propensities as paradigms for the analysis of structural motifs to engineer protein stability.

    PubMed Central

    Ohage, E. C.; Graml, W.; Walter, M. M.; Steinbacher, S.; Steipe, B.

    1997-01-01

    The thermodynamic stability of a protein provides an experimental metric for the relationship of protein sequence and native structure. We have investigated an approach based on an analysis of the structural database for stability engineering of an immunoglobulin variable domain. The most frequently occurring residues in specific positions of beta-turn motifs were predicted to increase the folding stability of mutants that were constructed by site-directed mutagenesis. Even in positions in which different residues are conserved in immunoglobulin sequences, the predictions were confirmed. Frequently, mutants with increased beta-turn propensities display increased folding cooperativities, suggesting pronounced effects on the unfolded state independent of the expected effect on conformational entropy. We conclude that structural motifs with predominantly local interactions can serve as templates with which patterns of sequence preferences can be extracted from the database of protein structures. Such preferences can predict the stability effects of mutations for protein engineering and design. PMID:9007995

  12. Members of the Meloidogyne avirulence protein family contain multiple plant ligand-like motifs.

    PubMed

    Rutter, William B; Hewezi, Tarek; Maier, Tom R; Mitchum, Melissa G; Davis, Eric L; Hussey, Richard S; Baum, Thomas J

    2014-08-01

    Sedentary plant-parasitic nematodes engage in complex interactions with their host plants by secreting effector proteins. Some effectors of both root-knot nematodes (Meloidogyne spp.) and cyst nematodes (Heterodera and Globodera spp.) mimic plant ligand proteins. Most prominently, cyst nematodes secrete effectors that mimic plant CLAVATA3/ESR-related (CLE) ligand proteins. However, only cyst nematodes have been shown to secrete such effectors and to utilize CLE ligand mimicry in their interactions with host plants. Here, we document the presence of ligand-like motifs in bona fide root-knot nematode effectors that are most similar to CLE peptides from plants and cyst nematodes. We have identified multiple tandem CLE-like motifs conserved within the previously identified Meloidogyne avirulence protein (MAP) family that are secreted from root-knot nematodes and have been shown to function in planta. By searching all 12 MAP family members from multiple Meloidogyne spp., we identified 43 repetitive CLE-like motifs composing 14 unique variants. At least one CLE-like motif was conserved in each MAP family member. Furthermore, we documented the presence of other conserved sequences that resemble the variable domains described in Heterodera and Globodera CLE effectors. These findings document that root-knot nematodes appear to use CLE ligand mimicry and point toward a common host node targeted by two evolutionarily diverse groups of nematodes. As a consequence, it is likely that CLE signaling pathways are important in other phytonematode pathosystems as well. PMID:25014776

  13. Sampling Motif-Constrained Ensembles of Networks

    NASA Astrophysics Data System (ADS)

    Fischer, Rico; Leitão, Jorge C.; Peixoto, Tiago P.; Altmann, Eduardo G.

    2015-10-01

    The statistical significance of network properties is conditioned on null models which satisfy specified properties but that are otherwise random. Exponential random graph models are a principled theoretical framework to generate such constrained ensembles, but which often fail in practice, either due to model inconsistency or due to the impossibility to sample networks from them. These problems affect the important case of networks with prescribed clustering coefficient or number of small connected subgraphs (motifs). In this Letter we use the Wang-Landau method to obtain a multicanonical sampling that overcomes both these problems. We sample, in polynomial time, networks with arbitrary degree sequences from ensembles with imposed motifs counts. Applying this method to social networks, we investigate the relation between transitivity and homophily, and we quantify the correlation between different types of motifs, finding that single motifs can explain up to 60% of the variation of motif profiles.

  14. Conserved Patterns of Microbial Immune Escape: Pathogenic Microbes of Diverse Origin Target the Human Terminal Complement Inhibitor Vitronectin via a Single Common Motif

    PubMed Central

    Kraiczy, Peter; Hammerschmidt, Sven; Skerka, Christine; Zipfel, Peter F.; Riesbeck, Kristian

    2016-01-01

    Pathogenicity of many microbes relies on their capacity to resist innate immunity, and to survive and persist in an immunocompetent human host microbes have developed highly efficient and sophisticated complement evasion strategies. Here we show that different human pathogens including Gram-negative and Gram-positive bacteria, as well as the fungal pathogen Candida albicans, acquire the human terminal complement regulator vitronectin to their surface. By using truncated vitronectin fragments we found that all analyzed microbial pathogens (n = 13) bound human vitronectin via the same C-terminal heparin-binding domain (amino acids 352–374). This specific interaction leaves the terminal complement complex (TCC) regulatory region of vitronectin accessible, allowing inhibition of C5b-7 membrane insertion and C9 polymerization. Vitronectin complexed with the various microbes and corresponding proteins was thus functionally active and inhibited complement-mediated C5b-9 deposition. Taken together, diverse microbial pathogens expressing different structurally unrelated vitronectin-binding molecules interact with host vitronectin via the same conserved region to allow versatile control of the host innate immune response. PMID:26808444

  15. An Annotated Catalog of Inverted Repeats of Caenorhabditis elegans Chromosomes III and X, with Observations Concerning Odd/Even Biases and Conserved Motifs

    PubMed Central

    LeBlanc, Mark D.; Aspeslagh, Glen; Buggia, Nathan P.; Dyer, Betsey D.

    2000-01-01

    We have taken a computational approach to the problem of discovering and deciphering the grammar and syntax of gene regulation in eukaryotes. A logical first step is to produce an annotated catalog of all regulatory sites in a given genome. Likely candidates for such sites are direct and indirect repeats, including three subcategories of indirect repeats: inverted (palindromic), everted, and mirror-image repeats. To that end we have produced a searchable database of inverted repeats of chromosomes III and X of Caenorhabditis elegans, the first completely sequenced multicellular eukaryote. Initial results from the use of this catalog are observations concerning odd/even biases in perfect IRs. The potential usefulness of the catalog as a discovery tool for promoters was shown for some of the genes involved with G-protein functions and for heat shock protein 104 (hsp104). PMID:10984456

  16. A survey of motif finding Web tools for detecting binding site motifs in ChIP-Seq data

    PubMed Central

    2014-01-01

    Abstract ChIP-Seq (chromatin immunoprecipitation sequencing) has provided the advantage for finding motifs as ChIP-Seq experiments narrow down the motif finding to binding site locations. Recent motif finding tools facilitate the motif detection by providing user-friendly Web interface. In this work, we reviewed nine motif finding Web tools that are capable for detecting binding site motifs in ChIP-Seq data. We showed each motif finding Web tool has its own advantages for detecting motifs that other tools may not discover. We recommended the users to use multiple motif finding Web tools that implement different algorithms for obtaining significant motifs, overlapping resemble motifs, and non-overlapping motifs. Finally, we provided our suggestions for future development of motif finding Web tool that better assists researchers for finding motifs in ChIP-Seq data. Reviewers This article was reviewed by Prof. Sandor Pongor, Dr. Yuriy Gusev, and Dr. Shyam Prabhakar (nominated by Prof. Limsoon Wong). PMID:24555784

  17. Novel sequences encoding venom C-type lectins are conserved in phylogenetically and geographically distinct Echis and Bitis viper species.

    PubMed

    Harrison, R A; Oliver, J; Hasson, S S; Bharati, K; Theakston, R D G

    2003-10-01

    Envenoming by Echis saw scaled vipers and Bitis arietans puff adders is the leading cause of death and morbidity in Africa due to snake bite. Despite their medical importance, the composition and constituent functionality of venoms from these vipers remains poorly understood. Here, we report the cloning of cDNA sequences encoding seven clusters or isoforms of the haemostasis-disruptive C-type lectin (CTL) proteins from the venom glands of Echis ocellatus, E. pyramidum leakeyi, E. carinatus sochureki and B. arietans. All these CTL sequences encoded the cysteine scaffold that defines the carbohydrate-recognition domain of mammalian CTLs. All but one of the Echis and Bitis CTL sequences showed greater sequence similarity to the beta than alpha CTL subunits in venoms of related Asian and American vipers. Four of the new CTL clusters showed marked inter-cluster sequence conservation across all four viper species which were significantly different from that of previously published viper CTLs. The other three Echis and Bitis CTL clusters showed varying degrees of sequence similarity to published viper venom CTLs. Because viper venom CTLs exhibit a high degree of sequence similarity and yet exert profoundly different effects on the mammalian haemostatic system, no attempt was made to assign functionality to the new Echis and Bitis CTLs on the basis of sequence alone. The extraordinary level of inter-specific and inter-generic sequence conservation exhibited by the Echis and Bitis CTLs leads us to speculate that antibodies to representative molecules should neutralise the biological function of this important group of venom toxins in vipers that are distributed throughout Africa, the Middle East and the Indian subcontinent. PMID:14557069

  18. Protein engineering of selected residues from conserved sequence regions of a novel Anoxybacillus α-amylase.

    PubMed

    Ranjani, Velayudhan; Janeček, Stefan; Chai, Kian Piaw; Shahir, Shafinaz; Abdul Rahman, Raja Noor Zaliha Raja; Chan, Kok-Gan; Goh, Kian Mau

    2014-01-01

    The α-amylases from Anoxybacillus species (ASKA and ADTA), Bacillus aquimaris (BaqA) and Geobacillus thermoleovorans (GTA, Pizzo and GtamyII) were proposed as a novel group of the α-amylase family GH13. An ASKA yielding a high percentage of maltose upon its reaction on starch was chosen as a model to study the residues responsible for the biochemical properties. Four residues from conserved sequence regions (CSRs) were thus selected, and the mutants F113V (CSR-I), Y187F and L189I (CSR-II) and A161D (CSR-V) were characterised. Few changes in the optimum reaction temperature and pH were observed for all mutants. Whereas the Y187F (t1/2 43 h) and L189I (t1/2 36 h) mutants had a lower thermostability at 65°C than the native ASKA (t1/2 48 h), the mutants F113V and A161D exhibited an improved t1/2 of 51 h and 53 h, respectively. Among the mutants, only the A161D had a specific activity, k(cat) and k(cat)/K(m) higher (1.23-, 1.17- and 2.88-times, respectively) than the values determined for the ASKA. The replacement of the Ala-161 in the CSR-V with an aspartic acid also caused a significant reduction in the ratio of maltose formed. This finding suggests the Ala-161 may contribute to the high maltose production of the ASKA. PMID:25069018

  19. Two evolutionarily conserved sequence elements for Peg3/Usp29 transcription

    PubMed Central

    Kim, Jeong Do; Yu, Sungryul; Choo, Jung Ha; Kim, Joomyeong

    2008-01-01

    Background Two evolutionarily Conserved Sequence Elements, CSE1 and CSE2 (YY1 binding sites), are found within the 3.8-kb CpG island surrounding the bidirectional promoter of two imprinted genes, Peg3 (Paternally expressed gene 3) and Usp29 (Ubiquitin-specific protease 29). This CpG island is a likely ICR (Imprinting Control Region) that controls transcription of the 500-kb genomic region of the Peg3 imprinted domain. Results The current study investigated the functional roles of CSE1 and CSE2 in the transcriptional control of the two genes, Peg3 and Usp29, using cell line-based promoter assays. The mutation of 6 YY1 binding sites (CSE2) reduced the transcriptional activity of the bidirectional promoter in the Peg3 direction in an orientation-dependent manner, suggesting an activator role for CSE2 (YY1 binding sites). However, the activity in the Usp29 direction was not detectable regardless of the presence/absence of YY1 binding sites. In contrast, mutation of CSE1 increased the transcriptional activity of the promoter in both the Peg3 and Usp29 directions, suggesting a potential repressor role for CSE1. The observed repression by CSE1 was also orientation-dependent. Serial mutational analyses further narrowed down two separate 6-bp-long regions within the 42-bp-long CSE1 which are individually responsible for the repression of Peg3 and Usp29. Conclusion CSE2 (YY1 binding sites) functions as an activator for Peg3 transcription, while CSE1 acts as a repressor for the transcription of both Peg3 and Usp29. PMID:19068137

  20. Use of a Drosophila Genome-Wide Conserved Sequence Database to Identify Functionally Related cis-Regulatory Enhancers

    PubMed Central

    Brody, Thomas; Yavatkar, Amarendra S; Kuzin, Alexander; Kundu, Mukta; Tyson, Leonard J; Ross, Jermaine; Lin, Tzu-Yang; Lee, Chi-Hon; Awasaki, Takeshi; Lee, Tzumin; Odenwald, Ward F

    2012-01-01

    Background: Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. Results: We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. Conclusions: cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process. Developmental Dynamics 241:169–189, 2012. © 2011 Wiley Periodicals, Inc. Key findings A genome-wide catalog of Drosophila conserved DNA sequence clusters. cis-Decoder discovers functionally related enhancers. Functionally related enhancers share balanced sequence element copy numbers. Many enhancers function during multiple phases of development. PMID:22174086

  1. A novel in vitro replication system for Dengue virus. Initiation of RNA synthesis at the 3'-end of exogenous viral RNA templates requires 5'- and 3'-terminal complementary sequence motifs of the viral RNA.

    PubMed

    You, S; Padmanabhan, R

    1999-11-19

    Positive strand viral replicases are membrane-bound complexes of viral and host proteins. The mechanism of viral replication and the role of host proteins are not well understood. To understand this mechanism, a viral replicase assay that utilizes extracts from dengue virus-infected mosquito (C6/36) cells and exogenous viral RNA templates is reported in this study. The 5'- and 3'-terminal regions (TR) of the template RNAs contain the conserved elements including the complementary (cyclization) motifs and stem-loop structures. RNA synthesis in vitro requires both 5'- and 3'-TR present in the same template molecule or when the 5'-TR RNA was added in trans to the 3'-untranslated region (UTR) RNA. However, the 3'-UTR RNA alone is not active. RNA synthesis occurs by elongation of the 3'-end of the template RNA to yield predominantly a double-stranded hairpin-like RNA product, twice the size of the template RNA. These results suggest that an interaction between 5'- and 3'-TR of the viral RNA that modulates the 3'-UTR RNA structure is required for RNA synthesis by the viral replicase. The complementary cyclization motifs of the viral genome also seem to play an important role in this interaction. PMID:10559263

  2. DNA Motif Databases and Their Uses.

    PubMed

    Stormo, Gary D

    2015-01-01

    Transcription factors (TFs) recognize and bind to specific DNA sequences. The specificity of a TF is usually represented as a position weight matrix (PWM). Several databases of DNA motifs exist and are used in biological research to address important biological questions. This overview describes PWMs and some of the most commonly used motif databases, as well as a few of their common applications. PMID:26334922

  3. The Drosophila juvenile hormone receptor candidates methoprene-tolerant (MET) and germ cell-expressed (GCE) utilize a conserved LIXXL motif to bind the FTZ-F1 nuclear receptor.

    PubMed

    Bernardo, Travis J; Dubrovsky, Edward B

    2012-03-01

    Juvenile hormone (JH) has been implicated in many developmental processes in holometabolous insects, but its mechanism of signaling remains controversial. We previously found that in Drosophila Schneider 2 cells, the nuclear receptor FTZ-F1 is required for activation of the E75A gene by JH. Here, we utilized insect two-hybrid assays to show that FTZ-F1 interacts with two JH receptor candidates, the bHLH-PAS paralogs MET and GCE, in a JH-dependent manner. These interactions are severely reduced when helix 12 of the FTZ-F1 activation function 2 (AF2) is removed, implicating AF2 as an interacting site. Through homology modeling, we found that MET and GCE possess a C-terminal α-helix featuring a conserved motif LIXXL that represents a novel nuclear receptor (NR) box. Docking simulations supported by two-hybrid experiments revealed that FTZ-F1·MET and FTZ-F1·GCE heterodimer formation involves a typical NR box-AF2 interaction but does not require the canonical charge clamp residues of FTZ-F1 and relies primarily on hydrophobic contacts, including a unique interaction with helix 4. Moreover, we identified paralog-specific features, including a secondary interaction site found only in MET. Our findings suggest that a novel NR box enables MET and GCE to interact JH-dependently with the AF2 of FTZ-F1. PMID:22249180

  4. Dominant sequences of human major histocompatibility complex conserved extended haplotypes from HLA-DQA2 to DAXX.

    PubMed

    Larsen, Charles E; Alford, Dennis R; Trautwein, Michael R; Jalloh, Yanoh K; Tarnacki, Jennifer L; Kunnenkeri, Sushruta K; Fici, Dolores A; Yunis, Edmond J; Awdeh, Zuheir L; Alper, Chester A

    2014-10-01

    We resequenced and phased 27 kb of DNA within 580 kb of the MHC class II region in 158 population chromosomes, most of which were conserved extended haplotypes (CEHs) of European descent or contained their centromeric fragments. We determined the single nucleotide polymorphism and deletion-insertion polymorphism alleles of the dominant sequences from HLA-DQA2 to DAXX for these CEHs. Nine of 13 CEHs remained sufficiently intact to possess a dominant sequence extending at least to DAXX, 230 kb centromeric to HLA-DPB1. We identified the regions centromeric to HLA-DQB1 within which single instances of eight "common" European MHC haplotypes previously sequenced by the MHC Haplotype Project (MHP) were representative of those dominant CEH sequences. Only two MHP haplotypes had a dominant CEH sequence throughout the centromeric and extended class II region and one MHP haplotype did not represent a known European CEH anywhere in the region. We identified the centromeric recombination transition points of other MHP sequences from CEH representation to non-representation. Several CEH pairs or groups shared sequence identity in small blocks but had significantly different (although still conserved for each separate CEH) sequences in surrounding regions. These patterns partly explain strong calculated linkage disequilibrium over only short (tens to hundreds of kilobases) distances in the context of a finite number of observed megabase-length CEHs comprising half a population's haplotypes. Our results provide a clearer picture of European CEH class II allelic structure and population haplotype architecture, improved regional CEH markers, and raise questions concerning regional recombination hotspots. PMID:25299700

  5. Dominant Sequences of Human Major Histocompatibility Complex Conserved Extended Haplotypes from HLA-DQA2 to DAXX

    PubMed Central

    Larsen, Charles E.; Alford, Dennis R.; Trautwein, Michael R.; Jalloh, Yanoh K.; Tarnacki, Jennifer L.; Kunnenkeri, Sushruta K.; Fici, Dolores A.; Yunis, Edmond J.; Awdeh, Zuheir L.; Alper, Chester A.

    2014-01-01

    We resequenced and phased 27 kb of DNA within 580 kb of the MHC class II region in 158 population chromosomes, most of which were conserved extended haplotypes (CEHs) of European descent or contained their centromeric fragments. We determined the single nucleotide polymorphism and deletion-insertion polymorphism alleles of the dominant sequences from HLA-DQA2 to DAXX for these CEHs. Nine of 13 CEHs remained sufficiently intact to possess a dominant sequence extending at least to DAXX, 230 kb centromeric to HLA-DPB1. We identified the regions centromeric to HLA-DQB1 within which single instances of eight “common” European MHC haplotypes previously sequenced by the MHC Haplotype Project (MHP) were representative of those dominant CEH sequences. Only two MHP haplotypes had a dominant CEH sequence throughout the centromeric and extended class II region and one MHP haplotype did not represent a known European CEH anywhere in the region. We identified the centromeric recombination transition points of other MHP sequences from CEH representation to non-representation. Several CEH pairs or groups shared sequence identity in small blocks but had significantly different (although still conserved for each separate CEH) sequences in surrounding regions. These patterns partly explain strong calculated linkage disequilibrium over only short (tens to hundreds of kilobases) distances in the context of a finite number of observed megabase-length CEHs comprising half a population's haplotypes. Our results provide a clearer picture of European CEH class II allelic structure and population haplotype architecture, improved regional CEH markers, and raise questions concerning regional recombination hotspots. PMID:25299700

  6. An update on cell surface proteins containing extensin-motifs.

    PubMed

    Borassi, Cecilia; Sede, Ana R; Mecchia, Martin A; Salgado Salter, Juan D; Marzol, Eliana; Muschietti, Jorge P; Estevez, Jose M

    2016-01-01

    In recent years it has become clear that there are several molecular links that interconnect the plant cell surface continuum, which is highly important in many biological processes such as plant growth, development, and interaction with the environment. The plant cell surface continuum can be defined as the space that contains and interlinks the cell wall, plasma membrane and cytoskeleton compartments. In this review, we provide an updated view of cell surface proteins that include modular domains with an extensin (EXT)-motif followed by a cytoplasmic kinase-like domain, known as PERKs (for proline-rich extensin-like receptor kinases); with an EXT-motif and an actin binding domain, known as formins; and with extracellular hybrid-EXTs. We focus our attention on the EXT-motifs with the short sequence Ser-Pro(3-5), which is found in several different protein contexts within the same extracellular space, highlighting a putative conserved structural and functional role. A closer understanding of the dynamic regulation of plant cell surface continuum and its relationship with the downstream signalling cascade is a crucial forthcoming challenge. PMID:26475923

  7. Motif analysis unveils the possible co-regulation of chloroplast genes and nuclear genes encoding chloroplast proteins.

    PubMed

    Wang, Ying; Ding, Jun; Daniell, Henry; Hu, Haiyan; Li, Xiaoman

    2012-09-01

    Chloroplasts play critical roles in land plant cells. Despite their importance and the availability of at least 200 sequenced chloroplast genomes, the number of known DNA regulatory sequences in chloroplast genomes are limited. In this paper, we designed computational methods to systematically study putative DNA regulatory sequences in intergenic regions near chloroplast genes in seven plant species and in promoter sequences of nuclear genes in Arabidopsis and rice. We found that -35/-10 elements alone cannot explain the transcriptional regulation of chloroplast genes. We also concluded that there are unlikely motifs shared by intergenic sequences of most of chloroplast genes, indicating that these genes are regulated differently. Finally and surprisingly, we found five conserved motifs, each of which occurs in no more than six chloroplast intergenic sequences, are significantly shared by promoters of nuclear-genes encoding chloroplast proteins. By integrating information from gene function annotation, protein subcellular localization analyses, protein-protein interaction data, and gene expression data, we further showed support of the functionality of these conserved motifs. Our study implies the existence of unknown nuclear-encoded transcription factors that regulate both chloroplast genes and nuclear genes encoding chloroplast protein, which sheds light on the understanding of the transcriptional regulation of chloroplast genes. PMID:22733202

  8. High-throughput genomic sequencing of cassava bacterial blight strains identifies conserved effectors to target for durable resistance.

    PubMed

    Bart, Rebecca; Cohn, Megan; Kassen, Andrew; McCallum, Emily J; Shybut, Mikel; Petriello, Annalise; Krasileva, Ksenia; Dahlbeck, Douglas; Medina, Cesar; Alicai, Titus; Kumar, Lava; Moreira, Leandro M; Rodrigues Neto, Júlio; Verdier, Valerie; Santana, María Angélica; Kositcharoenkul, Nuttima; Vanderschuren, Hervé; Gruissem, Wilhelm; Bernal, Adriana; Staskawicz, Brian J

    2012-07-10

    Cassava bacterial blight (CBB), incited by Xanthomonas axonopodis pv. manihotis (Xam), is the most important bacterial disease of cassava, a staple food source for millions of people in developing countries. Here we present a widely applicable strategy for elucidating the virulence components of a pathogen population. We report Illumina-based draft genomes for 65 Xam strains and deduce the phylogenetic relatedness of Xam across the areas where cassava is grown. Using an extensive database of effector proteins from animal and plant pathogens, we identify the effector repertoire for each sequenced strain and use a comparative sequence analysis to deduce the least polymorphic of the conserved effectors. These highly conserved effectors have been maintained over 11 countries, three continents, and 70 y of evolution and as such represent ideal targets for developing resistance strategies. PMID:22699502

  9. Local Renyi entropic profiles of DNA sequences

    PubMed Central

    Vinga, Susana; Almeida, Jonas S

    2007-01-01

    Background In a recent report the authors presented a new measure of continuous entropy for DNA sequences, which allows the estimation of their randomness level. The definition therein explored was based on the Rényi entropy of probability density estimation (pdf) using the Parzen's window method and applied to Chaos Game Representation/Universal Sequence Maps (CGR/USM). Subsequent work proposed a fractal pdf kernel as a more exact solution for the iterated map representation. This report extends the concepts of continuous entropy by defining DNA sequence entropic profiles using the new pdf estimations to refine the density estimation of motifs. Results The new methodology enables two results. On the one hand it shows that the entropic profiles are directly related with the statistical significance of motifs, allowing the study of under and over-representation of segments. On the other hand, by spanning the parameters of the kernel function it is possible to extract important information about the scale of each conserved DNA region. The computational applications, developed in Matlab m-code, the corresponding binary executables and additional material and examples are made publicly available at . Conclusion The ability to detect local conservation from a scale-independent representation of symbolic sequences is particularly relevant for biological applications where conserved motifs occur in multiple, overlapping scales, with significant future applications in the recognition of foreign genomic material and inference of motif structures. PMID:17939871

  10. The Thiamin Pyrophosphate-Motif

    NASA Technical Reports Server (NTRS)

    Dominiak, Paulina M.; Ciszak, Ewa M.

    2003-01-01

    Using databases the authors have identified a common thiamin pyrophosphate (TPP)-motif in the family of functionally diverse TPP-dependent enzymes. This common motif consists of multimeric organization of subunits, two catalytic centers, common amino acid sequence, and specific contacts to provide a flip-flop, or alternate site, mechanism of action. Each catalytic center [PP:PYR] is formed at the interface of the PP-domain binding the magnesium ion, pyrophosphate and aminopyrimidine ring of TPP, and the PYR-domain binding the aminopyrimidine ring of that cofactor. A pair of these catalytic centers constitutes the catalytic core [PP:PYR]* within these enzymes. Analysis of the structural elements of this catalytic core reveals novel definition of the common amino acid sequences, which are GX@&(G)@XXGQ, and GDGX25-30 within the PP- domain, and the E&(G)@XXG@ within the PYR-domain, where Q, corresponds to a hydrophobic amino acid. This TPP-motif provides a novel tool for annotation of TPP-dependent enzymes useful in advancing functional proteomics.

  11. Identification of conserved genomic regions and variation therein amongst Cetartiodactyla species using next generation sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background Next Generation Sequencing has created an opportunity to genetically characterize an individual both inexpensively and comprehensively. In earlier work produced in our collaboration [1], it was demonstrated that, for animals without a reference genome, their Next Generation Sequence data ...

  12. ConBind: motif-aware cross-species alignment for the identification of functional transcription factor binding sites.

    PubMed

    Lelieveld, Stefan H; Schütte, Judith; Dijkstra, Maurits J J; Bawono, Punto; Kinston, Sarah J; Göttgens, Berthold; Heringa, Jaap; Bonzanni, Nicola

    2016-05-01

    Eukaryotic gene expression is regulated by transcription factors (TFs) binding to promoter as well as distal enhancers. TFs recognize short, but specific binding sites (TFBSs) that are located within the promoter and enhancer regions. Functionally relevant TFBSs are often highly conserved during evolution leaving a strong phylogenetic signal. While multiple sequence alignment (MSA) is a potent tool to detect the phylogenetic signal, the current MSA implementations are optimized to align the maximum number of identical nucleotides. This approach might result in the omission of conserved motifs that contain interchangeable nucleotides such as the ETS motif (IUPAC code: GGAW). Here, we introduce ConBind, a novel method to enhance alignment of short motifs, even if their mutual sequence similarity is only partial. ConBind improves the identification of conserved TFBSs by improving the alignment accuracy of TFBS families within orthologous DNA sequences. Functional validation of the Gfi1b + 13 enhancer reveals that ConBind identifies additional functionally important ETS binding sites that were missed by all other tested alignment tools. In addition to the analysis of known regulatory regions, our web tool is useful for the analysis of TFBSs on so far unknown DNA regions identified through ChIP-sequencing. PMID:26721389

  13. ConBind: motif-aware cross-species alignment for the identification of functional transcription factor binding sites

    PubMed Central

    Lelieveld, Stefan H.; Schütte, Judith; Dijkstra, Maurits J.J.; Bawono, Punto; Kinston, Sarah J.; Göttgens, Berthold; Heringa, Jaap; Bonzanni, Nicola

    2016-01-01

    Eukaryotic gene expression is regulated by transcription factors (TFs) binding to promoter as well as distal enhancers. TFs recognize short, but specific binding sites (TFBSs) that are located within the promoter and enhancer regions. Functionally relevant TFBSs are often highly conserved during evolution leaving a strong phylogenetic signal. While multiple sequence alignment (MSA) is a potent tool to detect the phylogenetic signal, the current MSA implementations are optimized to align the maximum number of identical nucleotides. This approach might result in the omission of conserved motifs that contain interchangeable nucleotides such as the ETS motif (IUPAC code: GGAW). Here, we introduce ConBind, a novel method to enhance alignment of short motifs, even if their mutual sequence similarity is only partial. ConBind improves the identification of conserved TFBSs by improving the alignment accuracy of TFBS families within orthologous DNA sequences. Functional validation of the Gfi1b + 13 enhancer reveals that ConBind identifies additional functionally important ETS binding sites that were missed by all other tested alignment tools. In addition to the analysis of known regulatory regions, our web tool is useful for the analysis of TFBSs on so far unknown DNA regions identified through ChIP-sequencing. PMID:26721389

  14. Mutational analysis of the adeno-associated virus type 2 Rep68 protein helicase motifs.

    PubMed

    Walker, S L; Wonderling, R S; Owens, R A

    1997-09-01

    The adeno-associated virus type 2 (AAV) Rep78 and Rep68 proteins are required for viral replication. These proteins are encoded by unspliced and spliced transcripts, respectively, from the p5 promoter of AAV and therefore have overlapping amino acid sequences. The Rep78 and Rep68 proteins share a variety of activities including endonuclease, helicase, and ATPase activities and the ability to bind AAV hairpin DNA. The part of the amino acid sequence which is identical in Rep78 and Rep68 contains consensus helicase motifs that are conserved among the parvovirus replication proteins. In the present study, we mutated highly conserved amino acids within these helicase motifs. The mutant proteins were synthesized as maltose binding protein-Rep68 fusions in Escherichia coli cells and affinity purified on amylose resin. The fusion proteins were assayed in vitro, and their activities were directly compared to those of the fusion protein MBP-Rep68 delta, which contains most of the amino acid sequences common to Rep78 and Rep68 and was demonstrated previously to have all of the in vitro activities of wild-type Rep78 and Rep68. Our analysis showed that almost all mutations in the putative helicase motifs severely reduced or abolished helicase activity in vitro. Most mutants also had ATPase activity less than one-eighth of the wild-type levels and lacked endonuclease activity. PMID:9261429

  15. Discovering short linear protein motif based on selective training of profile hidden Markov models.

    PubMed

    Song, Tao; Gu, Hong

    2015-07-21

    Short linear motifs (SLiMs) in proteins are relatively conservative sequence patterns within disordered regions of proteins, typically 3-10 amino acids in length. They play an important role in mediating protein-protein interactions. Discovering SLiMs by computational methods has attracted more and more attention, most of which were based on regular expressions and profiles. In this paper, a de novo motif discovery method was proposed based on profile hidden Markov models (HMMs), which can not only provide the emission probabilities of amino acids in the defined positions of SLiMs, but also model the undefined positions. We adopted the ordered region masking and the relative local conservation (RLC) masking to improve the signal to noise ratio of the query sequences while applying evolutionary weighting to make the important sequences in evolutionary process get more attention by the selective training of profile HMMs. The experimental results show that our method and the profile-based method returned different subsets within a SLiMs dataset, and the performance of the two approaches are equivalent on a more realistic discovery dataset. Profile HMM-based motif discovery methods complement the existing methods and provide another way for SLiMs analysis. PMID:25791288

  16. Conserved sequence-specific lincRNA-steroid receptor interactions drive transcriptional repression and direct cell fate

    PubMed Central

    Hudson, William H.; Pickard, Mark R.; de Vera, Ian Mitchelle S.; Kuiper, Emily G.; Mourtada-Maarabouni, Mirna; Conn, Graeme L.; Kojetin, Douglas J.; Williams, Gwyn T.; Ortlund, Eric A.

    2014-01-01

    The majority of the eukaryotic genome is transcribed, generating a significant number of long intergenic non-coding RNAs (lincRNAs). While lincRNAs represent the most poorly understood product of transcription, recent work has shown lincRNAs fulfill important cellular functions. In addition to low sequence conservation, poor understanding of structural mechanisms driving lincRNA biology hinders systematic prediction of their function. Here, we report the molecular requirements for the recognition of steroid receptors (SRs) by the lincRNA Gas5, which regulates steroid-mediated transcriptional regulation, growth arrest, and apoptosis. We identify the functional Gas5-SR interface and generate point mutations that ablate the SR-Gas5 lincRNA interaction, altering Gas5-driven apoptosis in cancer cell lines. Further, we find that the Gas5 SR-recognition sequence is conserved among haplorhines, with its evolutionary origin as a splice acceptor site. This study demonstrates that lincRNAs can recognize protein targets in a conserved, sequence-specific manner in order to affect critical cell functions. PMID:25377354

  17. Conserved sequence-specific lincRNA-steroid receptor interactions drive transcriptional repression and direct cell fate

    SciTech Connect

    Hudson, William H.; Pickard, Mark R.; de Vera, Ian Mitchelle S.; Kuiper, Emily G.; Mourtada-Maarabouni, Mirna; Conn, Graeme L.; Kojetin, Douglas J.; Williams, Gwyn T.; Ortlund, Eric A.

    2014-12-23

    The majority of the eukaryotic genome is transcribed, generating a significant number of long intergenic noncoding RNAs (lincRNAs). Although lincRNAs represent the most poorly understood product of transcription, recent work has shown lincRNAs fulfill important cellular functions. In addition to low sequence conservation, poor understanding of structural mechanisms driving lincRNA biology hinders systematic prediction of their function. Here we report the molecular requirements for the recognition of steroid receptors (SRs) by the lincRNA growth arrest-specific 5 (Gas5), which regulates steroid-mediated transcriptional regulation, growth arrest and apoptosis. We identify the functional Gas5-SR interface and generate point mutations that ablate the SR-Gas5 lincRNA interaction, altering Gas5-driven apoptosis in cancer cell lines. Further, we find that the Gas5 SR-recognition sequence is conserved among haplorhines, with its evolutionary origin as a splice acceptor site. This study demonstrates that lincRNAs can recognize protein targets in a conserved, sequence-specific manner in order to affect critical cell functions.

  18. Control regions for chromosome replication are conserved with respect to sequence and location among Escherichia coli strains

    PubMed Central

    Frimodt-Møller, Jakob; Charbon, Godefroid; Krogfelt, Karen A.; Løbner-Olesen, Anders

    2015-01-01

    In Escherichia coli, chromosome replication is initiated from oriC by the DnaA initiator protein associated with ATP. Three non-coding regions contribute to the activity of DnaA. The datA locus is instrumental in conversion of DnaAATP to DnaAADP (datA dependent DnaAATP hydrolysis) whereas DnaA rejuvenation sequences 1 and 2 (DARS1 and DARS2) reactivate DnaAADP to DnaAATP. The structural organization of oriC, datA, DARS1, and DARS2 were found conserved among 59 fully sequenced E. coli genomes, with differences primarily in the non-functional spacer regions between key protein binding sites. The relative distances from oriC to datA, DARS1, and DARS2, respectively, was also conserved despite of large variations in genome size, suggesting that the gene dosage of either region is important for bacterial growth. Yet all three regions could be deleted alone or in combination without loss of viability. Competition experiments during balanced growth in rich medium and during mouse colonization indicated roles of datA, DARS1, and DARS2 for bacterial fitness although the relative contribution of each region differed between growth conditions. We suggest that this fitness advantage has contributed to conservation of both sequence and chromosomal location for datA, DARS1, and DARS2. PMID:26441936

  19. The RXL motif of the African cassava mosaic virus Rep protein is necessary for rereplication of yeast DNA and viral infection in plants

    SciTech Connect

    Hipp, Katharina; Rau, Peter; Schäfer, Benjamin; Gronenborn, Bruno; Jeske, Holger

    2014-08-15

    Geminiviruses, single-stranded DNA plant viruses, encode a replication-initiator protein (Rep) that is indispensable for virus replication. A potential cyclin interaction motif (RXL) in the sequence of African cassava mosaic virus Rep may be an alternative link to cell cycle controls to the known interaction with plant homologs of retinoblastoma protein (pRBR). Mutation of this motif abrogated rereplication in fission yeast induced by expression of wildtype Rep suggesting that Rep interacts via its RXL motif with one or several yeast proteins. The RXL motif is essential for viral infection of Nicotiana benthamiana plants, since mutation of this motif in infectious clones prevented any symptomatic infection. The cell-cycle link (Clink) protein of a nanovirus (faba bean necrotic yellows virus) was investigated that activates the cell cycle by binding via its LXCXE motif to pRBR. Expression of wildtype Clink and a Clink mutant deficient in pRBR-binding did not trigger rereplication in fission yeast. - Highlights: • A potential cyclin interaction motif is conserved in geminivirus Rep proteins. • In ACMV Rep, this motif (RXL) is essential for rereplication of fission yeast DNA. • Mutating RXL abrogated viral infection completely in Nicotiana benthamiana. • Expression of a nanovirus Clink protein in yeast did not induce rereplication. • Plant viruses may have evolved multiple routes to exploit host DNA synthesis.

  20. Genomic Locations of Conserved Noncoding Sequences and Their Proximal Protein-Coding Genes in Mammalian Expression Dynamics.

    PubMed

    Babarinde, Isaac Adeyemi; Saitou, Naruya

    2016-07-01

    Experimental studies have found the involvement of certain conserved noncoding sequences (CNSs) in the regulation of the proximal protein-coding genes in mammals. However, reported cases of long range enhancer activities and inter-chromosomal regulation suggest that proximity of CNSs to protein-coding genes might not be important for regulation. To test the importance of the CNS genomic location, we extracted the CNSs conserved between chicken and four mammalian species (human, mouse, dog, and cattle). These CNSs were confirmed to be under purifying selection. The intergenic CNSs are often found in clusters in gene deserts, where protein-coding genes are in paucity. The distribution pattern, ChIP-Seq, and RNA-Seq data suggested that the CNSs are more likely to be regulatory elements and not corresponding to long intergenic noncoding RNAs. Physical distances between CNS and their nearest protein coding genes were well conserved between human and mouse genomes, and CNS-flanking genes were often found in evolutionarily conserved genomic neighborhoods. ChIP-Seq signal and gene expression patterns also suggested that CNSs regulate nearby genes. Interestingly, genes with more CNSs have more evolutionarily conserved expression than those with fewer CNSs. These computationally obtained results suggest that the genomic locations of CNSs are important for their regulatory functions. In fact, various kinds of evolutionary constraints may be acting to maintain the genomic locations of CNSs and protein-coding genes in mammals to ensure proper regulation. PMID:27017584

  1. PCR-based study of conserved and variable DNA sequences of Tritrichomonas foetus isolates from Saskatchewan, Canada.

    PubMed Central

    Riley, D E; Wagner, B; Polley, L; Krieger, J N

    1995-01-01

    The protozoan parasite Tritrichomonas foetus causes infertility and spontaneous abortion in cattle. In Saskatchewan, Canada, the culture prevalence of trichomonads was 65 of 1,048 (6%) among 1,048 bulls tested within a 1-year period ending in April 1994. Saskatchewan was previously thought to be free of the parasite. To confirm the culture results, possible T. foetus DNA presence was determined by the PCR. All of the 16 culture-positive isolates tested were PCR positive by a single-band test, but one PCR product was weak. DNA fingerprinting by both T17 PCR and randomly amplified polymorphic DNA PCR revealed genetic variation or polymorphism among the T. foetus isolates. T17 PCR also revealed conserved loci that distinguished these T. foetus isolates from Trichomonas vaginalis, from a variety of other protozoa, and from prokaryotes. TCO-1 PCR, a PCR test designed to sample DNA sequence homologous to the 5' flank of a highly conserved cell division control gene, detected genetic polymorphism at low stringency and a conserved, single locus at higher stringency. These findings suggested that T. foetus isolates exhibit both conserved genetic loci and polymorphic loci detectable by independent PCR methods. Both conserved and polymorphic genetic loci may prove useful for improved clinical diagnosis of T. foetus. The polymorphic loci detected by PCR suggested either a long history of infection or multiple lines of T. foetus infection in Saskatchewan. Polymorphic loci detected by PCR may provide data for epidemiologic studies of T. foetus. PMID:7615746

  2. A novel secondary structure based on fused five-membered rings motif

    PubMed Central

    Dhar, Jesmita; Kishore, Raghuvansh; Chakrabarti, Pinak

    2016-01-01

    An analysis of protein structures indicates the existence of a novel, fused five-membered rings motif, comprising of two residues (i and i + 1), stabilized by interresidue Ni+1–H∙∙∙Ni and intraresidue Ni+1–H∙∙∙O=Ci+1 hydrogen bonds. Fused-rings geometry is the common thread running through many commonly occurring motifs, such as β-turn, β-bulge, Asx-turn, Ser/Thr-turn, Schellman motif, and points to its structural robustness. A location close to the beginning of a β-strand is rather common for the motif. Devoid of side chain, Gly seems to be a key player in this motif, occurring at i, for which the backbone torsion angles cluster at ~(−90°, −10°) and (70°, 20°). The fused-rings structures, distant from each other in sequence, can hydrogen bond with each other, and the two segments aligned to each other in a parallel fashion, give rise to a novel secondary structure, topi, which is quite common in proteins, distinct from two major secondary structures, α-helix and β-sheet. Majority of the peptide segments making topi are identified as aggregation-prone and the residues tend to be conserved among homologous proteins. PMID:27511362

  3. Characterization of DNA sequences that mediate nuclear protein binding to the regulatory region of the Pisum sativum (pea) chlorophyl a/b binding protein gene AB80: identification of a repeated heptamer motif.

    PubMed

    Argüello, G; García-Hernández, E; Sánchez, M; Gariglio, P; Herrera-Estrella, L; Simpson, J

    1992-05-01

    Two protein factors binding to the regulatory region of the pea chlorophyl a/b binding protein gene AB80 have been identified. One of these factors is found only in green tissue but not in etiolated or root tissue. The second factor (denominated ABF-2) binds to a DNA sequence element that contains a direct heptamer repeat TCTCAAA. It was found that presence of both of the repeats is essential for binding. ABF-2 is present in both green and etiolated tissue and in roots and factors analogous to ABF-2 are present in several plant species. Computer analysis showed that the TCTCAAA motif is present in the regulatory region of several plant genes. PMID:1303797

  4. Fast approximate motif statistics.

    PubMed

    Nicodème, P

    2001-01-01

    We present in this article a fast approximate method for computing the statistics of a number of non-self-overlapping matches of motifs in a random text in the nonuniform Bernoulli model. This method is well suited for protein motifs where the probability of self-overlap of motifs is small. For 96% of the PROSITE motifs, the expectations of occurrences of the motifs in a 7-million-amino-acids random database are computed by the approximate method with less than 1% error when compared with the exact method. Processing of the whole PROSITE takes about 30 seconds with the approximate method. We apply this new method to a comparison of the C. elegans and S. cerevisiae proteomes. PMID:11535175

  5. Retroposition and evolution of the DNA-binding motifs of YY1, YY2 and REX1.

    PubMed

    Kim, Jeong Do; Faulk, Christopher; Kim, Joomyeong

    2007-01-01

    YY1 is a DNA-binding transcription factor found in both vertebrates and invertebrates. Database searches identified 62 YY1 related sequences from all the available genome sequences ranging from flying insects to human. These sequences are characterized by high levels of sequence conservation, ranging from 66% to 100% similarity, in the zinc finger DNA-binding domain of the predicted proteins. Phylogenetic analyses uncovered duplication events of YY1 in several different lineages, including flies, fish and mammals. Retroposition is responsible for generating one duplicate in flies, PHOL from PHO, and two duplicates in placental mammals, YY2 and Reduced Expression 1 (REX1) from YY1. DNA-binding motif studies have demonstrated that YY2 still binds to the same consensus sequence as YY1 but with much lower affinity. In contrast, REX1 binds to DNA motifs divergent from YY1, but the binding motifs of REX1 and YY1 share some similarity at their core regions (5'-CCAT-3'). This suggests that the two duplicates, YY2 and REX1, although generated through similar retroposition events have undergone different selection schemes to adapt to new roles in placental mammals. Overall, the conservation of YY2 and REX1 in all placental mammals predicts that each duplicate has co-evolved with some unique features of eutherian mammals. PMID:17478514

  6. Overlapping ETS and CRE Motifs ((G/C)CGGAAGTGACGTCA) preferentially bound by GABPα and CREB proteins.

    PubMed

    Chatterjee, Raghunath; Zhao, Jianfei; He, Ximiao; Shlyakhtenko, Andrey; Mann, Ishminder; Waterfall, Joshua J; Meltzer, Paul; Sathyanarayana, B K; FitzGerald, Peter C; Vinson, Charles

    2012-10-01

    Previously, we identified 8-bps long DNA sequences (8-mers) that localize in human proximal promoters and grouped them into known transcription factor binding sites (TFBS). We now examine split 8-mers consisting of two 4-mers separated by 1-bp to 30-bps (X(4)-N(1-30)-X(4)) to identify pairs of TFBS that localize in proximal promoters at a precise distance. These include two overlapping TFBS: the ETS⇔ETS motif ((C/G)CCGGAAGCGGAA) and the ETS⇔CRE motif ((C/G)CGGAAGTGACGTCAC). The nucleotides in bold are part of both TFBS. Molecular modeling shows that the ETS⇔CRE motif can be bound simultaneously by both the ETS and the B-ZIP domains without protein-protein clashes. The electrophoretic mobility shift assay (EMSA) shows that the ETS protein GABPα and the B-ZIP protein CREB preferentially bind to the ETS⇔CRE motif only when the two TFBS overlap precisely. In contrast, the ETS domain of ETV5 and CREB interfere with each other for binding the ETS⇔CRE. The 11-mer (CGGAAGTGACG), the conserved part of the ETS⇔CRE motif, occurs 226 times in the human genome and 83% are in known regulatory regions. In vivo GABPα and CREB ChIP-seq peaks identified the ETS⇔CRE as the most enriched motif occurring in promoters of genes involved in mRNA processing, cellular catabolic processes, and stress response, suggesting that a specific class of genes is regulated by this composite motif. PMID:23050235

  7. Common sequence motifs coding for higher-plant and prokaryotic O-acetylserine (thiol)-lyases: bacterial origin of a chloroplast transit peptide?

    PubMed

    Rolland, N; Job, D; Douce, R

    1993-08-01

    A comparison of the amino acid sequence of O-acetylserine (thiol)-lyase (EC 4.2.99.8) from Escherichia coli and the isoforms of this enzyme found in the cytosolic and chloroplastic compartments of spinach (Spinacia oleracea) leaf cells allows the essential lysine residue involved in the binding of the pyridoxal 5'-phosphate cofactor to be identified. The results of further sequence comparison of cDNAs coding for these proteins are discussed in the frame of the endosymbiotic theory of chloroplast evolution. The results are compatible with a mechanism in which the chloroplast enzyme originated from the cytosolic enzyme and both plant genes originated from a common prokaryotic ancestor. The comparison also suggests that the 5'-non-coding sequence of the bacterial gene was transferred to the plant cell nucleus and that it has been used to create the N-terminal portions of both plant enzymes, and possibly the transit peptide of the chloroplast enzyme. PMID:7916619

  8. Cloning, Expression, and Sequencing of a Cell Surface Antigen Containing a Leucine-Rich Repeat Motif from Bacteroides forsythus ATCC 43037

    PubMed Central

    Sharma, Ashu; Sojar, Hakimuddin T.; Glurich, Ingrid; Honma, Kiyonobu; Kuramitsu, Howard K.; Genco, Robert J.

    1998-01-01

    Bacteroides forsythus is a recently recognized human periodontopathogen associated with advanced, as well as recurrent, periodontitis. However, very little is known about the mechanism of pathogenesis of this organism. The present study was undertaken to identify the surface molecules of this bacterium that may play roles in its adherence to oral tissues or triggering of a host immune response(s). The gene (bspA) encoding a cell surface-associated protein of B. forsythus with an apparent molecular mass of 98 kDa was isolated by immunoscreening of a B. forsythus gene library constructed in a lambda ZAP II vector. The encoded 98-kDa protein (BspA) contains 14 complete repeats of 23 amino acid residues that show partial homology to leucine-rich repeat motifs. A recombinant protein containing the repeat region was expressed in Escherichia coli, purified, and utilized for antibody production, as well as in vitro binding studies. The purified recombinant protein bound strongly to fibronectin and fibrinogen in a dose-dependent manner and further inhibited the binding of B. forsythus cells to these extracellular matrix (ECM) components. In addition, adult patients with B. forsythus-associated periodontitis expressed specific antibodies against the BspA protein. We report here the cloning and expression of an immunogenic cell surface-associated protein (BspA) of B. forsythus and speculate that it mediates the binding of bacteria to ECM components and clotting factors (fibronectin and fibrinogen, respectively), which may be important in the colonization of the oral cavity by this bacterium and is also a target for the host immune response. PMID:9826345