Sample records for identify conserved motifs

  1. [Conserved motifs in voltage sensing proteins].

    PubMed

    Wang, Chang-He; Xie, Zhen-Li; Lv, Jian-Wei; Yu, Zhi-Dan; Shao, Shu-Li

    2012-08-25

    This paper was aimed to study conserved motifs of voltage sensing proteins (VSPs) and establish a voltage sensing model. All VSPs were collected from the Uniprot database using a comprehensive keyword search followed by manual curation, and the results indicated that there are only two types of known VSPs, voltage gated ion channels and voltage dependent phosphatases. All the VSPs have a common domain of four helical transmembrane segments (TMS, S1-S4), which constitute the voltage sensing module of the VSPs. The S1 segment was shown to be responsible for membrane targeting and insertion of these proteins, while S2-S4 segments, which can sense membrane potential, for protein properties. Conserved motifs/residues and their functional significance of each TMS were identified using profile-to-profile sequence alignments. Conserved motifs in these four segments are strikingly similar for all VSPs, especially, the conserved motif [RK]-X(2)-R-X(2)-R-X(2)-[RK] was presented in all the S4 segments, with positively charged arginine (R) alternating with two hydrophobic or uncharged residues. Movement of these arginines across the membrane electric field is the core mechanism by which the VSPs detect changes in membrane potential. The negatively charged aspartate (D) in the S3 segment is universally conserved in all the VSPs, suggesting that the aspartate residue may be involved in voltage sensing properties of VSPs as well as the electrostatic interactions with the positively charged residues in the S4 segment, which may enhance the thermodynamic stability of the S4 segments in plasma membrane.

  2. De Novo Regulatory Motif Discovery Identifies Significant Motifs in Promoters of Five Classes of Plant Dehydrin Genes.

    PubMed

    Zolotarov, Yevgen; Strömvik, Martina

    2015-01-01

    Plants accumulate dehydrins in response to osmotic stresses. Dehydrins are divided into five different classes, which are thought to be regulated in different manners. To better understand differences in transcriptional regulation of the five dehydrin classes, de novo motif discovery was performed on 350 dehydrin promoter sequences from a total of 51 plant genomes. Overrepresented motifs were identified in the promoters of five dehydrin classes. The Kn dehydrin promoters contain motifs linked with meristem specific expression, as well as motifs linked with cold/dehydration and abscisic acid response. KS dehydrin promoters contain a motif with a GATA core. SKn and YnSKn dehydrin promoters contain motifs that match elements connected with cold/dehydration, abscisic acid and light response. YnKn dehydrin promoters contain motifs that match abscisic acid and light response elements, but not cold/dehydration response elements. Conserved promoter motifs are present in the dehydrin classes and across different plant lineages, indicating that dehydrin gene regulation is likely also conserved.

  3. Dipeptide frequency/bias analysis identifies conserved sites of nonrandomness shared by cysteine-rich motifs.

    PubMed

    Campion, S R; Ameen, A S; Lai, L; King, J M; Munzenmaier, T N

    2001-08-15

    This report describes the application of a simple computational tool, AAPAIR.TAB, for the systematic analysis of the cysteine-rich EGF, Sushi, and Laminin motif/sequence families at the two-amino acid level. Automated dipeptide frequency/bias analysis detects preferences in the distribution of amino acids in established protein families, by determining which "ordered dipeptides" occur most frequently in comprehensive motif-specific sequence data sets. Graphic display of the dipeptide frequency/bias data revealed family-specific preferences for certain dipeptides, but more importantly detected a shared preference for employment of the ordered dipeptides Gly-Tyr (GY) and Gly-Phe (GF) in all three protein families. The dipeptide Asn-Gly (NG) also exhibited high-frequency and bias in the EGF and Sushi motif families, whereas Asn-Thr (NT) was distinguished in the Laminin family. Evaluation of the distribution of dipeptides identified by frequency/bias analysis subsequently revealed the highly restricted localization of the G(F/Y) and N(G/T) sequence elements at two separate sites of extreme conservation in the consensus sequence of all three sequence families. The similar employment of the high-frequency/bias dipeptides in three distinct protein sequence families was further correlated with the concurrence of these shared molecular determinants at similar positions within the distinctive scaffolds of three structurally divergent, but similarly employed, motif modules.

  4. An effective approach for annotation of protein families with low sequence similarity and conserved motifs: identifying GDSL hydrolases across the plant kingdom.

    PubMed

    Vujaklija, Ivan; Bielen, Ana; Paradžik, Tina; Biđin, Siniša; Goldstein, Pavle; Vujaklija, Dušica

    2016-02-18

    The massive accumulation of protein sequences arising from the rapid development of high-throughput sequencing, coupled with automatic annotation, results in high levels of incorrect annotations. In this study, we describe an approach to decrease annotation errors of protein families characterized by low overall sequence similarity. The GDSL lipolytic family comprises proteins with multifunctional properties and high potential for pharmaceutical and industrial applications. The number of proteins assigned to this family has increased rapidly over the last few years. In particular, the natural abundance of GDSL enzymes reported recently in plants indicates that they could be a good source of novel GDSL enzymes. We noticed that a significant proportion of annotated sequences lack specific GDSL motif(s) or catalytic residue(s). Here, we applied motif-based sequence analyses to identify enzymes possessing conserved GDSL motifs in selected proteomes across the plant kingdom. Motif-based HMM scanning (Viterbi decoding-VD and posterior decoding-PD) and the here described PD/VD protocol were successfully applied on 12 selected plant proteomes to identify sequences with GDSL motifs. A significant number of identified GDSL sequences were novel. Moreover, our scanning approach successfully detected protein sequences lacking at least one of the essential motifs (171/820) annotated by Pfam profile search (PfamA) as GDSL. Based on these analyses we provide a curated list of GDSL enzymes from the selected plants. CLANS clustering and phylogenetic analysis helped us to gain a better insight into the evolutionary relationship of all identified GDSL sequences. Three novel GDSL subfamilies as well as unreported variations in GDSL motifs were discovered in this study. In addition, analyses of selected proteomes showed a remarkable expansion of GDSL enzymes in the lycophyte, Selaginella moellendorffii. Finally, we provide a general motif-HMM scanner which is easily accessible through

  5. Seed storage protein gene promoters contain conserved DNA motifs in Brassicaceae, Fabaceae and Poaceae

    PubMed Central

    Fauteux, François; Strömvik, Martina V

    2009-01-01

    Background Accurate computational identification of cis-regulatory motifs is difficult, particularly in eukaryotic promoters, which typically contain multiple short and degenerate DNA sequences bound by several interacting factors. Enrichment in combinations of rare motifs in the promoter sequence of functionally or evolutionarily related genes among several species is an indicator of conserved transcriptional regulatory mechanisms. This provides a basis for the computational identification of cis-regulatory motifs. Results We have used a discriminative seeding DNA motif discovery algorithm for an in-depth analysis of 54 seed storage protein (SSP) gene promoters from three plant families, namely Brassicaceae (mustards), Fabaceae (legumes) and Poaceae (grasses) using backgrounds based on complete sets of promoters from a representative species in each family, namely Arabidopsis (Arabidopsis thaliana (L.) Heynh.), soybean (Glycine max (L.) Merr.) and rice (Oryza sativa L.) respectively. We have identified three conserved motifs (two RY-like and one ACGT-like) in Brassicaceae and Fabaceae SSP gene promoters that are similar to experimentally characterized seed-specific cis-regulatory elements. Fabaceae SSP gene promoter sequences are also enriched in a novel, seed-specific E2Fb-like motif. Conserved motifs identified in Poaceae SSP gene promoters include a GCN4-like motif, two prolamin-box-like motifs and an Skn-1-like motif. Evidence of the presence of a variant of the TATA-box is found in the SSP gene promoters from the three plant families. Motifs discovered in SSP gene promoters were used to score whole-genome sets of promoters from Arabidopsis, soybean and rice. The highest-scoring promoters are associated with genes coding for different subunits or precursors of seed storage proteins. Conclusion Seed storage protein gene promoter motifs are conserved in diverse species, and different plant families are characterized by a distinct combination of conserved motifs

  6. Simple Shared Motifs (SSM) in conserved region of promoters: a new approach to identify co-regulation patterns.

    PubMed

    Gruel, Jérémy; LeBorgne, Michel; LeMeur, Nolwenn; Théret, Nathalie

    2011-09-12

    Regulation of gene expression plays a pivotal role in cellular functions. However, understanding the dynamics of transcription remains a challenging task. A host of computational approaches have been developed to identify regulatory motifs, mainly based on the recognition of DNA sequences for transcription factor binding sites. Recent integration of additional data from genomic analyses or phylogenetic footprinting has significantly improved these methods. Here, we propose a different approach based on the compilation of Simple Shared Motifs (SSM), groups of sequences defined by their length and similarity and present in conserved sequences of gene promoters. We developed an original algorithm to search and count SSM in pairs of genes. An exceptional number of SSM is considered as a common regulatory pattern. The SSM approach is applied to a sample set of genes and validated using functional gene-set enrichment analyses. We demonstrate that the SSM approach selects genes that are over-represented in specific biological categories (Ontology and Pathways) and are enriched in co-expressed genes. Finally we show that genes co-expressed in the same tissue or involved in the same biological pathway have increased SSM values. Using unbiased clustering of genes, Simple Shared Motifs analysis constitutes an original contribution to provide a clearer definition of expression networks.

  7. Simple Shared Motifs (SSM) in conserved region of promoters: a new approach to identify co-regulation patterns

    PubMed Central

    2011-01-01

    Background Regulation of gene expression plays a pivotal role in cellular functions. However, understanding the dynamics of transcription remains a challenging task. A host of computational approaches have been developed to identify regulatory motifs, mainly based on the recognition of DNA sequences for transcription factor binding sites. Recent integration of additional data from genomic analyses or phylogenetic footprinting has significantly improved these methods. Results Here, we propose a different approach based on the compilation of Simple Shared Motifs (SSM), groups of sequences defined by their length and similarity and present in conserved sequences of gene promoters. We developed an original algorithm to search and count SSM in pairs of genes. An exceptional number of SSM is considered as a common regulatory pattern. The SSM approach is applied to a sample set of genes and validated using functional gene-set enrichment analyses. We demonstrate that the SSM approach selects genes that are over-represented in specific biological categories (Ontology and Pathways) and are enriched in co-expressed genes. Finally we show that genes co-expressed in the same tissue or involved in the same biological pathway have increased SSM values. Conclusions Using unbiased clustering of genes, Simple Shared Motifs analysis constitutes an original contribution to provide a clearer definition of expression networks. PMID:21910886

  8. BayesMotif: de novo protein sorting motif discovery from impure datasets.

    PubMed

    Hu, Jianjun; Zhang, Fan

    2010-01-18

    Protein sorting is the process that newly synthesized proteins are transported to their target locations within or outside of the cell. This process is precisely regulated by protein sorting signals in different forms. A major category of sorting signals are amino acid sub-sequences usually located at the N-terminals or C-terminals of protein sequences. Genome-wide experimental identification of protein sorting signals is extremely time-consuming and costly. Effective computational algorithms for de novo discovery of protein sorting signals is needed to improve the understanding of protein sorting mechanisms. We formulated the protein sorting motif discovery problem as a classification problem and proposed a Bayesian classifier based algorithm (BayesMotif) for de novo identification of a common type of protein sorting motifs in which a highly conserved anchor is present along with a less conserved motif regions. A false positive removal procedure is developed to iteratively remove sequences that are unlikely to contain true motifs so that the algorithm can identify motifs from impure input sequences. Experiments on both implanted motif datasets and real-world datasets showed that the enhanced BayesMotif algorithm can identify anchored sorting motifs from pure or impure protein sequence dataset. It also shows that the false positive removal procedure can help to identify true motifs even when there is only 20% of the input sequences containing true motif instances. We proposed BayesMotif, a novel Bayesian classification based algorithm for de novo discovery of a special category of anchored protein sorting motifs from impure datasets. Compared to conventional motif discovery algorithms such as MEME, our algorithm can find less-conserved motifs with short highly conserved anchors. Our algorithm also has the advantage of easy incorporation of additional meta-sequence features such as hydrophobicity or charge of the motifs which may help to overcome the limitations of

  9. CoSMoS: Conserved Sequence Motif Search in the proteome

    PubMed Central

    Liu, Xiao I; Korde, Neeraj; Jakob, Ursula; Leichert, Lars I

    2006-01-01

    Background With the ever-increasing number of gene sequences in the public databases, generating and analyzing multiple sequence alignments becomes increasingly time consuming. Nevertheless it is a task performed on a regular basis by researchers in many labs. Results We have now created a database called CoSMoS to find the occurrences and at the same time evaluate the significance of sequence motifs and amino acids encoded in the whole genome of the model organism Escherichia coli K12. We provide a precomputed set of multiple sequence alignments for each individual E. coli protein with all of its homologues in the RefSeq database. The alignments themselves, information about the occurrence of sequence motifs together with information on the conservation of each of the more than 1.3 million amino acids encoded in the E. coli genome can be accessed via the web interface of CoSMoS. Conclusion CoSMoS is a valuable tool to identify highly conserved sequence motifs, to find regions suitable for mutational studies in functional analyses and to predict important structural features in E. coli proteins. PMID:16433915

  10. Comparative qualitative phosphoproteomics analysis identifies shared phosphorylation motifs and associated biological processes in evolutionary divergent plants.

    PubMed

    Al-Momani, Shireen; Qi, Da; Ren, Zhe; Jones, Andrew R

    2018-06-15

    Phosphorylation is one of the most prevalent post-translational modifications and plays a key role in regulating cellular processes. We carried out a bioinformatics analysis of pre-existing phosphoproteomics data, to profile two model species representing the largest subclasses in flowering plants the dicot Arabidopsis thaliana and the monocot Oryza sativa, to understand the extent to which phosphorylation signaling and function is conserved across evolutionary divergent plants. We identified 6537 phosphopeptides from 3189 phosphoproteins in Arabidopsis and 2307 phosphopeptides from 1613 phosphoproteins in rice. We identified phosphorylation motifs, finding nineteen pS motifs and two pT motifs shared in rice and Arabidopsis. The majority of shared motif-containing proteins were mapped to the same biological processes with similar patterns of fold enrichment, indicating high functional conservation. We also identified shared patterns of crosstalk between phosphoserines with enrichment for motifs pSXpS, pSXXpS and pSXXXpS, where X is any amino acid. Lastly, our results identified several pairs of motifs that are significantly enriched to co-occur in Arabidopsis proteins, indicating cross-talk between different sites, but this was not observed in rice. Our results demonstrate that there are evolutionary conserved mechanisms of phosphorylation-mediated signaling in plants, via analysis of high-throughput phosphorylation proteomics data from key monocot and dicot species: rice and Arabidposis thaliana. The results also suggest that there is increased crosstalk between phosphorylation sites in A. thaliana compared with rice. The results are important for our general understanding of cell signaling in plants, and the ability to use A. thaliana as a general model for plant biology. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.

  11. DoOPSearch: a web-based tool for finding and analysing common conserved motifs in the promoter regions of different chordate and plant genes

    PubMed Central

    Sebestyén, Endre; Nagy, Tibor; Suhai, Sándor; Barta, Endre

    2009-01-01

    Background The comparative genomic analysis of a large number of orthologous promoter regions of the chordate and plant genes from the DoOP databases shows thousands of conserved motifs. Most of these motifs differ from any known transcription factor binding site (TFBS). To identify common conserved motifs, we need a specific tool to be able to search amongst them. Since conserved motifs from the DoOP databases are linked to genes, the result of such a search can give a list of genes that are potentially regulated by the same transcription factor(s). Results We have developed a new tool called DoOPSearch for the analysis of the conserved motifs in the promoter regions of chordate or plant genes. We used the orthologous promoters of the DoOP database to extract thousands of conserved motifs from different taxonomic groups. The advantage of this approach is that different sets of conserved motifs might be found depending on how broad the taxonomic coverage of the underlying orthologous promoter sequence collection is (consider e.g. primates vs. mammals or Brassicaceae vs. Viridiplantae). The DoOPSearch tool allows the users to search these motif collections or the promoter regions of DoOP with user supplied query sequences or any of the conserved motifs from the DoOP database. To find overrepresented gene ontologies, the gene lists obtained can be analysed further using a modified version of the GeneMerge program. Conclusion We present here a comparative genomics based promoter analysis tool. Our system is based on a unique collection of conserved promoter motifs characteristic of different taxonomic groups. We offer both a command line and a web-based tool for searching in these motif collections using user specified queries. These can be either short promoter sequences or consensus sequences of known transcription factor binding sites. The GeneMerge analysis of the search results allows the user to identify statistically overrepresented Gene Ontology terms that

  12. D-MATRIX: A web tool for constructing weight matrix of conserved DNA motifs

    PubMed Central

    Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok

    2009-01-01

    Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. D­MATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the co­regulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sos­box cis­regulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. D­MATRIX tool is accessible through the CIMAP domain network. Availability http://203.190.147.116/dmatrix/ PMID:19759861

  13. D-MATRIX: a web tool for constructing weight matrix of conserved DNA motifs.

    PubMed

    Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok

    2009-07-27

    Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. D-MATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the co-regulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sos-box cis-regulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. D-MATRIX tool is accessible through the CIMAP domain network. http://203.190.147.116/dmatrix/

  14. A Conserved GPG-Motif in the HIV-1 Nef Core Is Required for Principal Nef-Activities

    PubMed Central

    Martínez-Bonet, Marta; Palladino, Claudia; Briz, Veronica; Rudolph, Jochen M.; Fackler, Oliver T.; Relloso, Miguel; Muñoz-Fernandez, Maria Angeles; Madrid, Ricardo

    2015-01-01

    To find out new determinants required for Nef activity we performed a functional alanine scanning analysis along a discrete but highly conserved region at the core of HIV-1 Nef. We identified the GPG-motif, located at the 121–137 region of HIV-1 NL4.3 Nef, as a novel protein signature strictly required for the p56Lck dependent Nef-induced CD4-downregulation in T-cells. Since the Nef-GPG motif was dispensable for CD4-downregulation in HeLa-CD4 cells, Nef/AP-1 interaction and Nef-dependent effects on Tf-R trafficking, the observed effects on CD4 downregulation cannot be attributed to structure constraints or to alterations on general protein trafficking. Besides, we found that the GPG-motif was also required for Nef-dependent inhibition of ring actin re-organization upon TCR triggering and MHCI downregulation, suggesting that the GPG-motif could actively cooperate with the Nef PxxP motif for these HIV-1 Nef-related effects. Finally, we observed that the Nef-GPG motif was required for optimal infectivity of those viruses produced in T-cells. According to these findings, we propose the conserved GPG-motif in HIV-1 Nef as functional region required for HIV-1 infectivity and therefore with a potential interest for the interference of Nef activity during HIV-1 infection. PMID:26700863

  15. Space-related pharma-motifs for fast search of protein binding motifs and polypharmacological targets

    PubMed Central

    2012-01-01

    Background To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. Results We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. Conclusions SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery

  16. Space-related pharma-motifs for fast search of protein binding motifs and polypharmacological targets.

    PubMed

    Chiu, Yi-Yuan; Lin, Chun-Yu; Lin, Chih-Ta; Hsu, Kai-Cheng; Chang, Li-Zen; Yang, Jinn-Moon

    2012-01-01

    To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery.

  17. Using SCOPE to identify potential regulatory motifs in coregulated genes.

    PubMed

    Martyanov, Viktor; Gross, Robert H

    2011-05-31

    SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data. In this article, we utilize a web version of SCOPE to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs and has been used in other studies. The three algorithms that comprise SCOPE are BEAM, which finds non-degenerate motifs (ACCGGT), PRISM, which finds degenerate motifs (ASCGWT), and SPACER, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well. Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor. Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run. Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from

  18. The conservation pattern of short linear motifs is highly correlated with the function of interacting protein domains.

    PubMed

    Ren, Siyuan; Yang, Guang; He, Youyu; Wang, Yiguo; Li, Yixue; Chen, Zhengjun

    2008-10-01

    Many well-represented domains recognize primary sequences usually less than 10 amino acids in length, called Short Linear Motifs (SLiMs). Accurate prediction of SLiMs has been difficult because they are short (often < 10 amino acids) and highly degenerate. In this study, we combined scoring matrixes derived from peptide library and conservation analysis to identify protein classes enriched of functional SLiMs recognized by SH2, SH3, PDZ and S/T kinase domains. Our combined approach revealed that SLiMs are highly conserved in proteins from functional classes that are known to interact with a specific domain, but that they are not conserved in most other protein groups. We found that SLiMs recognized by SH2 domains were highly conserved in receptor kinases/phosphatases, adaptor molecules, and tyrosine kinases/phosphatases, that SLiMs recognized by SH3 domains were highly conserved in cytoskeletal and cytoskeletal-associated proteins, that SLiMs recognized by PDZ domains were highly conserved in membrane proteins such as channels and receptors, and that SLiMs recognized by S/T kinase domains were highly conserved in adaptor molecules, S/T kinases/phosphatases, and proteins involved in transcription or cell cycle control. We studied Tyr-SLiMs recognized by SH2 domains in more detail, and found that SH2-recognized Tyr-SLiMs on the cytoplasmic side of membrane proteins are more highly conserved than those on the extra-cellular side. Also, we found that SH2-recognized Tyr-SLiMs that are associated with SH3 motifs and a tyrosine kinase phosphorylation motif are more highly conserved. The interactome of protein domains is reflected by the evolutionary conservation of SLiMs recognized by these domains. Combining scoring matrixes derived from peptide libraries and conservation analysis, we would be able to find those protein groups that are more likely to interact with specific domains.

  19. Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas

    PubMed Central

    Petrov, Anton I.; Zirbel, Craig L.; Leontis, Neocles B.

    2013-01-01

    The analysis of atomic-resolution RNA three-dimensional (3D) structures reveals that many internal and hairpin loops are modular, recurrent, and structured by conserved non-Watson–Crick base pairs. Structurally similar loops define RNA 3D motifs that are conserved in homologous RNA molecules, but can also occur at nonhomologous sites in diverse RNAs, and which often vary in sequence. To further our understanding of RNA motif structure and sequence variability and to provide a useful resource for structure modeling and prediction, we present a new method for automated classification of internal and hairpin loop RNA 3D motifs and a new online database called the RNA 3D Motif Atlas. To classify the motif instances, a representative set of internal and hairpin loops is automatically extracted from a nonredundant list of RNA-containing PDB files. Their structures are compared geometrically, all-against-all, using the FR3D program suite. The loops are clustered into motif groups, taking into account geometric similarity and structural annotations and making allowance for a variable number of bulged bases. The automated procedure that we have implemented identifies all hairpin and internal loop motifs previously described in the literature. All motif instances and motif groups are assigned unique and stable identifiers and are made available in the RNA 3D Motif Atlas (http://rna.bgsu.edu/motifs), which is automatically updated every four weeks. The RNA 3D Motif Atlas provides an interactive user interface for exploring motif diversity and tools for programmatic data access. PMID:23970545

  20. The conservation pattern of short linear motifs is highly correlated with the function of interacting protein domains

    PubMed Central

    Ren, Siyuan; Yang, Guang; He, Youyu; Wang, Yiguo; Li, Yixue; Chen, Zhengjun

    2008-01-01

    Background Many well-represented domains recognize primary sequences usually less than 10 amino acids in length, called Short Linear Motifs (SLiMs). Accurate prediction of SLiMs has been difficult because they are short (often < 10 amino acids) and highly degenerate. In this study, we combined scoring matrixes derived from peptide library and conservation analysis to identify protein classes enriched of functional SLiMs recognized by SH2, SH3, PDZ and S/T kinase domains. Results Our combined approach revealed that SLiMs are highly conserved in proteins from functional classes that are known to interact with a specific domain, but that they are not conserved in most other protein groups. We found that SLiMs recognized by SH2 domains were highly conserved in receptor kinases/phosphatases, adaptor molecules, and tyrosine kinases/phosphatases, that SLiMs recognized by SH3 domains were highly conserved in cytoskeletal and cytoskeletal-associated proteins, that SLiMs recognized by PDZ domains were highly conserved in membrane proteins such as channels and receptors, and that SLiMs recognized by S/T kinase domains were highly conserved in adaptor molecules, S/T kinases/phosphatases, and proteins involved in transcription or cell cycle control. We studied Tyr-SLiMs recognized by SH2 domains in more detail, and found that SH2-recognized Tyr-SLiMs on the cytoplasmic side of membrane proteins are more highly conserved than those on the extra-cellular side. Also, we found that SH2-recognized Tyr-SLiMs that are associated with SH3 motifs and a tyrosine kinase phosphorylation motif are more highly conserved. Conclusion The interactome of protein domains is reflected by the evolutionary conservation of SLiMs recognized by these domains. Combining scoring matrixes derived from peptide libraries and conservation analysis, we would be able to find those protein groups that are more likely to interact with specific domains. PMID:18828911

  1. Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium.

    PubMed

    Catania, Francesco; Lynch, Michael

    2010-05-04

    In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa) remains a virtually unexplored issue. By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Our observations 1) shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2) are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3) reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes.

  2. Function-based classification of carbohydrate-active enzymes by recognition of short, conserved peptide motifs.

    PubMed

    Busk, Peter Kamp; Lange, Lene

    2013-06-01

    Functional prediction of carbohydrate-active enzymes is difficult due to low sequence identity. However, similar enzymes often share a few short motifs, e.g., around the active site, even when the overall sequences are very different. To exploit this notion for functional prediction of carbohydrate-active enzymes, we developed a simple algorithm, peptide pattern recognition (PPR), that can divide proteins into groups of sequences that share a set of short conserved sequences. When this method was used on 118 glycoside hydrolase 5 proteins with 9% average pairwise identity and representing four characterized enzymatic functions, 97% of the proteins were sorted into groups correlating with their enzymatic activity. Furthermore, we analyzed 8,138 glycoside hydrolase 13 proteins including 204 experimentally characterized enzymes with 28 different functions. There was a 91% correlation between group and enzyme activity. These results indicate that the function of carbohydrate-active enzymes can be predicted with high precision by finding short, conserved motifs in their sequences. The glycoside hydrolase 61 family is important for fungal biomass conversion, but only a few proteins of this family have been functionally characterized. Interestingly, PPR divided 743 glycoside hydrolase 61 proteins into 16 subfamilies useful for targeted investigation of the function of these proteins and pinpointed three conserved motifs with putative importance for enzyme activity. Furthermore, the conserved sequences were useful for cloning of new, subfamily-specific glycoside hydrolase 61 proteins from 14 fungi. In conclusion, identification of conserved sequence motifs is a new approach to sequence analysis that can predict carbohydrate-active enzyme functions with high precision.

  3. INCENP Centromere and Spindle Targeting: Identification of Essential Conserved Motifs and Involvement of Heterochromatin Protein HP1

    PubMed Central

    Ainsztein, Alexandra M.; Kandels-Lewis, Stefanie E.; Mackay, Alastair M.; Earnshaw, William C.

    1998-01-01

    The inner centromere protein (INCENP) has a modular organization, with domains required for chromosomal and cytoskeletal functions concentrated near the amino and carboxyl termini, respectively. In this study we have identified an autonomous centromere- and midbody-targeting module in the amino-terminal 68 amino acids of INCENP. Within this module, we have identified two evolutionarily conserved amino acid sequence motifs: a 13–amino acid motif that is required for targeting to centromeres and transfer to the spindle, and an 11–amino acid motif that is required for transfer to the spindle by molecules that have targeted previously to the centromere. To begin to understand the mechanisms of INCENP function in mitosis, we have performed a yeast two-hybrid screen for interacting proteins. These and subsequent in vitro binding experiments identify a physical interaction between INCENP and heterochromatin protein HP1Hsα. Surprisingly, this interaction does not appear to be involved in targeting INCENP to the centromeric heterochromatin, but may instead have a role in its transfer from the chromosomes to the anaphase spindle. PMID:9864353

  4. Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium

    PubMed Central

    2010-01-01

    Background In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa) remains a virtually unexplored issue. Results By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Conclusions Our observations 1) shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2) are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3) reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes. PMID:20441586

  5. A Conserved Metal Binding Motif in the Bacillus subtilis Competence Protein ComFA Enhances Transformation.

    PubMed

    Chilton, Scott S; Falbel, Tanya G; Hromada, Susan; Burton, Briana M

    2017-08-01

    Genetic competence is a process in which cells are able to take up DNA from their environment, resulting in horizontal gene transfer, a major mechanism for generating diversity in bacteria. Many bacteria carry homologs of the central DNA uptake machinery that has been well characterized in Bacillus subtilis It has been postulated that the B. subtilis competence helicase ComFA belongs to the DEAD box family of helicases/translocases. Here, we made a series of mutants to analyze conserved amino acid motifs in several regions of B. subtilis ComFA. First, we confirmed that ComFA activity requires amino acid residues conserved among the DEAD box helicases, and second, we show that a zinc finger-like motif consisting of four cysteines is required for efficient transformation. Each cysteine in the motif is important, and mutation of at least two of the cysteines dramatically reduces transformation efficiency. Further, combining multiple cysteine mutations with the helicase mutations shows an additive phenotype. Our results suggest that the helicase and metal binding functions are two distinct activities important for ComFA function during transformation. IMPORTANCE ComFA is a highly conserved protein that has a role in DNA uptake during natural competence, a mechanism for horizontal gene transfer observed in many bacteria. Investigation of the details of the DNA uptake mechanism is important for understanding the ways in which bacteria gain new traits from their environment, such as drug resistance. To dissect the role of ComFA in the DNA uptake machinery, we introduced point mutations into several motifs in the protein sequence. We demonstrate that several amino acid motifs conserved among ComFA proteins are important for efficient transformation. This report is the first to demonstrate the functional requirement of an amino-terminal cysteine motif in ComFA. Copyright © 2017 American Society for Microbiology.

  6. Discovery of candidate KEN-box motifs using cell cycle keyword enrichment combined with native disorder prediction and motif conservation.

    PubMed

    Michael, Sushama; Travé, Gilles; Ramu, Chenna; Chica, Claudia; Gibson, Toby J

    2008-02-15

    KEN-box-mediated target selection is one of the mechanisms used in the proteasomal destruction of mitotic cell cycle proteins via the APC/C complex. While annotating the Eukaryotic Linear Motif resource (ELM, http://elm.eu.org/), we found that KEN motifs were significantly enriched in human protein entries with cell cycle keywords in the UniProt/Swiss-Prot database-implying that KEN-boxes might be more common than reported. Matches to short linear motifs in protein database searches are not, per se, significant. KEN-box enrichment with cell cycle Gene Ontology terms suggests that collectively these motifs are functional but does not prove that any given instance is so. Candidates were surveyed for native disorder prediction using GlobPlot and IUPred and for motif conservation in homologues. Among >25 strong new candidates, the most notable are human HIPK2, CHFR, CDC27, Dab2, Upf2, kinesin Eg5, DNA Topoisomerase 1 and yeast Cdc5 and Swi5. A similar number of weaker candidates were present. These proteins have yet to be tested for APC/C targeted destruction, providing potential new avenues of research.

  7. Rewiring yeast sugar transporter preference through modifying a conserved protein motif.

    PubMed

    Young, Eric M; Tong, Alice; Bui, Hang; Spofford, Caitlin; Alper, Hal S

    2014-01-07

    Utilization of exogenous sugars found in lignocellulosic biomass hydrolysates, such as xylose, must be improved before yeast can serve as an efficient biofuel and biochemical production platform. In particular, the first step in this process, the molecular transport of xylose into the cell, can serve as a significant flux bottleneck and is highly inhibited by other sugars. Here we demonstrate that sugar transport preference and kinetics can be rewired through the programming of a sequence motif of the general form G-G/F-XXX-G found in the first transmembrane span. By evaluating 46 different heterologously expressed transporters, we find that this motif is conserved among functional transporters and highly enriched in transporters that confer growth on xylose. Through saturation mutagenesis and subsequent rational mutagenesis, four transporter mutants unable to confer growth on glucose but able to sustain growth on xylose were engineered. Specifically, Candida intermedia gxs1 Phe(38)Ile(39)Met(40), Scheffersomyces stipitis rgt2 Phe(38) and Met(40), and Saccharomyces cerevisiae hxt7 Ile(39)Met(40)Met(340) all exhibit this phenotype. In these cases, primary hexose transporters were rewired into xylose transporters. These xylose transporters nevertheless remained inhibited by glucose. Furthermore, in the course of identifying this motif, novel wild-type transporters with superior monosaccharide growth profiles were discovered, namely S. stipitis RGT2 and Debaryomyces hansenii 2D01474. These findings build toward the engineering of efficient pentose utilization in yeast and provide a blueprint for reprogramming transporter properties.

  8. Identifying novel sequence variants of RNA 3D motifs

    PubMed Central

    Zirbel, Craig L.; Roll, James; Sweeney, Blake A.; Petrov, Anton I.; Pirrung, Meg; Leontis, Neocles B.

    2015-01-01

    Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameterize each model, we use all instances of each motif found in the RNA 3D Motif Atlas and annotations of pairwise nucleotide interactions generated by the FR3D software. Isostericity relations between non-Watson–Crick basepairs are used in scoring sequence variants. SCFG techniques model nested pairs and insertions, while MRF ideas handle crossing interactions and base triples. We use test sets of randomly-generated sequences to set acceptance and rejection thresholds for each motif group and thus control the false positive rate. Validation was carried out by comparing results for four motif groups to RMDetect. The software developed for sequence scoring (JAR3D) is structured to automatically incorporate new motifs as they accumulate in the RNA 3D Motif Atlas when new structures are solved and is available free for download. PMID:26130723

  9. Identifying DNA-binding proteins using structural motifs and the electrostatic potential

    PubMed Central

    Shanahan, Hugh P.; Garcia, Mario A.; Jones, Susan; Thornton, Janet M.

    2004-01-01

    Robust methods to detect DNA-binding proteins from structures of unknown function are important for structural biology. This paper describes a method for identifying such proteins that (i) have a solvent accessible structural motif necessary for DNA-binding and (ii) a positive electrostatic potential in the region of the binding region. We focus on three structural motifs: helix–turn-helix (HTH), helix–hairpin–helix (HhH) and helix–loop–helix (HLH). We find that the combination of these variables detect 78% of proteins with an HTH motif, which is a substantial improvement over previous work based purely on structural templates and is comparable to more complex methods of identifying DNA-binding proteins. Similar true positive fractions are achieved for the HhH and HLH motifs. We see evidence of wide evolutionary diversity for DNA-binding proteins with an HTH motif, and much smaller diversity for those with an HhH or HLH motif. PMID:15356290

  10. Statistical Methods for Identifying Sequence Motifs Affecting Point Mutations

    PubMed Central

    Zhu, Yicheng; Neeman, Teresa; Yap, Von Bing; Huttley, Gavin A.

    2017-01-01

    Mutation processes differ between types of point mutation, genomic locations, cells, and biological species. For some point mutations, specific neighboring bases are known to be mechanistically influential. Beyond these cases, numerous questions remain unresolved, including: what are the sequence motifs that affect point mutations? How large are the motifs? Are they strand symmetric? And, do they vary between samples? We present new log-linear models that allow explicit examination of these questions, along with sequence logo style visualization to enable identifying specific motifs. We demonstrate the performance of these methods by analyzing mutation processes in human germline and malignant melanoma. We recapitulate the known CpG effect, and identify novel motifs, including a highly significant motif associated with A→G mutations. We show that major effects of neighbors on germline mutation lie within ±2 of the mutating base. Models are also presented for contrasting the entire mutation spectra (the distribution of the different point mutations). We show the spectra vary significantly between autosomes and X-chromosome, with a difference in T→C transition dominating. Analyses of malignant melanoma confirmed reported characteristic features of this cancer, including statistically significant strand asymmetry, and markedly different neighboring influences. The methods we present are made freely available as a Python library https://bitbucket.org/pycogent3/mutationmotif. PMID:27974498

  11. Hairpin structures with conserved sequence motifs determine the 3' ends of non-polyadenylated invertebrate iridovirus transcripts.

    PubMed

    İnce, İkbal Agah; Pijlman, Gorben P; Vlak, Just M; van Oers, Monique M

    2017-11-01

    Previously, we observed that the transcripts of Invertebrate iridescent virus 6 (IIV6) are not polyadenylated, in line with the absence of canonical poly(A) motifs (AATAAA) downstream of the open reading frames (ORFs) in the genome. Here, we determined the 3' ends of the transcripts of fifty-four IIV6 virion protein genes in infected Drosophila Schneider 2 (S2) cells. By using ligation-based amplification of cDNA ends (LACE) it was shown that the IIV6 mRNAs often ended with a CAUUA motif. In silico analysis showed that the 3'-untranslated regions of IIV6 genes have the ability to form hairpin structures (22-56 nt in length) and that for about half of all IIV6 genes these 3' sequences contained complementary TAATG and CATTA motifs. We also show that a hairpin in the 3' flanking region with conserved sequence motifs is a conserved feature in invertebrate-infecting iridoviruses (genus Iridovirus and Chloriridovirus). Copyright © 2017 Elsevier Inc. All rights reserved.

  12. Intrinsically disordered proteins drive enamel formation via an evolutionarily conserved self-assembly motif.

    PubMed

    Wald, Tomas; Spoutil, Frantisek; Osickova, Adriana; Prochazkova, Michaela; Benada, Oldrich; Kasparek, Petr; Bumba, Ladislav; Klein, Ophir D; Sedlacek, Radislav; Sebo, Peter; Prochazka, Jan; Osicka, Radim

    2017-02-28

    The formation of mineralized tissues is governed by extracellular matrix proteins that assemble into a 3D organic matrix directing the deposition of hydroxyapatite. Although the formation of bones and dentin depends on the self-assembly of type I collagen via the Gly-X-Y motif, the molecular mechanism by which enamel matrix proteins (EMPs) assemble into the organic matrix remains poorly understood. Here we identified a Y/F-x-x-Y/L/F-x-Y/F motif, evolutionarily conserved from the first tetrapods to man, that is crucial for higher order structure self-assembly of the key intrinsically disordered EMPs, ameloblastin and amelogenin. Using targeted mutations in mice and high-resolution imaging, we show that impairment of ameloblastin self-assembly causes disorganization of the enamel organic matrix and yields enamel with disordered hydroxyapatite crystallites. These findings define a paradigm for the molecular mechanism by which the EMPs self-assemble into supramolecular structures and demonstrate that this process is crucial for organization of the organic matrix and formation of properly structured enamel.

  13. Characteristic motifs for families of allergenic proteins

    PubMed Central

    Ivanciuc, Ovidiu; Garcia, Tzintzuni; Torres, Miguel; Schein, Catherine H.; Braun, Werner

    2008-01-01

    The identification of potential allergenic proteins is usually done by scanning a database of allergenic proteins and locating known allergens with a high sequence similarity. However, there is no universally accepted cut-off value for sequence similarity to indicate potential IgE cross-reactivity. Further, overall sequence similarity may be less important than discrete areas of similarity in proteins with homologous structure. To identify such areas, we first classified all allergens and their subdomains in the Structural Database of Allergenic Proteins (SDAP, http://fermi.utmb.edu/SDAP/) to their closest protein families as defined in Pfam, and identified conserved physicochemical property motifs characteristic of each group of sequences. Allergens populate only a small subset of all known Pfam families, as all allergenic proteins in SDAP could be grouped to only 130 (of 9318 total) Pfams, and 31 families contain more than four allergens. Conserved physicochemical property motifs for the aligned sequences of the most populated Pfam families were identified with the PCPMer program suite and catalogued in the webserver Motif-Mate (http://born.utmb.edu/motifmate/summary.php). We also determined specific motifs for allergenic members of a family that could distinguish them from non-allergenic ones. These allergen specific motifs should be most useful in database searches for potential allergens. We found that sequence motifs unique to the allergens in three families (seed storage proteins, Bet v 1, and tropomyosin) overlap with known IgE epitopes, thus providing evidence that our motif based approach can be used to assess the potential allergenicity of novel proteins. PMID:18951633

  14. Characterization of Conserved Tandem Donor Sites and Intronic Motifs Required for Alternative Splicing in Corticosteroid Receptor Genes

    PubMed Central

    Qian, Xiaoxiao; Matthews, Laura; Lightman, Stafford; Ray, David; Norman, Michael

    2015-01-01

    Alternative splicing events from tandem donor sites result in mRNA variants coding for additional amino acids in the DNA binding domain of both the glucocorticoid (GR) and mineralocorticoid (MR) receptors. We now show that expression of both splice variants is extensively conserved in mammalian species, providing strong evidence for their functional significance. An exception to the conservation of the MR tandem splice site (an A at position +5 of the MR+12 donor site in the mouse) was predicted to decrease U1 small nuclear RNA binding. In accord with this prediction, we were unable to detect the MR+12 variant in this species. The one exception to the conservation of the GR tandem splice site, an A at position +3 of the platypus GRγ donor site that was predicted to enhance binding of U1 snRNA, was unexpectedly associated with decreased expression of the variant from the endogenous gene as well as a minigene. An intronic pyrimidine motif present in both GR and MR genes was found to be critical for usage of the downstream donor site, and overexpression of TIA1/TIAL1 RNA binding proteins, which are known to bind such motifs, led to a marked increase in the proportion of GRγ and MR+12. These results provide striking evidence for conservation of a complex splicing mechanism that involves processes other than stochastic spliceosome binding and identify a mechanism that would allow regulation of variant expression. PMID:19819975

  15. CompariMotif: quick and easy comparisons of sequence motifs.

    PubMed

    Edwards, Richard J; Davey, Norman E; Shields, Denis C

    2008-05-15

    CompariMotif is a novel tool for making motif-motif comparisons, identifying and describing similarities between regular expression motifs. CompariMotif can identify a number of different relationships between motifs, including exact matches, variants of degenerate motifs and complex overlapping motifs. Motif relationships are scored using shared information content, allowing the best matches to be easily identified in large comparisons. Many input and search options are available, enabling a list of motifs to be compared to itself (to identify recurring motifs) or to datasets of known motifs. CompariMotif can be run online at http://bioware.ucd.ie/ and is freely available for academic use as a set of open source Python modules under a GNU General Public License from http://bioinformatics.ucd.ie/shields/software/comparimotif/

  16. Methods for Identifying Ligands that Target Nucleic Acid Molecules and Nucleic Acid Structural Motifs

    NASA Technical Reports Server (NTRS)

    Childs-Disney, Jessica L. (Inventor); Disney, Matthew D. (Inventor)

    2017-01-01

    Disclosed are methods for identifying a nucleic acid (e.g., RNA, DNA, etc.) motif which interacts with a ligand. The method includes providing a plurality of ligands immobilized on a support, wherein each particular ligand is immobilized at a discrete location on the support; contacting the plurality of immobilized ligands with a nucleic acid motif library under conditions effective for one or more members of the nucleic acid motif library to bind with the immobilized ligands; and identifying members of the nucleic acid motif library that are bound to a particular immobilized ligand. Also disclosed are methods for selecting, from a plurality of candidate ligands, one or more ligands that have increased likelihood of binding to a nucleic acid molecule comprising a particular nucleic acid motif, as well as methods for identifying a nucleic acid which interacts with a ligand.

  17. Targeting of Arabidopsis KNL2 to Centromeres Depends on the Conserved CENPC-k Motif in Its C Terminus.

    PubMed

    Sandmann, Michael; Talbert, Paul; Demidov, Dmitri; Kuhlmann, Markus; Rutten, Twan; Conrad, Udo; Lermontova, Inna

    2017-01-01

    KINETOCHORE NULL2 (KNL2) is involved in recognition of centromeres and in centromeric localization of the centromere-specific histone cenH3. Our study revealed a cenH3 nucleosome binding CENPC-k motif at the C terminus of Arabidopsis thaliana KNL2, which is conserved among a wide spectrum of eukaryotes. Centromeric localization of KNL2 is abolished by deletion of the CENPC-k motif and by mutating single conserved amino acids, but can be restored by insertion of the corresponding motif of Arabidopsis CENP-C. We showed by electrophoretic mobility shift assay that the C terminus of KNL2 binds DNA sequence-independently and interacts with the centromeric transcripts in vitro. Chromatin immunoprecipitation with anti-KNL2 antibodies indicated that in vivo KNL2 is preferentially associated with the centromeric repeat pAL1 Complete deletion of the CENPC-k motif did not influence its ability to interact with DNA in vitro. Therefore, we suggest that KNL2 recognizes centromeric nucleosomes, similar to CENP-C, via the CENPC-k motif and binds adjoining DNA. © 2017 American Society of Plant Biologists. All rights reserved.

  18. Conserved binding of GCAC motifs by MEC-8, couch potato, and the RBPMS protein family

    PubMed Central

    Soufari, Heddy

    2017-01-01

    Precise regulation of mRNA processing, translation, localization, and stability relies on specific interactions with RNA-binding proteins whose biological function and target preference are dictated by their preferred RNA motifs. The RBPMS family of RNA-binding proteins is defined by a conserved RNA recognition motif (RRM) domain found in metazoan RBPMS/Hermes and RBPMS2, Drosophila couch potato, and MEC-8 from Caenorhabditis elegans. In order to determine the parameters of RNA sequence recognition by the RBPMS family, we have first used the N-terminal domain from MEC-8 in binding assays and have demonstrated a preference for two GCAC motifs optimally separated by >6 nucleotides (nt). We have also determined the crystal structure of the dimeric N-terminal RRM domain from MEC-8 in the unbound form, and in complex with an oligonucleotide harboring two copies of the optimal GCAC motif. The atomic details reveal the molecular network that provides specificity to all four bases in the motif, including multiple hydrogen bonds to the initial guanine. Further studies with human RBPMS, as well as Drosophila couch potato, confirm a general preference for this double GCAC motif by other members of the protein family and the presence of this motif in known targets. PMID:28003515

  19. Interaction of MYC with host cell factor-1 is mediated by the evolutionarily conserved Myc box IV motif.

    PubMed

    Thomas, L R; Foshage, A M; Weissmiller, A M; Popay, T M; Grieb, B C; Qualls, S J; Ng, V; Carboneau, B; Lorey, S; Eischen, C M; Tansey, W P

    2016-07-07

    The MYC family of oncogenes encodes a set of three related transcription factors that are overexpressed in many human tumors and contribute to the cancer-related deaths of more than 70,000 Americans every year. MYC proteins drive tumorigenesis by interacting with co-factors that enable them to regulate the expression of thousands of genes linked to cell growth, proliferation, metabolism and genome stability. One effective way to identify critical co-factors required for MYC function has been to focus on sequence motifs within MYC that are conserved throughout evolution, on the assumption that their conservation is driven by protein-protein interactions that are vital for MYC activity. In addition to their DNA-binding domains, MYC proteins carry five regions of high sequence conservation known as Myc boxes (Mb). To date, four of the Mb motifs (MbI, MbII, MbIIIa and MbIIIb) have had a molecular function assigned to them, but the precise role of the remaining Mb, MbIV, and the reason for its preservation in vertebrate Myc proteins, is unknown. Here, we show that MbIV is required for the association of MYC with the abundant transcriptional coregulator host cell factor-1 (HCF-1). We show that the invariant core of MbIV resembles the tetrapeptide HCF-binding motif (HBM) found in many HCF-interaction partners, and demonstrate that MYC interacts with HCF-1 in a manner indistinguishable from the prototypical HBM-containing protein VP16. Finally, we show that rationalized point mutations in MYC that disrupt interaction with HCF-1 attenuate the ability of MYC to drive tumorigenesis in mice. Together, these data expose a molecular function for MbIV and indicate that HCF-1 is an important co-factor for MYC.

  20. An evolutionarily conserved motif in the TAB1 C-terminal region is necessary for interaction with and activation of TAK1 MAPKKK.

    PubMed

    Ono, K; Ohtomo, T; Sato, S; Sugamata, Y; Suzuki, M; Hisamoto, N; Ninomiya-Tsuji, J; Tsuchiya, M; Matsumoto, K

    2001-06-29

    TAK1, a member of the MAPKKK family, is involved in the intracellular signaling pathways mediated by transforming growth factor beta, interleukin 1, and Wnt. TAK1 kinase activity is specifically activated by the TAK1-binding protein TAB1. The C-terminal 68-amino acid sequence of TAB1 (TAB1-C68) is sufficient for TAK1 interaction and activation. Analysis of various truncated versions of TAB1-C68 defined a C-terminal 30-amino acid sequence (TAB1-C30) necessary for TAK1 binding and activation. NMR studies revealed that the TAB1-C30 region has a unique alpha-helical structure. We identified a conserved sequence motif, PYVDXA/TXF, in the C-terminal domain of mammalian TAB1, Xenopus TAB1, and its Caenorhabditis elegans homolog TAP-1, suggesting that this motif constitutes a specific TAK1 docking site. Alanine substitution mutagenesis showed that TAB1 Phe-484, located in the conserved motif, is crucial for TAK1 binding and activation. The C. elegans homolog of TAB1, TAP-1, was able to interact with and activate the C. elegans homolog of TAK1, MOM-4. However, the site in TAP-1 corresponding to Phe-484 of TAB1 is an alanine residue (Ala-364), and changing this residue to Phe abrogates the ability of TAP-1 to interact with and activate MOM-4. These results suggest that the Phe or Ala residue within the conserved motif of the TAB1-related proteins is important for interaction with and activation of specific TAK1 MAPKKK family members in vivo.

  1. Identifying the preferred RNA motifs and chemotypes that interact by probing millions of combinations.

    PubMed

    Tran, Tuan; Disney, Matthew D

    2012-01-01

    RNA is an important therapeutic target but information about RNA-ligand interactions is limited. Here, we report a screening method that probes over 3,000,000 combinations of RNA motif-small molecule interactions to identify the privileged RNA structures and chemical spaces that interact. Specifically, a small molecule library biased for binding RNA was probed for binding to over 70,000 unique RNA motifs in a high throughput solution-based screen. The RNA motifs that specifically bind each small molecule were identified by microarray-based selection. In this library-versus-library or multidimensional combinatorial screening approach, hairpin loops (among a variety of RNA motifs) were the preferred RNA motif space that binds small molecules. Furthermore, it was shown that indole, 2-phenyl indole, 2-phenyl benzimidazole and pyridinium chemotypes allow for specific recognition of RNA motifs. As targeting RNA with small molecules is an extremely challenging area, these studies provide new information on RNA-ligand interactions that has many potential uses.

  2. Identifying the Preferred RNA Motifs and Chemotypes that Interact by Probing Millions of Combinations

    PubMed Central

    Tran, Tuan; Disney, Matthew D.

    2012-01-01

    RNA is an important therapeutic target but information about RNA-ligand interactions is limited. Here we report a screening method that probes over 3,000,000 combinations of RNA motif-small molecule interactions to identify the privileged RNA structures and chemical spaces that interact. Specifically, a small molecule library biased for binding RNA was probed for binding to over 70,000 unique RNA motifs in a high throughput solution-based screen. The RNA motifs that specifically bind each small molecule were identified by microarray-based selection. In this library-versus-library or multidimensional combinatorial screening approach, hairpin loops (amongst a variety of RNA motifs) were the preferred RNA motif space that binds small molecules. Furthermore, it was shown that indole, 2-phenyl indole, 2-phenyl benzimidazole, and pyridinium chemotypes allow for specific recognition of RNA motifs. Since targeting RNA with small molecules is an extremely challenging area, these studies provide new information on RNA-ligand interactions that has many potential uses. PMID:23047683

  3. Genomic positional conservation identifies topological anchor point RNAs linked to developmental loci.

    PubMed

    Amaral, Paulo P; Leonardi, Tommaso; Han, Namshik; Viré, Emmanuelle; Gascoigne, Dennis K; Arias-Carrasco, Raúl; Büscher, Magdalena; Pandolfini, Luca; Zhang, Anda; Pluchino, Stefano; Maracaja-Coutinho, Vinicius; Nakaya, Helder I; Hemberg, Martin; Shiekhattar, Ramin; Enright, Anton J; Kouzarides, Tony

    2018-03-15

    The mammalian genome is transcribed into large numbers of long noncoding RNAs (lncRNAs), but the definition of functional lncRNA groups has proven difficult, partly due to their low sequence conservation and lack of identified shared properties. Here we consider promoter conservation and positional conservation as indicators of functional commonality. We identify 665 conserved lncRNA promoters in mouse and human that are preserved in genomic position relative to orthologous coding genes. These positionally conserved lncRNA genes are primarily associated with developmental transcription factor loci with which they are coexpressed in a tissue-specific manner. Over half of positionally conserved RNAs in this set are linked to chromatin organization structures, overlapping binding sites for the CTCF chromatin organiser and located at chromatin loop anchor points and borders of topologically associating domains (TADs). We define these RNAs as topological anchor point RNAs (tapRNAs). Characterization of these noncoding RNAs and their associated coding genes shows that they are functionally connected: they regulate each other's expression and influence the metastatic phenotype of cancer cells in vitro in a similar fashion. Furthermore, we find that tapRNAs contain conserved sequence domains that are enriched in motifs for zinc finger domain-containing RNA-binding proteins and transcription factors, whose binding sites are found mutated in cancers. This work leverages positional conservation to identify lncRNAs with potential importance in genome organization, development and disease. The evidence that many developmental transcription factors are physically and functionally connected to lncRNAs represents an exciting stepping-stone to further our understanding of genome regulation.

  4. Conserved DNA motifs in the type II-A CRISPR leader region.

    PubMed

    Van Orden, Mason J; Klein, Peter; Babu, Kesavan; Najar, Fares Z; Rajan, Rakhi

    2017-01-01

    The Clustered Regularly Interspaced Short Palindromic Repeats associated (CRISPR-Cas) systems consist of RNA-protein complexes that provide bacteria and archaea with sequence-specific immunity against bacteriophages, plasmids, and other mobile genetic elements. Bacteria and archaea become immune to phage or plasmid infections by inserting short pieces of the intruder DNA (spacer) site-specifically into the leader-repeat junction in a process called adaptation. Previous studies have shown that parts of the leader region, especially the 3' end of the leader, are indispensable for adaptation. However, a comprehensive analysis of leader ends remains absent. Here, we have analyzed the leader, repeat, and Cas proteins from 167 type II-A CRISPR loci. Our results indicate two distinct conserved DNA motifs at the 3' leader end: ATTTGAG (noted previously in the CRISPR1 locus of Streptococcus thermophilus DGCC7710) and a newly defined CTRCGAG, associated with the CRISPR3 locus of S. thermophilus DGCC7710. A third group with a very short CG DNA conservation at the 3' leader end is observed mostly in lactobacilli. Analysis of the repeats and Cas proteins revealed clustering of these CRISPR components that mirrors the leader motif clustering, in agreement with the coevolution of CRISPR-Cas components. Based on our analysis of the type II-A CRISPR loci, we implicate leader end sequences that could confer site-specificity for the adaptation-machinery in the different subsets of type II-A CRISPR loci.

  5. Conserved DNA motifs in the type II-A CRISPR leader region

    PubMed Central

    Babu, Kesavan; Najar, Fares Z.

    2017-01-01

    The Clustered Regularly Interspaced Short Palindromic Repeats associated (CRISPR-Cas) systems consist of RNA-protein complexes that provide bacteria and archaea with sequence-specific immunity against bacteriophages, plasmids, and other mobile genetic elements. Bacteria and archaea become immune to phage or plasmid infections by inserting short pieces of the intruder DNA (spacer) site-specifically into the leader-repeat junction in a process called adaptation. Previous studies have shown that parts of the leader region, especially the 3′ end of the leader, are indispensable for adaptation. However, a comprehensive analysis of leader ends remains absent. Here, we have analyzed the leader, repeat, and Cas proteins from 167 type II-A CRISPR loci. Our results indicate two distinct conserved DNA motifs at the 3′ leader end: ATTTGAG (noted previously in the CRISPR1 locus of Streptococcus thermophilus DGCC7710) and a newly defined CTRCGAG, associated with the CRISPR3 locus of S. thermophilus DGCC7710. A third group with a very short CG DNA conservation at the 3′ leader end is observed mostly in lactobacilli. Analysis of the repeats and Cas proteins revealed clustering of these CRISPR components that mirrors the leader motif clustering, in agreement with the coevolution of CRISPR-Cas components. Based on our analysis of the type II-A CRISPR loci, we implicate leader end sequences that could confer site-specificity for the adaptation-machinery in the different subsets of type II-A CRISPR loci. PMID:28392985

  6. Motif finding in DNA sequences based on skipping nonconserved positions in background Markov chains.

    PubMed

    Zhao, Xiaoyan; Sze, Sing-Hoi

    2011-05-01

    One strategy to identify transcription factor binding sites is through motif finding in upstream DNA sequences of potentially co-regulated genes. Despite extensive efforts, none of the existing algorithms perform very well. We consider a string representation that allows arbitrary ignored positions within the nonconserved portion of single motifs, and use O(2(l)) Markov chains to model the background distributions of motifs of length l while skipping these positions within each Markov chain. By focusing initially on positions that have fixed nucleotides to define core occurrences, we develop an algorithm to identify motifs of moderate lengths. We compare the performance of our algorithm to other motif finding algorithms on a few benchmark data sets, and show that significant improvement in accuracy can be obtained when the sites are sufficiently conserved within a given sample, while comparable performance is obtained when the site conservation rate is low. A software program (PosMotif ) and detailed results are available online at http://faculty.cse.tamu.edu/shsze/posmotif.

  7. Multiple Dileucine-like Motifs Direct VGLUT1 Trafficking

    PubMed Central

    Foss, Sarah M.; Li, Haiyan; Santos, Magda S.; Edwards, Robert H.

    2013-01-01

    The vesicular glutamate transporters (VGLUTs) package glutamate into synaptic vesicles, and the two principal isoforms VGLUT1 and VGLUT2 have been suggested to influence the properties of release. To understand how a VGLUT isoform might influence transmitter release, we have studied their trafficking and previously identified a dileucine-like endocytic motif in the C terminus of VGLUT1. Disruption of this motif impairs the activity-dependent recycling of VGLUT1, but does not eliminate its endocytosis. We now report the identification of two additional dileucine-like motifs in the N terminus of VGLUT1 that are not well conserved in the other isoforms. In the absence of all three motifs, rat VGLUT1 shows limited accumulation at synaptic sites and no longer responds to stimulation. In addition, shRNA-mediated knockdown of clathrin adaptor proteins AP-1 and AP-2 shows that the C-terminal motif acts largely via AP-2, whereas the N-terminal motifs use AP-1. Without the C-terminal motif, knockdown of AP-1 reduces the proportion of VGLUT1 that responds to stimulation. VGLUT1 thus contains multiple sorting signals that engage distinct trafficking mechanisms. In contrast to VGLUT1, the trafficking of VGLUT2 depends almost entirely on the conserved C-terminal dileucine-like motif: without this motif, a substantial fraction of VGLUT2 redistributes to the plasma membrane and the transporter's synaptic localization is disrupted. Consistent with these differences in trafficking signals, wild-type VGLUT1 and VGLUT2 differ in their response to stimulation. PMID:23804088

  8. Multiple dileucine-like motifs direct VGLUT1 trafficking.

    PubMed

    Foss, Sarah M; Li, Haiyan; Santos, Magda S; Edwards, Robert H; Voglmaier, Susan M

    2013-06-26

    The vesicular glutamate transporters (VGLUTs) package glutamate into synaptic vesicles, and the two principal isoforms VGLUT1 and VGLUT2 have been suggested to influence the properties of release. To understand how a VGLUT isoform might influence transmitter release, we have studied their trafficking and previously identified a dileucine-like endocytic motif in the C terminus of VGLUT1. Disruption of this motif impairs the activity-dependent recycling of VGLUT1, but does not eliminate its endocytosis. We now report the identification of two additional dileucine-like motifs in the N terminus of VGLUT1 that are not well conserved in the other isoforms. In the absence of all three motifs, rat VGLUT1 shows limited accumulation at synaptic sites and no longer responds to stimulation. In addition, shRNA-mediated knockdown of clathrin adaptor proteins AP-1 and AP-2 shows that the C-terminal motif acts largely via AP-2, whereas the N-terminal motifs use AP-1. Without the C-terminal motif, knockdown of AP-1 reduces the proportion of VGLUT1 that responds to stimulation. VGLUT1 thus contains multiple sorting signals that engage distinct trafficking mechanisms. In contrast to VGLUT1, the trafficking of VGLUT2 depends almost entirely on the conserved C-terminal dileucine-like motif: without this motif, a substantial fraction of VGLUT2 redistributes to the plasma membrane and the transporter's synaptic localization is disrupted. Consistent with these differences in trafficking signals, wild-type VGLUT1 and VGLUT2 differ in their response to stimulation.

  9. SLiMSearch 2.0: biological context for short linear motifs in proteins

    PubMed Central

    Davey, Norman E.; Haslam, Niall J.; Shields, Denis C.

    2011-01-01

    Short, linear motifs (SLiMs) play a critical role in many biological processes. The SLiMSearch 2.0 (Short, Linear Motif Search) web server allows researchers to identify occurrences of a user-defined SLiM in a proteome, using conservation and protein disorder context statistics to rank occurrences. User-friendly output and visualizations of motif context allow the user to quickly gain insight into the validity of a putatively functional motif occurrence. For each motif occurrence, overlapping UniProt features and annotated SLiMs are displayed. Visualization also includes annotated multiple sequence alignments surrounding each occurrence, showing conservation and protein disorder statistics in addition to known and predicted SLiMs, protein domains and known post-translational modifications. In addition, enrichment of Gene Ontology terms and protein interaction partners are provided as indicators of possible motif function. All web server results are available for download. Users can search motifs against the human proteome or a subset thereof defined by Uniprot accession numbers or GO term. The SLiMSearch server is available at: http://bioware.ucd.ie/slimsearch2.html. PMID:21622654

  10. RNA 3D Structural Motifs: Definition, Identification, Annotation, and Database Searching

    NASA Astrophysics Data System (ADS)

    Nasalean, Lorena; Stombaugh, Jesse; Zirbel, Craig L.; Leontis, Neocles B.

    Structured RNA molecules resemble proteins in the hierarchical organization of their global structures, folding and broad range of functions. Structured RNAs are composed of recurrent modular motifs that play specific functional roles. Some motifs direct the folding of the RNA or stabilize the folded structure through tertiary interactions. Others bind ligands or proteins or catalyze chemical reactions. Therefore, it is desirable, starting from the RNA sequence, to be able to predict the locations of recurrent motifs in RNA molecules. Conversely, the potential occurrence of one or more known 3D RNA motifs may indicate that a genomic sequence codes for a structured RNA molecule. To identify known RNA structural motifs in new RNA sequences, precise structure-based definitions are needed that specify the core nucleotides of each motif and their conserved interactions. By comparing instances of each recurrent motif and applying base pair isosteriCity relations, one can identify neutral mutations that preserve its structure and function in the contexts in which it occurs.

  11. Viral Protein Inhibits RISC Activity by Argonaute Binding through Conserved WG/GW Motifs

    PubMed Central

    García-Chapa, Meritxell; López-Moya, Juan José; Burgyán, József

    2010-01-01

    RNA silencing is an evolutionarily conserved sequence-specific gene-inactivation system that also functions as an antiviral mechanism in higher plants and insects. To overcome antiviral RNA silencing, viruses express silencing-suppressor proteins. These viral proteins can target one or more key points in the silencing machinery. Here we show that in Sweet potato mild mottle virus (SPMMV, type member of the Ipomovirus genus, family Potyviridae), the role of silencing suppressor is played by the P1 protein (the largest serine protease among all known potyvirids) despite the presence in its genome of an HC-Pro protein, which, in potyviruses, acts as the suppressor. Using in vivo studies we have demonstrated that SPMMV P1 inhibits si/miRNA-programmed RISC activity. Inhibition of RISC activity occurs by binding P1 to mature high molecular weight RISC, as we have shown by immunoprecipitation. Our results revealed that P1 targets Argonaute1 (AGO1), the catalytic unit of RISC, and that suppressor/binding activities are localized at the N-terminal half of P1. In this region three WG/GW motifs were found resembling the AGO-binding linear peptide motif conserved in metazoans and plants. Site-directed mutagenesis proved that these three motifs are absolutely required for both binding and suppression of AGO1 function. In contrast to other viral silencing suppressors analyzed so far P1 inhibits both existing and de novo formed AGO1 containing RISC complexes. Thus P1 represents a novel RNA silencing suppressor mechanism. The discovery of the molecular bases of P1 mediated silencing suppression may help to get better insight into the function and assembly of the poorly explored multiprotein containing RISC. PMID:20657820

  12. Ser/Thr Motifs in Transmembrane Proteins: Conservation Patterns and Effects on Local Protein Structure and Dynamics

    PubMed Central

    del Val, Coral; White, Stephen H.

    2014-01-01

    We combined systematic bioinformatics analyses and molecular dynamics simulations to assess the conservation patterns of Ser and Thr motifs in membrane proteins, and the effect of such motifs on the structure and dynamics of α-helical transmembrane (TM) segments. We find that Ser/Thr motifs are often present in β-barrel TM proteins. At least one Ser/Thr motif is present in almost half of the sequences of α-helical proteins analyzed here. The extensive bioinformatics analyses and inspection of protein structures led to the identification of molecular transporters with noticeable numbers of Ser/Thr motifs within the TM region. Given the energetic penalty for burying multiple Ser/Thr groups in the membrane hydrophobic core, the observation of transporters with multiple membrane-embedded Ser/Thr is intriguing and raises the question of how the presence of multiple Ser/Thr affects protein local structure and dynamics. Molecular dynamics simulations of four different Ser-containing model TM peptides indicate that backbone hydrogen bonding of membrane-buried Ser/Thr hydroxyl groups can significantly change the local structure and dynamics of the helix. Ser groups located close to the membrane interface can hydrogen bond to solvent water instead of protein backbone, leading to an enhanced local solvation of the peptide. PMID:22836667

  13. BlockLogo: visualization of peptide and sequence motif conservation

    PubMed Central

    Olsen, Lars Rønn; Kudahl, Ulrich Johan; Simon, Christian; Sun, Jing; Schönbach, Christian; Reinherz, Ellis L.; Zhang, Guang Lan; Brusic, Vladimir

    2013-01-01

    BlockLogo is a web-server application for visualization of protein and nucleotide fragments, continuous protein sequence motifs, and discontinuous sequence motifs using calculation of block entropy from multiple sequence alignments. The user input consists of a multiple sequence alignment, selection of motif positions, type of sequence, and output format definition. The output has BlockLogo along with the sequence logo, and a table of motif frequencies. We deployed BlockLogo as an online application and have demonstrated its utility through examples that show visualization of T-cell epitopes and B-cell epitopes (both continuous and discontinuous). Our additional example shows a visualization and analysis of structural motifs that determine specificity of peptide binding to HLA-DR molecules. The BlockLogo server also employs selected experimentally validated prediction algorithms to enable on-the-fly prediction of MHC binding affinity to 15 common HLA class I and class II alleles as well as visual analysis of discontinuous epitopes from multiple sequence alignments. It enables the visualization and analysis of structural and functional motifs that are usually described as regular expressions. It provides a compact view of discontinuous motifs composed of distant positions within biological sequences. BlockLogo is available at: http://research4.dfci.harvard.edu/cvc/blocklogo/ and http://methilab.bu.edu/blocklogo/ PMID:24001880

  14. Plant and yeast cornichon possess a conserved acidic motif required for correct targeting of plasma membrane cargos.

    PubMed

    Rosas-Santiago, Paul; Lagunas-Gomez, Daniel; Yáñez-Domínguez, Carolina; Vera-Estrella, Rosario; Zimmermannová, Olga; Sychrová, Hana; Pantoja, Omar

    2017-10-01

    The export of membrane proteins along the secretory pathway is initiated at the endoplasmic reticulum after proteins are folded and packaged inside this organelle by their recruiting into the coat complex COPII vesicles. It is proposed that cargo receptors are required for the correct transport of proteins to its target membrane, however, little is known about ER export signals for cargo receptors. Erv14/Cornichon belong to a well conserved protein family in Eukaryotes, and have been proposed to function as cargo receptors for many transmembrane proteins. Amino acid sequence alignment showed the presence of a conserved acidic motif in the C-terminal in homologues from plants and yeast. Here, we demonstrate that mutation of the C-terminal acidic motif from ScErv14 or OsCNIH1, did not alter the localization of these cargo receptors, however it modified the proper targeting of the plasma membrane transporters Nha1p, Pdr12p and Qdr2p. Our results suggest that mistargeting of these plasma membrane proteins is a consequence of a weaker interaction between the cargo receptor and cargo proteins caused by the mutation of the C-terminal acidic motif. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. Identity and functions of CxxC-derived motifs.

    PubMed

    Fomenko, Dmitri E; Gladyshev, Vadim N

    2003-09-30

    Two cysteines separated by two other residues (the CxxC motif) are employed by many redox proteins for formation, isomerization, and reduction of disulfide bonds and for other redox functions. The place of the C-terminal cysteine in this motif may be occupied by serine (the CxxS motif), modifying the functional repertoire of redox proteins. Here we found that the CxxC motif may also give rise to a motif, in which the C-terminal cysteine is replaced with threonine (the CxxT motif). Moreover, in contrast to a view that the N-terminal cysteine in the CxxC motif always serves as a nucleophilic attacking group, this residue could also be replaced with threonine (the TxxC motif), serine (the SxxC motif), or other residues. In each of these CxxC-derived motifs, the presence of a downstream alpha-helix was strongly favored. A search for conserved CxxC-derived motif/helix patterns in four complete genomes representing bacteria, archaea, and eukaryotes identified known redox proteins and suggested possible redox functions for several additional proteins. Catalytic sites in peroxiredoxins were major representatives of the TxxC motif, whereas those in glutathione peroxidases represented the CxxT motif. Structural assessments indicated that threonines in these enzymes could stabilize catalytic thiolates, suggesting revisions to previously proposed catalytic triads. Each of the CxxC-derived motifs was also observed in natural selenium-containing proteins, in which selenocysteine was present in place of a catalytic cysteine.

  16. Members of the Meloidogyne avirulence protein family contain multiple plant ligand-like motifs.

    PubMed

    Rutter, William B; Hewezi, Tarek; Maier, Tom R; Mitchum, Melissa G; Davis, Eric L; Hussey, Richard S; Baum, Thomas J

    2014-08-01

    Sedentary plant-parasitic nematodes engage in complex interactions with their host plants by secreting effector proteins. Some effectors of both root-knot nematodes (Meloidogyne spp.) and cyst nematodes (Heterodera and Globodera spp.) mimic plant ligand proteins. Most prominently, cyst nematodes secrete effectors that mimic plant CLAVATA3/ESR-related (CLE) ligand proteins. However, only cyst nematodes have been shown to secrete such effectors and to utilize CLE ligand mimicry in their interactions with host plants. Here, we document the presence of ligand-like motifs in bona fide root-knot nematode effectors that are most similar to CLE peptides from plants and cyst nematodes. We have identified multiple tandem CLE-like motifs conserved within the previously identified Meloidogyne avirulence protein (MAP) family that are secreted from root-knot nematodes and have been shown to function in planta. By searching all 12 MAP family members from multiple Meloidogyne spp., we identified 43 repetitive CLE-like motifs composing 14 unique variants. At least one CLE-like motif was conserved in each MAP family member. Furthermore, we documented the presence of other conserved sequences that resemble the variable domains described in Heterodera and Globodera CLE effectors. These findings document that root-knot nematodes appear to use CLE ligand mimicry and point toward a common host node targeted by two evolutionarily diverse groups of nematodes. As a consequence, it is likely that CLE signaling pathways are important in other phytonematode pathosystems as well.

  17. A single amino-acid change in a highly conserved motif of gp41 elicits HIV-1 neutralization and protects against CD4 depletion.

    PubMed

    Petitdemange, Caroline; Achour, Abla; Dispinseri, Stefania; Malet, Isabelle; Sennepin, Alexis; Ho Tsong Fang, Raphaël; Crouzet, Joël; Marcelin, Anne-Geneviève; Calvez, Vincent; Scarlatti, Gabriella; Debré, Patrice; Vieillard, Vincent

    2013-09-01

    The induction of neutralizing antibodies against conserved regions of the human immunodeficiency virus type 1 (HIV-1) envelope protein is a major goal of vaccine strategies. We previously identified 3S, a critical conserved motif of gp41 that induces the NKp44L ligand of an activating NK receptor. In vivo, anti-3S antibodies protect against the natural killer (NK) cell-mediated CD4 depletion that occurs without efficient viral neutralization. Specific substitutions within the 3S peptide motif were prepared by directed mutagenesis. Virus production was monitored by measuring the p24 production. Neutralization assays were performed with immune-purified antibodies from immunized mice and a cohort of HIV-infected patients. Expression of NKp44L on CD4(+) T cells and degranulation assay on activating NK cells were both performed by flow cytometry. Here, we show that specific substitutions in the 3S motif reduce viral infection without affecting gp41 production, while decreasing both its capacity to induce NKp44L expression on CD4(+) T cells and its sensitivity to autologous NK cells. Generation of antibodies in mice against the W614 specific position in the 3S motif elicited a capacity to neutralize cross-clade viruses, notable in its magnitude, breadth, and durability. Antibodies against this 3S variant were also detected in sera from some HIV-1-infected patients, demonstrating both neutralization activity and protection against CD4 depletion. These findings suggest that a specific substitution in a 3S-based immunogen might allow the generation of specific antibodies, providing a foundation for a rational vaccine that combine a capacity to neutralize HIV-1 and to protect CD4(+) T cells.

  18. Detecting DNA regulatory motifs by incorporating positional trendsin information content

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kechris, Katherina J.; van Zwet, Erik; Bickel, Peter J.

    2004-05-04

    On the basis of the observation that conserved positions in transcription factor binding sites are often clustered together, we propose a simple extension to the model-based motif discovery methods. We assign position-specific prior distributions to the frequency parameters of the model, penalizing deviations from a specified conservation profile. Examples with both simulated and real data show that this extension helps discover motifs as the data become noisier or when there is a competing false motif.

  19. Transcriptional regulation of human eosinophil RNases by an evolutionary- conserved sequence motif in primate genome

    PubMed Central

    Wang, Hsiu-Yu; Chang, Hao-Teng; Pai, Tun-Wen; Wu, Chung-I; Lee, Yuan-Hung; Chang, Yen-Hsin; Tai, Hsiu-Ling; Tang, Chuan-Yi; Chou, Wei-Yao; Chang, Margaret Dah-Tsyr

    2007-01-01

    Background Human eosinophil-derived neurotoxin (edn) and eosinophil cationic protein (ecp) are members of a subfamily of primate ribonuclease (rnase) genes. Although they are generated by gene duplication event, distinct edn and ecp expression profile in various tissues have been reported. Results In this study, we obtained the upstream promoter sequences of several representative primate eosinophil rnases. Bioinformatic analysis revealed the presence of a shared 34-nucleotide (nt) sequence stretch located at -81 to -48 in all edn promoters and macaque ecp promoter. Such a unique sequence motif constituted a region essential for transactivation of human edn in hepatocellular carcinoma cells. Gel electrophoretic mobility shift assay, transient transfection and scanning mutagenesis experiments allowed us to identify binding sites for two transcription factors, Myc-associated zinc finger protein (MAZ) and SV-40 protein-1 (Sp1), within the 34-nt segment. Subsequent in vitro and in vivo binding assays demonstrated a direct molecular interaction between this 34-nt region and MAZ and Sp1. Interestingly, overexpression of MAZ and Sp1 respectively repressed and enhanced edn promoter activity. The regulatory transactivation motif was mapped to the evolutionarily conserved -74/-65 region of the edn promoter, which was guanidine-rich and critical for recognition by both transcription factors. Conclusion Our results provide the first direct evidence that MAZ and Sp1 play important roles on the transcriptional activation of the human edn promoter through specific binding to a 34-nt segment present in representative primate eosinophil rnase promoters. PMID:17927842

  20. Conserved Tryptophan Motifs in the Large Tegument Protein pUL36 Are Required for Efficient Secondary Envelopment of Herpes Simplex Virus Capsids

    PubMed Central

    Ivanova, Lyudmila; Buch, Anna; Döhner, Katinka; Pohlmann, Anja; Binz, Anne; Prank, Ute; Sandbaumhüter, Malte

    2016-01-01

    ABSTRACT Herpes simplex virus (HSV) replicates in the skin and mucous membranes, and initiates lytic or latent infections in sensory neurons. Assembly of progeny virions depends on the essential large tegument protein pUL36 of 3,164 amino acid residues that links the capsids to the tegument proteins pUL37 and VP16. Of the 32 tryptophans of HSV-1-pUL36, the tryptophan-acidic motifs 1766WD1767 and 1862WE1863 are conserved in all HSV-1 and HSV-2 isolates. Here, we characterized the role of these motifs in the HSV life cycle since the rare tryptophans often have unique roles in protein function due to their large hydrophobic surface. The infectivity of the mutants HSV-1(17+)Lox-pUL36-WD/AA-WE/AA and HSV-1(17+)Lox-CheVP26-pUL36-WD/AA-WE/AA, in which the capsid has been tagged with the fluorescent protein Cherry, was significantly reduced. Quantitative electron microscopy shows that there were a larger number of cytosolic capsids and fewer enveloped virions compared to their respective parental strains, indicating a severe impairment in secondary capsid envelopment. The capsids of the mutant viruses accumulated in the perinuclear region around the microtubule-organizing center and were not dispersed to the cell periphery but still acquired the inner tegument proteins pUL36 and pUL37. Furthermore, cytoplasmic capsids colocalized with tegument protein VP16 and, to some extent, with tegument protein VP22 but not with the envelope glycoprotein gD. These results indicate that the unique conserved tryptophan-acidic motifs in the central region of pUL36 are required for efficient targeting of progeny capsids to the membranes of secondary capsid envelopment and for efficient virion assembly. IMPORTANCE Herpesvirus infections give rise to severe animal and human diseases, especially in young, immunocompromised, and elderly individuals. The structural hallmark of herpesvirus virions is the tegument, which contains evolutionarily conserved proteins that are essential for several

  1. Discovering Motifs in Biological Sequences Using the Micron Automata Processor.

    PubMed

    Roy, Indranil; Aluru, Srinivas

    2016-01-01

    Finding approximately conserved sequences, called motifs, across multiple DNA or protein sequences is an important problem in computational biology. In this paper, we consider the (l, d) motif search problem of identifying one or more motifs of length l present in at least q of the n given sequences, with each occurrence differing from the motif in at most d substitutions. The problem is known to be NP-complete, and the largest solved instance reported to date is (26,11). We propose a novel algorithm for the (l,d) motif search problem using streaming execution over a large set of non-deterministic finite automata (NFA). This solution is designed to take advantage of the micron automata processor, a new technology close to deployment that can simultaneously execute multiple NFA in parallel. We demonstrate the capability for solving much larger instances of the (l, d) motif search problem using the resources available within a single automata processor board, by estimating run-times for problem instances (39,18) and (40,17). The paper serves as a useful guide to solving problems using this new accelerator technology.

  2. Ni2+-binding RNA motifs with an asymmetric purine-rich internal loop and a G-A base pair.

    PubMed Central

    Hofmann, H P; Limmer, S; Hornung, V; Sprinzl, M

    1997-01-01

    RNA molecules with high affinity for immobilized Ni2+ were isolated from an RNA pool with 50 randomized positions by in vitro selection-amplification. The selected RNAs preferentially bind Ni2+ and Co2+ over other cations from first series transition metals. Conserved structure motifs, comprising about 15 nt, were identified that are likely to represent the Ni2+ binding sites. Two conserved motifs contain an asymmetric purine-rich internal loop and probably a mismatch G-A base pair. The structure of one of these motifs was studied with proton NMR spectroscopy and formation of the G-A pair at the junction of helix and internal loop was demonstrated. Using Ni2+ as a paramagnetic probe, a divalent metal ion binding site near this G-A base pair was identified. Ni2+ ions bound to this motif exert a specific stabilization effect. We propose that small asymmetric purine-rich loops that contain a G-A interaction may represent a divalent metal ion binding site in RNA. PMID:9409620

  3. Motif types, motif locations and base composition patterns around the RNA polyadenylation site in microorganisms, plants and animals

    PubMed Central

    2014-01-01

    Background The polyadenylation of RNA is critical for gene functioning, but the conserved sequence motifs (often called signal or signature motifs), motif locations and abundances, and base composition patterns around mRNA polyadenylation [poly(A)] sites are still uncharacterized in most species. The evolutionary tendency for poly(A) site selection is still largely unknown. Results We analyzed the poly(A) site regions of 31 species or phyla. Different groups of species showed different poly(A) signal motifs: UUACUU at the poly(A) site in the parasite Trypanosoma cruzi; UGUAAC (approximately 13 bases upstream of the site) in the alga Chlamydomonas reinhardtii; UGUUUG (or UGUUUGUU) at mainly the fourth base downstream of the poly(A) site in the parasite Blastocystis hominis; and AAUAAA at approximately 16 bases and approximately 19 bases upstream of the poly(A) site in animals and plants, respectively. Polyadenylation signal motifs are usually several hundred times more abundant around poly(A) sites than in whole genomes. These predominant motifs usually had very specific locations, whether upstream of, at, or downstream of poly(A) sites, depending on the species or phylum. The poly(A) site was usually an adenosine (A) in all analyzed species except for B. hominis, and there was weak A predominance in C. reinhardtii. Fungi, animals, plants, and the protist Phytophthora infestans shared a general base abundance pattern (or base composition pattern) of “U-rich—A-rich—U-rich—Poly(A) site—U-rich regions”, or U-A-U-A-U for short, with some variation for each kingdom or subkingdom. Conclusion This study identified the poly(A) signal motifs, motif locations, and base composition patterns around mRNA poly(A) sites in protists, fungi, plants, and animals and provided insight into poly(A) site evolution. PMID:25052519

  4. Multiple activities of the plant pathogen type III effector proteins WtsE and AvrE require WxxxE motifs.

    PubMed

    Ham, Jong Hyun; Majerczak, Doris R; Nomura, Kinya; Mecey, Christy; Uribe, Francisco; He, Sheng-Yang; Mackey, David; Coplin, David L

    2009-06-01

    The broadly conserved AvrE-family of type III effectors from gram-negative plant-pathogenic bacteria includes important virulence factors, yet little is known about the mechanisms by which these effectors function inside plant cells to promote disease. We have identified two conserved motifs in AvrE-family effectors: a WxxxE motif and a putative C-terminal endoplasmic reticulum membrane retention/retrieval signal (ERMRS). The WxxxE and ERMRS motifs are both required for the virulence activities of WtsE and AvrE, which are major virulence factors of the corn pathogen Pantoea stewartii subsp. stewartii and the tomato or Arabidopsis pathogen Pseudomonas syringae pv. tomato, respectively. The WxxxE and the predicted ERMRS motifs are also required for other biological activities of WtsE, including elicitation of the hypersensitive response in nonhost plants and suppression of defense responses in Arabidopsis. A family of type III effectors from mammalian bacterial pathogens requires WxxxE and subcellular targeting motifs for virulence functions that involve their ability to mimic activated G-proteins. The conservation of related motifs and their necessity for the function of type III effectors from plant pathogens indicates that disturbing host pathways by mimicking activated host G-proteins may be a virulence mechanism employed by plant pathogens as well.

  5. WebMOTIFS: automated discovery, filtering and scoring of DNA sequence motifs using multiple programs and Bayesian approaches

    PubMed Central

    Romer, Katherine A.; Kayombya, Guy-Richard; Fraenkel, Ernest

    2007-01-01

    WebMOTIFS provides a web interface that facilitates the discovery and analysis of DNA-sequence motifs. Several studies have shown that the accuracy of motif discovery can be significantly improved by using multiple de novo motif discovery programs and using randomized control calculations to identify the most significant motifs or by using Bayesian approaches. WebMOTIFS makes it easy to apply these strategies. Using a single submission form, users can run several motif discovery programs and score, cluster and visualize the results. In addition, the Bayesian motif discovery program THEME can be used to determine the class of transcription factors that is most likely to regulate a set of sequences. Input can be provided as a list of gene or probe identifiers. Used with the default settings, WebMOTIFS accurately identifies biologically relevant motifs from diverse data in several species. WebMOTIFS is freely available at http://fraenkel.mit.edu/webmotifs. PMID:17584794

  6. Identifying the scale-dependent motifs in atmospheric surface layer by ordinal pattern analysis

    NASA Astrophysics Data System (ADS)

    Li, Qinglei; Fu, Zuntao

    2018-07-01

    Ramp-like structures in various atmospheric surface layer time series have been long studied, but the presence of motifs with the finer scale embedded within larger scale ramp-like structures has largely been overlooked in the reported literature. Here a novel, objective and well-adapted methodology, the ordinal pattern analysis, is adopted to study the finer-scaled motifs in atmospheric boundary-layer (ABL) time series. The studies show that the motifs represented by different ordinal patterns take clustering properties and 6 dominated motifs out of the whole 24 motifs account for about 45% of the time series under particular scales, which indicates the higher contribution of motifs with the finer scale to the series. Further studies indicate that motif statistics are similar for both stable conditions and unstable conditions at larger scales, but large discrepancies are found at smaller scales, and the frequencies of motifs "1234" and/or "4321" are a bit higher under stable conditions than unstable conditions. Under stable conditions, there are great changes for the occurrence frequencies of motifs "1234" and "4321", where the occurrence frequencies of motif "1234" decrease from nearly 24% to 4.5% with the scale factor increasing, and the occurrence frequencies of motif "4321" change nonlinearly with the scale increasing. These great differences of dominated motifs change with scale can be taken as an indicator to quantify the flow structure changes under different stability conditions, and motif entropy can be defined just by only 6 dominated motifs to quantify this time-scale independent property of the motifs. All these results suggest that the defined scale of motifs with the finer scale should be carefully taken into consideration in the interpretation of turbulence coherent structures.

  7. Discriminative motif discovery via simulated evolution and random under-sampling.

    PubMed

    Song, Tao; Gu, Hong

    2014-01-01

    Conserved motifs in biological sequences are closely related to their structure and functions. Recently, discriminative motif discovery methods have attracted more and more attention. However, little attention has been devoted to the data imbalance problem, which is one of the main reasons affecting the performance of the discriminative models. In this article, a simulated evolution method is applied to solve the multi-class imbalance problem at the stage of data preprocessing, and at the stage of Hidden Markov Models (HMMs) training, a random under-sampling method is introduced for the imbalance between the positive and negative datasets. It is shown that, in the task of discovering targeting motifs of nine subcellular compartments, the motifs found by our method are more conserved than the methods without considering data imbalance problem and recover the most known targeting motifs from Minimotif Miner and InterPro. Meanwhile, we use the found motifs to predict protein subcellular localization and achieve higher prediction precision and recall for the minority classes.

  8. A Gibbs sampler for motif detection in phylogenetically close sequences

    NASA Astrophysics Data System (ADS)

    Siddharthan, Rahul; van Nimwegen, Erik; Siggia, Eric

    2004-03-01

    Genes are regulated by transcription factors that bind to DNA upstream of genes and recognize short conserved ``motifs'' in a random intergenic ``background''. Motif-finders such as the Gibbs sampler compare the probability of these short sequences being represented by ``weight matrices'' to the probability of their arising from the background ``null model'', and explore this space (analogous to a free-energy landscape). But closely related species may show conservation not because of functional sites but simply because they have not had sufficient time to diverge, so conventional methods will fail. We introduce a new Gibbs sampler algorithm that accounts for common ancestry when searching for motifs, while requiring minimal ``prior'' assumptions on the number and types of motifs, assessing the significance of detected motifs by ``tracking'' clusters that stay together. We apply this scheme to motif detection in sporulation-cycle genes in the yeast S. cerevisiae, using recent sequences of other closely-related Saccharomyces species.

  9. Overlapping ETS and CRE Motifs (G/CCGGAAGTGACGTCA) Preferentially Bound by GABPα and CREB Proteins

    PubMed Central

    Chatterjee, Raghunath; Zhao, Jianfei; He, Ximiao; Shlyakhtenko, Andrey; Mann, Ishminder; Waterfall, Joshua J.; Meltzer, Paul; Sathyanarayana, B. K.; FitzGerald, Peter C.; Vinson, Charles

    2012-01-01

    Previously, we identified 8-bps long DNA sequences (8-mers) that localize in human proximal promoters and grouped them into known transcription factor binding sites (TFBS). We now examine split 8-mers consisting of two 4-mers separated by 1-bp to 30-bps (X4-N1-30-X4) to identify pairs of TFBS that localize in proximal promoters at a precise distance. These include two overlapping TFBS: the ETS⇔ETS motif (C/GCCGGAAGCGGAA) and the ETS⇔CRE motif (C/GCGGAAGTGACGTCAC). The nucleotides in bold are part of both TFBS. Molecular modeling shows that the ETS⇔CRE motif can be bound simultaneously by both the ETS and the B-ZIP domains without protein-protein clashes. The electrophoretic mobility shift assay (EMSA) shows that the ETS protein GABPα and the B-ZIP protein CREB preferentially bind to the ETS⇔CRE motif only when the two TFBS overlap precisely. In contrast, the ETS domain of ETV5 and CREB interfere with each other for binding the ETS⇔CRE. The 11-mer (CGGAAGTGACG), the conserved part of the ETS⇔CRE motif, occurs 226 times in the human genome and 83% are in known regulatory regions. In vivo GABPα and CREB ChIP-seq peaks identified the ETS⇔CRE as the most enriched motif occurring in promoters of genes involved in mRNA processing, cellular catabolic processes, and stress response, suggesting that a specific class of genes is regulated by this composite motif. PMID:23050235

  10. A conserved C-terminal RXG motif in the NgBR subunit of cis-prenyltransferase is critical for prenyltransferase activity.

    PubMed

    Grabińska, Kariona A; Edani, Ban H; Park, Eon Joo; Kraehling, Jan R; Sessa, William C

    2017-10-20

    cis -Prenyltransferases ( cis -PTs) constitute a large family of enzymes conserved during evolution and present in all domains of life. In eukaryotes and archaea, cis -PT is the first enzyme committed to the synthesis of dolichyl phosphate, an obligate lipid carrier in protein glycosylation reactions. The homodimeric bacterial enzyme, undecaprenyl diphosphate synthase, generates 11 isoprene units and has been structurally and mechanistically characterized in great detail. Recently, we discovered that unlike undecaprenyl diphosphate synthase, mammalian cis -PT is a heteromer consisting of NgBR (Nus1) and hCIT (dehydrodolichol diphosphate synthase) subunits, and this composition has been confirmed in plants and fungal cis -PTs. Here, we establish the first purification system for heteromeric cis -PT and show that both NgBR and hCIT subunits function in catalysis and substrate binding. Finally, we identified a critical R X G sequence in the C-terminal tail of NgBR that is conserved and essential for enzyme activity across phyla. In summary, our findings show that eukaryotic cis -PT is composed of the NgBR and hCIT subunits. The strong conservation of the R X G motif among NgBR orthologs indicates that this subunit is critical for the synthesis of polyprenol diphosphates and cellular function. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.

  11. Automatic annotation of protein motif function with Gene Ontology terms.

    PubMed

    Lu, Xinghua; Zhai, Chengxiang; Gopalakrishnan, Vanathi; Buchanan, Bruce G

    2004-09-02

    Conserved protein sequence motifs are short stretches of amino acid sequence patterns that potentially encode the function of proteins. Several sequence pattern searching algorithms and programs exist foridentifying candidate protein motifs at the whole genome level. However, a much needed and important task is to determine the functions of the newly identified protein motifs. The Gene Ontology (GO) project is an endeavor to annotate the function of genes or protein sequences with terms from a dynamic, controlled vocabulary and these annotations serve well as a knowledge base. This paper presents methods to mine the GO knowledge base and use the association between the GO terms assigned to a sequence and the motifs matched by the same sequence as evidence for predicting the functions of novel protein motifs automatically. The task of assigning GO terms to protein motifs is viewed as both a binary classification and information retrieval problem, where PROSITE motifs are used as samples for mode training and functional prediction. The mutual information of a motif and aGO term association is found to be a very useful feature. We take advantage of the known motifs to train a logistic regression classifier, which allows us to combine mutual information with other frequency-based features and obtain a probability of correct association. The trained logistic regression model has intuitively meaningful and logically plausible parameter values, and performs very well empirically according to our evaluation criteria. In this research, different methods for automatic annotation of protein motifs have been investigated. Empirical result demonstrated that the methods have a great potential for detecting and augmenting information about the functions of newly discovered candidate protein motifs.

  12. Classification and assessment tools for structural motif discovery algorithms.

    PubMed

    Badr, Ghada; Al-Turaiki, Isra; Mathkour, Hassan

    2013-01-01

    Motif discovery is the problem of finding recurring patterns in biological data. Patterns can be sequential, mainly when discovered in DNA sequences. They can also be structural (e.g. when discovering RNA motifs). Finding common structural patterns helps to gain a better understanding of the mechanism of action (e.g. post-transcriptional regulation). Unlike DNA motifs, which are sequentially conserved, RNA motifs exhibit conservation in structure, which may be common even if the sequences are different. Over the past few years, hundreds of algorithms have been developed to solve the sequential motif discovery problem, while less work has been done for the structural case. In this paper, we survey, classify, and compare different algorithms that solve the structural motif discovery problem, where the underlying sequences may be different. We highlight their strengths and weaknesses. We start by proposing a benchmark dataset and a measurement tool that can be used to evaluate different motif discovery approaches. Then, we proceed by proposing our experimental setup. Finally, results are obtained using the proposed benchmark to compare available tools. To the best of our knowledge, this is the first attempt to compare tools solely designed for structural motif discovery. Results show that the accuracy of discovered motifs is relatively low. The results also suggest a complementary behavior among tools where some tools perform well on simple structures, while other tools are better for complex structures. We have classified and evaluated the performance of available structural motif discovery tools. In addition, we have proposed a benchmark dataset with tools that can be used to evaluate newly developed tools.

  13. MIR@NT@N: a framework integrating transcription factors, microRNAs and their targets to identify sub-network motifs in a meta-regulation network model

    PubMed Central

    2011-01-01

    Background To understand biological processes and diseases, it is crucial to unravel the concerted interplay of transcription factors (TFs), microRNAs (miRNAs) and their targets within regulatory networks and fundamental sub-networks. An integrative computational resource generating a comprehensive view of these regulatory molecular interactions at a genome-wide scale would be of great interest to biologists, but is not available to date. Results To identify and analyze molecular interaction networks, we developed MIR@NT@N, an integrative approach based on a meta-regulation network model and a large-scale database. MIR@NT@N uses a graph-based approach to predict novel molecular actors across multiple regulatory processes (i.e. TFs acting on protein-coding or miRNA genes, or miRNAs acting on messenger RNAs). Exploiting these predictions, the user can generate networks and further analyze them to identify sub-networks, including motifs such as feedback and feedforward loops (FBL and FFL). In addition, networks can be built from lists of molecular actors with an a priori role in a given biological process to predict novel and unanticipated interactions. Analyses can be contextualized and filtered by integrating additional information such as microarray expression data. All results, including generated graphs, can be visualized, saved and exported into various formats. MIR@NT@N performances have been evaluated using published data and then applied to the regulatory program underlying epithelium to mesenchyme transition (EMT), an evolutionary-conserved process which is implicated in embryonic development and disease. Conclusions MIR@NT@N is an effective computational approach to identify novel molecular regulations and to predict gene regulatory networks and sub-networks including conserved motifs within a given biological context. Taking advantage of the M@IA environment, MIR@NT@N is a user-friendly web resource freely available at http://mironton.uni.lu which will be

  14. [Prediction of Promoter Motifs in Virophages].

    PubMed

    Gong, Chaowen; Zhou, Xuewen; Pan, Yingjie; Wang, Yongjie

    2015-07-01

    Virophages have crucial roles in ecosystems and are the transport vectors of genetic materials. To shed light on regulation and control mechanisms in virophage--host systems as well as evolution between virophages and their hosts, the promoter motifs of virophages were predicted on the upstream regions of start codons using an analytical tool for prediction of promoter motifs: Multiple EM for Motif Elicitation. Seventeen potential promoter motifs were identified based on the E-value, location, number and length of promoters in genomes. Sputnik and zamilon motif 2 with AT-rich regions were distributed widely on genomes, suggesting that these motifs may be associated with regulation of the expression of various genes. Motifs containing the TCTA box were predicted to be late promoter motif in mavirus; motifs containing the ATCT box were the potential late promoter motif in the Ace Lake mavirus . AT-rich regions were identified on motif 2 in the Organic Lake virophage, motif 3 in Yellowstone Lake virophage (YSLV)1 and 2, motif 1 in YSLV3, and motif 1 and 2 in YSLV4, respectively. AT-rich regions were distributed widely on the genomes of virophages. All of these motifs may be promoter motifs of virophages. Our results provide insights into further exploration of temporal expression of genes in virophages as well as associations between virophages and giant viruses.

  15. Gibbs motif sampling: detection of bacterial outer membrane protein repeats.

    PubMed Central

    Neuwald, A. F.; Liu, J. S.; Lawrence, C. E.

    1995-01-01

    The detection and alignment of locally conserved regions (motifs) in multiple sequences can provide insight into protein structure, function, and evolution. A new Gibbs sampling algorithm is described that detects motif-encoding regions in sequences and optimally partitions them into distinct motif models; this is illustrated using a set of immunoglobulin fold proteins. When applied to sequences sharing a single motif, the sampler can be used to classify motif regions into related submodels, as is illustrated using helix-turn-helix DNA-binding proteins. Other statistically based procedures are described for searching a database for sequences matching motifs found by the sampler. When applied to a set of 32 very distantly related bacterial integral outer membrane proteins, the sampler revealed that they share a subtle, repetitive motif. Although BLAST (Altschul SF et al., 1990, J Mol Biol 215:403-410) fails to detect significant pairwise similarity between any of the sequences, the repeats present in these outer membrane proteins, taken as a whole, are highly significant (based on a generally applicable statistical test for motifs described here). Analysis of bacterial porins with known trimeric beta-barrel structure and related proteins reveals a similar repetitive motif corresponding to alternating membrane-spanning beta-strands. These beta-strands occur on the membrane interface (as opposed to the trimeric interface) of the beta-barrel. The broad conservation and structural location of these repeats suggests that they play important functional roles. PMID:8520488

  16. DLocalMotif: a discriminative approach for discovering local motifs in protein sequences.

    PubMed

    Mehdi, Ahmed M; Sehgal, Muhammad Shoaib B; Kobe, Bostjan; Bailey, Timothy L; Bodén, Mikael

    2013-01-01

    Local motifs are patterns of DNA or protein sequences that occur within a sequence interval relative to a biologically defined anchor or landmark. Current protein motif discovery methods do not adequately consider such constraints to identify biologically significant motifs that are only weakly over-represented but spatially confined. Using negatives, i.e. sequences known to not contain a local motif, can further increase the specificity of their discovery. This article introduces the method DLocalMotif that makes use of positional information and negative data for local motif discovery in protein sequences. DLocalMotif combines three scoring functions, measuring degrees of motif over-representation, entropy and spatial confinement, specifically designed to discriminatively exploit the availability of negative data. The method is shown to outperform current methods that use only a subset of these motif characteristics. We apply the method to several biological datasets. The analysis of peroxisomal targeting signals uncovers several novel motifs that occur immediately upstream of the dominant peroxisomal targeting signal-1 signal. The analysis of proline-tyrosine nuclear localization signals uncovers multiple novel motifs that overlap with C2H2 zinc finger domains. We also evaluate the method on classical nuclear localization signals and endoplasmic reticulum retention signals and find that DLocalMotif successfully recovers biologically relevant sequence properties. http://bioinf.scmb.uq.edu.au/dlocalmotif/

  17. The ARTT motif and a unified structural understanding of substraterecognition in ADP ribosylating bacterial toxins and eukaryotic ADPribosyltransferases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Han, S.; Tainer, J.A.

    2001-08-01

    ADP-ribosylation is a widely occurring and biologically critical covalent chemical modification process in pathogenic mechanisms, intracellular signaling systems, DNA repair, and cell division. The reaction is catalyzed by ADP-ribosyltransferases, which transfer the ADP-ribose moiety of NAD to a target protein with nicotinamide release. A family of bacterial toxins and eukaryotic enzymes has been termed the mono-ADP-ribosyltransferases, in distinction to the poly-ADP-ribosyltransferases, which catalyze the addition of multiple ADP-ribose groups to the carboxyl terminus of eukaryotic nucleoproteins. Despite the limited primary sequence homology among the different ADP-ribosyltransferases, a central cleft bearing NAD-binding pocket formed by the two perpendicular b-sheet core hasmore » been remarkably conserved between bacterial toxins and eukaryotic mono- and poly-ADP-ribosyltransferases. The majority of bacterial toxins and eukaryotic mono-ADP-ribosyltransferases are characterized by conserved His and catalytic Glu residues. In contrast, Diphtheria toxin, Pseudomonas exotoxin A, and eukaryotic poly-ADP-ribosyltransferases are characterized by conserved Arg and catalytic Glu residues. The NAD-binding core of a binary toxin and a C3-like toxin family identified an ARTT motif (ADP-ribosylating turn-turn motif) that is implicated in substrate specificity and recognition by structural and mutagenic studies. Here we apply structure-based sequence alignment and comparative structural analyses of all known structures of ADP-ribosyltransfeases to suggest that this ARTT motif is functionally important in many ADP-ribosylating enzymes that bear a NAD binding cleft as characterized by conserved Arg and catalytic Glu residues. Overall, structure-based sequence analysis reveals common core structures and conserved active sites of ADP-ribosyltransferases to support similar NAD binding mechanisms but differing mechanisms of target protein binding via sequence variations within

  18. A generic motif discovery algorithm for sequential data.

    PubMed

    Jensen, Kyle L; Styczynski, Mark P; Rigoutsos, Isidore; Stephanopoulos, Gregory N

    2006-01-01

    Motif discovery in sequential data is a problem of great interest and with many applications. However, previous methods have been unable to combine exhaustive search with complex motif representations and are each typically only applicable to a certain class of problems. Here we present a generic motif discovery algorithm (Gemoda) for sequential data. Gemoda can be applied to any dataset with a sequential character, including both categorical and real-valued data. As we show, Gemoda deterministically discovers motifs that are maximal in composition and length. As well, the algorithm allows any choice of similarity metric for finding motifs. Finally, Gemoda's output motifs are representation-agnostic: they can be represented using regular expressions, position weight matrices or any number of other models for any type of sequential data. We demonstrate a number of applications of the algorithm, including the discovery of motifs in amino acids sequences, a new solution to the (l,d)-motif problem in DNA sequences and the discovery of conserved protein substructures. Gemoda is freely available at http://web.mit.edu/bamel/gemoda

  19. Motif discovery and motif finding from genome-mapped DNase footprint data.

    PubMed

    Kulakovskiy, Ivan V; Favorov, Alexander V; Makeev, Vsevolod J

    2009-09-15

    Footprint data is an important source of information on transcription factor recognition motifs. However, a footprinting fragment can contain no sequences similar to known protein recognition sites. Inspection of genome fragments nearby can help to identify missing site positions. Genome fragments containing footprints were supplied to a pipeline that constructed a position weight matrix (PWM) for different motif lengths and selected the optimal PWM. Fragments were aligned with the SeSiMCMC sampler and a new heuristic algorithm, Bigfoot. Footprints with missing hits were found for approximately 50% of factors. Adding only 2 bp on both sides of a footprinting fragment recovered most hits. We automatically constructed motifs for 41 Drosophila factors. New motifs can recognize footprints with a greater sensitivity at the same false positive rate than existing models. Also we discuss possible overfitting of constructed motifs. Software and the collection of regulatory motifs are freely available at http://line.imb.ac.ru/DMMPMM.

  20. A +1 ribosomal frameshifting motif prevalent among plant amalgaviruses.

    PubMed

    Nibert, Max L; Pyle, Jesse D; Firth, Andrew E

    2016-11-01

    Sequence accessions attributable to novel plant amalgaviruses have been found in the Transcriptome Shotgun Assembly database. Sixteen accessions, derived from 12 different plant species, appear to encompass the complete protein-coding regions of the proposed amalgaviruses, which would substantially expand the size of genus Amalgavirus from 4 current species. Other findings include evidence for UUU_CGN as a +1 ribosomal frameshifting motif prevalent among plant amalgaviruses; for a variant version of this motif found thus far in only two amalgaviruses from solanaceous plants; for a region of α-helical coiled coil propensity conserved in a central region of the ORF1 translation product of plant amalgaviruses; and for conserved sequences in a C-terminal region of the ORF2 translation product (RNA-dependent RNA polymerase) of plant amalgaviruses, seemingly beyond the region of conserved polymerase motifs. These results additionally illustrate the value of mining the TSA database and others for novel viral sequences for comparative analyses. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  1. TOPDOM: database of conservatively located domains and motifs in proteins.

    PubMed

    Varga, Julia; Dobson, László; Tusnády, Gábor E

    2016-09-01

    The TOPDOM database-originally created as a collection of domains and motifs located consistently on the same side of the membranes in α-helical transmembrane proteins-has been updated and extended by taking into consideration consistently localized domains and motifs in globular proteins, too. By taking advantage of the recently developed CCTOP algorithm to determine the type of a protein and predict topology in case of transmembrane proteins, and by applying a thorough search for domains and motifs as well as utilizing the most up-to-date version of all source databases, we managed to reach a 6-fold increase in the size of the whole database and a 2-fold increase in the number of transmembrane proteins. TOPDOM database is available at http://topdom.enzim.hu The webpage utilizes the common Apache, PHP5 and MySQL software to provide the user interface for accessing and searching the database. The database itself is generated on a high performance computer. tusnady.gabor@ttk.mta.hu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  2. Evidence for the Concerted Evolution between Short Linear Protein Motifs and Their Flanking Regions

    PubMed Central

    Chica, Claudia; Diella, Francesca; Gibson, Toby J.

    2009-01-01

    Background Linear motifs are short modules of protein sequences that play a crucial role in mediating and regulating many protein–protein interactions. The function of linear motifs strongly depends on the context, e.g. functional instances mainly occur inside flexible regions that are accessible for interaction. Sometimes linear motifs appear as isolated islands of conservation in multiple sequence alignments. However, they also occur in larger blocks of sequence conservation, suggesting an active role for the neighbouring amino acids. Results The evolution of regions flanking 116 functional linear motif instances was studied. The conservation of the amino acid sequence and order/disorder tendency of those regions was related to presence/absence of the instance. For the majority of the analysed instances, the pairs of sequences conserving the linear motif were also observed to maintain a similar local structural tendency and/or to have higher local sequence conservation when compared to pairs of sequences where one is missing the linear motif. Furthermore, those instances have a higher chance to co–evolve with the neighbouring residues in comparison to the distant ones. Those findings are supported by examples where the regulation of the linear motif–mediated interaction has been shown to depend on the modifications (e.g. phosphorylation) at neighbouring positions or is thought to benefit from the binding versatility of disordered regions. Conclusion The results suggest that flanking regions are relevant for linear motif–mediated interactions, both at the structural and sequence level. More interestingly, they indicate that the prediction of linear motif instances can be enriched with contextual information by performing a sequence analysis similar to the one presented here. This can facilitate the understanding of the role of these predicted instances in determining the protein function inside the broader context of the cellular network where they arise

  3. Microfluidic affinity and ChIP-seq analyses converge on a conserved FOXP2-binding motif in chimp and human, which enables the detection of evolutionarily novel targets.

    PubMed

    Nelson, Christopher S; Fuller, Chris K; Fordyce, Polly M; Greninger, Alexander L; Li, Hao; DeRisi, Joseph L

    2013-07-01

    The transcription factor forkhead box P2 (FOXP2) is believed to be important in the evolution of human speech. A mutation in its DNA-binding domain causes severe speech impairment. Humans have acquired two coding changes relative to the conserved mammalian sequence. Despite intense interest in FOXP2, it has remained an open question whether the human protein's DNA-binding specificity and chromatin localization are conserved. Previous in vitro and ChIP-chip studies have provided conflicting consensus sequences for the FOXP2-binding site. Using MITOMI 2.0 microfluidic affinity assays, we describe the binding site of FOXP2 and its affinity profile in base-specific detail for all substitutions of the strongest binding site. We find that human and chimp FOXP2 have similar binding sites that are distinct from previously suggested consensus binding sites. Additionally, through analysis of FOXP2 ChIP-seq data from cultured neurons, we find strong overrepresentation of a motif that matches our in vitro results and identifies a set of genes with FOXP2 binding sites. The FOXP2-binding sites tend to be conserved, yet we identified 38 instances of evolutionarily novel sites in humans. Combined, these data present a comprehensive portrait of FOXP2's-binding properties and imply that although its sequence specificity has been conserved, some of its genomic binding sites are newly evolved.

  4. Microfluidic affinity and ChIP-seq analyses converge on a conserved FOXP2-binding motif in chimp and human, which enables the detection of evolutionarily novel targets

    PubMed Central

    Nelson, Christopher S.; Fuller, Chris K.; Fordyce, Polly M.; Greninger, Alexander L.; Li, Hao; DeRisi, Joseph L.

    2013-01-01

    The transcription factor forkhead box P2 (FOXP2) is believed to be important in the evolution of human speech. A mutation in its DNA-binding domain causes severe speech impairment. Humans have acquired two coding changes relative to the conserved mammalian sequence. Despite intense interest in FOXP2, it has remained an open question whether the human protein’s DNA-binding specificity and chromatin localization are conserved. Previous in vitro and ChIP-chip studies have provided conflicting consensus sequences for the FOXP2-binding site. Using MITOMI 2.0 microfluidic affinity assays, we describe the binding site of FOXP2 and its affinity profile in base-specific detail for all substitutions of the strongest binding site. We find that human and chimp FOXP2 have similar binding sites that are distinct from previously suggested consensus binding sites. Additionally, through analysis of FOXP2 ChIP-seq data from cultured neurons, we find strong overrepresentation of a motif that matches our in vitro results and identifies a set of genes with FOXP2 binding sites. The FOXP2-binding sites tend to be conserved, yet we identified 38 instances of evolutionarily novel sites in humans. Combined, these data present a comprehensive portrait of FOXP2’s-binding properties and imply that although its sequence specificity has been conserved, some of its genomic binding sites are newly evolved. PMID:23625967

  5. The Runt domain of AML1 (RUNX1) binds a sequence-conserved RNA motif that mimics a DNA element.

    PubMed

    Fukunaga, Junichi; Nomura, Yusuke; Tanaka, Yoichiro; Amano, Ryo; Tanaka, Taku; Nakamura, Yoshikazu; Kawai, Gota; Sakamoto, Taiichi; Kozu, Tomoko

    2013-07-01

    AML1 (RUNX1) is a key transcription factor for hematopoiesis that binds to the Runt-binding double-stranded DNA element (RDE) of target genes through its N-terminal Runt domain. Aberrations in the AML1 gene are frequently found in human leukemia. To better understand AML1 and its potential utility for diagnosis and therapy, we obtained RNA aptamers that bind specifically to the AML1 Runt domain. Enzymatic probing and NMR analyses revealed that Apt1-S, which is a truncated variant of one of the aptamers, has a CACG tetraloop and two stem regions separated by an internal loop. All the isolated aptamers were found to contain the conserved sequence motif 5'-NNCCAC-3' and 5'-GCGMGN'N'-3' (M:A or C; N and N' form Watson-Crick base pairs). The motif contains one AC mismatch and one base bulged out. Mutational analysis of Apt1-S showed that three guanines of the motif are important for Runt binding as are the three guanines of RDE, which are directly recognized by three arginine residues of the Runt domain. Mutational analyses of the Runt domain revealed that the amino acid residues used for Apt1-S binding were similar to those used for RDE binding. Furthermore, the aptamer competed with RDE for binding to the Runt domain in vitro. These results demonstrated that the Runt domain of the AML1 protein binds to the motif of the aptamer that mimics DNA. Our findings should provide new insights into RNA function and utility in both basic and applied sciences.

  6. Identification and Characterization of Functionally Critical, Conserved Motifs in the Internal Repeats and N-terminal Domain of Yeast Translation Initiation Factor 4B (yeIF4B)*

    PubMed Central

    Zhou, Fujun; Walker, Sarah E.; Mitchell, Sarah F.; Lorsch, Jon R.; Hinnebusch, Alan G.

    2014-01-01

    eIF4B has been implicated in attachment of the 43 S preinitiation complex (PIC) to mRNAs and scanning to the start codon. We recently determined that the internal seven repeats (of ∼26 amino acids each) of Saccharomyces cerevisiae eIF4B (yeIF4B) compose the region most critically required to enhance mRNA recruitment by 43 S PICs in vitro and stimulate general translation initiation in yeast. Moreover, although the N-terminal domain (NTD) of yeIF4B contributes to these activities, the RNA recognition motif is dispensable. We have now determined that only two of the seven internal repeats are sufficient for wild-type (WT) yeIF4B function in vivo when all other domains are intact. However, three or more repeats are needed in the absence of the NTD or when the functions of eIF4F components are compromised. We corroborated these observations in the reconstituted system by demonstrating that yeIF4B variants with only one or two repeats display substantial activity in promoting mRNA recruitment by the PIC, whereas additional repeats are required at lower levels of eIF4A or when the NTD is missing. These findings indicate functional overlap among the 7-repeats and NTD domains of yeIF4B and eIF4A in mRNA recruitment. Interestingly, only three highly conserved positions in the 26-amino acid repeat are essential for function in vitro and in vivo. Finally, we identified conserved motifs in the NTD and demonstrate functional overlap of two such motifs. These results provide a comprehensive description of the critical sequence elements in yeIF4B that support eIF4F function in mRNA recruitment by the PIC. PMID:24285537

  7. Comparative Analysis of P450 Signature Motifs EXXR and CXG in the Large and Diverse Kingdom of Fungi: Identification of Evolutionarily Conserved Amino Acid Patterns Characteristic of P450 Family

    PubMed Central

    Syed, Khajamohiddin; Mashele, Samson Sitheni

    2014-01-01

    Cytochrome P450 monooxygenases (P450s) are heme-thiolate proteins distributed across the biological kingdoms. P450s are catalytically versatile and play key roles in organisms primary and secondary metabolism. Identification of P450s across the biological kingdoms depends largely on the identification of two P450 signature motifs, EXXR and CXG, in the protein sequence. Once a putative protein has been identified as P450, it will be assigned to a family and subfamily based on the criteria that P450s within a family share more than 40% homology and members of subfamilies share more than 55% homology. However, to date, no evidence has been presented that can distinguish members of a P450 family. Here, for the first time we report the identification of EXXR- and CXG-motifs-based amino acid patterns that are characteristic of the P450 family. Analysis of P450 signature motifs in the under-explored fungal P450s from four different phyla, ascomycota, basidiomycota, zygomycota and chytridiomycota, indicated that the EXXR motif is highly variable and the CXG motif is somewhat variable. The amino acids threonine and leucine are preferred as second and third amino acids in the EXXR motif and proline and glycine are preferred as second and third amino acids in the CXG motif in fungal P450s. Analysis of 67 P450 families from biological kingdoms such as plants, animals, bacteria and fungi showed conservation of a set of amino acid patterns characteristic of a particular P450 family in EXXR and CXG motifs. This suggests that during the divergence of P450 families from a common ancestor these amino acids patterns evolve and are retained in each P450 family as a signature of that family. The role of amino acid patterns characteristic of a P450 family in the structural and/or functional aspects of members of the P450 family is a topic for future research. PMID:24743800

  8. A novel approach to identifying regulatory motifs in distantly related genomes

    PubMed Central

    Van Hellemont, Ruth; Monsieurs, Pieter; Thijs, Gert; De Moor, Bart; Van de Peer, Yves; Marchal, Kathleen

    2005-01-01

    Although proven successful in the identification of regulatory motifs, phylogenetic footprinting methods still show some shortcomings. To assess these difficulties, most apparent when applying phylogenetic footprinting to distantly related organisms, we developed a two-step procedure that combines the advantages of sequence alignment and motif detection approaches. The results on well-studied benchmark datasets indicate that the presented method outperforms other methods when the sequences become either too long or too heterogeneous in size. PMID:16420672

  9. The Runt domain of AML1 (RUNX1) binds a sequence-conserved RNA motif that mimics a DNA element

    PubMed Central

    Fukunaga, Junichi; Nomura, Yusuke; Tanaka, Yoichiro; Amano, Ryo; Tanaka, Taku; Nakamura, Yoshikazu; Kawai, Gota; Sakamoto, Taiichi; Kozu, Tomoko

    2013-01-01

    AML1 (RUNX1) is a key transcription factor for hematopoiesis that binds to the Runt-binding double-stranded DNA element (RDE) of target genes through its N-terminal Runt domain. Aberrations in the AML1 gene are frequently found in human leukemia. To better understand AML1 and its potential utility for diagnosis and therapy, we obtained RNA aptamers that bind specifically to the AML1 Runt domain. Enzymatic probing and NMR analyses revealed that Apt1-S, which is a truncated variant of one of the aptamers, has a CACG tetraloop and two stem regions separated by an internal loop. All the isolated aptamers were found to contain the conserved sequence motif 5′-NNCCAC-3′ and 5′-GCGMGN′N′-3′ (M:A or C; N and N′ form Watson–Crick base pairs). The motif contains one AC mismatch and one base bulged out. Mutational analysis of Apt1-S showed that three guanines of the motif are important for Runt binding as are the three guanines of RDE, which are directly recognized by three arginine residues of the Runt domain. Mutational analyses of the Runt domain revealed that the amino acid residues used for Apt1-S binding were similar to those used for RDE binding. Furthermore, the aptamer competed with RDE for binding to the Runt domain in vitro. These results demonstrated that the Runt domain of the AML1 protein binds to the motif of the aptamer that mimics DNA. Our findings should provide new insights into RNA function and utility in both basic and applied sciences. PMID:23709277

  10. Signature Motifs Identify an Acinetobacter Cif Virulence Factor with Epoxide Hydrolase Activity*

    PubMed Central

    Bahl, Christopher D.; Hvorecny, Kelli L.; Bridges, Andrew A.; Ballok, Alicia E.; Bomberger, Jennifer M.; Cady, Kyle C.; O'Toole, George A.; Madden, Dean R.

    2014-01-01

    Endocytic recycling of the cystic fibrosis transmembrane conductance regulator (CFTR) is blocked by the CFTR inhibitory factor (Cif). Originally discovered in Pseudomonas aeruginosa, Cif is a secreted epoxide hydrolase that is transcriptionally regulated by CifR, an epoxide-sensitive repressor. In this report, we investigate a homologous protein found in strains of the emerging nosocomial pathogens Acinetobacter nosocomialis and Acinetobacter baumannii (“aCif”). Like Cif, aCif is an epoxide hydrolase that carries an N-terminal secretion signal and can be purified from culture supernatants. When applied directly to polarized airway epithelial cells, mature aCif triggers a reduction in CFTR abundance at the apical membrane. Biochemical and crystallographic studies reveal a dimeric assembly with a stereochemically conserved active site, confirming our motif-based identification of candidate Cif-like pathogenic EH sequences. Furthermore, cif expression is transcriptionally repressed by a CifR homolog (“aCifR”) and is induced in the presence of epoxides. Overall, this Acinetobacter protein recapitulates the essential attributes of the Pseudomonas Cif system and thus may facilitate airway colonization in nosocomial lung infections. PMID:24474692

  11. Gene Isolation Using Degenerate Primers Targeting Protein Motif: A Laboratory Exercise

    ERIC Educational Resources Information Center

    Yeo, Brandon Pei Hui; Foong, Lian Chee; Tam, Sheh May; Lee, Vivian; Hwang, Siaw San

    2018-01-01

    Structures and functions of protein motifs are widely included in many biology-based course syllabi. However, little emphasis is placed to link this knowledge to applications in biotechnology to enhance the learning experience. Here, the conserved motifs of nucleotide binding site-leucine rich repeats (NBS-LRR) proteins, successfully used for the…

  12. Motif enrichment tool.

    PubMed

    Blatti, Charles; Sinha, Saurabh

    2014-07-01

    The Motif Enrichment Tool (MET) provides an online interface that enables users to find major transcriptional regulators of their gene sets of interest. MET searches the appropriate regulatory region around each gene and identifies which transcription factor DNA-binding specificities (motifs) are statistically overrepresented. Motif enrichment analysis is currently available for many metazoan species including human, mouse, fruit fly, planaria and flowering plants. MET also leverages high-throughput experimental data such as ChIP-seq and DNase-seq from ENCODE and ModENCODE to identify the regulatory targets of a transcription factor with greater precision. The results from MET are produced in real time and are linked to a genome browser for easy follow-up analysis. Use of the web tool is free and open to all, and there is no login requirement. ADDRESS: http://veda.cs.uiuc.edu/MET/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. Presence of a consensus DNA motif at nearby DNA sequence of the mutation susceptible CG nucleotides.

    PubMed

    Chowdhury, Kaushik; Kumar, Suresh; Sharma, Tanu; Sharma, Ankit; Bhagat, Meenakshi; Kamai, Asangla; Ford, Bridget M; Asthana, Shailendra; Mandal, Chandi C

    2018-01-10

    Complexity in tissues affected by cancer arises from somatic mutations and epigenetic modifications in the genome. The mutation susceptible hotspots present within the genome indicate a non-random nature and/or a position specific selection of mutation. An association exists between the occurrence of mutations and epigenetic DNA methylation. This study is primarily aimed at determining mutation status, and identifying a signature for predicting mutation prone zones of tumor suppressor (TS) genes. Nearby sequences from the top five positions having a higher mutation frequency in each gene of 42 TS genes were selected from a cosmic database and were considered as mutation prone zones. The conserved motifs present in the mutation prone DNA fragments were identified. Molecular docking studies were done to determine putative interactions between the identified conserved motifs and enzyme methyltransferase DNMT1. Collective analysis of 42 TS genes found GC as the most commonly replaced and AT as the most commonly formed residues after mutation. Analysis of the top 5 mutated positions of each gene (210 DNA segments for 42 TS genes) identified that CG nucleotides of the amino acid codons (e.g., Arginine) are most susceptible to mutation, and found a consensus DNA "T/AGC/GAGGA/TG" sequence present in these mutation prone DNA segments. Similar to TS genes, analysis of 54 oncogenes not only found CG nucleotides of the amino acid Arg as the most susceptible to mutation, but also identified the presence of similar consensus DNA motifs in the mutation prone DNA fragments (270 DNA segments for 54 oncogenes) of oncogenes. Docking studies depicted that, upon binding of DNMT1 methylates to this consensus DNA motif (C residues of CpG islands), mutation was likely to occur. Thus, this study proposes that DNMT1 mediated methylation in chromosomal DNA may decrease if a foreign DNA segment containing this consensus sequence along with CG nucleotides is exogenously introduced to dividing

  14. Conservation of Transcription Start Sites within Genes across a Bacterial Genus

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shao, Wenjun; Price, Morgan N.; Deutschbauer, Adam M.

    Transcription start sites (TSSs) lying inside annotated genes, on the same or opposite strand, have been observed in diverse bacteria, but the function of these unexpected transcripts is unclear. Here, we use the metal-reducing bacterium Shewanella oneidensis MR-1 and its relatives to study the evolutionary conservation of unexpected TSSs. Using high-resolution tiling microarrays and 5'-end RNA sequencing, we identified 2,531 TSSs in S. oneidensis MR-1, of which 18% were located inside coding sequences (CDSs). Comparative transcriptome analysis with seven additional Shewanella species revealed that the majority (76%) of the TSSs within the upstream regions of annotated genes (gTSSs) were conserved.more » Thirty percent of the TSSs that were inside genes and on the sense strand (iTSSs) were also conserved. Sequence analysis around these iTSSs showed conserved promoter motifs, suggesting that many iTSS are under purifying selection. Furthermore, conserved iTSSs are enriched for regulatory motifs, suggesting that they are regulated, and they tend to eliminate polar effects, which confirms that they are functional. In contrast, the transcription of antisense TSSs located inside CDSs (aTSSs) was significantly less likely to be conserved (22%). However, aTSSs whose transcription was conserved often have conserved promoter motifs and drive the expression of nearby genes. Overall, our findings demonstrate that some internal TSSs are conserved and drive protein expression despite their unusual locations, but the majority are not conserved and may reflect noisy initiation of transcription rather than a biological function.« less

  15. Web server to identify similarity of amino acid motifs to compounds (SAAMCO).

    PubMed

    Casey, Fergal P; Davey, Norman E; Baran, Ivan; Varekova, Radka Svobodova; Shields, Denis C

    2008-07-01

    Protein-protein interactions are fundamental in mediating biological processes including metabolism, cell growth, and signaling. To be able to selectively inhibit or induce protein activity or complex formation is a key feature in controlling disease. For those situations in which protein-protein interactions derive substantial affinity from short linear peptide sequences, or motifs, we can develop search algorithms for peptidomimetic compounds that resemble the short peptide's structure but are not compromised by poor pharmacological properties. SAAMCO is a Web service ( http://bioware.ucd.ie/ approximately saamco) that facilitates the screening of motifs with known structures against bioactive compound databases. It is built on an algorithm that defines compound similarity based on the presence of appropriate amino acid side chain fragments and a favorable Root Mean Squared Deviation (RMSD) between compound and motif structure. The methodology is efficient as the available compound databases are preprocessed and fast regular expression searches filter potential matches before time-intensive 3D superposition is performed. The required input information is minimal, and the compound databases have been selected to maximize the availability of information on biological activity. "Hits" are accompanied with a visualization window and links to source database entries. Motif matching can be defined on partial or full similarity which will increase or reduce respectively the number of potential mimetic compounds. The Web server provides the functionality for rapid screening of known or putative interaction motifs against prepared compound libraries using a novel search algorithm. The tabulated results can be analyzed by linking to appropriate databases and by visualization.

  16. A short conserved motif in ALYREF directs cap- and EJC-dependent assembly of export complexes on spliced mRNAs

    PubMed Central

    Gromadzka, Agnieszka M.; Steckelberg, Anna-Lena; Singh, Kusum K.; Hofmann, Kay; Gehring, Niels H.

    2016-01-01

    The export of messenger RNAs (mRNAs) is the final of several nuclear posttranscriptional steps of gene expression. The formation of export-competent mRNPs involves the recruitment of export factors that are assumed to facilitate transport of the mature mRNAs. Using in vitro splicing assays, we show that a core set of export factors, including ALYREF, UAP56 and DDX39, readily associate with the spliced RNAs in an EJC (exon junction complex)- and cap-dependent manner. In order to elucidate how ALYREF and other export adaptors mediate mRNA export, we conducted a computational analysis and discovered four short, conserved, linear motifs present in RNA-binding proteins. We show that mutation in one of the new motifs (WxHD) in an unstructured region of ALYREF reduced RNA binding and abolished the interaction with eIF4A3 and CBP80. Additionally, the mutation impaired proper localization to nuclear speckles and export of a spliced reporter mRNA. Our results reveal important details of the orchestrated recruitment of export factors during the formation of export competent mRNPs. PMID:26773052

  17. Identification of the sequence motif of glycoside hydrolase 13 family members

    PubMed Central

    Kumar, Vikash

    2011-01-01

    A bioinformatics analysis of sequences of enzymes of the glycoside hydrolase (GH) 13 family members such as α-amylase, cyclodextrin glycosyltransferase (CGTase), branching enzyme and cyclomaltodextrinase has been carried out in order to find out the sequence motifs that govern the reactions specificities of these enzymes by using hidden Markov model (HMM) profile. This analysis suggests the existence of such sequence motifs and residues of these motifs constituting the −1 to +3 catalytic subsites of the enzyme. Hence, by introducing mutations in the residues of these four subsites, one can change the reaction specificities of the enzymes. In general it has been observed that α -amylase sequence motif have low sequence conservation than rest of the motifs of the GH13 family members. PMID:21544166

  18. NAC transcription factor genes: genome-wide identification, phylogenetic, motif and cis-regulatory element analysis in pigeonpea (Cajanus cajan (L.) Millsp.).

    PubMed

    Satheesh, Viswanathan; Jagannadham, P Tej Kumar; Chidambaranathan, Parameswaran; Jain, P K; Srinivasan, R

    2014-12-01

    The NAC (NAM, ATAF and CUC) proteins are plant-specific transcription factors implicated in development and stress responses. In the present study 88 pigeonpea NAC genes were identified from the recently published draft genome of pigeonpea by using homology based and de novo prediction programmes. These sequences were further subjected to phylogenetic, motif and promoter analyses. In motif analysis, highly conserved motifs were identified in the NAC domain and also in the C-terminal region of the NAC proteins. A phylogenetic reconstruction using pigeonpea, Arabidopsis and soybean NAC genes revealed 33 putative stress-responsive pigeonpea NAC genes. Several stress-responsive cis-elements were identified through in silico analysis of the promoters of these putative stress-responsive genes. This analysis is the first report of NAC gene family in pigeonpea and will be useful for the identification and selection of candidate genes associated with stress tolerance.

  19. A conserved WW domain-like motif regulates invariant chain-dependent cell-surface transport of the NKG2D ligand ULBP2.

    PubMed

    Uhlenbrock, Franziska; van Andel, Esther; Andresen, Lars; Skov, Søren

    2015-08-01

    Malignant cells expressing NKG2D ligands on their cell surface can be directly sensed and killed by NKG2D-bearing lymphocytes. To ensure this immune recognition, accumulating evidence suggests that NKG2D ligands are trafficed via alternative pathways to the cell surface. We have previously shown that the NKG2D ligand ULBP2 traffics over an invariant chain (Ii)-dependent pathway to the cell surface. This study set out to elucidate how Ii regulates ULBP2 cell-surface transport: We discovered conserved tryptophan (Trp) residues in the primary protein sequence of ULBP1-6 but not in the related MICA/B. Substitution of Trp to alanine resulted in cell-surface inhibition of ULBP2 in different cancer cell lines. Moreover, the mutated ULBP2 constructs were retained and not degraded inside the cell, indicating a crucial role of this conserved Trp-motif in trafficking. Finally, overexpression of Ii increased surface expression of wt ULBP2 while Trp-mutants could not be expressed, proposing that this Trp-motif is required for an Ii-dependent cell-surface transport of ULBP2. Aberrant soluble ULBP2 is immunosuppressive. Thus, targeting a distinct protein module on the ULBP2 sequence could counteract this abnormal expression of ULBP2. Copyright © 2015 Elsevier Ltd. All rights reserved.

  20. The Calcineurin Signaling Network Evolves Via Conserved Kinase–Phosphatase Modules That Transcend Substrate Identity

    PubMed Central

    Bodenmiller, Bernd; Wanka, Stefanie; Landry, Christian R.; Aebersold, Ruedi; Cyert, Martha S.

    2014-01-01

    Summary To define the first functional network for calcineurin, the conserved Ca2+/calmodulin-regulated phosphatase, we systematically identified its substrates in S. cerevisiae using phosphoproteomics and bioinformatics, followed by co-purification and dephosphorylation assays. This study establishes new calcineurin functions and reveals mechanisms that shape calcineurin network evolution. Analyses of closely related yeasts show that many proteins were recently recruited to the network by acquiring a calcineurin-recognition motif. Calcineurin substrates in yeast and mammals are distinct due to network rewiring but surprisingly are phosphorylated by similar kinases. We postulate that co-recognition of conserved substrate features, including phosphorylation and docking motifs, preserves calcineurin-kinase opposition during evolution. One example we document is a composite docking site that confers substrate recognition by both calcineurin and MAPK. We propose that conserved kinase-phosphatase pairs define the architecture of signaling networks and allow other connections between kinases and phosphatases to develop and establish common regulatory motifs in signaling networks. PMID:24930733

  1. A novel Arg H52/Tyr H33 conservative motif in antibodies: A correlation between sequence of antibodies and antigen binding.

    PubMed

    Petrov, Artem; Arzhanik, Vladimir; Makarov, Gennady; Koliasnikov, Oleg

    2016-08-01

    Antibodies are the family of proteins, which are responsible for antigen recognition. The computational modeling of interaction between an antigen and an antibody is very important when crystallographic structure is unavailable. In this research, we have discovered the correlation between the amino acid sequence of antibody and its specific binding characteristics on the example of the novel conservative binding motif, which consists of four residues: Arg H52, Tyr H33, Thr H59, and Glu H61. These residues are specifically oriented in the binding site and interact with each other in a specific manner. The residues of the binding motif are involved in interaction strictly with negatively charged groups of antigens, and form a binding complex. Mechanism of interaction and characteristics of the complex were also discovered. The results of this research can be used to increase the accuracy of computational antibody-antigen interaction modeling and for post-modeling quality control of the modeled structures.

  2. Discriminative motif optimization based on perceptron training

    PubMed Central

    Patel, Ronak Y.; Stormo, Gary D.

    2014-01-01

    Motivation: Generating accurate transcription factor (TF) binding site motifs from data generated using the next-generation sequencing, especially ChIP-seq, is challenging. The challenge arises because a typical experiment reports a large number of sequences bound by a TF, and the length of each sequence is relatively long. Most traditional motif finders are slow in handling such enormous amount of data. To overcome this limitation, tools have been developed that compromise accuracy with speed by using heuristic discrete search strategies or limited optimization of identified seed motifs. However, such strategies may not fully use the information in input sequences to generate motifs. Such motifs often form good seeds and can be further improved with appropriate scoring functions and rapid optimization. Results: We report a tool named discriminative motif optimizer (DiMO). DiMO takes a seed motif along with a positive and a negative database and improves the motif based on a discriminative strategy. We use area under receiver-operating characteristic curve (AUC) as a measure of discriminating power of motifs and a strategy based on perceptron training that maximizes AUC rapidly in a discriminative manner. Using DiMO, on a large test set of 87 TFs from human, drosophila and yeast, we show that it is possible to significantly improve motifs identified by nine motif finders. The motifs are generated/optimized using training sets and evaluated on test sets. The AUC is improved for almost 90% of the TFs on test sets and the magnitude of increase is up to 39%. Availability and implementation: DiMO is available at http://stormo.wustl.edu/DiMO Contact: rpatel@genetics.wustl.edu, ronakypatel@gmail.com PMID:24369152

  3. Identification of 15 candidate structured noncoding RNA motifs in fungi by comparative genomics.

    PubMed

    Li, Sanshu; Breaker, Ronald R

    2017-10-13

    With the development of rapid and inexpensive DNA sequencing, the genome sequences of more than 100 fungal species have been made available. This dataset provides an excellent resource for comparative genomics analyses, which can be used to discover genetic elements, including noncoding RNAs (ncRNAs). Bioinformatics tools similar to those used to uncover novel ncRNAs in bacteria, likewise, should be useful for searching fungal genomic sequences, and the relative ease of genetic experiments with some model fungal species could facilitate experimental validation studies. We have adapted a bioinformatics pipeline for discovering bacterial ncRNAs to systematically analyze many fungal genomes. This comparative genomics pipeline integrates information on conserved RNA sequence and structural features with alternative splicing information to reveal fungal RNA motifs that are candidate regulatory domains, or that might have other possible functions. A total of 15 prominent classes of structured ncRNA candidates were identified, including variant HDV self-cleaving ribozyme representatives, atypical snoRNA candidates, and possible structured antisense RNA motifs. Candidate regulatory motifs were also found associated with genes for ribosomal proteins, S-adenosylmethionine decarboxylase (SDC), amidase, and HexA protein involved in Woronin body formation. We experimentally confirm that the variant HDV ribozymes undergo rapid self-cleavage, and we demonstrate that the SDC RNA motif reduces the expression of SAM decarboxylase by translational repression. Furthermore, we provide evidence that several other motifs discovered in this study are likely to be functional ncRNA elements. Systematic screening of fungal genomes using a computational discovery pipeline has revealed the existence of a variety of novel structured ncRNAs. Genome contexts and similarities to known ncRNA motifs provide strong evidence for the biological and biochemical functions of some newly found ncRNA motifs

  4. Structural motif screening reveals a novel, conserved carbohydrate-binding surface in the pathogenesis-related protein PR-5d.

    PubMed

    Doxey, Andrew C; Cheng, Zhenyu; Moffatt, Barbara A; McConkey, Brendan J

    2010-08-03

    Aromatic amino acids play a critical role in protein-glycan interactions. Clusters of surface aromatic residues and their features may therefore be useful in distinguishing glycan-binding sites as well as predicting novel glycan-binding proteins. In this work, a structural bioinformatics approach was used to screen the Protein Data Bank (PDB) for coplanar aromatic motifs similar to those found in known glycan-binding proteins. The proteins identified in the screen were significantly associated with carbohydrate-related functions according to gene ontology (GO) enrichment analysis, and predicted motifs were found frequently within novel folds and glycan-binding sites not included in the training set. In addition to numerous binding sites predicted in structural genomics proteins of unknown function, one novel prediction was a surface motif (W34/W36/W192) in the tobacco pathogenesis-related protein, PR-5d. Phylogenetic analysis revealed that the surface motif is exclusive to a subfamily of PR-5 proteins from the Solanaceae family of plants, and is absent completely in more distant homologs. To confirm PR-5d's insoluble-polysaccharide binding activity, a cellulose-pulldown assay of tobacco proteins was performed and PR-5d was identified in the cellulose-binding fraction by mass spectrometry. Based on the combined results, we propose that the putative binding site in PR-5d may be an evolutionary adaptation of Solanaceae plants including potato, tomato, and tobacco, towards defense against cellulose-containing pathogens such as species of the deadly oomycete genus, Phytophthora. More generally, the results demonstrate that coplanar aromatic clusters on protein surfaces are a structural signature of glycan-binding proteins, and can be used to computationally predict novel glycan-binding proteins from 3 D structure.

  5. New type III effectors from Xanthomonas campestris pv. vesicatoria trigger plant reactions dependent on a conserved N-myristoylation motif.

    PubMed

    Thieme, Frank; Szczesny, Robert; Urban, Alexander; Kirchner, Oliver; Hause, Gerd; Bonas, Ulla

    2007-10-01

    Pathogenicity of the gram-negative plant pathogen Xanthomonas campestris pv. vesicatoria depends on a type III secretion system, which translocates bacterial effector proteins into the plant cell. In this study, we identified two novel type III effectors, XopE1 and XopE2 (Xanthomonas outer proteins), using the AvrBs3 effector domain as reporter. XopE1 and XopE2 belong to the HopX family and possess a conserved putative N-myristoylation motif that is also present in the effector XopJ from X. campestris pv. vesicatoria 85-10. XopJ is a member of the YopJ/AvrRxv family of acetyltransferases. Confocal laser scanning microscopy and immunocytochemistry revealed that green fluorescent protein fusions of XopE1, XopE2, and XopJ localized to the plant cell plasma membrane. Targeting to the membrane is probably due to N-myristoylation, because a point mutation in the putative myristoylated glycine residue G2 in XopE1, XopE2, and XopJ resulted in cytoplasmic localization of the mutant proteins. Results of hydroxylamine treatments of XopE2 protein extracts suggest that the proteins are additionally anchored in the host cell plasma membrane by palmitoylation. The membrane localization of the effectors strongly influences the phenotypes they trigger in the plant. Agrobacterium-mediated expression of xopE1 and xopJ in Nicotiana benthamiana led to cell-death reactions that, for xopJ, were dependent on the N-myristoylation motif. In the case of xopE1(G2A), cell death was more pronounced with the mutant than with the wild-type protein. In addition, XopE2 has an avirulence activity in Solanum pseudocapsicum.

  6. Role of NH{sub 2}-terminal hydrophobic motif in the subcellular localization of ATP-binding cassette protein subfamily D: Common features in eukaryotic organisms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lee, Asaka; Asahina, Kota; Okamoto, Takumi

    Highlights: • ABCD proteins classifies based on with or without NH{sub 2}-terminal hydrophobic segment. • The ABCD proteins with the segment are targeted peroxisomes. • The ABCD proteins without the segment are targeted to the endoplasmic reticulum. • The role of the segment in organelle targeting is conserved in eukaryotic organisms. - Abstract: In mammals, four ATP-binding cassette (ABC) proteins belonging to subfamily D have been identified. ABCD1–3 possesses the NH{sub 2}-terminal hydrophobic region and are targeted to peroxisomes, while ABCD4 lacking the region is targeted to the endoplasmic reticulum (ER). Based on hydropathy plot analysis, we found that severalmore » eukaryotes have ABCD protein homologs lacking the NH{sub 2}-terminal hydrophobic segment (H0 motif). To investigate whether the role of the NH{sub 2}-terminal H0 motif in subcellular localization is conserved across species, we expressed ABCD proteins from several species (metazoan, plant and fungi) in fusion with GFP in CHO cells and examined their subcellular localization. ABCD proteins possessing the NH{sub 2}-terminal H0 motif were localized to peroxisomes, while ABCD proteins lacking this region lost this capacity. In addition, the deletion of the NH{sub 2}-terminal H0 motif of ABCD protein resulted in their localization to the ER. These results suggest that the role of the NH{sub 2}-terminal H0 motif in organelle targeting is widely conserved in living organisms.« less

  7. Motif discovery with data mining in 3D protein structure databases: discovery, validation and prediction of the U-shape zinc binding ("Huf-Zinc") motif.

    PubMed

    Maurer-Stroh, Sebastian; Gao, He; Han, Hao; Baeten, Lies; Schymkowitz, Joost; Rousseau, Frederic; Zhang, Louxin; Eisenhaber, Frank

    2013-02-01

    Data mining in protein databases, derivatives from more fundamental protein 3D structure and sequence databases, has considerable unearthed potential for the discovery of sequence motif--structural motif--function relationships as the finding of the U-shape (Huf-Zinc) motif, originally a small student's project, exemplifies. The metal ion zinc is critically involved in universal biological processes, ranging from protein-DNA complexes and transcription regulation to enzymatic catalysis and metabolic pathways. Proteins have evolved a series of motifs to specifically recognize and bind zinc ions. Many of these, so called zinc fingers, are structurally independent globular domains with discontinuous binding motifs made up of residues mostly far apart in sequence. Through a systematic approach starting from the BRIX structure fragment database, we discovered that there exists another predictable subset of zinc-binding motifs that not only have a conserved continuous sequence pattern but also share a characteristic local conformation, despite being included in totally different overall folds. While this does not allow general prediction of all Zn binding motifs, a HMM-based web server, Huf-Zinc, is available for prediction of these novel, as well as conventional, zinc finger motifs in protein sequences. The Huf-Zinc webserver can be freely accessed through this URL (http://mendel.bii.a-star.edu.sg/METHODS/hufzinc/).

  8. Prediction of virus-host protein-protein interactions mediated by short linear motifs.

    PubMed

    Becerra, Andrés; Bucheli, Victor A; Moreno, Pedro A

    2017-03-09

    Short linear motifs in host organisms proteins can be mimicked by viruses to create protein-protein interactions that disable or control metabolic pathways. Given that viral linear motif instances of host motif regular expressions can be found by chance, it is necessary to develop filtering methods of functional linear motifs. We conduct a systematic comparison of linear motifs filtering methods to develop a computational approach for predicting motif-mediated protein-protein interactions between human and the human immunodeficiency virus 1 (HIV-1). We implemented three filtering methods to obtain linear motif sets: 1) conserved in viral proteins (C), 2) located in disordered regions (D) and 3) rare or scarce in a set of randomized viral sequences (R). The sets C,D,R are united and intersected. The resulting sets are compared by the number of protein-protein interactions correctly inferred with them - with experimental validation. The comparison is done with HIV-1 sequences and interactions from the National Institute of Allergy and Infectious Diseases (NIAID). The number of correctly inferred interactions allows to rank the interactions by the sets used to deduce them: D∪R and C. The ordering of the sets is descending on the probability of capturing functional interactions. With respect to HIV-1, the sets C∪R, D∪R, C∪D∪R infer all known interactions between HIV1 and human proteins mediated by linear motifs. We found that the majority of conserved linear motifs in the virus are located in disordered regions. We have developed a method for predicting protein-protein interactions mediated by linear motifs between HIV-1 and human proteins. The method only use protein sequences as inputs. We can extend the software developed to any other eukaryotic virus and host in order to find and rank candidate interactions. In future works we will use it to explore possible viral attack mechanisms based on linear motif mimicry.

  9. A short conserved motif in ALYREF directs cap- and EJC-dependent assembly of export complexes on spliced mRNAs.

    PubMed

    Gromadzka, Agnieszka M; Steckelberg, Anna-Lena; Singh, Kusum K; Hofmann, Kay; Gehring, Niels H

    2016-03-18

    The export of messenger RNAs (mRNAs) is the final of several nuclear posttranscriptional steps of gene expression. The formation of export-competent mRNPs involves the recruitment of export factors that are assumed to facilitate transport of the mature mRNAs. Using in vitro splicing assays, we show that a core set of export factors, including ALYREF, UAP56 and DDX39, readily associate with the spliced RNAs in an EJC (exon junction complex)- and cap-dependent manner. In order to elucidate how ALYREF and other export adaptors mediate mRNA export, we conducted a computational analysis and discovered four short, conserved, linear motifs present in RNA-binding proteins. We show that mutation in one of the new motifs (WxHD) in an unstructured region of ALYREF reduced RNA binding and abolished the interaction with eIF4A3 and CBP80. Additionally, the mutation impaired proper localization to nuclear speckles and export of a spliced reporter mRNA. Our results reveal important details of the orchestrated recruitment of export factors during the formation of export competent mRNPs. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. PDSM, a motif for phosphorylation-dependent SUMO modification

    PubMed Central

    Hietakangas, Ville; Anckar, Julius; Blomster, Henri A.; Fujimoto, Mitsuaki; Palvimo, Jorma J.; Nakai, Akira; Sistonen, Lea

    2006-01-01

    SUMO (small ubiquitin-like modifier) modification regulates many cellular processes, including transcription. Although sumoylation often occurs on specific lysines within the consensus tetrapeptide ΨKxE, other modifications, such as phosphorylation, may regulate the sumoylation of a substrate. We have discovered PDSM (phosphorylation-dependent sumoylation motif), composed of a SUMO consensus site and an adjacent proline-directed phosphorylation site (ΨKxExxSP). The highly conserved motif regulates phosphorylation-dependent sumoylation of multiple substrates, such as heat-shock factors (HSFs), GATA-1, and myocyte enhancer factor 2. In fact, the majority of the PDSM-containing proteins are transcriptional regulators. Within the HSF family, PDSM is conserved between two functionally distinct members, HSF1 and HSF4b, whose transactivation capacities are repressed through the phosphorylation-dependent sumoylation. As the first recurrent sumoylation determinant beyond the consensus tetrapeptide, the PDSM provides a valuable tool in predicting new SUMO substrates. PMID:16371476

  11. Discovering Sequence Motifs with Arbitrary Insertions and Deletions

    PubMed Central

    Frith, Martin C.; Saunders, Neil F. W.; Kobe, Bostjan; Bailey, Timothy L.

    2008-01-01

    Biology is encoded in molecular sequences: deciphering this encoding remains a grand scientific challenge. Functional regions of DNA, RNA, and protein sequences often exhibit characteristic but subtle motifs; thus, computational discovery of motifs in sequences is a fundamental and much-studied problem. However, most current algorithms do not allow for insertions or deletions (indels) within motifs, and the few that do have other limitations. We present a method, GLAM2 (Gapped Local Alignment of Motifs), for discovering motifs allowing indels in a fully general manner, and a companion method GLAM2SCAN for searching sequence databases using such motifs. glam2 is a generalization of the gapless Gibbs sampling algorithm. It re-discovers variable-width protein motifs from the PROSITE database significantly more accurately than the alternative methods PRATT and SAM-T2K. Furthermore, it usefully refines protein motifs from the ELM database: in some cases, the refined motifs make orders of magnitude fewer overpredictions than the original ELM regular expressions. GLAM2 performs respectably on the BAliBASE multiple alignment benchmark, and may be superior to leading multiple alignment methods for “motif-like” alignments with N- and C-terminal extensions. Finally, we demonstrate the use of GLAM2 to discover protein kinase substrate motifs and a gapped DNA motif for the LIM-only transcriptional regulatory complex: using GLAM2SCAN, we identify promising targets for the latter. GLAM2 is especially promising for short protein motifs, and it should improve our ability to identify the protein cleavage sites, interaction sites, post-translational modification attachment sites, etc., that underlie much of biology. It may be equally useful for arbitrarily gapped motifs in DNA and RNA, although fewer examples of such motifs are known at present. GLAM2 is public domain software, available for download at http://bioinformatics.org.au/glam2. PMID:18437229

  12. A systems wide mass spectrometric based linear motif screen to identify dominant in-vivo interacting proteins for the ubiquitin ligase MDM2.

    PubMed

    Nicholson, Judith; Scherl, Alex; Way, Luke; Blackburn, Elizabeth A; Walkinshaw, Malcolm D; Ball, Kathryn L; Hupp, Ted R

    2014-06-01

    Linear motifs mediate protein-protein interactions (PPI) that allow expansion of a target protein interactome at a systems level. This study uses a proteomics approach and linear motif sub-stratifications to expand on PPIs of MDM2. MDM2 is a multi-functional protein with over one hundred known binding partners not stratified by hierarchy or function. A new linear motif based on a MDM2 interaction consensus is used to select novel MDM2 interactors based on Nutlin-3 responsiveness in a cell-based proteomics screen. MDM2 binds a subset of peptide motifs corresponding to real proteins with a range of allosteric responses to MDM2 ligands. We validate cyclophilin B as a novel protein with a consensus MDM2 binding motif that is stabilised by Nutlin-3 in vivo, thus identifying one of the few known interactors of MDM2 that is stabilised by Nutlin-3. These data invoke two modes of peptide binding at the MDM2 N-terminus that rely on a consensus core motif to control the equilibrium between MDM2 binding proteins. This approach stratifies MDM2 interacting proteins based on the linear motif feature and provides a new biomarker assay to define clinically relevant Nutlin-3 responsive MDM2 interactors. Copyright © 2014 Elsevier Inc. All rights reserved.

  13. Putative bovine topological association domains and CTCF binding motifs can reduce the search space for causative regulatory variants of complex traits.

    PubMed

    Wang, Min; Hancock, Timothy P; Chamberlain, Amanda J; Vander Jagt, Christy J; Pryce, Jennie E; Cocks, Benjamin G; Goddard, Mike E; Hayes, Benjamin J

    2018-05-24

    Topological association domains (TADs) are chromosomal domains characterised by frequent internal DNA-DNA interactions. The transcription factor CTCF binds to conserved DNA sequence patterns called CTCF binding motifs to either prohibit or facilitate chromosomal interactions. TADs and CTCF binding motifs control gene expression, but they are not yet well defined in the bovine genome. In this paper, we sought to improve the annotation of bovine TADs and CTCF binding motifs, and assess whether the new annotation can reduce the search space for cis-regulatory variants. We used genomic synteny to map TADs and CTCF binding motifs from humans, mice, dogs and macaques to the bovine genome. We found that our mapped TADs exhibited the same hallmark properties of those sourced from experimental data, such as housekeeping genes, transfer RNA genes, CTCF binding motifs, short interspersed elements, H3K4me3 and H3K27ac. We showed that runs of genes with the same pattern of allele-specific expression (ASE) (either favouring paternal or maternal allele) were often located in the same TAD or between the same conserved CTCF binding motifs. Analyses of variance showed that when averaged across all bovine tissues tested, TADs explained 14% of ASE variation (standard deviation, SD: 0.056), while CTCF explained 27% (SD: 0.078). Furthermore, we showed that the quantitative trait loci (QTLs) associated with gene expression variation (eQTLs) or ASE variation (aseQTLs), which were identified from mRNA transcripts from 141 lactating cows' white blood and milk cells, were highly enriched at putative bovine CTCF binding motifs. The linearly-furthermost, and most-significant aseQTL and eQTL for each genic target were located within the same TAD as the gene more often than expected (Chi-Squared test P-value < 0.001). Our results suggest that genomic synteny can be used to functionally annotate conserved transcriptional components, and provides a tool to reduce the search space for causative

  14. Conservation of the glycoprotein B homologs of the Kaposi’s sarcoma-associated herpesvirus (KSHV/HHV8) and Old World primate rhadinoviruses of chimpanzees and macaques

    PubMed Central

    Bruce, A. Gregory; Horst, Jeremy A.; Rose, Timothy M.

    2016-01-01

    The envelope-associated glycoprotein B (gB) is highly conserved within the Herpesviridae and plays a critical role in viral entry. We analyzed the evolutionary conservation of sequence and structural motifs within the Kaposi’s sarcoma-associated herpesvirus (KSHV) gB and homologs of Old World primate rhadinoviruses belonging to the distinct RV1 and RV2 rhadinovirus lineages. In addition to gB homologs of rhadinoviruses infecting the pig-tailed and rhesus macaques, we cloned and sequenced gB homologs of RV1 and RV2 rhadinoviruses infecting chimpanzees. A structural model of the KSHV gB was determined, and functional motifs and sequence variants were mapped to the model structure. Conserved domains and motifs were identified, including an “RGD” motif that plays a critical role in KSHV binding and entry through the cellular integrin αVβ3. The RGD motif was only detected in RV1 rhadinoviruses suggesting an important difference in cell tropism between the two rhadinovirus lineages. PMID:27070755

  15. SCOPE: a web server for practical de novo motif discovery.

    PubMed

    Carlson, Jonathan M; Chakravarty, Arijit; DeZiel, Charles E; Gross, Robert H

    2007-07-01

    SCOPE is a novel parameter-free method for the de novo identification of potential regulatory motifs in sets of coordinately regulated genes. The SCOPE algorithm combines the output of three component algorithms, each designed to identify a particular class of motifs. Using an ensemble learning approach, SCOPE identifies the best candidate motifs from its component algorithms. In tests on experimentally determined datasets, SCOPE identified motifs with a significantly higher level of accuracy than a number of other web-based motif finders run with their default parameters. Because SCOPE has no adjustable parameters, the web server has an intuitive interface, requiring only a set of gene names or FASTA sequences and a choice of species. The most significant motifs found by SCOPE are displayed graphically on the main results page with a table containing summary statistics for each motif. Detailed motif information, including the sequence logo, PWM, consensus sequence and specific matching sites can be viewed through a single click on a motif. SCOPE's efficient, parameter-free search strategy has enabled the development of a web server that is readily accessible to the practising biologist while providing results that compare favorably with those of other motif finders. The SCOPE web server is at .

  16. Mutations in a Highly Conserved Motif of nsp1β Protein Attenuate the Innate Immune Suppression Function of Porcine Reproductive and Respiratory Syndrome Virus

    PubMed Central

    Li, Yanhua; Shyu, Duan-Liang; Shang, Pengcheng; Bai, Jianfa; Ouyang, Kang; Dhakal, Santosh; Hiremath, Jagadish; Binjawadagi, Basavaraj

    2016-01-01

    weak adaptive immunity. One of the strategies in next-generation vaccine construction is to manipulate viral proteins/genetic elements involved in antagonizing the host immune response. PRRSV nsp1β was identified to be a strong innate immune antagonist. In this study, two basic amino acids, R128 and R129, in a highly conserved GKYLQRRLQ motif were determined to be critical for nsp1β function. Mutations introduced into these two residues attenuated virus growth and improved the innate and adaptive immune responses of infected animals. Technologies developed in this study could be broadly applied to current commercial PRRSV modified live-virus (MLV) vaccines and other candidate vaccines. PMID:26792733

  17. Computational study of the fibril organization of polyglutamine repeats reveals a common motif identified in beta-helices.

    PubMed

    Zanuy, David; Gunasekaran, Kannan; Lesk, Arthur M; Nussinov, Ruth

    2006-04-21

    The formation of fibril aggregates by long polyglutamine sequences is assumed to play a major role in neurodegenerative diseases such as Huntington. Here, we model peptides rich in glutamine, through a series of molecular dynamics simulations. Starting from a rigid nanotube-like conformation, we have obtained a new conformational template that shares structural features of a tubular helix and of a beta-helix conformational organization. Our new model can be described as a super-helical arrangement of flat beta-sheet segments linked by planar turns or bends. Interestingly, our comprehensive analysis of the Protein Data Bank reveals that this is a common motif in beta-helices (termed beta-bend), although it has not been identified so far. The motif is based on the alternation of beta-sheet and helical conformation as the protein sequence is followed from the N to the C termini (beta-alpha(R)-beta-polyPro-beta). We further identify this motif in the ssNMR structure of the protofibril of the amyloidogenic peptide Abeta(1-40). The recurrence of the beta-bend suggests a general mode of connecting long parallel beta-sheet segments that would allow the growth of partially ordered fibril structures. The design allows the peptide backbone to change direction with a minimal loss of main chain hydrogen bonds. The identification of a coherent organization beyond that of the beta-sheet segments in different folds rich in parallel beta-sheets suggests a higher degree of ordered structure in protein fibrils, in agreement with their low solubility and dense molecular packing.

  18. Signature motif-guided identification of receptors for peptide hormones essential for root meristem growth.

    PubMed

    Song, Wen; Liu, Li; Wang, Jizong; Wu, Zhen; Zhang, Heqiao; Tang, Jiao; Lin, Guangzhong; Wang, Yichuan; Wen, Xing; Li, Wenyang; Han, Zhifu; Guo, Hongwei; Chai, Jijie

    2016-06-01

    Peptide-mediated cell-to-cell signaling has crucial roles in coordination and definition of cellular functions in plants. Peptide-receptor matching is important for understanding the mechanisms underlying peptide-mediated signaling. Here we report the structure-guided identification of root meristem growth factor (RGF) receptors important for plant development. An assay based on a signature ligand recognition motif (Arg-x-Arg) conserved in a subfamily of leucine-rich repeat receptor kinases (LRR-RKs) identified the functionally uncharacterized LRR-RK At4g26540 as a receptor of RGF1 (RGFR1). We further solved the crystal structure of RGF1 in complex with the LRR domain of RGFR1 at a resolution of 2.6 Å, which reveals that the Arg-x-Gly-Gly (RxGG) motif is responsible for specific recognition of the sulfate group of RGF1 by RGFR1. Based on the RxGG motif, we identified additional four RGFRs. Participation of the five RGFRs in RGF-induced signaling is supported by biochemical and genetic data. We also offer evidence showing that SERKs function as co-receptors for RGFs. Taken together, our study identifies RGF receptors and co-receptors that can link RGF signals with their downstream components and provides a proof of principle for structure-based matching of LRR-RKs with their peptide ligands.

  19. A conserved human DJ1-subfamily motif (DJSM) is critical for anti-oxidative and deglycase activities of Plasmodium falciparum DJ1.

    PubMed

    Nair, Divya N; Prasad, Rajesh; Singhal, Neha; Bhattacharjee, Manish; Sudhakar, Renu; Singh, Pushpa; Thanumalayan, Subramonian; Kiran, Uday; Sharma, Yogendra; Sijwali, Puran Singh

    2018-06-01

    Plasmodium falciparum DJ1 (PfDJ1) belongs to the DJ-1/ThiJ/PfpI superfamily whose members are present in all the kingdoms of life and exhibit diverse cellular functions and biochemical activities. The common feature of the superfamily is the class I glutamine amidotransferase domain with a conserved redox-active cysteine residue, which mediates various activities of the superfamily members, including anti-oxidative activity in PfDJ1 and human DJ1 (hDJ1). As the superfamily members represent diverse functional classes, to investigate if there is any sequence feature unique to hDJ1-like proteins, sequences of the representative proteins of different functional classes were compared and analysed. A novel motif unique to PfDJ1 and several other hDJ1-like proteins, with the consensus sequence of TSXGPX5FXLX5L, was identified that we designated as the hDJ1-subfamily motif (DJSM). Several mutations that have been associated with Parkinson's disease are also present in DJSM, suggesting its functional importance in hDJ1-like proteins. Mutations of the conserved residues of DJSM of PfDJ1 did not significantly affect overall secondary structure, but caused both a significant loss (S151A and P154A) and gain (L168A) of anti-oxidative activity. We also report that PfDJ1 has deglycase activity, which was significantly decreased in its mutants of the catalytic cysteine (C106A) and DJSM (S151A and P154A). Episomal expression of the catalytic cysteine (C106A) or DJSM (P154A) mutant decreased growth rates of parasites as compared to that of wild type parasites or parasites expressing wild type PfDJ1. S151 appears to properly position the nucleophilic elbow containing C106 and P154 forms a hydrogen bond with C106, which could be a reason for the loss of activities of PfDJ1 upon their mutations. Taken together, DJSM delineates PfDJ1 and other hDJ1-subfamily proteins from the remaining superfamily, and is critical for anti-oxidative and deglycase activities of PfDJ1. Copyright © 2018

  20. The heptanucleotide motif GAGACGC is a key component of a cis-acting promoter element that is critical for SnSAG1 expression in Sarcocystis neurona.

    PubMed

    Gaji, Rajshekhar Y; Howe, Daniel K

    2009-07-01

    The apicomplexan parasite Sarcocystis neurona undergoes a complex process of intracellular development, during which many genes are temporally regulated. The described study was undertaken to begin identifying the basic promoter elements that control gene expression in S. neurona. Sequence analysis of the 5'-flanking region of five S. neurona genes revealed a conserved heptanucleotide motif GAGACGC that is similar to the WGAGACG motif described upstream of multiple genes in Toxoplasma gondii. The promoter region for the major surface antigen gene SnSAG1, which contains three heptanucleotide motifs within 135 bases of the transcription start site, was dissected by functional analysis using a dual luciferase reporter assay. These analyses revealed that a minimal promoter fragment containing all three motifs was sufficient to drive reporter molecule expression, with the presence and orientation of the 5'-most heptanucleotide motif being absolutely critical for promoter function. Further studies should help to identify additional sequence elements important for promoter function and for controlling gene expression during intracellular development by this apicomplexan pathogen.

  1. A structural-alphabet-based strategy for finding structural motifs across protein families

    PubMed Central

    Wu, Chih Yuan; Chen, Yao Chi; Lim, Carmay

    2010-01-01

    Proteins with insignificant sequence and overall structure similarity may still share locally conserved contiguous structural segments; i.e. structural/3D motifs. Most methods for finding 3D motifs require a known motif to search for other similar structures or functionally/structurally crucial residues. Here, without requiring a query motif or essential residues, a fully automated method for discovering 3D motifs of various sizes across protein families with different folds based on a 16-letter structural alphabet is presented. It was applied to structurally non-redundant proteins bound to DNA, RNA, obligate/non-obligate proteins as well as free DNA-binding proteins (DBPs) and proteins with known structures but unknown function. Its usefulness was illustrated by analyzing the 3D motifs found in DBPs. A non-specific motif was found with a ‘corner’ architecture that confers a stable scaffold and enables diverse interactions, making it suitable for binding not only DNA but also RNA and proteins. Furthermore, DNA-specific motifs present ‘only’ in DBPs were discovered. The motifs found can provide useful guidelines in detecting binding sites and computational protein redesign. PMID:20525797

  2. Conservation of batik: Conseptual framework of design and process development

    NASA Astrophysics Data System (ADS)

    Syamwil, Rodia

    2018-03-01

    Development of Conservation Batik concept becomes critical due to the recessive of traditional batik as the intangible cultural heritage of humanity. The existence of printed batik, polluting process, and new stream design becomes the consequences of batik industry transformation to creative industry. Conservation Batik was proposed to answer all the threats to traditional batik, in the aspect of technique, process, and motif. However, creativities are also critical to meet consumer satisfaction. Research and development was conducted, start with the initial research in formulating the concept, and exploration of ideas to develop the designs of conservation motifs. In development steps, cyclical process to complete motif with high preferences, in the aspect of aesthetics, productivity, and efficiency. Data were collected through bibliography, documentation, observation, and interview, and analyzed in qualitative methods. The concept of Conservation Batik adopted from the principles of Universitas Negeri Semarang (UNNES) vision, as well as theoretical analyses, and expert judgment. Conservation Batik are assessed from three aspect, design, process, and consumer preferences. Conservation means the effort of safeguarding, promoting, maintaining, and preserving. Concervation Batik concept could be interpreted as batik with: (1) traditional values and authenticity; (2) the values of philosophycal meanings; (3) eco-friendly process with minimum waste; (4) conservation as idea resources of design; and (5) raising up of classic motifs.

  3. DynaMIT: the dynamic motif integration toolkit

    PubMed Central

    Dassi, Erik; Quattrone, Alessandro

    2016-01-01

    De-novo motif search is a frequently applied bioinformatics procedure to identify and prioritize recurrent elements in sequences sets for biological investigation, such as the ones derived from high-throughput differential expression experiments. Several algorithms have been developed to perform motif search, employing widely different approaches and often giving divergent results. In order to maximize the power of these investigations and ultimately be able to draft solid biological hypotheses, there is the need for applying multiple tools on the same sequences and merge the obtained results. However, motif reporting formats and statistical evaluation methods currently make such an integration task difficult to perform and mostly restricted to specific scenarios. We thus introduce here the Dynamic Motif Integration Toolkit (DynaMIT), an extremely flexible platform allowing to identify motifs employing multiple algorithms, integrate them by means of a user-selected strategy and visualize results in several ways; furthermore, the platform is user-extendible in all its aspects. DynaMIT is freely available at http://cibioltg.bitbucket.org. PMID:26253738

  4. MotifNet: a web-server for network motif analysis.

    PubMed

    Smoly, Ilan Y; Lerman, Eugene; Ziv-Ukelson, Michal; Yeger-Lotem, Esti

    2017-06-15

    Network motifs are small topological patterns that recur in a network significantly more often than expected by chance. Their identification emerged as a powerful approach for uncovering the design principles underlying complex networks. However, available tools for network motif analysis typically require download and execution of computationally intensive software on a local computer. We present MotifNet, the first open-access web-server for network motif analysis. MotifNet allows researchers to analyze integrated networks, where nodes and edges may be labeled, and to search for motifs of up to eight nodes. The output motifs are presented graphically and the user can interactively filter them by their significance, number of instances, node and edge labels, and node identities, and view their instances. MotifNet also allows the user to distinguish between motifs that are centered on specific nodes and motifs that recur in distinct parts of the network. MotifNet is freely available at http://netbio.bgu.ac.il/motifnet . The website was implemented using ReactJs and supports all major browsers. The server interface was implemented in Python with data stored on a MySQL database. estiyl@bgu.ac.il or michaluz@cs.bgu.ac.il. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  5. Statistical tests to compare motif count exceptionalities

    PubMed Central

    Robin, Stéphane; Schbath, Sophie; Vandewalle, Vincent

    2007-01-01

    Background Finding over- or under-represented motifs in biological sequences is now a common task in genomics. Thanks to p-value calculation for motif counts, exceptional motifs are identified and represent candidate functional motifs. The present work addresses the related question of comparing the exceptionality of one motif in two different sequences. Just comparing the motif count p-values in each sequence is indeed not sufficient to decide if this motif is significantly more exceptional in one sequence compared to the other one. A statistical test is required. Results We develop and analyze two statistical tests, an exact binomial one and an asymptotic likelihood ratio test, to decide whether the exceptionality of a given motif is equivalent or significantly different in two sequences of interest. For that purpose, motif occurrences are modeled by Poisson processes, with a special care for overlapping motifs. Both tests can take the sequence compositions into account. As an illustration, we compare the octamer exceptionalities in the Escherichia coli K-12 backbone versus variable strain-specific loops. Conclusion The exact binomial test is particularly adapted for small counts. For large counts, we advise to use the likelihood ratio test which is asymptotic but strongly correlated with the exact binomial test and very simple to use. PMID:17346349

  6. An experimental test of a fundamental food web motif.

    PubMed

    Rip, Jason M K; McCann, Kevin S; Lynn, Denis H; Fawcett, Sonia

    2010-06-07

    Large-scale changes to the world's ecosystem are resulting in the deterioration of biostructure-the complex web of species interactions that make up ecological communities. A difficult, yet crucial task is to identify food web structures, or food web motifs, that are the building blocks of this baroque network of interactions. Once identified, these food web motifs can then be examined through experiments and theory to provide mechanistic explanations for how structure governs ecosystem stability. Here, we synthesize recent ecological research to show that generalist consumers coupling resources with different interaction strengths, is one such motif. This motif amazingly occurs across an enormous range of spatial scales, and so acts to distribute coupled weak and strong interactions throughout food webs. We then perform an experiment that illustrates the importance of this motif to ecological stability. We find that weak interactions coupled to strong interactions by generalist consumers dampen strong interaction strengths and increase community stability. This study takes a critical step by isolating a common food web motif and through clear, experimental manipulation, identifies the fundamental stabilizing consequences of this structure for ecological communities.

  7. [Personal motif in art].

    PubMed

    Gerevich, József

    2015-01-01

    One of the basic questions of the art psychology is whether a personal motif is to be found behind works of art and if so, how openly or indirectly it appears in the work itself. Analysis of examples and documents from the fine arts and literature allow us to conclude that the personal motif that can be identified by the viewer through symbols, at times easily at others with more difficulty, gives an emotional plus to the artistic product. The personal motif may be found in traumatic experiences, in communication to the model or with other emotionally important persons (mourning, disappointment, revenge, hatred, rivalry, revolt etc.), in self-searching, or self-analysis. The emotions are expressed in artistic activity either directly or indirectly. The intention nourished by the artist's identity (Kunstwollen) may stand in the way of spontaneous self-expression, channelling it into hidden paths. Under the influence of certain circumstances, the artist may arouse in the viewer, consciously or unconsciously, an illusionary, misleading image of himself. An examination of the personal motif is one of the important research areas of art therapy.

  8. A polymorphism in a conserved posttranscriptional regulatory motif alters bone morphogenetic protein 2 (BMP2) RNA:protein interactions.

    PubMed

    Fritz, David T; Jiang, Shan; Xu, Junwang; Rogers, Melissa B

    2006-07-01

    The bone morphogenetic protein (BMP)2 gene has been genetically linked to osteoporosis and osteoarthritis. We have shown that the 3'-untranslated regions (UTR) of BMP2 genes from mammals to fishes are extraordinarily conserved. This indicates that the BMP2 3'-UTR is under stringent selective pressure. We present evidence that the conserved region is a strong posttranscriptional regulator of BMP2 expression. Polymorphisms in cis-regulatory elements have been proven to influence susceptibility to a growing number of diseases. A common single nucleotide polymorphism (SNP) disrupts a putative posttranscriptional regulatory motif, an AU-rich element, within the BMP2 3'-UTR. The affinity of specific proteins for the rs15705 SNP sequence differs from their affinity for the normal human sequence. More importantly, the in vitro decay rate of RNAs with the SNP is higher than that of RNAs with the normal sequence. Such changes in mRNA:protein interactions may influence the posttranscriptional mechanisms that control BMP2 gene expression. The consequent alterations in BMP2 protein levels may influence the development or physiology of bone or other BMP2-influenced tissues.

  9. Unitary circular code motifs in genomes of eukaryotes.

    PubMed

    El Soufi, Karim; Michel, Christian J

    A set X of 20 trinucleotides was identified in genes of bacteria, eukaryotes, plasmids and viruses, which has in average the highest occurrence in reading frame compared to its two shifted frames (Michel, 2015; Arquès and Michel, 1996). This set X has an interesting mathematical property as X is a circular code (Arquès and Michel, 1996). Thus, the motifs from this circular code X, called X motifs, have the property to always retrieve, synchronize and maintain the reading frame in genes. The origin of this circular code X in genes is an open problem since its discovery in 1996. Here, we first show that the unitary circular codes (UCC), i.e. sets of one word, allow to generate unitary circular code motifs (UCC motifs), i.e. a concatenation of the same motif (simple repeats) leading to low complexity DNA. Three classes of UCC motifs are studied here: repeated dinucleotides (D + motifs), repeated trinucleotides (T + motifs) and repeated tetranucleotides (T + motifs). Thus, the D + , T + and T + motifs allow to retrieve, synchronize and maintain a frame modulo 2, modulo 3 and modulo 4, respectively, and their shifted frames (1 modulo 2; 1 and 2 modulo 3; 1, 2 and 3 modulo 4 according to the C 2 , C 3 and C 4 properties, respectively) in the DNA sequences. The statistical distribution of the D + , T + and T + motifs is analyzed in the genomes of eukaryotes. A UCC motif and its comp lementary UCC motif have the same distribution in the eukaryotic genomes. Furthermore, a UCC motif and its complementary UCC motif have increasing occurrences contrary to their number of hydrogen bonds, very significant with the T + motifs. The longest D + , T + and T + motifs in the studied eukaryotic genomes are also given. Surprisingly, a scarcity of repeated trinucleotides (T + motifs) in the large eukaryotic genomes is observed compared to the D + and T + motifs. This result has been investigated and may be explained by two outcomes. Repeated trinucleotides (T + motifs) are identified

  10. Designing synthetic RNAs to determine the relevance of structural motifs in picornavirus IRES elements

    NASA Astrophysics Data System (ADS)

    Fernandez-Chamorro, Javier; Lozano, Gloria; Garcia-Martin, Juan Antonio; Ramajo, Jorge; Dotu, Ivan; Clote, Peter; Martinez-Salas, Encarnacion

    2016-04-01

    The function of Internal Ribosome Entry Site (IRES) elements is intimately linked to their RNA structure. Viral IRES elements are organized in modular domains consisting of one or more stem-loops that harbor conserved RNA motifs critical for internal initiation of translation. A conserved motif is the pyrimidine-tract located upstream of the functional initiation codon in type I and II picornavirus IRES. By computationally designing synthetic RNAs to fold into a structure that sequesters the polypyrimidine tract in a hairpin, we establish a correlation between predicted inaccessibility of the pyrimidine tract and IRES activity, as determined in both in vitro and in vivo systems. Our data supports the hypothesis that structural sequestration of the pyrimidine-tract within a stable hairpin inactivates IRES activity, since the stronger the stability of the hairpin the higher the inhibition of protein synthesis. Destabilization of the stem-loop immediately upstream of the pyrimidine-tract also decreases IRES activity. Our work introduces a hybrid computational/experimental method to determine the importance of structural motifs for biological function. Specifically, we show the feasibility of using the software RNAiFold to design synthetic RNAs with particular sequence and structural motifs that permit subsequent experimental determination of the importance of such motifs for biological function.

  11. Roles of conserved proline and glycosyltransferase motifs of EmbC in biosynthesis of lipoarabinomannan.

    PubMed

    Berg, Stefan; Starbuck, James; Torrelles, Jordi B; Vissa, Varalakshmi D; Crick, Dean C; Chatterjee, Delphi; Brennan, Patrick J

    2005-02-18

    D-Arabinans, composed of D-arabinofuranose (D-Araf), dominate the structure of mycobacterial cell walls in two settings, as part of lipoarabinomannan (LAM) and arabinogalactan, each with markedly different structures and functions. Little is known of the complexity of their biosynthesis. beta-D-Arabinofuranosyl-1-monophosphoryldecaprenol is the only known sugar donor. EmbA, EmbB, and EmbC, products of the paralogous genes embA, embB, and embC, the sites of resistance to the anti-tuberculosis drug ethambutol (EMB), are the only known implicated enzymes. EmbA and -B apparently contribute to the synthesis of arabinogalactan, whereas EmbC is reserved for the synthesis of LAM. The Emb proteins show no overall similarity to any known proteins beyond Mycobacterium and related genera. However, functional motifs, equivalent to a proline-rich motif of several bacterial polysaccharide co-polymerases and a superfamily of glycosyltransferases, were found. Site-directed mutagenesis in glycosyltransferase superfamily C resulted in complete ablation of LAM synthesis. Point mutations in three amino acids of the proline motif of EmbC resulted in marked reduction of LAM-arabinan synthesis and accumulation of an unknown intermediate and of the known precursor lipomannan. Yet the pattern of the differently linked d-Araf units observed in wild type LAM-arabinan was largely retained in the proline motif mutants. The results allow for the presentation of a unique model of arabinan synthesis.

  12. Conservation of RNA chaperone activity of the human La-related proteins 4, 6 and 7.

    PubMed

    Hussain, Rawaa H; Zawawi, Mariam; Bayfield, Mark A

    2013-10-01

    The La module is a conserved tandem arrangement of a La motif and RNA recognition motif whose function has been best characterized in genuine La proteins. The best-characterized substrates of La proteins are pre-tRNAs, and previous work using tRNA mediated suppression in Schizosaccharomyces pombe has demonstrated that yeast and human La enhance the maturation of these using two distinguishable activities: UUU-3'OH-dependent trailer binding/protection and a UUU-3'OH independent activity related to RNA chaperone function. The La module has also been identified in several conserved families of La-related proteins (LARPs) that engage other RNAs, but their mode of RNA binding and function(s) are not well understood. We demonstrate that the La modules of the human LARPs 4, 6 and 7 are also active in tRNA-mediated suppression, even in the absence of stable UUU-3'OH trailer protection. Rather, the capacity of these to enhance pre-tRNA maturation is associated with RNA chaperone function, which we demonstrate to be a conserved activity for each hLARP in vitro. Our work reveals insight into the mechanisms by which La module containing proteins discriminate RNA targets and demonstrates that RNA chaperone activity is a conserved function across representative members of the La motif-containing superfamily.

  13. Quantitatively and Kinetically Identifying Binding Motifs of Amelogenin Proteins to Mineral Crystals Through Biochemical and Spectroscopic Assays

    PubMed Central

    Zhu, Li; Hwang, Peter; Witkowska, H. Ewa; Liu, Haichuan; Li, Wu

    2014-01-01

    Tooth enamel is the hardest tissue in vertebrate animals. Consisting of millions of carbonated hydroxyapatite crystals, this highly mineralized tissue develops from a protein matrix in which amelogenin is the predominant component. The enamel matrix proteins are eventually and completely degraded and removed by proteinases to form mineral-enriched tooth enamel. Identification of the apatite-binding motifs in amelogenin is critical for understanding the amelogenin–crystal interactions and amelogenin–proteinases interactions during tooth enamel biomineralization. A stepwise strategy is introduced to kinetically and quantitatively identify the crystal-binding motifs in amelogenin, including a peptide screening assay, a competitive adsorption assay, and a kinetic-binding assay using amelogenin and gene-engineered amelogenin mutants. A modified enzyme-linked immunosorbent assay on crystal surfaces is also applied to compare binding amounts of amelogenin and its mutants on different planes of apatite crystals. We describe the detailed protocols for these assays and provide the considerations for these experiments in this chapter. PMID:24188774

  14. Motif-based analysis of large nucleotide data sets using MEME-ChIP

    PubMed Central

    Ma, Wenxiu; Noble, William S; Bailey, Timothy L

    2014-01-01

    MEME-ChIP is a web-based tool for analyzing motifs in large DNA or RNA data sets. It can analyze peak regions identified by ChIP-seq, cross-linking sites identified by cLIP-seq and related assays, as well as sets of genomic regions selected using other criteria. MEME-ChIP performs de novo motif discovery, motif enrichment analysis, motif location analysis and motif clustering, providing a comprehensive picture of the DNA or RNA motifs that are enriched in the input sequences. MEME-ChIP performs two complementary types of de novo motif discovery: weight matrix–based discovery for high accuracy; and word-based discovery for high sensitivity. Motif enrichment analysis using DNA or RNA motifs from human, mouse, worm, fly and other model organisms provides even greater sensitivity. MEME-ChIP’s interactive HTML output groups and aligns significant motifs to ease interpretation. this protocol takes less than 3 h, and it provides motif discovery approaches that are distinct and complementary to other online methods. PMID:24853928

  15. Motifs, modules and games in bacteria.

    PubMed

    Wolf, Denise M; Arkin, Adam P

    2003-04-01

    Global explorations of regulatory network dynamics, organization and evolution have become tractable thanks to high-throughput sequencing and molecular measurement of bacterial physiology. From these, a nascent conceptual framework is developing, that views the principles of regulation in term of motifs, modules and games. Motifs are small, repeated, and conserved biological units ranging from molecular domains to small reaction networks. They are arranged into functional modules, genetically dissectible cellular functions such as the cell cycle, or different stress responses. The dynamical functioning of modules defines the organism's strategy to survive in a game, pitting cell against cell, and cell against environment. Placing pathway structure and dynamics into an evolutionary context begins to allow discrimination between those physical and molecular features that particularize a species to its surroundings, and those that provide core physiological function. This approach promises to generate a higher level understanding of cellular design, pathway evolution and cellular bioengineering.

  16. Using peptide array to identify binding motifs and interaction networks for modular domains.

    PubMed

    Li, Shawn S-C; Wu, Chenggang

    2009-01-01

    Specific protein-protein interactions underlie all essential biological processes and form the basis of cellular signal transduction. The recognition of a short, linear peptide sequence in one protein by a modular domain in another represents a common theme of macromolecular recognition in cells, and the importance of this mode of protein-protein interaction is highlighted by the large number of peptide-binding domains encoded by the human genome. This phenomenon also provides a unique opportunity to identify protein-protein binding events using peptide arrays and complementary biochemical assays. Accordingly, high-density peptide array has emerged as a useful tool by which to map domain-mediated protein-protein interaction networks at the proteome level. Using the Src-homology 2 (SH2) and 3 (SH3) domains as examples, we describe the application of oriented peptide array libraries in uncovering specific motifs recognized by an SH2 domain and the use of high-density peptide arrays in identifying interaction networks mediated by the SH3 domain. Methods reviewed here could also be applied to other modular domains, including catalytic domains, that recognize linear peptide sequences.

  17. Leucine-rich Repeats of Bacterial Surface Proteins Serve as Common Pattern Recognition Motifs of Human Scavenger Receptor gp340*

    PubMed Central

    Loimaranta, Vuokko; Hytönen, Jukka; Pulliainen, Arto T.; Sharma, Ashu; Tenovuo, Jorma; Strömberg, Nicklas; Finne, Jukka

    2009-01-01

    Scavenger receptors are innate immune molecules recognizing and inducing the clearance of non-host as well as modified host molecules. To recognize a wide pattern of invading microbes, many scavenger receptors bind to common pathogen-associated molecular patterns, such as lipopolysaccharides and lipoteichoic acids. Similarly, the gp340/DMBT1 protein, a member of the human scavenger receptor cysteine-rich protein family, displays a wide ligand repertoire. The peptide motif VEVLXXXXW derived from its scavenger receptor cysteine-rich domains is involved in some of these interactions, but most of the recognition mechanisms are unknown. In this study, we used mass spectrometry sequencing, gene inactivation, and recombinant proteins to identify Streptococcus pyogenes protein Spy0843 as a recognition receptor of gp340. Antibodies against Spy0843 are shown to protect against S. pyogenes infection, but no function or host receptor have been identified for the protein. Spy0843 belongs to the leucine-rich repeat (Lrr) family of eukaryotic and prokaryotic proteins. Experiments with truncated forms of the recombinant proteins confirmed that the Lrr region is needed in the binding of Spy0843 to gp340. The same motif of two other Lrr proteins, LrrG from the Gram-positive S. agalactiae and BspA from the Gram-negative Tannerella forsythia, also mediated binding to gp340. Moreover, inhibition of Spy0843 binding occurred with peptides containing the VEVLXXXXW motif, but also peptides devoid of the XXXXW motif inhibited binding of Lrr proteins. These results thus suggest that the conserved Lrr motif in bacterial proteins serves as a novel pattern recognition motif for unique core peptides of human scavenger receptor gp340. PMID:19465482

  18. The Thiamin Pyrophosphate-Motif

    NASA Technical Reports Server (NTRS)

    Dominiak, P.; Ciszak, E.

    2003-01-01

    Using databases the authors have identified a common thiamin pyrophosphate (TPP)-motif in the family of functionally diverse TPP-dependent enzymes. This common motif consists of multimeric organization of subunits and two catalytic centers. Each catalytic center (PP:PYR) is formed at the interface of the PP-domain binding the magnesium ion, pyrophosphate and amhopyrimidine ring of TPP, and the PYR-domain binding the aminopyrimidine ring of that cofactor. A pair of these catalytic centers constitutes the catalytic core (PP:PYR)(sub 2) within these enzymes. Analysis of the structural elements of this catalytic core reveals novel definition of the common amino acid sequences, which are GXPhiX(sub 4)(G)PhiXXGQ and GDGX(sub 25-30)NN in the PP-domain, and the EX(sub 4)(G)PhiXXGPhi in the PYR-domain, where Phi corresponds to a hydrophobic amino acid. This TPP-motif provides a novel tool for annotation of TPP-dependent enzymes useful in advancing functional proteomics.

  19. The Thiamin Pyrophosphate-Motif

    NASA Technical Reports Server (NTRS)

    Dominiak, Paulina M.; Ciszak, Ewa M.

    2003-01-01

    Using databases the authors have identified a common thiamin pyrophosphate (TPP)-motif in the family of functionally diverse TPP-dependent enzymes. This common motif consists of multimeric organization of subunits, two catalytic centers, common amino acid sequence, and specific contacts to provide a flip-flop, or alternate site, mechanism of action. Each catalytic center [PP:PYR] is formed at the interface of the PP-domain binding the magnesium ion, pyrophosphate and aminopyrimidine ring of TPP, and the PYR-domain binding the aminopyrimidine ring of that cofactor. A pair of these catalytic centers constitutes the catalytic core [PP:PYR]* within these enzymes. Analysis of the structural elements of this catalytic core reveals novel definition of the common amino acid sequences, which are GX@&(G)@XXGQ, and GDGX25-30 within the PP- domain, and the E&(G)@XXG@ within the PYR-domain, where Q, corresponds to a hydrophobic amino acid. This TPP-motif provides a novel tool for annotation of TPP-dependent enzymes useful in advancing functional proteomics.

  20. Conservation of tubulin-binding sequences in TRPV1 throughout evolution.

    PubMed

    Sardar, Puspendu; Kumar, Abhishek; Bhandari, Anita; Goswami, Chandan

    2012-01-01

    Transient Receptor Potential Vanilloid sub type 1 (TRPV1), commonly known as capsaicin receptor can detect multiple stimuli ranging from noxious compounds, low pH, temperature as well as electromagnetic wave at different ranges. In addition, this receptor is involved in multiple physiological and sensory processes. Therefore, functions of TRPV1 have direct influences on adaptation and further evolution also. Availability of various eukaryotic genomic sequences in public domain facilitates us in studying the molecular evolution of TRPV1 protein and the respective conservation of certain domains, motifs and interacting regions that are functionally important. Using statistical and bioinformatics tools, our analysis reveals that TRPV1 has evolved about ∼420 million years ago (MYA). Our analysis reveals that specific regions, domains and motifs of TRPV1 has gone through different selection pressure and thus have different levels of conservation. We found that among all, TRP box is the most conserved and thus have functional significance. Our results also indicate that the tubulin binding sequences (TBS) have evolutionary significance as these stretch sequences are more conserved than many other essential regions of TRPV1. The overall distribution of positively charged residues within the TBS motifs is conserved throughout evolution. In silico analysis reveals that the TBS-1 and TBS-2 of TRPV1 can form helical structures and may play important role in TRPV1 function. Our analysis identifies the regions of TRPV1, which are important for structure-function relationship. This analysis indicates that tubulin binding sequence-1 (TBS-1) near the TRP-box forms a potential helix and the tubulin interactions with TRPV1 via TBS-1 have evolutionary significance. This interaction may be required for the proper channel function and regulation and may also have significance in the context of Taxol®-induced neuropathy.

  1. Functional and Structural Analysis of the Conserved EFhd2 Protein

    PubMed Central

    Acosta, Yancy Ferrer; Rodríguez Cruz, Eva N.; Vaquer, Ana del C.; Vega, Irving E.

    2013-01-01

    EFhd2 is a novel protein conserved from C. elegans to H. sapiens. This novel protein was originally identified in cells of the immune and central nervous systems. However, it is most abundant in the central nervous system, where it has been found associated with pathological forms of the microtubule-associated protein tau. The physiological or pathological roles of EFhd2 are poorly understood. In this study, a functional and structural analysis was carried to characterize the molecular requirements for EFhd2’s calcium binding activity. The results showed that mutations of a conserved aspartate on either EF-hand motif disrupted the calcium binding activity, indicating that these motifs work in pair as a functional calcium binding domain. Furthermore, characterization of an identified single-nucleotide polymorphisms (SNP) that introduced a missense mutation indicates the importance of a conserved phenylalanine on EFhd2 calcium binding activity. Structural analysis revealed that EFhd2 is predominantly composed of alpha helix and random coil structures and that this novel protein is thermostable. EFhd2’s thermo stability depends on its N-terminus. In the absence of the N-terminus, calcium binding restored EFhd2’s thermal stability. Overall, these studies contribute to our understanding on EFhd2 functional and structural properties, and introduce it into the family of canonical EF-hand domain containing proteins. PMID:22973849

  2. Using a distribution and conservation status weighted hotspot approach to identify areas in need of conservation action to benefit Idaho bird species

    USGS Publications Warehouse

    Haines, Aaron M.; Leu, Matthias; Svancara, Leona K.; Wilson, Gina; Scott, J. Michael

    2010-01-01

    Identification of biodiversity hotspots (hereafter, hotspots) has become a common strategy to delineate important areas for wildlife conservation. However, the use of hotspots has not often incorporated important habitat types, ecosystem services, anthropogenic activity, or consistency in identifying important conservation areas. The purpose of this study was to identify hotspots to improve avian conservation efforts for Species of Greatest Conservation Need (SGCN) in the state of Idaho, United States. We evaluated multiple approaches to define hotspots and used a unique approach based on weighting species by their distribution size and conservation status to identify hotspot areas. All hotspot approaches identified bodies of water (Bear Lake, Grays Lake, and American Falls Reservoir) as important hotspots for Idaho avian SGCN, but we found that the weighted approach produced more congruent hotspot areas when compared to other hotspot approaches. To incorporate anthropogenic activity into hotspot analysis, we grouped species based on their sensitivity to specific human threats (i.e., urban development, agriculture, fire suppression, grazing, roads, and logging) and identified ecological sections within Idaho that may require specific conservation actions to address these human threats using the weighted approach. The Snake River Basalts and Overthrust Mountains ecological sections were important areas for potential implementation of conservation actions to conserve biodiversity. Our approach to identifying hotspots may be useful as part of a larger conservation strategy to aid land managers or local governments in applying conservation actions on the ground.

  3. Conservation of RNA chaperone activity of the human La-related proteins 4, 6 and 7

    PubMed Central

    Hussain, Rawaa H.; Zawawi, Mariam; Bayfield, Mark A.

    2013-01-01

    The La module is a conserved tandem arrangement of a La motif and RNA recognition motif whose function has been best characterized in genuine La proteins. The best-characterized substrates of La proteins are pre-tRNAs, and previous work using tRNA mediated suppression in Schizosaccharomyces pombe has demonstrated that yeast and human La enhance the maturation of these using two distinguishable activities: UUU-3′OH-dependent trailer binding/protection and a UUU-3′OH independent activity related to RNA chaperone function. The La module has also been identified in several conserved families of La-related proteins (LARPs) that engage other RNAs, but their mode of RNA binding and function(s) are not well understood. We demonstrate that the La modules of the human LARPs 4, 6 and 7 are also active in tRNA-mediated suppression, even in the absence of stable UUU-3′OH trailer protection. Rather, the capacity of these to enhance pre-tRNA maturation is associated with RNA chaperone function, which we demonstrate to be a conserved activity for each hLARP in vitro. Our work reveals insight into the mechanisms by which La module containing proteins discriminate RNA targets and demonstrates that RNA chaperone activity is a conserved function across representative members of the La motif-containing superfamily. PMID:23887937

  4. MotifMark: Finding regulatory motifs in DNA sequences.

    PubMed

    Hassanzadeh, Hamid Reza; Kolhe, Pushkar; Isbell, Charles L; Wang, May D

    2017-07-01

    The interaction between proteins and DNA is a key driving force in a significant number of biological processes such as transcriptional regulation, repair, recombination, splicing, and DNA modification. The identification of DNA-binding sites and the specificity of target proteins in binding to these regions are two important steps in understanding the mechanisms of these biological activities. A number of high-throughput technologies have recently emerged that try to quantify the affinity between proteins and DNA motifs. Despite their success, these technologies have their own limitations and fall short in precise characterization of motifs, and as a result, require further downstream analysis to extract useful and interpretable information from a haystack of noisy and inaccurate data. Here we propose MotifMark, a new algorithm based on graph theory and machine learning, that can find binding sites on candidate probes and rank their specificity in regard to the underlying transcription factor. We developed a pipeline to analyze experimental data derived from compact universal protein binding microarrays and benchmarked it against two of the most accurate motif search methods. Our results indicate that MotifMark can be a viable alternative technique for prediction of motif from protein binding microarrays and possibly other related high-throughput techniques.

  5. AMP-acetyl CoA synthetase from Leishmania donovani: identification and functional analysis of 'PX4GK' motif.

    PubMed

    Soumya, Neelagiri; Kumar, I Sravan; Shivaprasad, S; Gorakh, Landage Nitin; Dinesh, Neeradi; Swamy, Kayala Kambagiri; Singh, Sushma

    2015-04-01

    An adenosine monophosphate forming acetyl CoA synthetase (AceCS) which is the key enzyme involved in the conversion of acetate to acetyl CoA has been identified from Leishmania donovani for the first time. Sequence analysis of L. donovani AceCS (LdAceCS) revealed the presence of a 'PX4GK' motif which is highly conserved throughout organisms with higher sequence identity (96%) to lower sequence identity (38%). A ∼ 77 kDa heterologous protein with C-terminal 6X His-tag was expressed in Escherichia coli. Expression of LdAceCS in promastigotes was confirmed by western blot and RT-PCR analysis. Immunolocalization studies revealed that it is a cytosolic protein. We also report the kinetic characterization of recombinant LdAceCS with acetate, adenosine 5'-triphosphate, coenzyme A and propionate as substrates. Site directed mutagenesis of residues in conserved PX4GK motif of LdAceCS was performed to gain insight into its potential role in substrate binding, catalysis and its role in maintaining structural integrity of the protein. P646A, G651A and K652R exhibited more than 90% loss in activity signifying its indispensible role in the enzyme activity. Substitution of other residues in this motif resulted in altered substrate specificity and catalysis. However, none of them had any role in modulation of the secondary structure of the protein except G651A mutant. Copyright © 2015 Elsevier B.V. All rights reserved.

  6. Crystal structure of yeast allantoicase reveals a repeated jelly roll motif.

    PubMed

    Leulliot, Nicolas; Quevillon-Cheruel, Sophie; Sorel, Isabelle; Graille, Marc; Meyer, Philippe; Liger, Dominique; Blondeau, Karine; Janin, Joël; van Tilbeurgh, Herman

    2004-05-28

    Allantoicase (EC 3.5.3.4) catalyzes the conversion of allantoate into ureidoglycolate and urea, one of the final steps in the degradation of purines to urea. The mechanism of most enzymes involved in this pathway, which has been known for a long time, is unknown. In this paper we describe the three-dimensional crystal structure of the yeast allantoicase determined at a resolution of 2.6 A by single anomalous diffraction. This constitutes the first structure for an enzyme of this pathway. The structure reveals a repeated jelly roll beta-sheet motif, also present in proteins of unrelated biochemical function. Allantoicase has a hexameric arrangement in the crystal (dimer of trimers). Analysis of the protein sequence against the structural data reveals the presence of two totally conserved surface patches, one on each jelly roll motif. The hexameric packing concentrates these patches into conserved pockets that probably constitute the active site.

  7. Motifs, modules and games in bacteria

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wolf, Denise M.; Arkin, Adam P.

    2003-04-01

    Global explorations of regulatory network dynamics, organization and evolution have become tractable thanks to high-throughput sequencing and molecular measurement of bacterial physiology. From these, a nascent conceptual framework is developing, that views the principles of regulation in term of motifs, modules and games. Motifs are small, repeated, and conserved biological units ranging from molecular domains to small reaction networks. They are arranged into functional modules, genetically dissectible cellular functions such as the cell cycle, or different stress responses. The dynamical functioning of modules defines the organism's strategy to survive in a game, pitting cell against cell, and cell against environment.more » Placing pathway structure and dynamics into an evolutionary context begins to allow discrimination between those physical and molecular features that particularize a species to its surroundings, and those that provide core physiological function. This approach promises to generate a higher level understanding of cellular design, pathway evolution and cellular bioengineering.« less

  8. Evolution subverting essentiality: Dispensability of the cell attachment Arg-Gly-Asp motif in multiply passaged foot-and-mouth disease virus

    PubMed Central

    Martínez, Miguel A.; Verdaguer, Nuria; Mateu, Mauricio G.; Domingo, Esteban

    1997-01-01

    Aphthoviruses use a conserved Arg-Gly-Asp triplet for attachment to host cells and this motif is believed to be essential for virus viability. Here we report that this triplet—which is also a widespread motif involved in cell-to-cell adhesion—can become dispensable upon short-term evolution of the virus harboring it. Foot-and-mouth disease virus (FMDV), which was multiply passaged in cell culture, showed an altered repertoire of antigenic variants resistant to a neutralizing monoclonal antibody. The altered repertoire includes variants with substitutions at the Arg-Gly-Asp motif. Mutants lacking this sequence replicated normally in cell culture and were indistinguishable from the parental virus. Studies with individual FMDV clones indicate that amino acid replacements on the capsid surface located around the loop harboring the Arg-Gly-Asp triplet may mediate in the dispensability of this motif. The results show that FMDV quasispecies evolving in a constant biological environment have the capability of rendering totally dispensable a receptor recognition motif previously invariant, and to ensure an alternative pathway for normal viral replication. Thus, variability of highly conserved motifs, even those that viruses have adapted from functional cellular motifs, can contribute to phenotypic flexibility of RNA viruses in nature. PMID:9192645

  9. Automatic Network Fingerprinting through Single-Node Motifs

    PubMed Central

    Echtermeyer, Christoph; da Fontoura Costa, Luciano; Rodrigues, Francisco A.; Kaiser, Marcus

    2011-01-01

    Complex networks have been characterised by their specific connectivity patterns (network motifs), but their building blocks can also be identified and described by node-motifs—a combination of local network features. One technique to identify single node-motifs has been presented by Costa et al. (L. D. F. Costa, F. A. Rodrigues, C. C. Hilgetag, and M. Kaiser, Europhys. Lett., 87, 1, 2009). Here, we first suggest improvements to the method including how its parameters can be determined automatically. Such automatic routines make high-throughput studies of many networks feasible. Second, the new routines are validated in different network-series. Third, we provide an example of how the method can be used to analyse network time-series. In conclusion, we provide a robust method for systematically discovering and classifying characteristic nodes of a network. In contrast to classical motif analysis, our approach can identify individual components (here: nodes) that are specific to a network. Such special nodes, as hubs before, might be found to play critical roles in real-world networks. PMID:21297963

  10. SA-Mot: a web server for the identification of motifs of interest extracted from protein loops

    PubMed Central

    Regad, Leslie; Saladin, Adrien; Maupetit, Julien; Geneix, Colette; Camproux, Anne-Claude

    2011-01-01

    The detection of functional motifs is an important step for the determination of protein functions. We present here a new web server SA-Mot (Structural Alphabet Motif) for the extraction and location of structural motifs of interest from protein loops. Contrary to other methods, SA-Mot does not focus only on functional motifs, but it extracts recurrent and conserved structural motifs involved in structural redundancy of loops. SA-Mot uses the structural word notion to extract all structural motifs from uni-dimensional sequences corresponding to loop structures. Then, SA-Mot provides a description of these structural motifs using statistics computed in the loop data set and in SCOP superfamily, sequence and structural parameters. SA-Mot results correspond to an interactive table listing all structural motifs extracted from a target structure and their associated descriptors. Using this information, the users can easily locate loop regions that are important for the protein folding and function. The SA-Mot web server is available at http://sa-mot.mti.univ-paris-diderot.fr. PMID:21665924

  11. SA-Mot: a web server for the identification of motifs of interest extracted from protein loops.

    PubMed

    Regad, Leslie; Saladin, Adrien; Maupetit, Julien; Geneix, Colette; Camproux, Anne-Claude

    2011-07-01

    The detection of functional motifs is an important step for the determination of protein functions. We present here a new web server SA-Mot (Structural Alphabet Motif) for the extraction and location of structural motifs of interest from protein loops. Contrary to other methods, SA-Mot does not focus only on functional motifs, but it extracts recurrent and conserved structural motifs involved in structural redundancy of loops. SA-Mot uses the structural word notion to extract all structural motifs from uni-dimensional sequences corresponding to loop structures. Then, SA-Mot provides a description of these structural motifs using statistics computed in the loop data set and in SCOP superfamily, sequence and structural parameters. SA-Mot results correspond to an interactive table listing all structural motifs extracted from a target structure and their associated descriptors. Using this information, the users can easily locate loop regions that are important for the protein folding and function. The SA-Mot web server is available at http://sa-mot.mti.univ-paris-diderot.fr.

  12. Analysis of septins across kingdoms reveals orthology and new motifs.

    PubMed

    Pan, Fangfang; Malmberg, Russell L; Momany, Michelle

    2007-07-01

    Septins are cytoskeletal GTPase proteins first discovered in the fungus Saccharomyces cerevisiae where they organize the septum and link nuclear division with cell division. More recently septins have been found in animals where they are important in processes ranging from actin and microtubule organization to embryonic patterning and where defects in septins have been implicated in human disease. Previous studies suggested that many animal septins fell into independent evolutionary groups, confounding cross-kingdom comparison. In the current work, we identified 162 septins from fungi, microsporidia and animals and analyzed their phylogenetic relationships. There was support for five groups of septins with orthology between kingdoms. Group 1 (which includes S. cerevisiae Cdc10p and human Sept9) and Group 2 (which includes S. cerevisiae Cdc3p and human Sept7) contain sequences from fungi and animals. Group 3 (which includes S. cerevisiae Cdc11p) and Group 4 (which includes S. cerevisiae Cdc12p) contain sequences from fungi and microsporidia. Group 5 (which includes Aspergillus nidulans AspE) contains sequences from filamentous fungi. We suggest a modified nomenclature based on these phylogenetic relationships. Comparative sequence alignments revealed septin derivatives of already known G1, G3 and G4 GTPase motifs, four new motifs from two to twelve amino acids long and six conserved single amino acid positions. One of these new motifs is septin-specific and several are group specific. Our studies provide an evolutionary history for this important family of proteins and a framework and consistent nomenclature for comparison of septin orthologs across kingdoms.

  13. Analysis of evolutionary conservation patterns and their influence on identifying protein functional sites.

    PubMed

    Fang, Chun; Noguchi, Tamotsu; Yamana, Hayato

    2014-10-01

    Evolutionary conservation information included in position-specific scoring matrix (PSSM) has been widely adopted by sequence-based methods for identifying protein functional sites, because all functional sites, whether in ordered or disordered proteins, are found to be conserved at some extent. However, different functional sites have different conservation patterns, some of them are linear contextual, some of them are mingled with highly variable residues, and some others seem to be conserved independently. Every value in PSSMs is calculated independently of each other, without carrying the contextual information of residues in the sequence. Therefore, adopting the direct output of PSSM for prediction fails to consider the relationship between conservation patterns of residues and the distribution of conservation scores in PSSMs. In order to demonstrate the importance of combining PSSMs with the specific conservation patterns of functional sites for prediction, three different PSSM-based methods for identifying three kinds of functional sites have been analyzed. Results suggest that, different PSSM-based methods differ in their capability to identify different patterns of functional sites, and better combining PSSMs with the specific conservation patterns of residues would largely facilitate the prediction.

  14. Isosteric And Non-Isosteric Base Pairs In RNA Motifs: Molecular Dynamics And Bioinformatics Study Of The Sarcin-Ricin Internal Loop

    PubMed Central

    Havrila, Marek; Réblová, Kamila; Zirbel, Craig L.; Leontis, Neocles B.; Šponer, Jiří

    2013-01-01

    The Sarcin-Ricin RNA motif (SR motif) is one of the most prominent recurrent RNA building blocks that occurs in many different RNA contexts and folds autonomously, i.e., in a context-independent manner. In this study, we combined bioinformatics analysis with explicit-solvent molecular dynamics (MD) simulations to better understand the relation between the RNA sequence and the evolutionary patterns of SR motif. SHAPE probing experiment was also performed to confirm fidelity of MD simulations. We identified 57 instances of the SR motif in a non-redundant subset of the RNA X-ray structure database and analyzed their basepairing, base-phosphate, and backbone-backbone interactions. We extracted sequences aligned to these instances from large ribosomal RNA alignments to determine frequency of occurrence for different sequence variants. We then used a simple scoring scheme based on isostericity to suggest 10 sequence variants with highly variable expected degree of compatibility with the SR motif 3D structure. We carried out MD simulations of SR motifs with these base substitutions. Non isosteric base substitutions led to unstable structures, but so did isosteric substitutions which were unable to make key base-phosphate interactions. MD technique explains why some potentially isosteric SR motifs are not realized during evolution. We also found that inability to form stable cWW geometry is an important factor in case of the first base pair of the flexible region of the SR motif. Comparison of structural, bioinformatics, SHAPE probing and MD simulation data reveals that explicit solvent MD simulations neatly reflect viability of different sequence variants of the SR motif. Thus, MD simulations can efficiently complement bioinformatics tools in studies of conservation patterns of RNA motifs and provide atomistic insight into the role of their different signature interactions. PMID:24144333

  15. DNA motifs associated with aberrant CpG island methylation.

    PubMed

    Feltus, F Alex; Lee, Eva K; Costello, Joseph F; Plass, Christoph; Vertino, Paula M

    2006-05-01

    Epigenetic silencing involving the aberrant methylation of promoter region CpG islands is widely recognized as a tumor suppressor silencing mechanism in cancer. However, the molecular pathways underlying aberrant DNA methylation remain elusive. Recently we showed that, on a genome-wide level, CpG island loci differ in their intrinsic susceptibility to aberrant methylation and that this susceptibility can be predicted based on underlying sequence context. These data suggest that there are sequence/structural features that contribute to the protection from or susceptibility to aberrant methylation. Here we use motif elicitation coupled with classification techniques to identify DNA sequence motifs that selectively define methylation-prone or methylation-resistant CpG islands. Motifs common to 28 methylation-prone or 47 methylation-resistant CpG island-containing genomic fragments were determined using the MEME and MAST algorithms (). The five most discriminatory motifs derived from methylation-prone sequences were found to be associated with CpG islands in general and were nonrandomly distributed throughout the genome. In contrast, the eight most discriminatory motifs derived from the methylation-resistant CpG islands were randomly distributed throughout the genome. Interestingly, this latter group tended to associate with Alu and other repetitive sequences. Used together, the frequency of occurrence of these motifs successfully discriminated methylation-prone and methylation-resistant CpG island groups with an accuracy of 87% after 10-fold cross-validation. The motifs identified here are candidate methylation-targeting or methylation-protection DNA sequences.

  16. Production of mouse monoclonal antibody against Streptococcus dysgalactiae GapC protein and mapping its conserved B-cell epitope.

    PubMed

    Zhang, Limeng; Zhang, Hua; Fan, Ziyao; Zhou, Xue; Yu, Liquan; Sun, Hunan; Wu, Zhijun; Yu, Yongzhong; Song, Baifen; Ma, Jinzhu; Tong, Chunyu; Zhu, Zhanbo; Cui, Yudong

    2015-02-01

    Streptococcus dysgalactiae (S. dysgalactiae) GapC protein is a protective antigen that induces partial immunity against S. dysgalactiae infection in animals. To identify the conserved B-cell epitope of S. dysgalactiae GapC, a mouse monoclonal antibody 1E11 (mAb1E11) against GapC was generated and used to screen a phage-displayed 12-mer random peptide library (Ph.D.-12). Eleven positive clones recognized by mAb1E11 were identified, most of which matched the consensus motif TGFFAKK. Sequence of the motif exactly matched amino acids 97-103 of the S. dysgalactiae GapC. In addition, the epitope (97)TGFFAKK(103) showed high homology among different streptococcus species. Site-directed mutagenic analysis further confirmed that residues G98, F99, F100 and K103 formed the core of (97)TGFFAKK(103), and this core motif was the minimal determinant of the B-cell epitope recognized by the mAb1E11. Collectively, the identification of conserved B-cell epitope within S. dysgalactiae GapC highlights the possibility of developing the epitope-based vaccine. Copyright © 2014 Elsevier Ltd. All rights reserved.

  17. De novo discovery of structural motifs in RNA 3D structures through clustering.

    PubMed

    Ge, Ping; Islam, Shahidul; Zhong, Cuncong; Zhang, Shaojie

    2018-05-18

    As functional components in three-dimensional (3D) conformation of an RNA, the RNA structural motifs provide an easy way to associate the molecular architectures with their biological mechanisms. In the past years, many computational tools have been developed to search motif instances by using the existing knowledge of well-studied families. Recently, with the rapidly increasing number of resolved RNA 3D structures, there is an urgent need to discover novel motifs with the newly presented information. In this work, we classify all the loops in non-redundant RNA 3D structures to detect plausible RNA structural motif families by using a clustering pipeline. Compared with other clustering approaches, our method has two benefits: first, the underlying alignment algorithm is tolerant to the variations in 3D structures. Second, sophisticated downstream analysis has been performed to ensure the clusters are valid and easily applied to further research. The final clustering results contain many interesting new variants of known motif families, such as GNAA tetraloop, kink-turn, sarcin-ricin and T-loop. We have also discovered potential novel functional motifs conserved in ribosomal RNA, sgRNA, SRP RNA, riboswitch and ribozyme.

  18. Interleukin-11 binds specific EF-hand proteins via their conserved structural motifs.

    PubMed

    Kazakov, Alexei S; Sokolov, Andrei S; Vologzhannikova, Alisa A; Permyakova, Maria E; Khorn, Polina A; Ismailov, Ramis G; Denessiouk, Konstantin A; Denesyuk, Alexander I; Rastrygina, Victoria A; Baksheeva, Viktoriia E; Zernii, Evgeni Yu; Zinchenko, Dmitry V; Glazatov, Vladimir V; Uversky, Vladimir N; Mirzabekov, Tajib A; Permyakov, Eugene A; Permyakov, Sergei E

    2017-01-01

    Interleukin-11 (IL-11) is a hematopoietic cytokine engaged in numerous biological processes and validated as a target for treatment of various cancers. IL-11 contains intrinsically disordered regions that might recognize multiple targets. Recently we found that aside from IL-11RA and gp130 receptors, IL-11 interacts with calcium sensor protein S100P. Strict calcium dependence of this interaction suggests a possibility of IL-11 interaction with other calcium sensor proteins. Here we probed specificity of IL-11 to calcium-binding proteins of various types: calcium sensors of the EF-hand family (calmodulin, S100B and neuronal calcium sensors: recoverin, NCS-1, GCAP-1, GCAP-2), calcium buffers of the EF-hand family (S100G, oncomodulin), and a non-EF-hand calcium buffer (α-lactalbumin). A specific subset of the calcium sensor proteins (calmodulin, S100B, NCS-1, GCAP-1/2) exhibits metal-dependent binding of IL-11 with dissociation constants of 1-19 μM. These proteins share several amino acid residues belonging to conservative structural motifs of the EF-hand proteins, 'black' and 'gray' clusters. Replacements of the respective S100P residues by alanine drastically decrease its affinity to IL-11, suggesting their involvement into the association process. Secondary structure and accessibility of the hinge region of the EF-hand proteins studied are predicted to control specificity and selectivity of their binding to IL-11. The IL-11 interaction with the EF-hand proteins is expected to occur under numerous pathological conditions, accompanied by disintegration of plasma membrane and efflux of cellular components into the extracellular milieu.

  19. SARNAclust: Semi-automatic detection of RNA protein binding motifs from immunoprecipitation data

    PubMed Central

    Dotu, Ivan; Adamson, Scott I.; Coleman, Benjamin; Fournier, Cyril; Ricart-Altimiras, Emma; Eyras, Eduardo

    2018-01-01

    RNA-protein binding is critical to gene regulation, controlling fundamental processes including splicing, translation, localization and stability, and aberrant RNA-protein interactions are known to play a role in a wide variety of diseases. However, molecular understanding of RNA-protein interactions remains limited; in particular, identification of RNA motifs that bind proteins has long been challenging, especially when such motifs depend on both sequence and structure. Moreover, although RNA binding proteins (RBPs) often contain more than one binding domain, algorithms capable of identifying more than one binding motif simultaneously have not been developed. In this paper we present a novel pipeline to determine binding peaks in crosslinking immunoprecipitation (CLIP) data, to discover multiple possible RNA sequence/structure motifs among them, and to experimentally validate such motifs. At the core is a new semi-automatic algorithm SARNAclust, the first unsupervised method to identify and deconvolve multiple sequence/structure motifs simultaneously. SARNAclust computes similarity between sequence/structure objects using a graph kernel, providing the ability to isolate the impact of specific features through the bulge graph formalism. Application of SARNAclust to synthetic data shows its capability of clustering 5 motifs at once with a V-measure value of over 0.95, while GraphClust achieves only a V-measure of 0.083 and RNAcontext cannot detect any of the motifs. When applied to existing eCLIP sets, SARNAclust finds known motifs for SLBP and HNRNPC and novel motifs for several other RBPs such as AGGF1, AKAP8L and ILF3. We demonstrate an experimental validation protocol, a targeted Bind-n-Seq-like high-throughput sequencing approach that relies on RNA inverse folding for oligo pool design, that can validate the components within the SLBP motif. Finally, we use this protocol to experimentally interrogate the SARNAclust motif predictions for protein ILF3. Our

  20. Loss of a highly conserved sterile alpha motif domain gene (WEEP) results in pendulous branch growth in peach trees.

    PubMed

    Hollender, Courtney A; Pascal, Thierry; Tabb, Amy; Hadiarto, Toto; Srinivasan, Chinnathambi; Wang, Wanpeng; Liu, Zhongchi; Scorza, Ralph; Dardick, Chris

    2018-05-15

    Plant shoots typically grow upward in opposition to the pull of gravity. However, exceptions exist throughout the plant kingdom. Most conspicuous are trees with weeping or pendulous branches. While such trees have long been cultivated and appreciated for their ornamental value, the molecular basis behind the weeping habit is not known. Here, we characterized a weeping tree phenotype in Prunus persica (peach) and identified the underlying genetic mutation using a genomic sequencing approach. Weeping peach tree shoots exhibited a downward elliptical growth pattern and did not exhibit an upward bending in response to 90° reorientation. The causative allele was found to be an uncharacterized gene, Ppa013325 , having a 1.8-Kb deletion spanning the 5' end. This gene, dubbed WEEP , was predominantly expressed in phloem tissues and encodes a highly conserved 129-amino acid protein containing a sterile alpha motif (SAM) domain. Silencing WEEP in the related tree species Prunus domestica (plum) resulted in more outward, downward, and wandering shoot orientations compared to standard trees, supporting a role for WEEP in directing lateral shoot growth in trees. This previously unknown regulator of branch orientation, which may also be a regulator of gravity perception or response, provides insights into our understanding of how tree branches grow in opposition to gravity and could serve as a critical target for manipulating tree architecture for improved tree shape in agricultural and horticulture applications. Copyright © 2018 the Author(s). Published by PNAS.

  1. Chaotic Motifs in Gene Regulatory Networks

    PubMed Central

    Zhang, Zhaoyang; Ye, Weiming; Qian, Yu; Zheng, Zhigang; Huang, Xuhui; Hu, Gang

    2012-01-01

    Chaos should occur often in gene regulatory networks (GRNs) which have been widely described by nonlinear coupled ordinary differential equations, if their dimensions are no less than 3. It is therefore puzzling that chaos has never been reported in GRNs in nature and is also extremely rare in models of GRNs. On the other hand, the topic of motifs has attracted great attention in studying biological networks, and network motifs are suggested to be elementary building blocks that carry out some key functions in the network. In this paper, chaotic motifs (subnetworks with chaos) in GRNs are systematically investigated. The conclusion is that: (i) chaos can only appear through competitions between different oscillatory modes with rivaling intensities. Conditions required for chaotic GRNs are found to be very strict, which make chaotic GRNs extremely rare. (ii) Chaotic motifs are explored as the simplest few-node structures capable of producing chaos, and serve as the intrinsic source of chaos of random few-node GRNs. Several optimal motifs causing chaos with atypically high probability are figured out. (iii) Moreover, we discovered that a number of special oscillators can never produce chaos. These structures bring some advantages on rhythmic functions and may help us understand the robustness of diverse biological rhythms. (iv) The methods of dominant phase-advanced driving (DPAD) and DPAD time fraction are proposed to quantitatively identify chaotic motifs and to explain the origin of chaotic behaviors in GRNs. PMID:22792171

  2. Understanding the role of histidine in the GHSxG acyltransferase active site motif: Evidence for histidine stabilization of the malonyl-enzyme intermediate

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Poust, Sean; Yoon, Isu; Adams, Paul D.

    Acyltransferases determine which extender units are incorporated into polyketide and fatty acid products. Thus, the ping-pong acyltransferase mechanism utilizes a serine in a conserved GHSxG motif. However, the role of the conserved histidine in this motif is poorly understood. We observed that a histidine to alanine mutation (H640A) in the GHSxG motif of the malonyl-CoA specific yersiniabactin acyltransferase results in an approximately seven-fold higher hydrolysis rate over the wildtype enzyme, while retaining transacylation activity. We propose two possibilities for the reduction in hydrolysis rate: either H640 structurally stabilizes the protein by hydrogen bonding with a conserved asparagine in the ferredoxin-likemore » subdomain of the protein, or a water-mediated hydrogen bond between H640 and the malonyl moiety stabilizes the malonyl-O-AT ester intermediate.« less

  3. Understanding the role of histidine in the GHSxG acyltransferase active site motif: Evidence for histidine stabilization of the malonyl-enzyme intermediate

    DOE PAGES

    Poust, Sean; Yoon, Isu; Adams, Paul D.; ...

    2014-10-06

    Acyltransferases determine which extender units are incorporated into polyketide and fatty acid products. Thus, the ping-pong acyltransferase mechanism utilizes a serine in a conserved GHSxG motif. However, the role of the conserved histidine in this motif is poorly understood. We observed that a histidine to alanine mutation (H640A) in the GHSxG motif of the malonyl-CoA specific yersiniabactin acyltransferase results in an approximately seven-fold higher hydrolysis rate over the wildtype enzyme, while retaining transacylation activity. We propose two possibilities for the reduction in hydrolysis rate: either H640 structurally stabilizes the protein by hydrogen bonding with a conserved asparagine in the ferredoxin-likemore » subdomain of the protein, or a water-mediated hydrogen bond between H640 and the malonyl moiety stabilizes the malonyl-O-AT ester intermediate.« less

  4. A Bioinformatics Approach for Detecting Repetitive Nested Motifs using Pattern Matching.

    PubMed

    Romero, José R; Carballido, Jessica A; Garbus, Ingrid; Echenique, Viviana C; Ponzoni, Ignacio

    2016-01-01

    The identification of nested motifs in genomic sequences is a complex computational problem. The detection of these patterns is important to allow the discovery of transposable element (TE) insertions, incomplete reverse transcripts, deletions, and/or mutations. In this study, a de novo strategy for detecting patterns that represent nested motifs was designed based on exhaustive searches for pairs of motifs and combinatorial pattern analysis. These patterns can be grouped into three categories, motifs within other motifs, motifs flanked by other motifs, and motifs of large size. The methodology used in this study, applied to genomic sequences from the plant species Aegilops tauschii and Oryza sativa , revealed that it is possible to identify putative nested TEs by detecting these three types of patterns. The results were validated through BLAST alignments, which revealed the efficacy and usefulness of the new method, which is called Mamushka.

  5. Triadic motifs in the dependence networks of virtual societies.

    PubMed

    Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

    2014-06-10

    In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs.

  6. Triadic motifs in the dependence networks of virtual societies

    NASA Astrophysics Data System (ADS)

    Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

    2014-06-01

    In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs.

  7. Triadic motifs in the dependence networks of virtual societies

    PubMed Central

    Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

    2014-01-01

    In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs. PMID:24912755

  8. One motif to bind them: A small-XXX-small motif affects transmembrane domain 1 oligomerization, function, localization, and cross-talk between two yeast GPCRs.

    PubMed

    Lock, Antonia; Forfar, Rachel; Weston, Cathryn; Bowsher, Leo; Upton, Graham J G; Reynolds, Christopher A; Ladds, Graham; Dixon, Ann M

    2014-12-01

    G protein-coupled receptors (GPCRs) are the largest family of cell-surface receptors in mammals and facilitate a range of physiological responses triggered by a variety of ligands. GPCRs were thought to function as monomers, however it is now accepted that GPCR homo- and hetero-oligomers also exist and influence receptor properties. The Schizosaccharomyces pombe GPCR Mam2 is a pheromone-sensing receptor involved in mating and has previously been shown to form oligomers in vivo. The first transmembrane domain (TMD) of Mam2 contains a small-XXX-small motif, overrepresented in membrane proteins and well-known for promoting helix-helix interactions. An ortholog of Mam2 in Saccharomyces cerevisiae, Ste2, contains an analogous small-XXX-small motif which has been shown to contribute to receptor homo-oligomerization, localization and function. Here we have used experimental and computational techniques to characterize the role of the small-XXX-small motif in function and assembly of Mam2 for the first time. We find that disruption of the motif via mutagenesis leads to reduction of Mam2 TMD1 homo-oligomerization and pheromone-responsive cellular signaling of the full-length protein. It also impairs correct targeting to the plasma membrane. Mutation of the analogous motif in Ste2 yielded similar results, suggesting a conserved mechanism for assembly. Using co-expression of the two fungal receptors in conjunction with computational models, we demonstrate a functional change in G protein specificity and propose that this is brought about through hetero-dimeric interactions of Mam2 with Ste2 via the complementary small-XXX-small motifs. This highlights the potential of these motifs to affect a range of properties that can be investigated in other GPCRs. Copyright © 2014. Published by Elsevier B.V.

  9. Newly identified essential amino acid residues affecting ^8-sphingolipid desaturase activity revealed by site-directed mutagenesis

    USDA-ARS?s Scientific Manuscript database

    In order to identify amino acid residues crucial for the enzymatic activity of ^8-sphingolipid desaturases, a sequence comparison was performed among ^8-sphingolipid desaturases and ^6-fatty acid desaturase from various plants. In addition to the known conserved cytb5 (cytochrome b5) HPGG motif and...

  10. Computational Analyses of Synergism in Small Molecular Network Motifs

    PubMed Central

    Zhang, Yili; Smolen, Paul; Baxter, Douglas A.; Byrne, John H.

    2014-01-01

    Cellular functions and responses to stimuli are controlled by complex regulatory networks that comprise a large diversity of molecular components and their interactions. However, achieving an intuitive understanding of the dynamical properties and responses to stimuli of these networks is hampered by their large scale and complexity. To address this issue, analyses of regulatory networks often focus on reduced models that depict distinct, reoccurring connectivity patterns referred to as motifs. Previous modeling studies have begun to characterize the dynamics of small motifs, and to describe ways in which variations in parameters affect their responses to stimuli. The present study investigates how variations in pairs of parameters affect responses in a series of ten common network motifs, identifying concurrent variations that act synergistically (or antagonistically) to alter the responses of the motifs to stimuli. Synergism (or antagonism) was quantified using degrees of nonlinear blending and additive synergism. Simulations identified concurrent variations that maximized synergism, and examined the ways in which it was affected by stimulus protocols and the architecture of a motif. Only a subset of architectures exhibited synergism following paired changes in parameters. The approach was then applied to a model describing interlocked feedback loops governing the synthesis of the CREB1 and CREB2 transcription factors. The effects of motifs on synergism for this biologically realistic model were consistent with those for the abstract models of single motifs. These results have implications for the rational design of combination drug therapies with the potential for synergistic interactions. PMID:24651495

  11. DMINDA: an integrated web server for DNA motif identification and analyses

    PubMed Central

    Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying

    2014-01-01

    DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. PMID:24753419

  12. Mining for class-specific motifs in protein sequence classification

    PubMed Central

    2013-01-01

    Background In protein sequence classification, identification of the sequence motifs or n-grams that can precisely discriminate between classes is a more interesting scientific question than the classification itself. A number of classification methods aim at accurate classification but fail to explain which sequence features indeed contribute to the accuracy. We hypothesize that sequences in lower denominations (n-grams) can be used to explore the sequence landscape and to identify class-specific motifs that discriminate between classes during classification. Discriminative n-grams are short peptide sequences that are highly frequent in one class but are either minimally present or absent in other classes. In this study, we present a new substitution-based scoring function for identifying discriminative n-grams that are highly specific to a class. Results We present a scoring function based on discriminative n-grams that can effectively discriminate between classes. The scoring function, initially, harvests the entire set of 4- to 8-grams from the protein sequences of different classes in the dataset. Similar n-grams of the same size are combined to form new n-grams, where the similarity is defined by positive amino acid substitution scores in the BLOSUM62 matrix. Substitution has resulted in a large increase in the number of discriminatory n-grams harvested. Due to the unbalanced nature of the dataset, the frequencies of the n-grams are normalized using a dampening factor, which gives more weightage to the n-grams that appear in fewer classes and vice-versa. After the n-grams are normalized, the scoring function identifies discriminative 4- to 8-grams for each class that are frequent enough to be above a selection threshold. By mapping these discriminative n-grams back to the protein sequences, we obtained contiguous n-grams that represent short class-specific motifs in protein sequences. Our method fared well compared to an existing motif finding method known as

  13. Social Network Analysis Identifies Key Participants in Conservation Development.

    PubMed

    Farr, Cooper M; Reed, Sarah E; Pejchar, Liba

    2018-05-01

    Understanding patterns of participation in private lands conservation, which is often implemented voluntarily by individual citizens and private organizations, could improve its effectiveness at combating biodiversity loss. We used social network analysis (SNA) to examine participation in conservation development (CD), a private land conservation strategy that clusters houses in a small portion of a property while preserving the remaining land as protected open space. Using data from public records for six counties in Colorado, USA, we compared CD participation patterns among counties and identified actors that most often work with others to implement CDs. We found that social network characteristics differed among counties. The network density, or proportion of connections in the network, varied from fewer than 2 to nearly 15%, and was higher in counties with smaller populations and fewer CDs. Centralization, or the degree to which connections are held disproportionately by a few key actors, was not correlated strongly with any county characteristics. Network characteristics were not correlated with the prevalence of wildlife-friendly design features in CDs. The most highly connected actors were biological and geological consultants, surveyors, and engineers. Our work demonstrates a new application of SNA to land-use planning, in which CD network patterns are examined and key actors are identified. For better conservation outcomes of CD, we recommend using network patterns to guide strategies for outreach and information dissemination, and engaging with highly connected actor types to encourage widespread adoption of best practices for CD design and stewardship.

  14. Memetic algorithms for de novo motif-finding in biomedical sequences.

    PubMed

    Bi, Chengpeng

    2012-09-01

    The objectives of this study are to design and implement a new memetic algorithm for de novo motif discovery, which is then applied to detect important signals hidden in various biomedical molecular sequences. In this paper, memetic algorithms are developed and tested in de novo motif-finding problems. Several strategies in the algorithm design are employed that are to not only efficiently explore the multiple sequence local alignment space, but also effectively uncover the molecular signals. As a result, there are a number of key features in the implementation of the memetic motif-finding algorithm (MaMotif), including a chromosome replacement operator, a chromosome alteration-aware local search operator, a truncated local search strategy, and a stochastic operation of local search imposed on individual learning. To test the new algorithm, we compare MaMotif with a few of other similar algorithms using simulated and experimental data including genomic DNA, primary microRNA sequences (let-7 family), and transmembrane protein sequences. The new memetic motif-finding algorithm is successfully implemented in C++, and exhaustively tested with various simulated and real biological sequences. In the simulation, it shows that MaMotif is the most time-efficient algorithm compared with others, that is, it runs 2 times faster than the expectation maximization (EM) method and 16 times faster than the genetic algorithm-based EM hybrid. In both simulated and experimental testing, results show that the new algorithm is compared favorably or superior to other algorithms. Notably, MaMotif is able to successfully discover the transcription factors' binding sites in the chromatin immunoprecipitation followed by massively parallel sequencing (ChIP-Seq) data, correctly uncover the RNA splicing signals in gene expression, and precisely find the highly conserved helix motif in the transmembrane protein sequences, as well as rightly detect the palindromic segments in the primary micro

  15. SALAD database: a motif-based database of protein annotations for plant comparative genomics

    PubMed Central

    Mihara, Motohiro; Itoh, Takeshi; Izawa, Takeshi

    2010-01-01

    Proteins often have several motifs with distinct evolutionary histories. Proteins with similar motifs have similar biochemical properties and thus related biological functions. We constructed a unique comparative genomics database termed the SALAD database (http://salad.dna.affrc.go.jp/salad/) from plant-genome-based proteome data sets. We extracted evolutionarily conserved motifs by MEME software from 209 529 protein-sequence annotation groups selected by BLASTP from the proteome data sets of 10 species: rice, sorghum, Arabidopsis thaliana, grape, a lycophyte, a moss, 3 algae, and yeast. Similarity clustering of each protein group was performed by pairwise scoring of the motif patterns of the sequences. The SALAD database provides a user-friendly graphical viewer that displays a motif pattern diagram linked to the resulting bootstrapped dendrogram for each protein group. Amino-acid-sequence-based and nucleotide-sequence-based phylogenetic trees for motif combination alignment, a logo comparison diagram for each clade in the tree, and a Pfam-domain pattern diagram are also available. We also developed a viewer named ‘SALAD on ARRAYs’ to view arbitrary microarray data sets of paralogous genes linked to the same dendrogram in a window. The SALAD database is a powerful tool for comparing protein sequences and can provide valuable hints for biological analysis. PMID:19854933

  16. SALAD database: a motif-based database of protein annotations for plant comparative genomics.

    PubMed

    Mihara, Motohiro; Itoh, Takeshi; Izawa, Takeshi

    2010-01-01

    Proteins often have several motifs with distinct evolutionary histories. Proteins with similar motifs have similar biochemical properties and thus related biological functions. We constructed a unique comparative genomics database termed the SALAD database (http://salad.dna.affrc.go.jp/salad/) from plant-genome-based proteome data sets. We extracted evolutionarily conserved motifs by MEME software from 209,529 protein-sequence annotation groups selected by BLASTP from the proteome data sets of 10 species: rice, sorghum, Arabidopsis thaliana, grape, a lycophyte, a moss, 3 algae, and yeast. Similarity clustering of each protein group was performed by pairwise scoring of the motif patterns of the sequences. The SALAD database provides a user-friendly graphical viewer that displays a motif pattern diagram linked to the resulting bootstrapped dendrogram for each protein group. Amino-acid-sequence-based and nucleotide-sequence-based phylogenetic trees for motif combination alignment, a logo comparison diagram for each clade in the tree, and a Pfam-domain pattern diagram are also available. We also developed a viewer named 'SALAD on ARRAYs' to view arbitrary microarray data sets of paralogous genes linked to the same dendrogram in a window. The SALAD database is a powerful tool for comparing protein sequences and can provide valuable hints for biological analysis.

  17. Yeast One-Hybrid Gγ Recruitment System for Identification of Protein Lipidation Motifs

    PubMed Central

    Fukuda, Nobuo; Doi, Motomichi; Honda, Shinya

    2013-01-01

    Fatty acids and isoprenoids can be covalently attached to a variety of proteins. These lipid modifications regulate protein structure, localization and function. Here, we describe a yeast one-hybrid approach based on the Gγ recruitment system that is useful for identifying sequence motifs those influence lipid modification to recruit proteins to the plasma membrane. Our approach facilitates the isolation of yeast cells expressing lipid-modified proteins via a simple and easy growth selection assay utilizing G-protein signaling that induces diploid formation. In the current study, we selected the N-terminal sequence of Gα subunits as a model case to investigate dual lipid modification, i.e., myristoylation and palmitoylation, a modification that is widely conserved from yeast to higher eukaryotes. Our results suggest that both lipid modifications are required for restoration of G-protein signaling. Although we could not differentiate between myristoylation and palmitoylation, N-terminal position 7 and 8 play some critical role. Moreover, we tested the preference for specific amino-acid residues at position 7 and 8 using library-based screening. This new approach will be useful to explore protein-lipid associations and to determine the corresponding sequence motifs. PMID:23922919

  18. Unique Structural Features and Sequence Motifs of Proline Utilization A (PutA)

    PubMed Central

    Singh, Ranjan K.; Tanner, John J.

    2013-01-01

    Proline utilization A proteins (PutAs) are bifunctional enzymes that catalyze the oxidation of proline to glutamate using spatially separated proline dehydrogenase and pyrroline-5-carboxylate dehydrogenase active sites. Here we use the crystal structure of the minimalist PutA from Bradyrhizobium japonicum (BjPutA) along with sequence analysis to identify unique structural features of PutAs. This analysis shows that PutAs have secondary structural elements and domains not found in the related monofunctional enzymes. Some of these extra features are predicted to be important for substrate channeling in BjPutA. Multiple sequence alignment analysis shows that some PutAs have a 17-residue conserved motif in the C-terminal 20–30 residues of the polypeptide chain. The BjPutA structure shows that this motif helps seal the internal substrate-channeling cavity from the bulk medium. Finally, it is shown that some PutAs have a 100–200 residue domain of unknown function in the C-terminus that is not found in minimalist PutAs. Remote homology detection suggests that this domain is homologous to the oligomerization beta-hairpin and Rossmann fold domain of BjPutA. PMID:22201760

  19. The Proliferating Cell Nuclear Antigen (PCNA)-interacting Protein (PIP) Motif of DNA Polymerase η Mediates Its Interaction with the C-terminal Domain of Rev1*

    PubMed Central

    Boehm, Elizabeth M.; Powers, Kyle T.; Kondratick, Christine M.; Spies, Maria; Houtman, Jon C. D.; Washington, M. Todd

    2016-01-01

    Y-family DNA polymerases, such as polymerase η, polymerase ι, and polymerase κ, catalyze the bypass of DNA damage during translesion synthesis. These enzymes are recruited to sites of DNA damage by interacting with the essential replication accessory protein proliferating cell nuclear antigen (PCNA) and the scaffold protein Rev1. In most Y-family polymerases, these interactions are mediated by one or more conserved PCNA-interacting protein (PIP) motifs that bind in a hydrophobic pocket on the front side of PCNA as well as by conserved Rev1-interacting region (RIR) motifs that bind in a hydrophobic pocket on the C-terminal domain of Rev1. Yeast polymerase η, a prototypical translesion synthesis polymerase, binds both PCNA and Rev1. It possesses a single PIP motif but not an RIR motif. Here we show that the PIP motif of yeast polymerase η mediates its interactions both with PCNA and with Rev1. Moreover, the PIP motif of polymerase η binds in the hydrophobic pocket on the Rev1 C-terminal domain. We also show that the RIR motif of human polymerase κ and the PIP motif of yeast Msh6 bind both PCNA and Rev1. Overall, these findings demonstrate that PIP motifs and RIR motifs have overlapping specificities and can interact with both PCNA and Rev1 in structurally similar ways. These findings also suggest that PIP motifs are a more versatile protein interaction motif than previously believed. PMID:26903512

  20. Sequence Bundles: a novel method for visualising, discovering and exploring sequence motifs

    PubMed Central

    2014-01-01

    Background We introduce Sequence Bundles--a novel data visualisation method for representing multiple sequence alignments (MSAs). We identify and address key limitations of the existing bioinformatics data visualisation methods (i.e. the Sequence Logo) by enabling Sequence Bundles to give salient visual expression to sequence motifs and other data features, which would otherwise remain hidden. Methods For the development of Sequence Bundles we employed research-led information design methodologies. Sequences are encoded as uninterrupted, semi-opaque lines plotted on a 2-dimensional reconfigurable grid. Each line represents a single sequence. The thickness and opacity of the stack at each residue in each position indicates the level of conservation and the lines' curved paths expose patterns in correlation and functionality. Several MSAs can be visualised in a composite image. The Sequence Bundles method is designed to favour a tangible, continuous and intuitive display of information. Results We have developed a software demonstration application for generating a Sequence Bundles visualisation of MSAs provided for the BioVis 2013 redesign contest. A subsequent exploration of the visualised line patterns allowed for the discovery of a number of interesting features in the dataset. Reported features include the extreme conservation of sequences displaying a specific residue and bifurcations of the consensus sequence. Conclusions Sequence Bundles is a novel method for visualisation of MSAs and the discovery of sequence motifs. It can aid in generating new insight and hypothesis making. Sequence Bundles is well disposed for future implementation as an interactive visual analytics software, which can complement existing visualisation tools. PMID:25237395

  1. SARM1-specific motifs in the TIR domain enable NAD+ loss and regulate injury-induced SARM1 activation.

    PubMed

    Summers, Daniel W; Gibson, Daniel A; DiAntonio, Aaron; Milbrandt, Jeffrey

    2016-10-11

    Axon injury in response to trauma or disease stimulates a self-destruction program that promotes the localized clearance of damaged axon segments. Sterile alpha and Toll/interleukin receptor (TIR) motif-containing protein 1 (SARM1) is an evolutionarily conserved executioner of this degeneration cascade, also known as Wallerian degeneration; however, the mechanism of SARM1-dependent neuronal destruction is still obscure. SARM1 possesses a TIR domain that is necessary for SARM1 activity. In other proteins, dimerized TIR domains serve as scaffolds for innate immune signaling. In contrast, dimerization of the SARM1 TIR domain promotes consumption of the essential metabolite NAD + and induces neuronal destruction. This activity is unique to the SARM1 TIR domain, yet the structural elements that enable this activity are unknown. In this study, we identify fundamental properties of the SARM1 TIR domain that promote NAD + loss and axon degeneration. Dimerization of the TIR domain from the Caenorhabditis elegans SARM1 ortholog TIR-1 leads to NAD + loss and neuronal death, indicating these activities are an evolutionarily conserved feature of SARM1 function. Detailed analysis of sequence homology identifies canonical TIR motifs as well as a SARM1-specific (SS) loop that are required for NAD + loss and axon degeneration. Furthermore, we identify a residue in the SARM1 BB loop that is dispensable for TIR activity yet required for injury-induced activation of full-length SARM1, suggesting that SARM1 function requires multidomain interactions. Indeed, we identify a physical interaction between the autoinhibitory N terminus and the TIR domain of SARM1, revealing a previously unrecognized direct connection between these domains that we propose mediates autoinhibition and activation upon injury.

  2. DNA motifs determining the accuracy of repeat duplication during CRISPR adaptation in Haloarcula hispanica

    PubMed Central

    Wang, Rui; Li, Ming; Gong, Luyao; Hu, Songnian; Xiang, Hua

    2016-01-01

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) acquire new spacers to generate adaptive immunity in prokaryotes. During spacer integration, the leader-preceded repeat is always accurately duplicated, leading to speculations of a repeat-length ruler. Here in Haloarcula hispanica, we demonstrate that the accurate duplication of its 30-bp repeat requires two conserved mid-repeat motifs, AACCC and GTGGG. The AACCC motif was essential and needed to be ∼10 bp downstream from the leader-repeat junction site, where duplication consistently started. Interestingly, repeat duplication terminated sequence-independently and usually with a specific distance from the GTGGG motif, which seemingly served as an anchor site for a molecular ruler. Accordingly, altering the spacing between the two motifs led to an aberrant duplication size (29, 31, 32 or 33 bp). We propose the adaptation complex may recognize these mid-repeat elements to enable measuring the repeat DNA for spacer integration. PMID:27085805

  3. DMINDA: an integrated web server for DNA motif identification and analyses.

    PubMed

    Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying

    2014-07-01

    DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. Physical-chemical property based sequence motifs and methods regarding same

    DOEpatents

    Braun, Werner [Friendswood, TX; Mathura, Venkatarajan S [Sarasota, FL; Schein, Catherine H [Friendswood, TX

    2008-09-09

    A data analysis system, program, and/or method, e.g., a data mining/data exploration method, using physical-chemical property motifs. For example, a sequence database may be searched for identifying segments thereof having physical-chemical properties similar to the physical-chemical property motifs.

  5. Unusual Intron Conservation near Tissue-Regulated Exons Found by Splicing Microarrays

    PubMed Central

    Sugnet, Charles W; Srinivasan, Karpagam; Clark, Tyson A; O'Brien, Georgeann; Cline, Melissa S; Wang, Hui; Williams, Alan; Kulp, David; Blume, John E; Haussler, David; Ares, Manuel

    2006-01-01

    Alternative splicing contributes to both gene regulation and protein diversity. To discover broad relationships between regulation of alternative splicing and sequence conservation, we applied a systems approach, using oligonucleotide microarrays designed to capture splicing information across the mouse genome. In a set of 22 adult tissues, we observe differential expression of RNA containing at least two alternative splice junctions for about 40% of the 6,216 alternative events we could detect. Statistical comparisons identify 171 cassette exons whose inclusion or skipping is different in brain relative to other tissues and another 28 exons whose splicing is different in muscle. A subset of these exons is associated with unusual blocks of intron sequence whose conservation in vertebrates rivals that of protein-coding exons. By focusing on sets of exons with similar regulatory patterns, we have identified new sequence motifs implicated in brain and muscle splicing regulation. Of note is a motif that is strikingly similar to the branchpoint consensus but is located downstream of the 5′ splice site of exons included in muscle. Analysis of three paralogous membrane-associated guanylate kinase genes reveals that each contains a paralogous tissue-regulated exon with a similar tissue inclusion pattern. While the intron sequences flanking these exons remain highly conserved among mammalian orthologs, the paralogous flanking intron sequences have diverged considerably, suggesting unusually complex evolution of the regulation of alternative splicing in multigene families. PMID:16424921

  6. Recurring sequence-structure motifs in (βα)8-barrel proteins and experimental optimization of a chimeric protein designed based on such motifs.

    PubMed

    Wang, Jichao; Zhang, Tongchuan; Liu, Ruicun; Song, Meilin; Wang, Juncheng; Hong, Jiong; Chen, Quan; Liu, Haiyan

    2017-02-01

    An interesting way of generating novel artificial proteins is to combine sequence motifs from natural proteins, mimicking the evolutionary path suggested by natural proteins comprising recurring motifs. We analyzed the βα and αβ modules of TIM barrel proteins by structure alignment-based sequence clustering. A number of preferred motifs were identified. A chimeric TIM was designed by using recurring elements as mutually compatible interfaces. The foldability of the designed TIM protein was then significantly improved by six rounds of directed evolution. The melting temperature has been improved by more than 20°C. A variety of characteristics suggested that the resulting protein is well-folded. Our analysis provided a library of peptide motifs that is potentially useful for different protein engineering studies. The protein engineering strategy of using recurring motifs as interfaces to connect partial natural proteins may be applied to other protein folds. Copyright © 2016 Elsevier B.V. All rights reserved.

  7. SiteBinder: an improved approach for comparing multiple protein structural motifs.

    PubMed

    Sehnal, David; Vařeková, Radka Svobodová; Huber, Heinrich J; Geidl, Stanislav; Ionescu, Crina-Maria; Wimmerová, Michaela; Koča, Jaroslav

    2012-02-27

    There is a paramount need to develop new techniques and tools that will extract as much information as possible from the ever growing repository of protein 3D structures. We report here on the development of a software tool for the multiple superimposition of large sets of protein structural motifs. Our superimposition methodology performs a systematic search for the atom pairing that provides the best fit. During this search, the RMSD values for all chemically relevant pairings are calculated by quaternion algebra. The number of evaluated pairings is markedly decreased by using PDB annotations for atoms. This approach guarantees that the best fit will be found and can be applied even when sequence similarity is low or does not exist at all. We have implemented this methodology in the Web application SiteBinder, which is able to process up to thousands of protein structural motifs in a very short time, and which provides an intuitive and user-friendly interface. Our benchmarking analysis has shown the robustness, efficiency, and versatility of our methodology and its implementation by the successful superimposition of 1000 experimentally determined structures for each of 32 eukaryotic linear motifs. We also demonstrate the applicability of SiteBinder using three case studies. We first compared the structures of 61 PA-IIL sugar binding sites containing nine different sugars, and we found that the sugar binding sites of PA-IIL and its mutants have a conserved structure despite their binding different sugars. We then superimposed over 300 zinc finger central motifs and revealed that the molecular structure in the vicinity of the Zn atom is highly conserved. Finally, we superimposed 12 BH3 domains from pro-apoptotic proteins. Our findings come to support the hypothesis that there is a structural basis for the functional segregation of BH3-only proteins into activators and enablers.

  8. Identifying taxonomic and functional surrogates for spring biodiversity conservation.

    PubMed

    Jyväsjärvi, Jussi; Virtanen, Risto; Ilmonen, Jari; Paasivirta, Lauri; Muotka, Timo

    2018-02-27

    Surrogate approaches are widely used to estimate overall taxonomic diversity for conservation planning. Surrogate taxa are frequently selected based on rarity or charisma, whereas selection through statistical modeling has been applied rarely. We used boosted-regression-tree models (BRT) fitted to biological data from 165 springs to identify bryophyte and invertebrate surrogates for taxonomic and functional diversity of boreal springs. We focused on these 2 groups because they are well known and abundant in most boreal springs. The best indicators of taxonomic versus functional diversity differed. The bryophyte Bryum weigelii and the chironomid larva Paratrichocladius skirwithensis best indicated taxonomic diversity, whereas the isopod Asellus aquaticus and the chironomid Macropelopia spp. were the best surrogates of functional diversity. In a scoring algorithm for priority-site selection, taxonomic surrogates performed only slightly better than random selection for all spring-dwelling taxa, but they were very effective in representing spring specialists, providing a distinct improvement over random solutions. However, the surrogates for taxonomic diversity represented functional diversity poorly and vice versa. When combined with cross-taxon complementarity analyses, surrogate selection based on statistical modeling provides a promising approach for identifying groundwater-dependent ecosystems of special conservation value, a key requirement of the EU Water Framework Directive. © 2018 Society for Conservation Biology.

  9. A sequence-specific transcription activator motif and powerful synthetic variants that bind Mediator using a fuzzy protein interface.

    PubMed

    Warfield, Linda; Tuttle, Lisa M; Pacheco, Derek; Klevit, Rachel E; Hahn, Steven

    2014-08-26

    Although many transcription activators contact the same set of coactivator complexes, the mechanism and specificity of these interactions have been unclear. For example, do intrinsically disordered transcription activation domains (ADs) use sequence-specific motifs, or do ADs of seemingly different sequence have common properties that encode activation function? We find that the central activation domain (cAD) of the yeast activator Gcn4 functions through a short, conserved sequence-specific motif. Optimizing the residues surrounding this short motif by inserting additional hydrophobic residues creates very powerful ADs that bind the Mediator subunit Gal11/Med15 with high affinity via a "fuzzy" protein interface. In contrast to Gcn4, the activity of these synthetic ADs is not strongly dependent on any one residue of the AD, and this redundancy is similar to that of some natural ADs in which few if any sequence-specific residues have been identified. The additional hydrophobic residues in the synthetic ADs likely allow multiple faces of the AD helix to interact with the Gal11 activator-binding domain, effectively forming a fuzzier interface than that of the wild-type cAD.

  10. EBNA-2 of herpesvirus papio diverges significantly from the type A and type B EBNA-2 proteins of Epstein-Barr virus but retains an efficient transactivation domain with a conserved hydrophobic motif.

    PubMed Central

    Ling, P D; Ryon, J J; Hayward, S D

    1993-01-01

    EBNA-2 contributes to the establishment of Epstein-Barr virus (EBV) latency in B cells and to the resultant alterations in B-cell growth pattern by up-regulating expression from specific viral and cellular promoters. We have taken a comparative approach toward characterizing functional domains within EBNA-2. To this end, we have cloned and sequenced the EBNA-2 gene from the closely related baboon virus herpesvirus papio (HVP). All human EBV isolates have either a type A or type B EBNA-2 gene. However, the HVP EBNA-2 gene falls into neither the type A category nor the type B category, suggesting that the separation into these two subtypes may have been a recent evolutionary event. Comparison of the predicted amino acid sequences indicates 37% amino acid identity with EBV type A EBNA-2 and 35% amino acid identity with type B EBNA-2. To define the domains of EBNA-2 required for transcriptional activation, the DNA binding domain of the GAL4 protein was fused to overlapping segments of EBV EBNA-2. This approach identified a 40-amino-acid (40-aa) EBNA-2 activation domain located between aa 437 and 477. Transactivation ability was completely lost when the amino-terminal boundary of this domain was moved to aa 441, indicating that the motif at aa 437 to 440, Pro-Ile-Leu-Phe, contains residues critical for function. The aa 437 boundary identified in these experiments coincides precisely with a block of conserved sequences in HVP EBNA-2, and the comparable carboxy-terminal region of HVP EBNA-2 also functioned as a strong transcriptional activation domain when fused to the Gal4(1-147) protein. The EBV and HVP EBNA-2 activation domains share a mixed proline-rich, negatively charged character with a striking conservation of positionally equivalent hydrophobic residues. The importance of the individual amino acids making up the Pro-Ile-Leu-Phe motif was examined by mutagenesis. Any alteration of these residues was found to reduce transactivation efficiency, with changes at the

  11. EBNA-2 of herpesvirus papio diverges significantly from the type A and type B EBNA-2 proteins of Epstein-Barr virus but retains an efficient transactivation domain with a conserved hydrophobic motif.

    PubMed

    Ling, P D; Ryon, J J; Hayward, S D

    1993-06-01

    EBNA-2 contributes to the establishment of Epstein-Barr virus (EBV) latency in B cells and to the resultant alterations in B-cell growth pattern by up-regulating expression from specific viral and cellular promoters. We have taken a comparative approach toward characterizing functional domains within EBNA-2. To this end, we have cloned and sequenced the EBNA-2 gene from the closely related baboon virus herpesvirus papio (HVP). All human EBV isolates have either a type A or type B EBNA-2 gene. However, the HVP EBNA-2 gene falls into neither the type A category nor the type B category, suggesting that the separation into these two subtypes may have been a recent evolutionary event. Comparison of the predicted amino acid sequences indicates 37% amino acid identity with EBV type A EBNA-2 and 35% amino acid identity with type B EBNA-2. To define the domains of EBNA-2 required for transcriptional activation, the DNA binding domain of the GAL4 protein was fused to overlapping segments of EBV EBNA-2. This approach identified a 40-amino-acid (40-aa) EBNA-2 activation domain located between aa 437 and 477. Transactivation ability was completely lost when the amino-terminal boundary of this domain was moved to aa 441, indicating that the motif at aa 437 to 440, Pro-Ile-Leu-Phe, contains residues critical for function. The aa 437 boundary identified in these experiments coincides precisely with a block of conserved sequences in HVP EBNA-2, and the comparable carboxy-terminal region of HVP EBNA-2 also functioned as a strong transcriptional activation domain when fused to the Gal4(1-147) protein. The EBV and HVP EBNA-2 activation domains share a mixed proline-rich, negatively charged character with a striking conservation of positionally equivalent hydrophobic residues. The importance of the individual amino acids making up the Pro-Ile-Leu-Phe motif was examined by mutagenesis. Any alteration of these residues was found to reduce transactivation efficiency, with changes at the

  12. Conservation of the PTEN catalytic motif in the bacterial undecaprenyl pyrophosphate phosphatase, BacA/UppP.

    PubMed

    Bickford, Justin S; Nick, Harry S

    2013-12-01

    Isoprenoid lipid carriers are essential in protein glycosylation and bacterial cell envelope biosynthesis. The enzymes involved in their metabolism (synthases, kinases and phosphatases) are therefore critical to cell viability. In this review, we focus on two broad groups of isoprenoid pyrophosphate phosphatases. One group, containing phosphatidic acid phosphatase motifs, includes the eukaryotic dolichyl pyrophosphate phosphatases and proposed recycling bacterial undecaprenol pyrophosphate phosphatases, PgpB, YbjB and YeiU/LpxT. The second group comprises the bacterial undecaprenol pyrophosphate phosphatase, BacA/UppP, responsible for initial formation of undecaprenyl phosphate, which we predict contains a tyrosine phosphate phosphatase motif resembling that of the tumour suppressor, phosphatase and tensin homologue (PTEN). Based on protein sequence alignments across species and 2D structure predictions, we propose catalytic and lipid recognition motifs unique to BacA/UppP enzymes. The verification of our proposed active-site residues would provide new strategies for the development of substrate-specific inhibitors which mimic both the lipid and pyrophosphate moieties, leading to the development of novel antimicrobial agents.

  13. Conformational Preference of ‘CαNN’ Short Peptide Motif towards Recognition of Anions

    PubMed Central

    Banerjee, Raja

    2013-01-01

    Among several ‘anion binding motifs’, the recently described ‘CαNN’ motif occurring in the loop regions preceding a helix, is conserved through evolution both in sequence and its conformation. To establish the significance of the conserved sequence and their intrinsic affinity for anions, a series of peptides containing the naturally occurring ‘CαNN’ motif at the N-terminus of a designed helix, have been modeled and studied in a context free system using computational techniques. Appearance of a single interacting site with negative binding free-energy for both the sulfate and phosphate ions, as evidenced in docking experiments, establishes that the ‘CαNN’ segment has an intrinsic affinity for anions. Molecular Dynamics (MD) simulation studies reveal that interaction with anion triggers a conformational switch from non-helical to helical state at the ‘CαNN’ segment, which extends the length of the anchoring-helix by one turn at the N-terminus. Computational experiments substantiate the significance of sequence/structural context and justify the conserved nature of the ‘CαNN’ sequence for anion recognition through “local” interaction. PMID:23516403

  14. PSSMSearch: a server for modeling, visualization, proteome-wide discovery and annotation of protein motif specificity determinants.

    PubMed

    Krystkowiak, Izabella; Manguy, Jean; Davey, Norman E

    2018-06-05

    There is a pressing need for in silico tools that can aid in the identification of the complete repertoire of protein binding (SLiMs, MoRFs, miniMotifs) and modification (moiety attachment/removal, isomerization, cleavage) motifs. We have created PSSMSearch, an interactive web-based tool for rapid statistical modeling, visualization, discovery and annotation of protein motif specificity determinants to discover novel motifs in a proteome-wide manner. PSSMSearch analyses proteomes for regions with significant similarity to a motif specificity determinant model built from a set of aligned motif-containing peptides. Multiple scoring methods are available to build a position-specific scoring matrix (PSSM) describing the motif specificity determinant model. This model can then be modified by a user to add prior knowledge of specificity determinants through an interactive PSSM heatmap. PSSMSearch includes a statistical framework to calculate the significance of specificity determinant model matches against a proteome of interest. PSSMSearch also includes the SLiMSearch framework's annotation, motif functional analysis and filtering tools to highlight relevant discriminatory information. Additional tools to annotate statistically significant shared keywords and GO terms, or experimental evidence of interaction with a motif-recognizing protein have been added. Finally, PSSM-based conservation metrics have been created for taxonomic range analyses. The PSSMSearch web server is available at http://slim.ucd.ie/pssmsearch/.

  15. Identification and characterization of a translation arrest motif in VemP by systematic mutational analysis.

    PubMed

    Mori, Hiroyuki; Sakashita, Sohei; Ito, Jun; Ishii, Eiji; Akiyama, Yoshinori

    2018-02-23

    VemP ( V ibrio protein e xport m onitoring p olypeptide) is a secretory protein comprising 159 amino acid residues, which functions as a secretion monitor in Vibrio and regulates expression of the downstream V.secDF2 genes. When VemP export is compromised, its translation specifically undergoes elongation arrest at the position where the Gln 156 codon of vemP encounters the P-site in the translating ribosome, resulting in up-regulation of V.SecDF2 production. Although our previous study suggests that many residues in a highly conserved C-terminal 20-residue region of VemP contribute to its elongation arrest, the exact role of each residue remains unclear. Here, we constructed a reporter system to easily and exactly monitor the in vivo arrest efficiency of VemP. Using this reporter system, we systematically performed a mutational analysis of the 20 residues (His 138 -Phe 157 ) to identify and characterize the arrest motif. Our results show that 15 residues in the conserved region participate in elongation arrest and that multiple interactions between important residues in VemP and in the interior of the exit tunnel contribute to the elongation arrest of VemP. The arrangement of these important residues induced by specific secondary structures in the ribosomal tunnel is critical for the arrest. Pro scanning analysis of the preceding segment (Met 120 -Phe 137 ) revealed a minor role of this region in the arrest. Considering these results, we conclude that the arrest motif in VemP is mainly composed of the highly conserved multiple residues in the C-terminal region. © 2018 by The American Society for Biochemistry and Molecular Biology, Inc.

  16. G-quadruplex prediction in E. coli genome reveals a conserved putative G-quadruplex-Hairpin-Duplex switch.

    PubMed

    Kaplan, Oktay I; Berber, Burak; Hekim, Nezih; Doluca, Osman

    2016-11-02

    Many studies show that short non-coding sequences are widely conserved among regulatory elements. More and more conserved sequences are being discovered since the development of next generation sequencing technology. A common approach to identify conserved sequences with regulatory roles relies on topological changes such as hairpin formation at the DNA or RNA level. G-quadruplexes, non-canonical nucleic acid topologies with little established biological roles, are increasingly considered for conserved regulatory element discovery. Since the tertiary structure of G-quadruplexes is strongly dependent on the loop sequence which is disregarded by the generally accepted algorithm, we hypothesized that G-quadruplexes with similar topology and, indirectly, similar interaction patterns, can be determined using phylogenetic clustering based on differences in the loop sequences. Phylogenetic analysis of 52 G-quadruplex forming sequences in the Escherichia coli genome revealed two conserved G-quadruplex motifs with a potential regulatory role. Further analysis revealed that both motifs tend to form hairpins and G quadruplexes, as supported by circular dichroism studies. The phylogenetic analysis as described in this work can greatly improve the discovery of functional G-quadruplex structures and may explain unknown regulatory patterns. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. A comparative hidden Markov model analysis pipeline identifies proteins characteristic of cereal-infecting fungi

    PubMed Central

    2013-01-01

    Background Fungal pathogens cause devastating losses in economically important cereal crops by utilising pathogen proteins to infect host plants. Secreted pathogen proteins are referred to as effectors and have thus far been identified by selecting small, cysteine-rich peptides from the secretome despite increasing evidence that not all effectors share these attributes. Results We take advantage of the availability of sequenced fungal genomes and present an unbiased method for finding putative pathogen proteins and secreted effectors in a query genome via comparative hidden Markov model analyses followed by unsupervised protein clustering. Our method returns experimentally validated fungal effectors in Stagonospora nodorum and Fusarium oxysporum as well as the N-terminal Y/F/WxC-motif from the barley powdery mildew pathogen. Application to the cereal pathogen Fusarium graminearum reveals a secreted phosphorylcholine phosphatase that is characteristic of hemibiotrophic and necrotrophic cereal pathogens and shares an ancient selection process with bacterial plant pathogens. Three F. graminearum protein clusters are found with an enriched secretion signal. One of these putative effector clusters contains proteins that share a [SG]-P-C-[KR]-P sequence motif in the N-terminal and show features not commonly associated with fungal effectors. This motif is conserved in secreted pathogenic Fusarium proteins and a prime candidate for functional testing. Conclusions Our pipeline has successfully uncovered conservation patterns, putative effectors and motifs of fungal pathogens that would have been overlooked by existing approaches that identify effectors as small, secreted, cysteine-rich peptides. It can be applied to any pathogenic proteome data, such as microbial pathogen data of plants and other organisms. PMID:24252298

  18. Flexible risk metrics for identifying and monitoring conservation-priority species

    USGS Publications Warehouse

    Stanton, Jessica C.; Semmens, Brice X.; McKann, Patrick C.; Will, Tom; Thogmartin, Wayne E.

    2016-01-01

    Region-specific conservation programs should have objective, reliable metrics for species prioritization and progress evaluation that are customizable to the goals of a program, easy to comprehend and communicate, and standardized across time. Regional programs may have vastly different goals, spatial coverage, or management agendas, and one-size-fits-all schemes may not always be the best approach. We propose a quantitative and objective framework for generating metrics for prioritizing species that is straightforward to implement and update, customizable to different spatial resolutions, and based on readily available time-series data. This framework is also well-suited to handling missing-data and observer error. We demonstrate this approach using North American Breeding Bird Survey (NABBS) data to identify conservation priority species from a list of over 300 landbirds across 33 bird conservation regions (BCRs). To highlight the flexibility of the framework for different management goals and timeframes we calculate two different metrics. The first identifies species that may be inadequately monitored by NABBS protocols in the near future (TMT, time to monitoring threshold), and the other identifies species likely to decline significantly in the near future based on recent trends (TPD, time to percent decline). Within the individual BCRs we found up to 45% (mean 28%) of the species analyzed had overall declining population trajectories, which could result in up to 37 species declining below a minimum NABBS monitoring threshold in at least one currently occupied BCR within the next 50 years. Additionally, up to 26% (mean 8%) of the species analyzed within the individual BCRs may decline by 30% within the next decade. Conservation workers interested in conserving avian diversity and abundance within these BCRs can use these metrics to plan alternative monitoring schemes or highlight the urgency of those populations experiencing the fastest declines. However, this

  19. Promoter Recognition by Extracytoplasmic Function σ Factors: Analyzing DNA and Protein Interaction Motifs

    PubMed Central

    Guzina, Jelena

    2016-01-01

    ABSTRACT Extracytoplasmic function (ECF) σ factors are the largest and the most diverse group of alternative σ factors, but their mechanisms of transcription are poorly studied. This subfamily is considered to exhibit a rigid promoter structure and an absence of mixing and matching; both −35 and −10 elements are considered necessary for initiating transcription. This paradigm, however, is based on very limited data, which bias the analysis of diverse ECF σ subgroups. Here we investigate DNA and protein recognition motifs involved in ECF σ factor transcription by a computational analysis of canonical ECF subfamily members, much less studied ECF σ subgroups, and the group outliers, obtained from recently sequenced bacteriophages. The analysis identifies an extended −10 element in promoters for phage ECF σ factors; a comparison with bacterial σ factors points to a putative 6-amino-acid motif just C-terminal of domain σ2, which is responsible for the interaction with the identified extension of the −10 element. Interestingly, a similar protein motif is found C-terminal of domain σ2 in canonical ECF σ factors, at a position where it is expected to interact with a conserved motif further upstream of the −10 element. Moreover, the phiEco32 ECF σ factor lacks a recognizable −35 element and σ4 domain, which we identify in a homologous phage, 7-11, indicating that the extended −10 element can compensate for the lack of −35 element interactions. Overall, the results reveal greater flexibility in promoter recognition by ECF σ factors than previously recognized and raise the possibility that mixing and matching also apply to this group, a notion that remains to be biochemically tested. IMPORTANCE ECF σ factors are the most numerous group of alternative σ factors but have been little studied. Their promoter recognition mechanisms are obscured by the large diversity within the ECF σ factor group and the limited similarity with the well

  20. A survey of motif finding Web tools for detecting binding site motifs in ChIP-Seq data

    PubMed Central

    2014-01-01

    Abstract ChIP-Seq (chromatin immunoprecipitation sequencing) has provided the advantage for finding motifs as ChIP-Seq experiments narrow down the motif finding to binding site locations. Recent motif finding tools facilitate the motif detection by providing user-friendly Web interface. In this work, we reviewed nine motif finding Web tools that are capable for detecting binding site motifs in ChIP-Seq data. We showed each motif finding Web tool has its own advantages for detecting motifs that other tools may not discover. We recommended the users to use multiple motif finding Web tools that implement different algorithms for obtaining significant motifs, overlapping resemble motifs, and non-overlapping motifs. Finally, we provided our suggestions for future development of motif finding Web tool that better assists researchers for finding motifs in ChIP-Seq data. Reviewers This article was reviewed by Prof. Sandor Pongor, Dr. Yuriy Gusev, and Dr. Shyam Prabhakar (nominated by Prof. Limsoon Wong). PMID:24555784

  1. Sequence analysis of the L protein of the Ebola 2014 outbreak: Insight into conserved regions and mutations.

    PubMed

    Ayub, Gohar; Waheed, Yasir

    2016-06-01

    The 2014 Ebola outbreak was one of the largest that have occurred; it started in Guinea and spread to Nigeria, Liberia and Sierra Leone. Phylogenetic analysis of the current virus species indicated that this outbreak is the result of a divergent lineage of the Zaire ebolavirus. The L protein of Ebola virus (EBOV) is the catalytic subunit of the RNA‑dependent RNA polymerase complex, which, with VP35, is key for the replication and transcription of viral RNA. Earlier sequence analysis demonstrated that the L protein of all non‑segmented negative‑sense (NNS) RNA viruses consists of six domains containing conserved functional motifs. The aim of the present study was to analyze the presence of these motifs in 2014 EBOV isolates, highlight their function and how they may contribute to the overall pathogenicity of the isolates. For this purpose, 81 2014 EBOV L protein sequences were aligned with 475 other NNS RNA viruses, including Paramyxoviridae and Rhabdoviridae viruses. Phylogenetic analysis of all EBOV outbreak L protein sequences was also performed. Analysis of the amino acid substitutions in the 2014 EBOV outbreak was conducted using sequence analysis. The alignment demonstrated the presence of previously conserved motifs in the 2014 EBOV isolates and novel residues. Notably, all the mutations identified in the 2014 EBOV isolates were tolerant, they were pathogenic with certain examples occurring within previously determined functional conserved motifs, possibly altering viral pathogenicity, replication and virulence. The phylogenetic analysis demonstrated that all sequences with the exception of the 2014 EBOV sequences were clustered together. The 2014 EBOV outbreak has acquired a great number of mutations, which may explain the reasons behind this unprecedented outbreak. Certain residues critical to the function of the polymerase remain conserved and may be targets for the development of antiviral therapeutic agents.

  2. Imperfect duplicate insertions type of mutations in plasmepsin V modulates binding properties of PEXEL motifs of export proteins in Indian Plasmodium vivax.

    PubMed

    Rawat, Manmeet; Vijay, Sonam; Gupta, Yash; Tiwari, Pramod Kumar; Sharma, Arun

    2013-01-01

    Plasmepsin V (PM-V) have functionally conserved orthologues across the Plasmodium genus who's binding and antigenic processing at the PEXEL motifs for export about 200-300 essential proteins is important for the virulence and viability of the causative Plasmodium species. This study was undertaken to determine P. vivax plasmepsin V Ind (PvPM-V-Ind) PEXEL motif export pathway for pathogenicity-related proteins/antigens export thereby altering plasmodium exportome during erythrocytic stages. We identify and characterize Plasmodium vivax plasmepsin-V-Ind (mutant) gene by cloning, sequence analysis, in silico bioinformatic protocols and structural modeling predictions based on docking studies on binding capacity with PEXEL motifs processing in terms of binding and accessibility of export proteins. Cloning and sequence analysis for genetic diversity demonstrates PvPM-V-Ind (mutant) gene is highly conserved among all isolates from different geographical regions of India. Imperfect duplicate insertion types of mutations (SVSE from 246-249 AA and SLSE from 266-269 AA) were identified among all Indian isolates in comparison to P.vivax Sal-1 (PvPM-V-Sal 1) isolate. In silico bioinformatics interaction studies of PEXEL peptide and active enzyme reveal that PvPM-V-Ind (mutant) is only active in endoplasmic reticulum lumen and membrane embedding is essential for activation of plasmepsin V. Structural modeling predictions based on docking studies with PEXEL motif show significant variation in substrate protein binding of these imperfect mutations with data mined PEXEL sequences. The predicted variation in the docking score and interacting amino acids of PvPM-V-Ind (mutant) proteins with PEXEL and lopinavir suggests a modulation in the activity of PvPM-V in terms of binding and accessibility at these sites. Our functional modeled validation of PvPM-V-Ind (mutant) imperfect duplicate insertions with data mined PEXEL sequences leading to altered binding and substrate accessibility

  3. Imperfect Duplicate Insertions Type of Mutations in Plasmepsin V Modulates Binding Properties of PEXEL Motifs of Export Proteins in Indian Plasmodium vivax

    PubMed Central

    Rawat, Manmeet; Vijay, Sonam; Gupta, Yash; Tiwari, Pramod Kumar; Sharma, Arun

    2013-01-01

    Introduction Plasmepsin V (PM-V) have functionally conserved orthologues across the Plasmodium genus who's binding and antigenic processing at the PEXEL motifs for export about 200–300 essential proteins is important for the virulence and viability of the causative Plasmodium species. This study was undertaken to determine P. vivax plasmepsin V Ind (PvPM-V-Ind) PEXEL motif export pathway for pathogenicity-related proteins/antigens export thereby altering plasmodium exportome during erythrocytic stages. Method We identify and characterize Plasmodium vivax plasmepsin-V-Ind (mutant) gene by cloning, sequence analysis, in silico bioinformatic protocols and structural modeling predictions based on docking studies on binding capacity with PEXEL motifs processing in terms of binding and accessibility of export proteins. Results Cloning and sequence analysis for genetic diversity demonstrates PvPM-V-Ind (mutant) gene is highly conserved among all isolates from different geographical regions of India. Imperfect duplicate insertion types of mutations (SVSE from 246–249 AA and SLSE from 266–269 AA) were identified among all Indian isolates in comparison to P.vivax Sal-1 (PvPM-V-Sal 1) isolate. In silico bioinformatics interaction studies of PEXEL peptide and active enzyme reveal that PvPM-V-Ind (mutant) is only active in endoplasmic reticulum lumen and membrane embedding is essential for activation of plasmepsin V. Structural modeling predictions based on docking studies with PEXEL motif show significant variation in substrate protein binding of these imperfect mutations with data mined PEXEL sequences. The predicted variation in the docking score and interacting amino acids of PvPM-V-Ind (mutant) proteins with PEXEL and lopinavir suggests a modulation in the activity of PvPM-V in terms of binding and accessibility at these sites. Conclusion/Significance Our functional modeled validation of PvPM-V-Ind (mutant) imperfect duplicate insertions with data mined PEXEL

  4. Hydrophobic Motif Phosphorylation Coordinates Activity and Polar Localization of the Neurospora crassa Nuclear Dbf2-Related Kinase COT1

    PubMed Central

    Maerz, Sabine; Dettmann, Anne

    2012-01-01

    Nuclear Dbf2p-related (NDR) kinases and associated proteins are recognized as a conserved network that regulates eukaryotic cell polarity. NDR kinases require association with MOB adaptor proteins and phosphorylation of two conserved residues in the activation segment and hydrophobic motif for activity and function. We demonstrate that the Neurospora crassa NDR kinase COT1 forms inactive dimers via a conserved N-terminal extension, which is also required for the interaction of the kinase with MOB2 to generate heterocomplexes with basal activity. Basal kinase activity also requires autophosphorylation of the COT1-MOB2 complex in the activation segment, while hydrophobic motif phosphorylation of COT1 by the germinal center kinase POD6 fully activates COT1 through induction of a conformational change. Hydrophobic motif phosphorylation is also required for plasma membrane association of the COT1-MOB2 complex. MOB2 further restricts the membrane-associated kinase complex to the hyphal apex to promote polar cell growth. These data support an integrated mechanism of NDR kinase regulation in vivo, in which kinase activation and cellular localization of COT1 are coordinated by dual phosphorylation and interaction with MOB2. PMID:22451488

  5. The MARVEL transmembrane motif of occludin mediates oligomerization and targeting to the basolateral surface in epithelia.

    PubMed

    Yaffe, Yakey; Shepshelovitch, Jeanne; Nevo-Yassaf, Inbar; Yeheskel, Adva; Shmerling, Hedva; Kwiatek, Joanna M; Gaus, Katharina; Pasmanik-Chor, Metsada; Hirschberg, Koret

    2012-08-01

    Occludin (Ocln), a MARVEL-motif-containing protein, is found in all tight junctions. MARVEL motifs are comprised of four transmembrane helices associated with the localization to or formation of diverse membrane subdomains by interacting with the proximal lipid environment. The functions of the Ocln MARVEL motif are unknown. Bioinformatics sequence- and structure-based analyses demonstrated that the MARVEL domain of Ocln family proteins has distinct evolutionarily conserved sequence features that are consistent with its basolateral membrane localization. Live-cell microscopy, fluorescence resonance energy transfer (FRET) and bimolecular fluorescence complementation (BiFC) were used to analyze the intracellular distribution and self-association of fluorescent-protein-tagged full-length human Ocln or the Ocln MARVEL motif excluding the cytosolic C- and N-termini (amino acids 60-269, FP-MARVEL-Ocln). FP-MARVEL-Ocln efficiently arrived at the plasma membrane (PM) and was sorted to the basolateral PM in filter-grown polarized MDCK cells. A series of conserved aromatic amino acids within the MARVEL domain were found to be associated with Ocln dimerization using BiFC. FP-MARVEL-Ocln inhibited membrane pore growth during Triton-X-100-induced solubilization and was shown to increase the membrane-ordered state using Laurdan, a lipid dye. These data demonstrate that the Ocln MARVEL domain mediates self-association and correct sorting to the basolateral membrane.

  6. RNA Bricks—a database of RNA 3D motifs and their interactions

    PubMed Central

    Chojnowski, Grzegorz; Waleń, Tomasz; Bujnicki, Janusz M.

    2014-01-01

    The RNA Bricks database (http://iimcb.genesilico.pl/rnabricks), stores information about recurrent RNA 3D motifs and their interactions, found in experimentally determined RNA structures and in RNA–protein complexes. In contrast to other similar tools (RNA 3D Motif Atlas, RNA Frabase, Rloom) RNA motifs, i.e. ‘RNA bricks’ are presented in the molecular environment, in which they were determined, including RNA, protein, metal ions, water molecules and ligands. All nucleotide residues in RNA bricks are annotated with structural quality scores that describe real-space correlation coefficients with the electron density data (if available), backbone geometry and possible steric conflicts, which can be used to identify poorly modeled residues. The database is also equipped with an algorithm for 3D motif search and comparison. The algorithm compares spatial positions of backbone atoms of the user-provided query structure and of stored RNA motifs, without relying on sequence or secondary structure information. This enables the identification of local structural similarities among evolutionarily related and unrelated RNA molecules. Besides, the search utility enables searching ‘RNA bricks’ according to sequence similarity, and makes it possible to identify motifs with modified ribonucleotide residues at specific positions. PMID:24220091

  7. A Conserved Structural Motif Mediates Retrograde Trafficking of Shiga Toxin Types 1 and 2.

    PubMed

    Selyunin, Andrey S; Mukhopadhyay, Somshuvra

    2015-12-01

    Shiga toxin-producing Escherichia coli (STEC) produce two types of Shiga toxin (STx): STx1 and STx2. The toxin A-subunits block protein synthesis, while the B-subunits mediate retrograde trafficking. STEC infections do not have definitive treatments, and there is growing interest in generating toxin transport inhibitors for therapy. However, a comprehensive understanding of the mechanisms of toxin trafficking is essential for drug development. While STx2 is more toxic in vivo, prior studies focused on STx1 B-subunit (STx1B) trafficking. Here, we show that, compared with STx1B, trafficking of the B-subunit of STx2 (STx2B) to the Golgi occurs with slower kinetics. Despite this difference, similar to STx1B, endosome-to-Golgi transport of STx2B does not involve transit through degradative late endosomes and is dependent on dynamin II, epsinR, retromer and syntaxin5. Importantly, additional experiments show that a surface-exposed loop in STx2B (β4-β5 loop) is required for its endosome-to-Golgi trafficking. We previously demonstrated that residues in the corresponding β4-β5 loop of STx1B are required for interaction with GPP130, the STx1B-specific endosomal receptor, and for endosome-to-Golgi transport. Overall, STx1B and STx2B share a common pathway and use a similar structural motif to traffic to the Golgi, suggesting that the underlying mechanisms of endosomal sorting may be evolutionarily conserved. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  8. BEAM web server: a tool for structural RNA motif discovery.

    PubMed

    Pietrosanto, Marco; Adinolfi, Marta; Casula, Riccardo; Ausiello, Gabriele; Ferrè, Fabrizio; Helmer-Citterich, Manuela

    2018-03-15

    RNA structural motif finding is a relevant problem that becomes computationally hard when working on high-throughput data (e.g. eCLIP, PAR-CLIP), often represented by thousands of RNA molecules. Currently, the BEAM server is the only web tool capable to handle tens of thousands of RNA in input with a motif discovery procedure that is only limited by the current secondary structure prediction accuracies. The recently developed method BEAM (BEAr Motifs finder) can analyze tens of thousands of RNA molecules and identify RNA secondary structure motifs associated to a measure of their statistical significance. BEAM is extremely fast thanks to the BEAR encoding that transforms each RNA secondary structure in a string of characters. BEAM also exploits the evolutionary knowledge contained in a substitution matrix of secondary structure elements, extracted from the RFAM database of families of homologous RNAs. The BEAM web server has been designed to streamline data pre-processing by automatically handling folding and encoding of RNA sequences, giving users a choice for the preferred folding program. The server provides an intuitive and informative results page with the list of secondary structure motifs identified, the logo of each motif, its significance, graphic representation and information about its position in the RNA molecules sharing it. The web server is freely available at http://beam.uniroma2.it/ and it is implemented in NodeJS and Python with all major browsers supported. marco.pietrosanto@uniroma2.it. Supplementary data are available at Bioinformatics online.

  9. Functional Analysis of Light-harvesting-like Protein 3 (LIL3) and Its Light-harvesting Chlorophyll-binding Motif in Arabidopsis*

    PubMed Central

    Takahashi, Kaori; Takabayashi, Atsushi; Tanaka, Ayumi; Tanaka, Ryouichi

    2014-01-01

    The light-harvesting complex (LHC) constitutes the major light-harvesting antenna of photosynthetic eukaryotes. LHC contains a characteristic sequence motif, termed LHC motif, consisting of 25–30 mostly hydrophobic amino acids. This motif is shared by a number of transmembrane proteins from oxygenic photoautotrophs that are termed light-harvesting-like (LIL) proteins. To gain insights into the functions of LIL proteins and their LHC motifs, we functionally characterized a plant LIL protein, LIL3. This protein has been shown previously to stabilize geranylgeranyl reductase (GGR), a key enzyme in phytol biosynthesis. It is hypothesized that LIL3 functions to anchor GGR to membranes. First, we conjugated the transmembrane domain of LIL3 or that of ascorbate peroxidase to GGR and expressed these chimeric proteins in an Arabidopsis mutant lacking LIL3 protein. As a result, the transgenic plants restored phytol-synthesizing activity. These results indicate that GGR is active as long as it is anchored to membranes, even in the absence of LIL3. Subsequently, we addressed the question why the LHC motif is conserved in the LIL3 sequences. We modified the transmembrane domain of LIL3, which contains the LHC motif, by substituting its conserved amino acids (Glu-171, Asn-174, and Asp-189) with alanine. As a result, the Arabidopsis transgenic plants partly recovered the phytol-biosynthesizing activity. However, in these transgenic plants, the LIL3-GGR complexes were partially dissociated. Collectively, these results indicate that the LHC motif of LIL3 is involved in the complex formation of LIL3 and GGR, which might contribute to the GGR reaction. PMID:24275650

  10. Identifying species conservation strategies to reduce disease-associated declines

    USGS Publications Warehouse

    Gerber, Brian D.; Converse, Sarah J.; Muths, Erin L.; Crockett, Harry J.; Mosher, Brittany A.; Bailey, Larissa L.

    2018-01-01

    Emerging infectious diseases (EIDs) are a salient threat to many animal taxa, causing local and global extinctions, altering communities and ecosystem function. The EID chytridiomycosis is a prominent driver of amphibian declines, which is caused by the fungal pathogen Batrachochytrium dendrobatidis (Bd). To guide conservation policy, we developed a predictive decision-analytic model that combines empirical knowledge of host-pathogen metapopulation dynamics with expert judgment regarding effects of management actions, to select from potential conservation strategies. We apply our approach to a boreal toad (Anaxyrus boreas boreas) and Bd system, identifying optimal strategies that balance tradeoffs in maximizing toad population persistence and landscape-level distribution, while considering costs. The most robust strategy is expected to reduce the decline of toad breeding sites from 53% to 21% over 50 years. Our findings are incorporated into management policy to guide conservation planning. Our online modeling application provides a template for managers of other systems challenged by EIDs.

  11. Genome-wide comparison of ferritin family from Archaea, Bacteria, Eukarya, and Viruses: its distribution, characteristic motif, and phylogenetic relationship

    NASA Astrophysics Data System (ADS)

    Bai, Lina; Xie, Ting; Hu, Qingqing; Deng, Changyan; Zheng, Rong; Chen, Wanping

    2015-10-01

    Ferritins are highly conserved proteins that are widely distributed in various species from archaea to humans. The ubiquitous characteristic of these proteins reflects the pivotal contribution of ferritins to the safe storage and timely delivery of iron to achieve iron homeostasis. This study investigated the ferritin genes in 248 genomes from various species, including viruses, archaea, bacteria, and eukarya. The distribution comparison suggests that mammals and eudicots possess abundant ferritin genes, whereas fungi contain very few ferritin genes. Archaea and bacteria show considerable numbers of ferritin genes. Generally, prokaryotes possess three types of ferritin (the typical ferritin, bacterioferritin, and DNA-binding protein from starved cell), whereas eukaryotes have various subunit types of ferritin, thereby indicating the individuation of the ferritin family during evolution. The characteristic motif analysis of ferritins suggested that all key residues specifying the unique structural motifs of ferritin are highly conserved across three domains of life. Meanwhile, the characteristic motifs were also distinguishable between ferritin groups, especially phytoferritins, which show a plant-specific motif. The phylogenetic analyses show that ferritins within the same subfamily or subunits are generally clustered together. The phylogenetic relationships among ferritin members suggest that both gene duplication and horizontal transfer contribute to the wide variety of ferritins, and their possible evolutionary scenario was also proposed. The results contribute to a better understanding of the distribution, characteristic motif, and evolutionary relationship of the ferritin family.

  12. A novel motif in the yeast mitochondrial dynamin Dnm1 is essential for adaptor binding and membrane recruitment

    PubMed Central

    Bui, Huyen T.; Karren, Mary A.; Bhar, Debjani

    2012-01-01

    To initiate mitochondrial fission, dynamin-related proteins (DRPs) must bind specific adaptors on the outer mitochondrial membrane. The structural features underlying this interaction are poorly understood. Using yeast as a model, we show that the Insert B domain of the Dnm1 guanosine triphosphatase (a DRP) contains a novel motif required for association with the mitochondrial adaptor Mdv1. Mutation of this conserved motif specifically disrupted Dnm1–Mdv1 interactions, blocking Dnm1 recruitment and mitochondrial fission. Suppressor mutations in Mdv1 that restored Dnm1–Mdv1 interactions and fission identified potential protein-binding interfaces on the Mdv1 β-propeller domain. These results define the first known function for Insert B in DRP–adaptor interactions. Based on the variability of Insert B sequences and adaptor proteins, we propose that Insert B domains and mitochondrial adaptors have coevolved to meet the unique requirements for mitochondrial fission of different organisms. PMID:23148233

  13. PISMA: A Visual Representation of Motif Distribution in DNA Sequences.

    PubMed

    Alcántara-Silva, Rogelio; Alvarado-Hermida, Moisés; Díaz-Contreras, Gibrán; Sánchez-Barrios, Martha; Carrera, Samantha; Galván, Silvia Carolina

    2017-01-01

    Because the graphical presentation and analysis of motif distribution can provide insights for experimental hypothesis, PISMA aims at identifying motifs on DNA sequences, counting and showing them graphically. The motif length ranges from 2 to 10 bases, and the DNA sequences range up to 10 kb. The motif distribution is shown as a bar-code-like, as a gene-map-like, and as a transcript scheme. We obtained graphical schemes of the CpG site distribution from 91 human papillomavirus genomes. Also, we present 2 analyses: one of DNA motifs associated with either methylation-resistant or methylation-sensitive CpG islands and another analysis of motifs associated with exosome RNA secretion. PISMA is developed in Java; it is executable in any type of hardware and in diverse operating systems. PISMA is freely available to noncommercial users. The English version and the User Manual are provided in Supplementary Files 1 and 2, and a Spanish version is available at www.biomedicas.unam.mx/wp-content/software/pisma.zip and www.biomedicas.unam.mx/wp-content/pdf/manual/pisma.pdf.

  14. Evolution of the Ferric Reductase Domain (FRD) Superfamily: Modularity, Functional Diversification, and Signature Motifs

    PubMed Central

    Zhang, Xuezhi; Krause, Karl-Heinz; Xenarios, Ioannis; Soldati, Thierry; Boeckmann, Brigitte

    2013-01-01

    A heme-containing transmembrane ferric reductase domain (FRD) is found in bacterial and eukaryotic protein families, including ferric reductases (FRE), and NADPH oxidases (NOX). The aim of this study was to understand the phylogeny of the FRD superfamily. Bacteria contain FRD proteins consisting only of the ferric reductase domain, such as YedZ and short bFRE proteins. Full length FRE and NOX enzymes are mostly found in eukaryotic cells and all possess a dehydrogenase domain, allowing them to catalyze electron transfer from cytosolic NADPH to extracellular metal ions (FRE) or oxygen (NOX). Metazoa possess YedZ-related STEAP proteins, possibly derived from bacteria through horizontal gene transfer. Phylogenetic analyses suggests that FRE enzymes appeared early in evolution, followed by a transition towards EF-hand containing NOX enzymes (NOX5- and DUOX-like). An ancestral gene of the NOX(1-4) family probably lost the EF-hands and new regulatory mechanisms of increasing complexity evolved in this clade. Two signature motifs were identified: NOX enzymes are distinguished from FRE enzymes through a four amino acid motif spanning from transmembrane domain 3 (TM3) to TM4, and YedZ/STEAP proteins are identified by the replacement of the first canonical heme-spanning histidine by a highly conserved arginine. The FRD superfamily most likely originated in bacteria. PMID:23505460

  15. Evolution of the ferric reductase domain (FRD) superfamily: modularity, functional diversification, and signature motifs.

    PubMed

    Zhang, Xuezhi; Krause, Karl-Heinz; Xenarios, Ioannis; Soldati, Thierry; Boeckmann, Brigitte

    2013-01-01

    A heme-containing transmembrane ferric reductase domain (FRD) is found in bacterial and eukaryotic protein families, including ferric reductases (FRE), and NADPH oxidases (NOX). The aim of this study was to understand the phylogeny of the FRD superfamily. Bacteria contain FRD proteins consisting only of the ferric reductase domain, such as YedZ and short bFRE proteins. Full length FRE and NOX enzymes are mostly found in eukaryotic cells and all possess a dehydrogenase domain, allowing them to catalyze electron transfer from cytosolic NADPH to extracellular metal ions (FRE) or oxygen (NOX). Metazoa possess YedZ-related STEAP proteins, possibly derived from bacteria through horizontal gene transfer. Phylogenetic analyses suggests that FRE enzymes appeared early in evolution, followed by a transition towards EF-hand containing NOX enzymes (NOX5- and DUOX-like). An ancestral gene of the NOX(1-4) family probably lost the EF-hands and new regulatory mechanisms of increasing complexity evolved in this clade. Two signature motifs were identified: NOX enzymes are distinguished from FRE enzymes through a four amino acid motif spanning from transmembrane domain 3 (TM3) to TM4, and YedZ/STEAP proteins are identified by the replacement of the first canonical heme-spanning histidine by a highly conserved arginine. The FRD superfamily most likely originated in bacteria.

  16. Process-based network decomposition reveals backbone motif structure

    PubMed Central

    Wang, Guanyu; Du, Chenghang; Chen, Hao; Simha, Rahul; Rong, Yongwu; Xiao, Yi; Zeng, Chen

    2010-01-01

    A central challenge in systems biology today is to understand the network of interactions among biomolecules and, especially, the organizing principles underlying such networks. Recent analysis of known networks has identified small motifs that occur ubiquitously, suggesting that larger networks might be constructed in the manner of electronic circuits by assembling groups of these smaller modules. Using a unique process-based approach to analyzing such networks, we show for two cell-cycle networks that each of these networks contains a giant backbone motif spanning all the network nodes that provides the main functional response. The backbone is in fact the smallest network capable of providing the desired functionality. Furthermore, the remaining edges in the network form smaller motifs whose role is to confer stability properties rather than provide function. The process-based approach used in the above analysis has additional benefits: It is scalable, analytic (resulting in a single analyzable expression that describes the behavior), and computationally efficient (all possible minimal networks for a biological process can be identified and enumerated). PMID:20498084

  17. iFORM: Incorporating Find Occurrence of Regulatory Motifs.

    PubMed

    Ren, Chao; Chen, Hebing; Yang, Bite; Liu, Feng; Ouyang, Zhangyi; Bo, Xiaochen; Shu, Wenjie

    2016-01-01

    Accurately identifying the binding sites of transcription factors (TFs) is crucial to understanding the mechanisms of transcriptional regulation and human disease. We present incorporating Find Occurrence of Regulatory Motifs (iFORM), an easy-to-use and efficient tool for scanning DNA sequences with TF motifs described as position weight matrices (PWMs). Both performance assessment with a receiver operating characteristic (ROC) curve and a correlation-based approach demonstrated that iFORM achieves higher accuracy and sensitivity by integrating five classical motif discovery programs using Fisher's combined probability test. We have used iFORM to provide accurate results on a variety of data in the ENCODE Project and the NIH Roadmap Epigenomics Project, and the tool has demonstrated its utility in further elucidating individual roles of functional elements. Both the source and binary codes for iFORM can be freely accessed at https://github.com/wenjiegroup/iFORM. The identified TF binding sites across human cell and tissue types using iFORM have been deposited in the Gene Expression Omnibus under the accession ID GSE53962.

  18. Identifying all moiety conservation laws in genome-scale metabolic networks.

    PubMed

    De Martino, Andrea; De Martino, Daniele; Mulet, Roberto; Pagnani, Andrea

    2014-01-01

    The stoichiometry of a metabolic network gives rise to a set of conservation laws for the aggregate level of specific pools of metabolites, which, on one hand, pose dynamical constraints that cross-link the variations of metabolite concentrations and, on the other, provide key insight into a cell's metabolic production capabilities. When the conserved quantity identifies with a chemical moiety, extracting all such conservation laws from the stoichiometry amounts to finding all non-negative integer solutions of a linear system, a programming problem known to be NP-hard. We present an efficient strategy to compute the complete set of integer conservation laws of a genome-scale stoichiometric matrix, also providing a certificate for correctness and maximality of the solution. Our method is deployed for the analysis of moiety conservation relationships in two large-scale reconstructions of the metabolism of the bacterium E. coli, in six tissue-specific human metabolic networks, and, finally, in the human reactome as a whole, revealing that bacterial metabolism could be evolutionarily designed to cover broader production spectra than human metabolism. Convergence to the full set of moiety conservation laws in each case is achieved in extremely reduced computing times. In addition, we uncover a scaling relation that links the size of the independent pool basis to the number of metabolites, for which we present an analytical explanation.

  19. Computational and experimental analysis of short peptide motifs for enzyme inhibition.

    PubMed

    Fu, Jinglin; Larini, Luca; Cooper, Anthony J; Whittaker, John W; Ahmed, Azka; Dong, Junhao; Lee, Minyoung; Zhang, Ting

    2017-01-01

    The metabolism of living systems involves many enzymes that play key roles as catalysts and are essential to biological function. Searching ligands with the ability to modulate enzyme activities is central to diagnosis and therapeutics. Peptides represent a promising class of potential enzyme modulators due to the large chemical diversity, and well-established methods for library synthesis. Peptides and their derivatives are found to play critical roles in modulating enzymes and mediating cellular uptakes, which are increasingly valuable in therapeutics. We present a methodology that uses molecular dynamics (MD) and point-variant screening to identify short peptide motifs that are critical for inhibiting β-galactosidase (β-Gal). MD was used to simulate the conformations of peptides and to suggest short motifs that were most populated in simulated conformations. The function of the simulated motifs was further validated by the experimental point-variant screening as critical segments for inhibiting the enzyme. Based on the validated motifs, we eventually identified a 7-mer short peptide for inhibiting an enzyme with low μM IC50. The advantage of our methodology is the relatively simplified simulation that is informative enough to identify the critical sequence of a peptide inhibitor, with a precision comparable to truncation and alanine scanning experiments. Our combined experimental and computational approach does not rely on a detailed understanding of mechanistic and structural details. The MD simulation suggests the populated motifs that are consistent with the results of the experimental alanine and truncation scanning. This approach appears to be applicable to both natural and artificial peptides. With more discovered short motifs in the future, they could be exploited for modulating biocatalysis, and developing new medicine.

  20. Motif mismatches in microsatellites: insights from genome-wide investigation among 20 insect species.

    PubMed

    Behura, Susanta K; Severson, David W

    2015-02-01

    We present a detailed genome-wide comparative study of motif mismatches of microsatellites among 20 insect species representing five taxonomic orders. The results show that varying proportions (∼15-46%) of microsatellites identified in these species are imperfect in motif structure, and that they also vary in chromosomal distribution within genomes. It was observed that the genomic abundance of imperfect repeats is significantly associated with the length and number of motif mismatches of microsatellites. Furthermore, microsatellites with a higher number of mismatches tend to have lower abundance in the genome, suggesting that sequence heterogeneity of repeat motifs is a key determinant of genomic abundance of microsatellites. This relationship seems to be a general feature of microsatellites even in unrelated species such as yeast, roundworm, mouse and human. We provide a mechanistic explanation of the evolutionary link between motif heterogeneity and genomic abundance of microsatellites by examining the patterns of motif mismatches and allele sequences of single-nucleotide polymorphisms identified within microsatellite loci. Using Drosophila Reference Genetic Panel data, we further show that pattern of allelic variation modulates motif heterogeneity of microsatellites, and provide estimates of allele age of specific imperfect microsatellites found within protein-coding genes. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  1. Structural complexity of Dengue virus untranslated regions: cis-acting RNA motifs and pseudoknot interactions modulating functionality of the viral genome

    PubMed Central

    Sztuba-Solinska, Joanna; Teramoto, Tadahisa; Rausch, Jason W.; Shapiro, Bruce A.; Padmanabhan, Radhakrishnan; Le Grice, Stuart F. J.

    2013-01-01

    The Dengue virus (DENV) genome contains multiple cis-acting elements required for translation and replication. Previous studies indicated that a 719-nt subgenomic minigenome (DENV-MINI) is an efficient template for translation and (−) strand RNA synthesis in vitro. We performed a detailed structural analysis of DENV-MINI RNA, combining chemical acylation techniques, Pb2+ ion-induced hydrolysis and site-directed mutagenesis. Our results highlight protein-independent 5′–3′ terminal interactions involving hybridization between recognized cis-acting motifs. Probing analyses identified tandem dumbbell structures (DBs) within the 3′ terminus spaced by single-stranded regions, internal loops and hairpins with embedded GNRA-like motifs. Analysis of conserved motifs and top loops (TLs) of these dumbbells, and their proposed interactions with downstream pseudoknot (PK) regions, predicted an H-type pseudoknot involving TL1 of the 5′ DB and the complementary region, PK2. As disrupting the TL1/PK2 interaction, via ‘flipping’ mutations of PK2, previously attenuated DENV replication, this pseudoknot may participate in regulation of RNA synthesis. Computer modeling implied that this motif might function as autonomous structural/regulatory element. In addition, our studies targeting elements of the 3′ DB and its complementary region PK1 indicated that communication between 5′–3′ terminal regions strongly depends on structure and sequence composition of the 5′ cyclization region. PMID:23531545

  2. Infection of capilloviruses requires subgenomic RNAs whose transcription is controlled by promoter-like sequences conserved among flexiviruses.

    PubMed

    Komatsu, Ken; Hirata, Hisae; Fukagawa, Takako; Yamaji, Yasuyuki; Okano, Yukari; Ishikawa, Kazuya; Adachi, Tatsushi; Maejima, Kensaku; Hashimoto, Masayoshi; Namba, Shigetou

    2012-07-01

    The first open-reading frame (ORF) of apple stem grooving virus (ASGV), of the genus Capillovirus, encodes an apparently chimeric polyprotein containing conserved regions for replicase (Rep) and coat protein (CP). However, our previous study revealed that ASGV mutants with distinct and discontinuous Rep- and CP-coding regions successfully infect plants, indicating that CP expressed via a subgenomic RNA (sgRNA) is sufficient for viability of the virus. Here we identified a transcription start site of the CP sgRNA and revealed that CP translated from the sgRNA is essential for ASGV infection. We mapped the transcription start sites of both the CP and the movement protein (MP) sgRNAs of ASGV and found a hexanucleotide motif, UUAGGU, conserved upstream from both sgRNA transcription start sites. Mutational analysis of the putative CP initiation codon and of the UUAGGU sequence upstream from the transcription start site of CP sgRNA demonstrated their importance for ASGV accumulation. Our results also demonstrated that potato virus T (PVT), an unassigned species closely related to ASGV, produces two sgRNAs putatively deployed for the CP and MP expression and that the same hexanucleotide motif as found in ASGV is located upstream from the transcription start sites of both sgRNAs. This motif, which constituted putative core elements of the sgRNA promoter, is broadly conserved among viruses in the families Alphaflexiviridae and Betaflexiviridae, suggesting that the gene expression strategy of the viruses in both families has been conserved throughout evolution. Copyright © 2012 Elsevier B.V. All rights reserved.

  3. Biological network motif detection and evaluation

    PubMed Central

    2011-01-01

    Background Molecular level of biological data can be constructed into system level of data as biological networks. Network motifs are defined as over-represented small connected subgraphs in networks and they have been used for many biological applications. Since network motif discovery involves computationally challenging processes, previous algorithms have focused on computational efficiency. However, we believe that the biological quality of network motifs is also very important. Results We define biological network motifs as biologically significant subgraphs and traditional network motifs are differentiated as structural network motifs in this paper. We develop five algorithms, namely, EDGEGO-BNM, EDGEBETWEENNESS-BNM, NMF-BNM, NMFGO-BNM and VOLTAGE-BNM, for efficient detection of biological network motifs, and introduce several evaluation measures including motifs included in complex, motifs included in functional module and GO term clustering score in this paper. Experimental results show that EDGEGO-BNM and EDGEBETWEENNESS-BNM perform better than existing algorithms and all of our algorithms are applicable to find structural network motifs as well. Conclusion We provide new approaches to finding network motifs in biological networks. Our algorithms efficiently detect biological network motifs and further improve existing algorithms to find high quality structural network motifs, which would be impossible using existing algorithms. The performances of the algorithms are compared based on our new evaluation measures in biological contexts. We believe that our work gives some guidelines of network motifs research for the biological networks. PMID:22784624

  4. Conserved structural and functional aspects of the tripartite motif gene family point towards therapeutic applications in multiple diseases.

    PubMed

    Gushchina, Liubov V; Kwiatkowski, Thomas A; Bhattacharya, Sayak; Weisleder, Noah L

    2018-05-01

    The tripartite motif (TRIM) gene family is a highly conserved group of E3 ubiquitin ligase proteins that can establish substrate specificity for the ubiquitin-proteasome complex and also have proteasome-independent functions. While several family members were studied previously, it is relatively recent that over 80 genes, based on sequence homology, were grouped to establish the TRIM gene family. Functional studies of various TRIM genes linked these proteins to modulation of inflammatory responses showing that they can contribute to a wide variety of disease states including cardiovascular, neurological and musculoskeletal diseases, as well as various forms of cancer. Given the fundamental role of the ubiquitin-proteasome complex in protein turnover and the importance of this regulation in most aspects of cellular physiology, it is not surprising that TRIM proteins display a wide spectrum of functions in a variety of cellular processes. This broad range of function and the highly conserved primary amino acid sequence of family members, particularly in the canonical TRIM E3 ubiquitin ligase domain, complicates the development of therapeutics that specifically target these proteins. A more comprehensive understanding of the structure and function of TRIM proteins will help guide therapeutic development for a number of different diseases. This review summarizes the structural organization of TRIM proteins, their domain architecture, common and unique post-translational modifications within the family, and potential binding partners and targets. Further discussion is provided on efforts to target TRIM proteins as therapeutic agents and how our increasing understanding of the nature of TRIM proteins can guide discovery of other therapeutics in the future. Copyright © 2017 Elsevier Inc. All rights reserved.

  5. In vivo functional mapping of the conserved protein domains within murine Themis1.

    PubMed

    Zvezdova, Ekaterina; Lee, Jan; El-Khoury, Dalal; Barr, Valarie; Akpan, Itoro; Samelson, Lawrence; Love, Paul E

    2014-09-01

    Thymocyte development requires the coordinated input of signals that originate from numerous cell surface molecules. Although the majority of thymocyte signal-initiating receptors are lineage-specific, most trigger 'ubiquitous' downstream signaling pathways. T-lineage-specific receptors are coupled to these signaling pathways by lymphocyte-restricted adapter molecules. We and others recently identified a new putative adapter protein, Themis1, whose expression is largely restricted to the T lineage. Mice lacking Themis1 exhibit a severe block in thymocyte development and a striking paucity of mature T cells revealing a critical role for Themis1 in T-cell maturation. Themis1 orthologs contain three conserved domains: a proline-rich region (PRR) that binds to the ubiquitous cytosolic adapter Grb2, a nuclear localization sequence (NLS), and two copies of a novel cysteine-containing globular (CABIT) domain. In the present study, we evaluated the functional importance of each of these motifs by retroviral reconstitution of Themis1(-/-) progenitor cells. The results demonstrate an essential requirement for the PRR and NLS motifs but not the conserved CABIT cysteines for Themis1 function.

  6. Using a color-coded ambigraphic nucleic acid notation to visualize conserved palindromic motifs within and across genomes

    PubMed Central

    2014-01-01

    Background Ambiscript is a graphically-designed nucleic acid notation that uses symbol symmetries to support sequence complementation, highlight biologically-relevant palindromes, and facilitate the analysis of consensus sequences. Although the original Ambiscript notation was designed to easily represent consensus sequences for multiple sequence alignments, the notation’s black-on-white ambiguity characters are unable to reflect the statistical distribution of nucleotides found at each position. We now propose a color-augmented ambigraphic notation to encode the frequency of positional polymorphisms in these consensus sequences. Results We have implemented this color-coding approach by creating an Adobe Flash® application ( http://www.ambiscript.org) that shades and colors modified Ambiscript characters according to the prevalence of the encoded nucleotide at each position in the alignment. The resulting graphic helps viewers perceive biologically-relevant patterns in multiple sequence alignments by uniquely combining color, shading, and character symmetries to highlight palindromes and inverted repeats in conserved DNA motifs. Conclusion Juxtaposing an intuitive color scheme over the deliberate character symmetries of an ambigraphic nucleic acid notation yields a highly-functional nucleic acid notation that maximizes information content and successfully embodies key principles of graphic excellence put forth by the statistician and graphic design theorist, Edward Tufte. PMID:24447494

  7. Systems analysis of cis-regulatory motifs in C4 photosynthesis genes using maize and rice leaf transcriptomic data during a process of de-etiolation

    PubMed Central

    Xu, Jiajia; Bräutigam, Andrea; Weber, Andreas P. M.; Zhu, Xin-Guang

    2016-01-01

    Identification of potential cis-regulatory motifs controlling the development of C4 photosynthesis is a major focus of current research. In this study, we used time-series RNA-seq data collected from etiolated maize and rice leaf tissues sampled during a de-etiolation process to systematically characterize the expression patterns of C4-related genes and to further identify potential cis elements in five different genomic regions (i.e. promoter, 5′UTR, 3′UTR, intron, and coding sequence) of C4 orthologous genes. The results demonstrate that although most of the C4 genes show similar expression patterns, a number of them, including chloroplast dicarboxylate transporter 1, aspartate aminotransferase, and triose phosphate transporter, show shifted expression patterns compared with their C3 counterparts. A number of conserved short DNA motifs between maize C4 genes and their rice orthologous genes were identified not only in the promoter, 5′UTR, 3′UTR, and coding sequences, but also in the introns of core C4 genes. We also identified cis-regulatory motifs that exist in maize C4 genes and also in genes showing similar expression patterns as maize C4 genes but that do not exist in rice C3 orthologs, suggesting a possible recruitment of pre-existing cis-elements from genes unrelated to C4 photosynthesis into C4 photosynthesis genes during C4 evolution. PMID:27436282

  8. Dynamic motifs in socio-economic networks

    NASA Astrophysics Data System (ADS)

    Zhang, Xin; Shao, Shuai; Stanley, H. Eugene; Havlin, Shlomo

    2014-12-01

    Socio-economic networks are of central importance in economic life. We develop a method of identifying and studying motifs in socio-economic networks by focusing on “dynamic motifs,” i.e., evolutionary connection patterns that, because of “node acquaintances” in the network, occur much more frequently than random patterns. We examine two evolving bi-partite networks: i) the world-wide commercial ship chartering market and ii) the ship build-to-order market. We find similar dynamic motifs in both bipartite networks, even though they describe different economic activities. We also find that “influence” and “persistence” are strong factors in the interaction behavior of organizations. When two companies are doing business with the same customer, it is highly probable that another customer who currently only has business relationship with one of these two companies, will become customer of the second in the future. This is the effect of influence. Persistence means that companies with close business ties to customers tend to maintain their relationships over a long period of time.

  9. Acidic and uncharged polar residues in the consensus motifs of the yeast Ca2+ transporter Gdt1p are required for calcium transport.

    PubMed

    Colinet, Anne-Sophie; Thines, Louise; Deschamps, Antoine; Flémal, Gaëlle; Demaegd, Didier; Morsomme, Pierre

    2017-07-01

    The UPF0016 family is a recently identified group of poorly characterized membrane proteins whose function is conserved through evolution and that are defined by the presence of 1 or 2 copies of the E-φ-G-D-[KR]-[TS] consensus motif in their transmembrane domain. We showed that 2 members of this family, the human TMEM165 and the budding yeast Gdt1p, are functionally related and are likely to form a new group of Ca 2+ transporters. Mutations in TMEM165 have been demonstrated to cause a new type of rare human genetic diseases denominated as Congenital Disorders of Glycosylation. Using site-directed mutagenesis, we generated 17 mutations in the yeast Golgi-localized Ca 2+ transporter Gdt1p. Single alanine substitutions were targeted to the highly conserved consensus motifs, 4 acidic residues localized in the central cytosolic loop, and the arginine at position 71. The mutants were screened in a yeast strain devoid of both the endogenous Gdt1p exchanger and Pmr1p, the Ca 2+ -ATPase of the Golgi apparatus. We show here that acidic and polar uncharged residues of the consensus motifs play a crucial role in calcium tolerance and calcium transport activity and are therefore likely to be architectural components of the cation binding site of Gdt1p. Importantly, we confirm the essential role of the E53 residue whose mutation in humans triggers congenital disorders of glycosylation. © 2017 John Wiley & Sons Ltd.

  10. Identification of cancer-specific motifs in mimotope profiles of serum antibody repertoire.

    PubMed

    Gerasimov, Ekaterina; Zelikovsky, Alex; Măndoiu, Ion; Ionov, Yurij

    2017-06-07

    For fighting cancer, earlier detection is crucial. Circulating auto-antibodies produced by the patient's own immune system after exposure to cancer proteins are promising bio-markers for the early detection of cancer. Since an antibody recognizes not the whole antigen but 4-7 critical amino acids within the antigenic determinant (epitope), the whole proteome can be represented by a random peptide phage display library. This opens the possibility to develop an early cancer detection test based on a set of peptide sequences identified by comparing cancer patients' and healthy donors' global peptide profiles of antibody specificities. Due to the enormously large number of peptide sequences contained in global peptide profiles generated by next generation sequencing, the large number of cancer and control sera is required to identify cancer-specific peptides with high degree of statistical significance. To decrease the number of peptides in profiles generated by nextgen sequencing without losing cancer-specific sequences we used for generation of profiles the phage library enriched by panning on the pool of cancer sera. To further decrease the complexity of profiles we used computational methods for transforming a list of peptides constituting the mimotope profiles to the list motifs formed by similar peptide sequences. We have shown that the amino-acid order is meaningful in mimotope motifs since they contain significantly more peptides than motifs among peptides where amino-acids are randomly permuted. Also the single sample motifs significantly differ from motifs in peptides drawn from multiple samples. Finally, multiple cancer-specific motifs have been identified.

  11. Functional Conservation of PISTILLATA Activity in a Pea Homolog Lacking the PI Motif1

    PubMed Central

    Berbel, Ana; Navarro, Cristina; Ferrándiz, Cristina; Cañas, Luis Antonio; Beltrán, José-Pío; Madueño, Francisco

    2005-01-01

    Current understanding of floral development is mainly based on what we know from Arabidopsis (Arabidopsis thaliana) and Antirrhinum majus. However, we can learn more by comparing developmental mechanisms that may explain morphological differences between species. A good example comes from the analysis of genes controlling flower development in pea (Pisum sativum), a plant with more complex leaves and inflorescences than Arabidopsis and Antirrhinum, and a different floral ontogeny. The analysis of UNIFOLIATA (UNI) and STAMINA PISTILLOIDA (STP), the pea orthologs of LEAFY and UNUSUAL FLORAL ORGANS, has revealed a common link in the regulation of flower and leaf development not apparent in Arabidopsis. While the Arabidopsis genes mainly behave as key regulators of flower development, where they control the expression of B-function genes, UNI and STP also contribute to the development of the pea compound leaf. Here, we describe the characterization of P. sativum PISTILLATA (PsPI), a pea MADS-box gene homologous to B-function genes like PI and GLOBOSA (GLO), from Arabidopsis and Antirrhinum, respectively. PsPI encodes for an atypical PI-type polypeptide that lacks the highly conserved C-terminal PI motif. Nevertheless, constitutive expression of PsPI in tobacco (Nicotiana tabacum) and Arabidopsis shows that it can specifically replace the function of PI, being able to complement the strong pi-1 mutant. Accordingly, PsPI expression in pea flowers, which is dependent on STP, is identical to PI and GLO. Interestingly, PsPI is also transiently expressed in young leaves, suggesting a role of PsPI in pea leaf development, a possibility that fits with the established role of UNI and STP in the control of this process. PMID:16113230

  12. Selection of functional 2A sequences within foot-and-mouth disease virus; requirements for the NPGP motif with a distinct codon bias.

    PubMed

    Kjær, Jonas; Belsham, Graham J

    2018-01-01

    Foot-and-mouth disease virus (FMDV) has a positive-sense ssRNA genome including a single, large, open reading frame. Splitting of the encoded polyprotein at the 2A/2B junction is mediated by the 2A peptide (18 residues long), which induces a nonproteolytic, cotranslational "cleavage" at its own C terminus. A conserved feature among variants of 2A is the C-terminal motif N 16 P 17 G 18 /P 19 , where P 19 is the first residue of 2B. It has been shown previously that certain amino acid substitutions can be tolerated at residues E 14 , S 15 , and N 16 within the 2A sequence of infectious FMDVs, but no variants at residues P 17 , G 18 , or P 19 have been identified. In this study, using highly degenerate primers, we analyzed if any other residues can be present at each position of the NPG/P motif within infectious FMDV. No alternative forms of this motif were found to be encoded by rescued FMDVs after two, three, or four passages. However, surprisingly, a clear codon preference for the wt nucleotide sequence encoding the NPGP motif within these viruses was observed. Indeed, the codons selected to code for P 17 and P 19 within this motif were distinct; thus the synonymous codons are not equivalent. © 2018 Kjær and Belsham; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  13. PISMA: A Visual Representation of Motif Distribution in DNA Sequences

    PubMed Central

    Alcántara-Silva, Rogelio; Alvarado-Hermida, Moisés; Díaz-Contreras, Gibrán; Sánchez-Barrios, Martha; Carrera, Samantha; Galván, Silvia Carolina

    2017-01-01

    Background: Because the graphical presentation and analysis of motif distribution can provide insights for experimental hypothesis, PISMA aims at identifying motifs on DNA sequences, counting and showing them graphically. The motif length ranges from 2 to 10 bases, and the DNA sequences range up to 10 kb. The motif distribution is shown as a bar-code–like, as a gene-map–like, and as a transcript scheme. Results: We obtained graphical schemes of the CpG site distribution from 91 human papillomavirus genomes. Also, we present 2 analyses: one of DNA motifs associated with either methylation-resistant or methylation-sensitive CpG islands and another analysis of motifs associated with exosome RNA secretion. Availability and Implementation: PISMA is developed in Java; it is executable in any type of hardware and in diverse operating systems. PISMA is freely available to noncommercial users. The English version and the User Manual are provided in Supplementary Files 1 and 2, and a Spanish version is available at www.biomedicas.unam.mx/wp-content/software/pisma.zip and www.biomedicas.unam.mx/wp-content/pdf/manual/pisma.pdf. PMID:28469418

  14. Identification of new members of the MAPK gene family in plants shows diverse conserved domains and novel activation loop variants.

    PubMed

    Mohanta, Tapan Kumar; Arora, Pankaj Kumar; Mohanta, Nibedita; Parida, Pratap; Bae, Hanhong

    2015-02-06

    Mitogen Activated Protein Kinase (MAPK) signaling is of critical importance in plants and other eukaryotic organisms. The MAPK cascade plays an indispensible role in the growth and development of plants, as well as in biotic and abiotic stress responses. The MAPKs are constitute the most downstream module of the three tier MAPK cascade and are phosphorylated by upstream MAP kinase kinases (MAPKK), which are in turn are phosphorylated by MAP kinase kinase kinase (MAPKKK). The MAPKs play pivotal roles in regulation of many cytoplasmic and nuclear substrates, thus regulating several biological processes. A total of 589 MAPKs genes were identified from the genome wide analysis of 40 species. The sequence analysis has revealed the presence of several N- and C-terminal conserved domains. The MAPKs were previously believed to be characterized by the presence of TEY/TDY activation loop motifs. The present study showed that, in addition to presence of activation loop TEY/TDY motifs, MAPKs are also contain MEY, TEM, TQM, TRM, TVY, TSY, TEC and TQY activation loop motifs. Phylogenetic analysis of all predicted MAPKs were clustered into six different groups (group A, B, C, D, E and F), and all predicted MAPKs were assigned with specific names based on their orthology based evolutionary relationships with Arabidopsis or Oryza MAPKs. We conducted global analysis of the MAPK gene family of plants from lower eukaryotes to higher eukaryotes and analyzed their genomic and evolutionary aspects. Our study showed the presence of several new activation loop motifs and diverse conserved domains in MAPKs. Advance study of newly identified activation loop motifs can provide further information regarding the downstream signaling cascade activated in response to a wide array of stress conditions, as well as plant growth and development.

  15. Conserved thioredoxin fold is present in Pisum sativum L. sieve element occlusion-1 protein

    PubMed Central

    Umate, Pavan; Tuteja, Renu

    2010-01-01

    Homology-based three-dimensional model for Pisum sativum sieve element occlusion 1 (Ps.SEO1) (forisomes) protein was constructed. A stretch of amino acids (residues 320 to 456) which is well conserved in all known members of forisomes proteins was used to model the 3D structure of Ps.SEO1. The structural prediction was done using Protein Homology/analogY Recognition Engine (PHYRE) web server. Based on studies of local sequence alignment, the thioredoxin-fold containing protein [Structural Classification of Proteins (SCOP) code d1o73a_], a member of the glutathione peroxidase family was selected as a template for modeling the spatial structure of Ps.SEO1. Selection was based on comparison of primary sequence, higher match quality and alignment accuracy. Motif 1 (EVF) is conserved in Ps.SEO1, Vicia faba (Vf.For1) and Medicago truncatula (MT.SEO3); motif 2 (KKED) is well conserved across all forisomes proteins and motif 3 (IGYIGNP) is conserved in Ps.SEO1 and Vf.For1. PMID:20404566

  16. Disparate requirements for the Walker A and B ATPase motifs of human RAD51D in homologous recombination.

    PubMed

    Wiese, Claudia; Hinz, John M; Tebbs, Robert S; Nham, Peter B; Urbin, Salustra S; Collins, David W; Thompson, Larry H; Schild, David

    2006-01-01

    In vertebrates, homologous recombinational repair (HRR) requires RAD51 and five RAD51 paralogs (XRCC2, XRCC3, RAD51B, RAD51C and RAD51D) that all contain conserved Walker A and B ATPase motifs. In human RAD51D we examined the requirement for these motifs in interactions with XRCC2 and RAD51C, and for survival of cells in response to DNA interstrand crosslinks (ICLs). Ectopic expression of wild-type human RAD51D or mutants having a non-functional A or B motif was used to test for complementation of a rad51d knockout hamster CHO cell line. Although A-motif mutants complement very efficiently, B-motif mutants do not. Consistent with these results, experiments using the yeast two- and three-hybrid systems show that the interactions between RAD51D and its XRCC2 and RAD51C partners also require a functional RAD51D B motif, but not motif A. Similarly, hamster Xrcc2 is unable to bind to the non-complementing human RAD51D B-motif mutants in co-immunoprecipitation assays. We conclude that a functional Walker B motif, but not A motif, is necessary for RAD51D's interactions with other paralogs and for efficient HRR. We present a model in which ATPase sites are formed in a bipartite manner between RAD51D and other RAD51 paralogs.

  17. Efficient exact motif discovery.

    PubMed

    Marschall, Tobias; Rahmann, Sven

    2009-06-15

    The motif discovery problem consists of finding over-represented patterns in a collection of biosequences. It is one of the classical sequence analysis problems, but still has not been satisfactorily solved in an exact and efficient manner. This is partly due to the large number of possibilities of defining the motif search space and the notion of over-representation. Even for well-defined formalizations, the problem is frequently solved in an ad hoc manner with heuristics that do not guarantee to find the best motif. We show how to solve the motif discovery problem (almost) exactly on a practically relevant space of IUPAC generalized string patterns, using the p-value with respect to an i.i.d. model or a Markov model as the measure of over-representation. In particular, (i) we use a highly accurate compound Poisson approximation for the null distribution of the number of motif occurrences. We show how to compute the exact clump size distribution using a recently introduced device called probabilistic arithmetic automaton (PAA). (ii) We define two p-value scores for over-representation, the first one based on the total number of motif occurrences, the second one based on the number of sequences in a collection with at least one occurrence. (iii) We describe an algorithm to discover the optimal pattern with respect to either of the scores. The method exploits monotonicity properties of the compound Poisson approximation and is by orders of magnitude faster than exhaustive enumeration of IUPAC strings (11.8 h compared with an extrapolated runtime of 4.8 years). (iv) We justify the use of the proposed scores for motif discovery by showing our method to outperform other motif discovery algorithms (e.g. MEME, Weeder) on benchmark datasets. We also propose new motifs on Mycobacterium tuberculosis. The method has been implemented in Java. It can be obtained from http://ls11-www.cs.tu-dortmund.de/people/marschal/paa_md/.

  18. Cellular automata simulation of topological effects on the dynamics of feed-forward motifs

    PubMed Central

    Apte, Advait A; Cain, John W; Bonchev, Danail G; Fong, Stephen S

    2008-01-01

    Background Feed-forward motifs are important functional modules in biological and other complex networks. The functionality of feed-forward motifs and other network motifs is largely dictated by the connectivity of the individual network components. While studies on the dynamics of motifs and networks are usually devoted to the temporal or spatial description of processes, this study focuses on the relationship between the specific architecture and the overall rate of the processes of the feed-forward family of motifs, including double and triple feed-forward loops. The search for the most efficient network architecture could be of particular interest for regulatory or signaling pathways in biology, as well as in computational and communication systems. Results Feed-forward motif dynamics were studied using cellular automata and compared with differential equation modeling. The number of cellular automata iterations needed for a 100% conversion of a substrate into a target product was used as an inverse measure of the transformation rate. Several basic topological patterns were identified that order the specific feed-forward constructions according to the rate of dynamics they enable. At the same number of network nodes and constant other parameters, the bi-parallel and tri-parallel motifs provide higher network efficacy than single feed-forward motifs. Additionally, a topological property of isodynamicity was identified for feed-forward motifs where different network architectures resulted in the same overall rate of the target production. Conclusion It was shown for classes of structural motifs with feed-forward architecture that network topology affects the overall rate of a process in a quantitatively predictable manner. These fundamental results can be used as a basis for simulating larger networks as combinations of smaller network modules with implications on studying synthetic gene circuits, small regulatory systems, and eventually dynamic whole-cell models

  19. Counting motifs in dynamic networks.

    PubMed

    Mukherjee, Kingshuk; Hasan, Md Mahmudul; Boucher, Christina; Kahveci, Tamer

    2018-04-11

    A network motif is a sub-network that occurs frequently in a given network. Detection of such motifs is important since they uncover functions and local properties of the given biological network. Finding motifs is however a computationally challenging task as it requires solving the costly subgraph isomorphism problem. Moreover, the topology of biological networks change over time. These changing networks are called dynamic biological networks. As the network evolves, frequency of each motif in the network also changes. Computing the frequency of a given motif from scratch in a dynamic network as the network topology evolves is infeasible, particularly for large and fast evolving networks. In this article, we design and develop a scalable method for counting the number of motifs in a dynamic biological network. Our method incrementally updates the frequency of each motif as the underlying network's topology evolves. Our experiments demonstrate that our method can update the frequency of each motif in orders of magnitude faster than counting the motif embeddings every time the network changes. If the network evolves more frequently, the margin with which our method outperforms the existing static methods, increases. We evaluated our method extensively using synthetic and real datasets, and show that our method is highly accurate(≥ 96%) and that it can be scaled to large dense networks. The results on real data demonstrate the utility of our method in revealing interesting insights on the evolution of biological processes.

  20. Composite Structural Motifs of Binding Sites for Delineating Biological Functions of Proteins

    PubMed Central

    Kinjo, Akira R.; Nakamura, Haruki

    2012-01-01

    Most biological processes are described as a series of interactions between proteins and other molecules, and interactions are in turn described in terms of atomic structures. To annotate protein functions as sets of interaction states at atomic resolution, and thereby to better understand the relation between protein interactions and biological functions, we conducted exhaustive all-against-all atomic structure comparisons of all known binding sites for ligands including small molecules, proteins and nucleic acids, and identified recurring elementary motifs. By integrating the elementary motifs associated with each subunit, we defined composite motifs that represent context-dependent combinations of elementary motifs. It is demonstrated that function similarity can be better inferred from composite motif similarity compared to the similarity of protein sequences or of individual binding sites. By integrating the composite motifs associated with each protein function, we define meta-composite motifs each of which is regarded as a time-independent diagrammatic representation of a biological process. It is shown that meta-composite motifs provide richer annotations of biological processes than sequence clusters. The present results serve as a basis for bridging atomic structures to higher-order biological phenomena by classification and integration of binding site structures. PMID:22347478

  1. The Calmodulin-Binding, Short Linear Motif, NSCaTE Is Conserved in L-Type Channel Ancestors of Vertebrate Cav1.2 and Cav1.3 Channels

    PubMed Central

    Taiakina, Valentina; Boone, Adrienne N.; Fux, Julia; Senatore, Adriano; Weber-Adrian, Danielle

    2013-01-01

    NSCaTE is a short linear motif of (xWxxx(I or L)xxxx), composed of residues with a high helix-forming propensity within a mostly disordered N-terminus that is conserved in L-type calcium channels from protostome invertebrates to humans. NSCaTE is an optional, lower affinity and calcium-sensitive binding site for calmodulin (CaM) which competes for CaM binding with a more ancient, C-terminal IQ domain on L-type channels. CaM bound to N- and C- terminal tails serve as dual detectors to changing intracellular Ca2+ concentrations, promoting calcium-dependent inactivation of L-type calcium channels. NSCaTE is absent in some arthropod species, and is also lacking in vertebrate L-type isoforms, Cav1.1 and Cav1.4 channels. The pervasiveness of a methionine just downstream from NSCaTE suggests that L-type channels could generate alternative N-termini lacking NSCaTE through the choice of translational start sites. Long N-terminus with an NSCaTE motif in L-type calcium channel homolog LCav1 from pond snail Lymnaea stagnalis has a faster calcium-dependent inactivation than a shortened N-termini lacking NSCaTE. NSCaTE effects are present in low concentrations of internal buffer (0.5 mM EGTA), but disappears in high buffer conditions (10 mM EGTA). Snail and mammalian NSCaTE have an alpha-helical propensity upon binding Ca2+-CaM and can saturate both CaM N-terminal and C-terminal domains in the absence of a competing IQ motif. NSCaTE evolved in ancestors of the first animals with internal organs for promoting a more rapid, calcium-sensitive inactivation of L-type channels. PMID:23626724

  2. The Leu-Arg-Glu (LRE) adhesion motif in proteins of the neuromuscular junction with special reference to proteins of the carboxylesterase/cholinesterase family.

    PubMed

    Johnson, Glynis; Moore, Samuel W

    2013-09-01

    Short linear motifs confer evolutionary flexibility on proteins as they can be added with relative ease allowing the acquisition of new functions. Such motifs may mediate a variety of signalling functions. The adhesion-mediating Leu-Arg-Glu (LRE) motif is enriched in laminin beta 2, and has been observed in other proteins, including members of the carboxylesterase/cholinesterase family. It acts as a stop signal for growing axons in the developing neuromuscular junction, binding to the voltage-gated calcium channel. In this bioinformatic analysis, we have investigated the presence of the motif in proteins of the neuromuscular junction, and have also examined its structural position and potential for ligand interaction, as well as phylogenetic conservation, in the carboxylesterase/cholinesterase family. The motif was observed to occur with a significantly higher frequency than expected in the UniProt/Swiss-Prot database, as well as in four individual species (human, mouse, Caenorhabditis elegans and Drosophila melanogaster). Examination of its presence in neuromuscular junction proteins showed it to be enriched in certain proteins of the synaptic basement membrane, including laminin, agrin, acetylcholinesterase and tenascin. A highly significant enrichment was observed in cytoskeletal proteins, particularly intermediate filament proteins and members of the spectrin family. In the carboxylesterase/cholinesterase family, the motif was observed in four conserved positions in the protein structure. It is present in the majority of mammalian acetylcholinesterases, as well as acetylcholinesterases from electric fish and a number of invertebrates. In insects, it is present in the ace-2, rather than in the synaptic ace-1, enzyme. It is also observed in the cholinesterase-like adhesion molecules (neuroligins, neurotactin and glutactin). It is never seen in butyrylcholinesterases, which do not mediate cell adhesion. In conclusion, the significant enrichment of the motif in

  3. Identifying conservation priorities and management strategies based on ecosystem services to improve urban sustainability in Harbin, China.

    PubMed

    Qu, Yi; Lu, Ming

    2018-01-01

    Rapid urbanization and agricultural development has resulted in the degradation of ecosystems, while also negatively impacting ecosystem services (ES) and urban sustainability. Identifying conservation priorities for ES and applying reasonable management strategies have been found to be effective methods for mitigating this phenomenon. The purpose of this study is to propose a comprehensive framework for identifying ES conservation priorities and associated management strategies for these planning areas. First, we incorporated 10 ES indicators within a systematic conservation planning (SCP) methodology in order to identify ES conservation priorities with high irreplaceability values based on conservation target goals associated with the potential distribution of ES indicators. Next, we assessed the efficiency of the ES conservation priorities for meeting the designated conservation target goals. Finally, ES conservation priorities were clustered into groups using a K-means clustering analysis in an effort to identify the dominant ES per location before formulating management strategies. We effectively identified 12 ES priorities to best represent conservation target goals for the ES indicators. These 12 priorities had a total areal coverage of 13,364 km 2 representing 25.16% of the study area. The 12 priorities were further clustered into five significantly different groups ( p -values between groups < 0.05), which helped to refine management strategies formulated to best enhance ES across the study area. The proposed method allows conservation and management plans to easily adapt to a wide variety of quantitative ES target goals within urban and agricultural areas, thereby preventing urban and agriculture sprawl and guiding sustainable urban development.

  4. Identifying conservation priorities and management strategies based on ecosystem services to improve urban sustainability in Harbin, China

    PubMed Central

    2018-01-01

    Rapid urbanization and agricultural development has resulted in the degradation of ecosystems, while also negatively impacting ecosystem services (ES) and urban sustainability. Identifying conservation priorities for ES and applying reasonable management strategies have been found to be effective methods for mitigating this phenomenon. The purpose of this study is to propose a comprehensive framework for identifying ES conservation priorities and associated management strategies for these planning areas. First, we incorporated 10 ES indicators within a systematic conservation planning (SCP) methodology in order to identify ES conservation priorities with high irreplaceability values based on conservation target goals associated with the potential distribution of ES indicators. Next, we assessed the efficiency of the ES conservation priorities for meeting the designated conservation target goals. Finally, ES conservation priorities were clustered into groups using a K-means clustering analysis in an effort to identify the dominant ES per location before formulating management strategies. We effectively identified 12 ES priorities to best represent conservation target goals for the ES indicators. These 12 priorities had a total areal coverage of 13,364 km2 representing 25.16% of the study area. The 12 priorities were further clustered into five significantly different groups (p-values between groups < 0.05), which helped to refine management strategies formulated to best enhance ES across the study area. The proposed method allows conservation and management plans to easily adapt to a wide variety of quantitative ES target goals within urban and agricultural areas, thereby preventing urban and agriculture sprawl and guiding sustainable urban development. PMID:29682412

  5. Canonical Bcl-2 motifs of the Na+/K+ pump revealed by the BH3 mimetic chelerythrine: early signal transducers of apoptosis?

    PubMed

    Lauf, Peter K; Heiny, Judith; Meller, Jarek; Lepera, Michael A; Koikov, Leonid; Alter, Gerald M; Brown, Thomas L; Adragna, Norma C

    2013-01-01

    Chelerythrine [CET], a protein kinase C [PKC] inhibitor, is a prop-apoptotic BH3-mimetic binding to BH1-like motifs of Bcl-2 proteins. CET action was examined on PKC phosphorylation-dependent membrane transporters (Na+/K+ pump/ATPase [NKP, NKA], Na+-K+-2Cl+ [NKCC] and K+-Cl- [KCC] cotransporters, and channel-supported K+ loss) in human lens epithelial cells [LECs]. K+ loss and K+ uptake, using Rb+ as congener, were measured by atomic absorption/emission spectrophotometry with NKP and NKCC inhibitors, and Cl- replacement by NO3ˉ to determine KCC. 3H-Ouabain binding was performed on a pig renal NKA in the presence and absence of CET. Bcl-2 protein and NKA sequences were aligned and motifs identified and mapped using PROSITE in conjunction with BLAST alignments and analysis of conservation and structural similarity based on prediction of secondary and crystal structures. CET inhibited NKP and NKCC by >90% (IC50 values ~35 and ~15 μM, respectively) without significant KCC activity change, and stimulated K+ loss by ~35% at 10-30 μM. Neither ATP levels nor phosphorylation of the NKA α1 subunit changed. 3H-ouabain was displaced from pig renal NKA only at 100 fold higher CET concentrations than the ligand. Sequence alignments of NKA with BH1- and BH3-like motifs containing pro-survival Bcl-2 and BclXl proteins showed more than one BH1-like motif within NKA for interaction with CET or with BH3 motifs. One NKA BH1-like motif (ARAAEILARDGPN) was also found in all P-type ATPases. Also, NKA possessed a second motif similar to that near the BH3 region of Bcl-2. Findings support the hypothesis that CET inhibits NKP by binding to BH1-like motifs and disrupting the α1 subunit catalytic activity through conformational changes. By interacting with Bcl-2 proteins through their complementary BH1- or BH3-like-motifs, NKP proteins may be sensors of normal and pathological cell functions, becoming important yet unrecognized signal transducers in the initial phases of apoptosis. CET

  6. Unfolding Kinetics of the Human Telomere i-Motif Under a 10 pN Force Imposed by the α-Hemolysin Nanopore Identify Transient Folded-State Lifetimes at Physiological pH.

    PubMed

    Ding, Yun; Fleming, Aaron M; He, Lidong; Burrows, Cynthia J

    2015-07-22

    Cytosine (C)-rich DNA can adopt i-motif folds under acidic conditions, with the human telomere i-motif providing a well-studied example. The dimensions of this i-motif are appropriate for capture in the nanocavity of the α-hemolysin (α-HL) protein pore under an electrophoretic force. Interrogation of the current vs time (i-t) traces when the i-motif interacts with α-HL identified characteristic signals that were pH dependent. These features were evaluated from pH 5.0 to 7.2, a region surrounding the transition pH of the i-motif (6.1). When the i-motif without polynucleotide tails was studied at pH 5.0, the folded structure entered the nanocavity of α-HL from either the top or bottom face to yield characteristic current patterns. Addition of a 5' 25-mer poly-2'-deoxyadensosine tail allowed capture of the i-motif from the unfolded terminus, and this was used to analyze the pH dependency of unfolding. At pH values below the transition point, only folded strands were observed, and when the pH was increased above the transition pH, the number of folded events decreased, while the unfolded events increased. At pH 6.8 and 7.2 4% and 2% of the strands were still folded, respectively. The lifetimes for the folded states at pH 6.8 and 7.2 were 21 and 9 ms, respectively, at 160 mV electrophoretic force. These lifetimes are sufficiently long to affect enzymes operating on DNA. Furthermore, these transient lifetimes are readily obtained using the α-HL nanopore, a feature that is not easily achievable by other methods.

  7. Detection and Preliminary Analysis of Motifs in Promoters of Anaerobically Induced Genes of Different Plant Species

    PubMed Central

    MOHANTY, BIJAYALAXMI; KRISHNAN, S. P. T.; SWARUP, SANJAY; BAJIC, VLADIMIR B.

    2005-01-01

    • Background and Aims Plants can suffer from oxygen limitation during flooding or more complete submergence and may therefore switch from Kreb's cycle respiration to fermentation in association with the expression of anaerobically inducible genes coding for enzymes involved in glycolysis and fermentation. The aim of this study was to clarify mechanisms of transcriptional regulation of these anaerobic genes by identifying motifs shared by their promoter regions. • Methods Statistically significant motifs were detected by an in silico method from 13 promoters of anaerobic genes. The selected motifs were common for the majority of analysed promoters. Their significance was evaluated by searching for their presence in transcription factor-binding site databases (TRANSFAC, PlantCARE and PLACE). Using several negative control data sets, it was tested whether the motifs found were specific to the anaerobic group. • Key Results Previously, anaerobic response elements have been identified in maize (Zea mays) and arabidopsis (Arabidopsis thaliana) genes. Known functional motifs were detected, such as GT and GC motifs, but also other motifs shared by most of the genes examined. Five motifs detected have not been found in plants hitherto but are present in the promoters of animal genes with various functions. The consensus sequences of these novel motifs are 5′-AAACAAA-3′, 5′-AGCAGC-3′, 5′-TCATCAC-3′, 5′-GTTT(A/C/T)GCAA-3′ and 5′-TTCCCTGTT-3′. • Conclusions It is believed that the promoter motifs identified could be functional by conferring anaerobic sensitivity to the genes that possess them. This proposal now requires experimental verification. PMID:16027132

  8. Sequence motifs and prokaryotic expression of the reptilian paramyxovirus fusion protein

    USGS Publications Warehouse

    Franke, J.; Batts, W.N.; Ahne, W.; Kurath, G.; Winton, J.R.

    2006-01-01

    Fourteen reptilian paramyxovirus isolates were chosen to represent the known extent of genetic diversity among this novel group of viruses. Selected regions of the fusion (F) gene were sequenced, analyzed and compared. The F gene of all isolates contained conserved motifs homologous to those described for other members of the family Paramyxoviridae including: signal peptide, transmembrane domain, furin cleavage site, fusion peptide, N-linked glycosylation sites, and two heptad repeats, the second of which (HRB-LZ) had the characteristics of a leucine zipper. Selected regions of the fusion gene of isolate Gono-GER85 were inserted into a prokaryotic expression system to generate three recombinant protein fragments of various sizes. The longest recombinant protein was cleaved by furin into two fragments of predicted length. Western blot analysis with virus-neutralizing rabbit-antiserum against this isolate demonstrated that only the longest construct reacted with the antiserum. This construct was unique in containing 30 additional C-terminal amino acids that included most of the HRB-LZ. These results indicate that the F genes of reptilian paramyxoviruses contain highly conserved motifs typical of other members of the family and suggest that the HRB-LZ domain of the reptilian paramyxovirus F protein contains a linear antigenic epitope. ?? Springer-Verlag 2005.

  9. Grafting of functional motifs onto protein scaffolds identified by PDB screening--an efficient route to design optimizable protein binders.

    PubMed

    Tlatli, Rym; Nozach, Hervé; Collet, Guillaume; Beau, Fabrice; Vera, Laura; Stura, Enrico; Dive, Vincent; Cuniasse, Philippe

    2013-01-01

    Artificial miniproteins that are able to target catalytic sites of matrix metalloproteinases (MMPs) were designed using a functional motif-grafting approach. The motif corresponded to the four N-terminal residues of TIMP-2, a broad-spectrum protein inhibitor of MMPs. Scaffolds that are able to reproduce the functional topology of this motif were obtained by exhaustive screening of the Protein Data Bank (PDB) using STAMPS software (search for three-dimensional atom motifs in protein structures). Ten artificial protein binders were produced. The designed proteins bind catalytic sites of MMPs with affinities ranging from 450 nm to 450 μm prior to optimization. The crystal structure of one artificial binder in complex with the catalytic domain of MMP-12 showed that the inter-molecular interactions established by the functional motif in the artificial binder corresponded to those found in the MMP-14-TIMP-2 complex, albeit with some differences in geometry. Molecular dynamics simulations of the ten binders in complex with MMP-14 suggested that these scaffolds may allow partial reproduction of native inter-molecular interactions, but differences in geometry and stability may contribute to the lower affinity of the artificial protein binders compared to the natural protein binder. Nevertheless, these results show that the in silico design method used provides sets of protein binders that target a specific binding site with a good rate of success. This approach may constitute the first step of an efficient hybrid computational/experimental approach to protein binder design. © 2012 The Authors Journal compilation © 2012 FEBS.

  10. Identification and preliminary characterization of a protein motif related to the zinc finger.

    PubMed Central

    Lovering, R; Hanson, I M; Borden, K L; Martin, S; O'Reilly, N J; Evan, G I; Rahman, D; Pappin, D J; Trowsdale, J; Freemont, P S

    1993-01-01

    We have identified a protein motif, related to the zinc finger, which defines a newly discovered family of proteins. The motif was found in the sequence of the human RING1 gene, which is proximal to the major histocompatibility complex region on chromosome six. We propose naming this motif the "RING finger" and it is found in 27 proteins, all of which have putative DNA binding functions. We have synthesized a peptide corresponding to the RING1 motif and examined a number of properties, including metal and DNA binding. We provide evidence to support the suggestion that the RING finger motif is the DNA binding domain of this newly defined family of proteins. Images Fig. 1 Fig. 4 PMID:7681583

  11. Detection of core-periphery structure in networks based on 3-tuple motifs

    NASA Astrophysics Data System (ADS)

    Ma, Chuang; Xiang, Bing-Bing; Chen, Han-Shuang; Small, Michael; Zhang, Hai-Feng

    2018-05-01

    Detecting mesoscale structure, such as community structure, is of vital importance for analyzing complex networks. Recently, a new mesoscale structure, core-periphery (CP) structure, has been identified in many real-world systems. In this paper, we propose an effective algorithm for detecting CP structure based on a 3-tuple motif. In this algorithm, we first define a 3-tuple motif in terms of the patterns of edges as well as the property of nodes, and then a motif adjacency matrix is constructed based on the 3-tuple motif. Finally, the problem is converted to find a cluster that minimizes the smallest motif conductance. Our algorithm works well in different CP structures: including single or multiple CP structure, and local or global CP structures. Results on the synthetic and the empirical networks validate the high performance of our method.

  12. A motif detection and classification method for peptide sequences using genetic programming.

    PubMed

    Tomita, Yasuyuki; Kato, Ryuji; Okochi, Mina; Honda, Hiroyuki

    2008-08-01

    An exploration of common rules (property motifs) in amino acid sequences has been required for the design of novel sequences and elucidation of the interactions between molecules controlled by the structural or physical environment. In the present study, we developed a new method to search property motifs that are common in peptide sequence data. Our method comprises the following two characteristics: (i) the automatic determination of the position and length of common property motifs by calculating the physicochemical similarity of amino acids, and (ii) the quick and effective exploration of motif candidates that discriminates the positives and negatives by the introduction of genetic programming (GP). Our method was evaluated by two types of model data sets. First, the intentionally buried property motifs were searched in the artificially derived peptide data containing intentionally buried property motifs. As a result, the expected property motifs were correctly extracted by our algorithm. Second, the peptide data that interact with MHC class II molecules were analyzed as one of the models of biologically active peptides with buried motifs in various lengths. Twofold MHC class II binding peptides were identified with the rule using our method, compared to the existing scoring matrix method. In conclusion, our GP based motif searching approach enabled to obtain knowledge of functional aspects of the peptides without any prior knowledge.

  13. The conserved RNA recognition motif and C3H1 domain of the Not4 ubiquitin ligase regulate in vivo ligase function.

    PubMed

    Chen, Hongfeng; Sirupangi, Tirupataiah; Wu, Zhao-Hui; Johnson, Daniel L; Laribee, R Nicholas

    2018-05-25

    The Ccr4-Not complex controls RNA polymerase II (Pol II) dependent gene expression and proteasome function. The Not4 ubiquitin ligase is a Ccr4-Not subunit that has both a RING domain and a conserved RNA recognition motif and C3H1 domain (referred to as the RRM-C domain) with unknown function. We demonstrate that while individual Not4 RING or RRM-C mutants fail to replicate the proteasomal defects found in Not4 deficient cells, mutation of both exhibits a Not4 loss of function phenotype. Transcriptome analysis revealed that the Not4 RRM-C affects a specific subset of Pol II-regulated genes, including those involved in transcription elongation, cyclin-dependent kinase regulated nutrient responses, and ribosomal biogenesis. The Not4 RING, RRM-C, or RING/RRM-C mutations cause a generalized increase in Pol II binding at a subset of these genes, yet their impact on gene expression does not always correlate with Pol II recruitment which suggests Not4 regulates their expression through additional mechanisms. Intriguingly, we find that while the Not4 RRM-C is dispensable for Ccr4-Not association with RNA Pol II, the Not4 RING domain is required for these interactions. Collectively, these data elucidate previously unknown roles for the conserved Not4 RRM-C and RING domains in regulating Ccr4-Not dependent functions in vivo.

  14. Distinct Contributions of Conserved Modules to Runt Transcription Factor Activity

    PubMed Central

    Walrad, Pegine B.; Hang, Saiyu; Joseph, Genevieve S.; Salas, Julia

    2010-01-01

    Runx proteins play vital roles in regulating transcription in numerous developmental pathways throughout the animal kingdom. Two Runx protein hallmarks are the DNA-binding Runt domain and a C-terminal VWRPY motif that mediates interaction with TLE/Gro corepressor proteins. A phylogenetic analysis of Runt, the founding Runx family member, identifies four distinct regions C-terminal to the Runt domain that are conserved in Drosophila and other insects. We used a series of previously described ectopic expression assays to investigate the functions of these different conserved regions in regulating gene expression during embryogenesis and in controlling axonal projections in the developing eye. The results indicate each conserved region is required for a different subset of activities and identify distinct regions that participate in the transcriptional activation and repression of the segmentation gene sloppy-paired-1 (slp1). Interestingly, the C-terminal VWRPY-containing region is not required for repression but instead plays a role in slp1 activation. Genetic experiments indicating that Groucho (Gro) does not participate in slp1 regulation further suggest that Runt's conserved C-terminus interacts with other factors to promote transcriptional activation. These results provide a foundation for further studies on the molecular interactions that contribute to the context-dependent properties of Runx proteins as developmental regulators. PMID:20462957

  15. An integrative and applicable phylogenetic footprinting framework for cis-regulatory motifs identification in prokaryotic genomes.

    PubMed

    Liu, Bingqiang; Zhang, Hanyuan; Zhou, Chuan; Li, Guojun; Fennell, Anne; Wang, Guanghui; Kang, Yu; Liu, Qi; Ma, Qin

    2016-08-09

    Phylogenetic footprinting is an important computational technique for identifying cis-regulatory motifs in orthologous regulatory regions from multiple genomes, as motifs tend to evolve slower than their surrounding non-functional sequences. Its application, however, has several difficulties for optimizing the selection of orthologous data and reducing the false positives in motif prediction. Here we present an integrative phylogenetic footprinting framework for accurate motif predictions in prokaryotic genomes (MP(3)). The framework includes a new orthologous data preparation procedure, an additional promoter scoring and pruning method and an integration of six existing motif finding algorithms as basic motif search engines. Specifically, we collected orthologous genes from available prokaryotic genomes and built the orthologous regulatory regions based on sequence similarity of promoter regions. This procedure made full use of the large-scale genomic data and taxonomy information and filtered out the promoters with limited contribution to produce a high quality orthologous promoter set. The promoter scoring and pruning is implemented through motif voting by a set of complementary predicting tools that mine as many motif candidates as possible and simultaneously eliminate the effect of random noise. We have applied the framework to Escherichia coli k12 genome and evaluated the prediction performance through comparison with seven existing programs. This evaluation was systematically carried out at the nucleotide and binding site level, and the results showed that MP(3) consistently outperformed other popular motif finding tools. We have integrated MP(3) into our motif identification and analysis server DMINDA, allowing users to efficiently identify and analyze motifs in 2,072 completely sequenced prokaryotic genomes. The performance evaluation indicated that MP(3) is effective for predicting regulatory motifs in prokaryotic genomes. Its application may enhance

  16. The human RNA-binding protein and E3 ligase MEX-3C binds the MEX-3-recognition element (MRE) motif with high affinity.

    PubMed

    Yang, Lingna; Wang, Chongyuan; Li, Fudong; Zhang, Jiahai; Nayab, Anam; Wu, Jihui; Shi, Yunyu; Gong, Qingguo

    2017-09-29

    MEX-3 is a K-homology (KH) domain-containing RNA-binding protein first identified as a translational repressor in Caenorhabditis elegans , and its four orthologs (MEX-3A-D) in human and mouse were subsequently found to have E3 ubiquitin ligase activity mediated by a RING domain and critical for RNA degradation. Current evidence implicates human MEX-3C in many essential biological processes and suggests a strong connection with immune diseases and carcinogenesis. The highly conserved dual KH domains in MEX-3 proteins enable RNA binding and are essential for the recognition of the 3'-UTR and post-transcriptional regulation of MEX-3 target transcripts. However, the molecular mechanisms of translational repression and the consensus RNA sequence recognized by the MEX-3C KH domain are unknown. Here, using X-ray crystallography and isothermal titration calorimetry, we investigated the RNA-binding activity and selectivity of human MEX-3C dual KH domains. Our high-resolution crystal structures of individual KH domains complexed with a noncanonical U-rich and a GA-rich RNA sequence revealed that the KH1/2 domains of human MEX-3C bound MRE10, a 10-mer RNA (5'-CAGAGUUUAG-3') consisting of an eight-nucleotide MEX-3-recognition element (MRE) motif, with high affinity. Of note, we also identified a consensus RNA motif recognized by human MEX-3C. The potential RNA-binding sites in the 3'-UTR of the human leukocyte antigen serotype ( HLA-A2 ) mRNA were mapped with this RNA-binding motif and further confirmed by fluorescence polarization. The binding motif identified here will provide valuable information for future investigations of the functional pathways controlled by human MEX-3C and for predicting potential mRNAs regulated by this enzyme. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.

  17. Overlapping activation-induced cytidine deaminase hotspot motifs in Ig class-switch recombination

    PubMed Central

    Han, Li; Masani, Shahnaz; Yu, Kefei

    2011-01-01

    Ig class-switch recombination (CSR) is directed by the long and repetitive switch regions and requires activation-induced cytidine deaminase (AID). One of the conserved switch-region sequence motifs (AGCT) is a preferred site for AID-mediated DNA-cytosine deamination. By using somatic gene targeting and recombinase-mediated cassette exchange, we established a cell line-based CSR assay that allows manipulation of switch sequences at the endogenous locus. We show that AGCT is only one of a family of four WGCW motifs in the switch region that can facilitate CSR. We go on to show that it is the overlap of AID hotspots at WGCW sites on the top and bottom strands that is critical. This finding leads to a much clearer model for the difference between CSR and somatic hypermutation. PMID:21709240

  18. IndeCut evaluates performance of network motif discovery algorithms.

    PubMed

    Ansariola, Mitra; Megraw, Molly; Koslicki, David

    2018-05-01

    Genomic networks represent a complex map of molecular interactions which are descriptive of the biological processes occurring in living cells. Identifying the small over-represented circuitry patterns in these networks helps generate hypotheses about the functional basis of such complex processes. Network motif discovery is a systematic way of achieving this goal. However, a reliable network motif discovery outcome requires generating random background networks which are the result of a uniform and independent graph sampling method. To date, there has been no method to numerically evaluate whether any network motif discovery algorithm performs as intended on realistically sized datasets-thus it was not possible to assess the validity of resulting network motifs. In this work, we present IndeCut, the first method to date that characterizes network motif finding algorithm performance in terms of uniform sampling on realistically sized networks. We demonstrate that it is critical to use IndeCut prior to running any network motif finder for two reasons. First, IndeCut indicates the number of samples needed for a tool to produce an outcome that is both reproducible and accurate. Second, IndeCut allows users to choose the tool that generates samples in the most independent fashion for their network of interest among many available options. The open source software package is available at https://github.com/megrawlab/IndeCut. megrawm@science.oregonstate.edu or david.koslicki@math.oregonstate.edu. Supplementary data are available at Bioinformatics online.

  19. A Conserved Acidic Motif in the N-Terminal Domain of Nitrate Reductase Is Necessary for the Inactivation of the Enzyme in the Dark by Phosphorylation and 14-3-3 Binding1

    PubMed Central

    Pigaglio, Emmanuelle; Durand, Nathalie; Meyer, Christian

    1999-01-01

    It has previously been shown that the N-terminal domain of tobacco (Nicotiana tabacum) nitrate reductase (NR) is involved in the inactivation of the enzyme by phosphorylation, which occurs in the dark (L. Nussaume, M. Vincentz, C. Meyer, J.P. Boutin, and M. Caboche [1995] Plant Cell 7: 611–621). The activity of a mutant NR protein lacking this N-terminal domain was no longer regulated by light-dark transitions. In this study smaller deletions were performed in the N-terminal domain of tobacco NR that removed protein motifs conserved among higher plant NRs. The resulting truncated NR-coding sequences were then fused to the cauliflower mosaic virus 35S RNA promoter and introduced in NR-deficient mutants of the closely related species Nicotiana plumbaginifolia. We found that the deletion of a conserved stretch of acidic residues led to an active NR protein that was more thermosensitive than the wild-type enzyme, but it was relatively insensitive to the inactivation by phosphorylation in the dark. Therefore, the removal of this acidic stretch seems to have the same effects on NR activation state as the deletion of the N-terminal domain. A hypothetical explanation for these observations is that a specific factor that impedes inactivation remains bound to the truncated enzyme. A synthetic peptide derived from this acidic protein motif was also found to be a good substrate for casein kinase II. PMID:9880364

  20. Stress-Responsive Mitogen-Activated Protein Kinases Interact with the EAR Motif of a Poplar Zinc Finger Protein and Mediate Its Degradation through the 26S Proteasome1[W][OA

    PubMed Central

    Hamel, Louis-Philippe; Benchabane, Meriem; Nicole, Marie-Claude; Major, Ian T.; Morency, Marie-Josée; Pelletier, Gervais; Beaudoin, Nathalie; Sheen, Jen; Séguin, Armand

    2011-01-01

    Mitogen-activated protein kinases (MAPKs) contribute to the establishment of plant disease resistance by regulating downstream signaling components, including transcription factors. In this study, we identified MAPK-interacting proteins, and among the newly discovered candidates was a Cys-2/His-2-type zinc finger protein named PtiZFP1. This putative transcription factor belongs to a family of transcriptional repressors that rely on an ERF-associated amphiphilic repression (EAR) motif for their repression activity. Amino acids located within this repression motif were also found to be essential for MAPK binding. Close examination of the primary protein sequence revealed a functional bipartite MAPK docking site that partially overlaps with the EAR motif. Transient expression assays in Arabidopsis (Arabidopsis thaliana) protoplasts suggest that MAPKs promote PtiZFP1 degradation through the 26S proteasome. Since features of the MAPK docking site are conserved among other EAR repressors, our study suggests a novel mode of defense mechanism regulation involving stress-responsive MAPKs and EAR repressors. PMID:21873571

  1. Identification of helix capping and β-turn motifs from NMR chemical shifts

    PubMed Central

    Shen, Yang; Bax, Ad

    2012-01-01

    We present an empirical method for identification of distinct structural motifs in proteins on the basis of experimentally determined backbone and 13Cβ chemical shifts. Elements identified include the N-terminal and C-terminal helix capping motifs and five types of β-turns: I, II, I′, II′ and VIII. Using a database of proteins of known structure, the NMR chemical shifts, together with the PDB-extracted amino acid preference of the helix capping and β-turn motifs are used as input data for training an artificial neural network algorithm, which outputs the statistical probability of finding each motif at any given position in the protein. The trained neural networks, contained in the MICS (motif identification from chemical shifts) program, also provide a confidence level for each of their predictions, and values ranging from ca 0.7–0.9 for the Matthews correlation coefficient of its predictions far exceed that attainable by sequence analysis. MICS is anticipated to be useful both in the conventional NMR structure determination process and for enhancing on-going efforts to determine protein structures solely on the basis of chemical shift information, where it can aid in identifying protein database fragments suitable for use in building such structures. PMID:22314702

  2. Systems analysis of cis-regulatory motifs in C4 photosynthesis genes using maize and rice leaf transcriptomic data during a process of de-etiolation.

    PubMed

    Xu, Jiajia; Bräutigam, Andrea; Weber, Andreas P M; Zhu, Xin-Guang

    2016-09-01

    Identification of potential cis-regulatory motifs controlling the development of C4 photosynthesis is a major focus of current research. In this study, we used time-series RNA-seq data collected from etiolated maize and rice leaf tissues sampled during a de-etiolation process to systematically characterize the expression patterns of C4-related genes and to further identify potential cis elements in five different genomic regions (i.e. promoter, 5'UTR, 3'UTR, intron, and coding sequence) of C4 orthologous genes. The results demonstrate that although most of the C4 genes show similar expression patterns, a number of them, including chloroplast dicarboxylate transporter 1, aspartate aminotransferase, and triose phosphate transporter, show shifted expression patterns compared with their C3 counterparts. A number of conserved short DNA motifs between maize C4 genes and their rice orthologous genes were identified not only in the promoter, 5'UTR, 3'UTR, and coding sequences, but also in the introns of core C4 genes. We also identified cis-regulatory motifs that exist in maize C4 genes and also in genes showing similar expression patterns as maize C4 genes but that do not exist in rice C3 orthologs, suggesting a possible recruitment of pre-existing cis-elements from genes unrelated to C4 photosynthesis into C4 photosynthesis genes during C4 evolution. © The Author 2016. Published by Oxford University Press on behalf of the Society for Experimental Biology.

  3. Crystal structures of two novel dye-decolorizing peroxidases reveal a beta-barrel fold with a conserved heme-binding motif.

    PubMed

    Zubieta, Chloe; Krishna, S Sri; Kapoor, Mili; Kozbial, Piotr; McMullan, Daniel; Axelrod, Herbert L; Miller, Mitchell D; Abdubek, Polat; Ambing, Eileen; Astakhova, Tamara; Carlton, Dennis; Chiu, Hsiu-Ju; Clayton, Thomas; Deller, Marc C; Duan, Lian; Elsliger, Marc-André; Feuerhelm, Julie; Grzechnik, Slawomir K; Hale, Joanna; Hampton, Eric; Han, Gye Won; Jaroszewski, Lukasz; Jin, Kevin K; Klock, Heath E; Knuth, Mark W; Kumar, Abhinav; Marciano, David; Morse, Andrew T; Nigoghossian, Edward; Okach, Linda; Oommachen, Silvya; Reyes, Ron; Rife, Christopher L; Schimmel, Paul; van den Bedem, Henry; Weekes, Dana; White, Aprilfawn; Xu, Qingping; Hodgson, Keith O; Wooley, John; Deacon, Ashley M; Godzik, Adam; Lesley, Scott A; Wilson, Ian A

    2007-11-01

    BtDyP from Bacteroides thetaiotaomicron (strain VPI-5482) and TyrA from Shewanella oneidensis are dye-decolorizing peroxidases (DyPs), members of a new family of heme-dependent peroxidases recently identified in fungi and bacteria. Here, we report the crystal structures of BtDyP and TyrA at 1.6 and 2.7 A, respectively. BtDyP assembles into a hexamer, while TyrA assembles into a dimer; the dimerization interface is conserved between the two proteins. Each monomer exhibits a two-domain, alpha+beta ferredoxin-like fold. A site for heme binding was identified computationally, and modeling of a heme into the proposed active site allowed for identification of residues likely to be functionally important. Structural and sequence comparisons with other DyPs demonstrate a conservation of putative heme-binding residues, including an absolutely conserved histidine. Isothermal titration calorimetry experiments confirm heme binding, but with a stoichiometry of 0.3:1 (heme:protein). (c) 2007 Wiley-Liss, Inc.

  4. A cross-species bi-clustering approach to identifying conserved co-regulated genes.

    PubMed

    Sun, Jiangwen; Jiang, Zongliang; Tian, Xiuchun; Bi, Jinbo

    2016-06-15

    A growing number of studies have explored the process of pre-implantation embryonic development of multiple mammalian species. However, the conservation and variation among different species in their developmental programming are poorly defined due to the lack of effective computational methods for detecting co-regularized genes that are conserved across species. The most sophisticated method to date for identifying conserved co-regulated genes is a two-step approach. This approach first identifies gene clusters for each species by a cluster analysis of gene expression data, and subsequently computes the overlaps of clusters identified from different species to reveal common subgroups. This approach is ineffective to deal with the noise in the expression data introduced by the complicated procedures in quantifying gene expression. Furthermore, due to the sequential nature of the approach, the gene clusters identified in the first step may have little overlap among different species in the second step, thus difficult to detect conserved co-regulated genes. We propose a cross-species bi-clustering approach which first denoises the gene expression data of each species into a data matrix. The rows of the data matrices of different species represent the same set of genes that are characterized by their expression patterns over the developmental stages of each species as columns. A novel bi-clustering method is then developed to cluster genes into subgroups by a joint sparse rank-one factorization of all the data matrices. This method decomposes a data matrix into a product of a column vector and a row vector where the column vector is a consistent indicator across the matrices (species) to identify the same gene cluster and the row vector specifies for each species the developmental stages that the clustered genes co-regulate. Efficient optimization algorithm has been developed with convergence analysis. This approach was first validated on synthetic data and compared

  5. Searching RNA motifs and their intermolecular contacts with constraint networks.

    PubMed

    Thébault, P; de Givry, S; Schiex, T; Gaspin, C

    2006-09-01

    Searching RNA gene occurrences in genomic sequences is a task whose importance has been renewed by the recent discovery of numerous functional RNA, often interacting with other ligands. Even if several programs exist for RNA motif search, none exists that can represent and solve the problem of searching for occurrences of RNA motifs in interaction with other molecules. We present a constraint network formulation of this problem. RNA are represented as structured motifs that can occur on more than one sequence and which are related together by possible hybridization. The implemented tool MilPat is used to search for several sRNA families in genomic sequences. Results show that MilPat allows to efficiently search for interacting motifs in large genomic sequences and offers a simple and extensible framework to solve such problems. New and known sRNA are identified as H/ACA candidates in Methanocaldococcus jannaschii. http://carlit.toulouse.inra.fr/MilPaT/MilPat.pl.

  6. BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements.

    PubMed

    De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan

    2015-12-01

    The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be. Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  7. Assessing local structure motifs using order parameters for motif recognition, interstitial identification, and diffusion path characterization

    NASA Astrophysics Data System (ADS)

    Zimmermann, Nils E. R.; Horton, Matthew K.; Jain, Anubhav; Haranczyk, Maciej

    2017-11-01

    Structure-property relationships form the basis of many design rules in materials science, including synthesizability and long-term stability of catalysts, control of electrical and optoelectronic behavior in semiconductors as well as the capacity of and transport properties in cathode materials for rechargeable batteries. The immediate atomic environments (i.e., the first coordination shells) of a few atomic sites are often a key factor in achieving a desired property. Some of the most frequently encountered coordination patterns are tetrahedra, octahedra, body and face-centered cubic as well as hexagonal closed packed-like environments. Here, we showcase the usefulness of local order parameters to identify these basic structural motifs in inorganic solid materials by developing classification criteria. We introduce a systematic testing framework, the Einstein crystal test rig, that probes the response of order parameters to distortions in perfect motifs to validate our approach. Subsequently, we highlight three important application cases. First, we map basic crystal structure information of a large materials database in an intuitive manner by screening the Materials Project (MP) database (61,422 compounds) for element-specific motif distributions. Second, we use the structure-motif recognition capabilities to automatically find interstitials in metals, semiconductor, and insulator materials. Our Interstitialcy Finding Tool (InFiT) facilitates high-throughput screenings of defect properties. Third, the order parameters are reliable and compact quantitative structure descriptors for characterizing diffusion hops of intercalants as our example of magnesium in MnO2-spinel indicates. Finally, the tools developed in our work are readily and freely available as software implementations in the pymatgen library, and we expect them to be further applied to machine-learning approaches for emerging applications in materials science.

  8. CombiMotif: A new algorithm for network motifs discovery in protein-protein interaction networks

    NASA Astrophysics Data System (ADS)

    Luo, Jiawei; Li, Guanghui; Song, Dan; Liang, Cheng

    2014-12-01

    Discovering motifs in protein-protein interaction networks is becoming a current major challenge in computational biology, since the distribution of the number of network motifs can reveal significant systemic differences among species. However, this task can be computationally expensive because of the involvement of graph isomorphic detection. In this paper, we present a new algorithm (CombiMotif) that incorporates combinatorial techniques to count non-induced occurrences of subgraph topologies in the form of trees. The efficiency of our algorithm is demonstrated by comparing the obtained results with the current state-of-the art subgraph counting algorithms. We also show major differences between unicellular and multicellular organisms. The datasets and source code of CombiMotif are freely available upon request.

  9. Assessment of composite motif discovery methods.

    PubMed

    Klepper, Kjetil; Sandve, Geir K; Abul, Osman; Johansen, Jostein; Drablos, Finn

    2008-02-26

    Computational discovery of regulatory elements is an important area of bioinformatics research and more than a hundred motif discovery methods have been published. Traditionally, most of these methods have addressed the problem of single motif discovery - discovering binding motifs for individual transcription factors. In higher organisms, however, transcription factors usually act in combination with nearby bound factors to induce specific regulatory behaviours. Hence, recent focus has shifted from single motifs to the discovery of sets of motifs bound by multiple cooperating transcription factors, so called composite motifs or cis-regulatory modules. Given the large number and diversity of methods available, independent assessment of methods becomes important. Although there have been several benchmark studies of single motif discovery, no similar studies have previously been conducted concerning composite motif discovery. We have developed a benchmarking framework for composite motif discovery and used it to evaluate the performance of eight published module discovery tools. Benchmark datasets were constructed based on real genomic sequences containing experimentally verified regulatory modules, and the module discovery programs were asked to predict both the locations of these modules and to specify the single motifs involved. To aid the programs in their search, we provided position weight matrices corresponding to the binding motifs of the transcription factors involved. In addition, selections of decoy matrices were mixed with the genuine matrices on one dataset to test the response of programs to varying levels of noise. Although some of the methods tested tended to score somewhat better than others overall, there were still large variations between individual datasets and no single method performed consistently better than the rest in all situations. The variation in performance on individual datasets also shows that the new benchmark datasets represents a

  10. Multivariate ordination identifies vegetation types associated with spider conservation in brassica crops

    PubMed Central

    Saqib, Hafiz Sohaib Ahmed; You, Minsheng

    2017-01-01

    Conservation biological control emphasizes natural and other non-crop vegetation as a source of natural enemies to focal crops. There is an unmet need for better methods to identify the types of vegetation that are optimal to support specific natural enemies that may colonize the crops. Here we explore the commonality of the spider assemblage—considering abundance and diversity (H)—in brassica crops with that of adjacent non-crop and non-brassica crop vegetation. We employ spatial-based multivariate ordination approaches, hierarchical clustering and spatial eigenvector analysis. The small-scale mixed cropping and high disturbance frequency of southern Chinese vegetation farming offered a setting to test the role of alternate vegetation for spider conservation. Our findings indicate that spider families differ markedly in occurrence with respect to vegetation type. Grassy field margins, non-crop vegetation, taro and sweetpotato harbour spider morphospecies and functional groups that are also present in brassica crops. In contrast, pumpkin and litchi contain spiders not found in brassicas, and so may have little benefit for conservation biological control services for brassicas. Our findings also illustrate the utility of advanced statistical approaches for identifying spatial relationships between natural enemies and the land uses most likely to offer alternative habitats for conservation biological control efforts that generates testable hypotheses for future studies. PMID:29085741

  11. Distribution of CpG Motifs in Upstream Gene Domains in a Reef Coral and Sea Anemone: Implications for Epigenetics in Cnidarians.

    PubMed

    Marsh, Adam G; Hoadley, Kenneth D; Warner, Mark E

    2016-01-01

    Coral reefs are under assault from stressors including global warming, ocean acidification, and urbanization. Knowing how these factors impact the future fate of reefs requires delineating stress responses across ecological, organismal and cellular scales. Recent advances in coral reef biology have integrated molecular processes with ecological fitness and have identified putative suites of temperature acclimation genes in a Scleractinian coral Acropora hyacinthus. We wondered what unique characteristics of these genes determined their coordinate expression in response to temperature acclimation, and whether or not other corals and cnidarians would likewise possess these features. Here, we focus on cytosine methylation as an epigenetic DNA modification that is responsive to environmental stressors. We identify common conserved patterns of cytosine-guanosine dinucleotide (CpG) motif frequencies in upstream promoter domains of different functional gene groups in two cnidarian genomes: a coral (Acropora digitifera) and an anemone (Nematostella vectensis). Our analyses show that CpG motif frequencies are prominent in the promoter domains of functional genes associated with environmental adaptation, particularly those identified in A. hyacinthus. Densities of CpG sites in upstream promoter domains near the transcriptional start site (TSS) are 1.38x higher than genomic background levels upstream of -2000 bp from the TSS. The increase in CpG usage suggests selection to allow for DNA methylation events to occur more frequently within 1 kb of the TSS. In addition, observed shifts in CpG densities among functional groups of genes suggests a potential role for epigenetic DNA methylation within promoter domains to impact functional gene expression responses in A. digitifera and N. vectensis. Identifying promoter epigenetic sequence motifs among genes within specific functional groups establishes an approach to describe integrated cellular responses to environmental stress in

  12. Rapid motif compliance scoring with match weight sets.

    PubMed

    Venezia, D; O'Hara, P J

    1993-02-01

    Most current implementations of motif matching in biological sequences have sacrificed the generality of weight matrix scoring for shorter runtimes. The program MOTIF incorporates a weight matrix and a rapid, backtracking tree-search algorithm to score motif compliance with greatly enhanced performance while placing no constraints on the motif. In addition, any positions within a motif can be marked as 'inviolate', thereby requiring an exact match. MOTIF allows a choice of regular expression formats and can use both motif and sequence libraries as either targets or queries. Nucleic acid sequences can optionally be translated by MOTIF in any frame(s) and used against peptide motifs.

  13. Structural and immunochemical relatedness suggests a conserved pathogenicity motif for secondary cell wall polysaccharides in Bacillus anthracis and infection-associated Bacillus cereus

    PubMed Central

    Saile, Elke; Klee, Silke R.; Hoffmaster, Alex; Kannenberg, Elmar L.

    2017-01-01

    Bacillus anthracis (Ba) and human infection-associated Bacillus cereus (Bc) strains Bc G9241 and Bc 03BB87 have secondary cell wall polysaccharides (SCWPs) comprising an aminoglycosyl trisaccharide repeat: →4)-β-d-ManpNAc-(1→4)-β-d-GlcpNAc-(1→6)-α-d-GlcpNAc-(1→, substituted at GlcNAc residues with both α- and β-Galp. In Bc G9241 and Bc 03BB87, an additional α-Galp is attached to O-3 of ManNAc. Using NMR spectroscopy, mass spectrometry and immunochemical methods, we compared these structures to SCWPs from Bc biovar anthracis strains isolated from great apes displaying “anthrax-like” symptoms in Cameroon (Bc CA) and Côte d’Ivoire (Bc CI). The SCWPs of Bc CA/CI contained the identical HexNAc trisaccharide backbone and Gal modifications found in Ba, together with the α-Gal-(1→3) substitution observed previously at ManNAc residues only in Bc G9241/03BB87. Interestingly, the great ape derived strains displayed a unique α-Gal-(1→3)-α-Gal-(1→3) disaccharide substitution at some ManNAc residues, a modification not found in any previously examined Ba or Bc strain. Immuno-analysis with specific polyclonal anti-Ba SCWP antiserum demonstrated a reactivity hierarchy: high reactivity with SCWPs from Ba 7702 and Ba Sterne 34F2, and Bc G9241 and Bc 03BB87; intermediate reactivity with SCWPs from Bc CI/CA; and low reactivity with the SCWPs from structurally distinct Ba CDC684 (a unique strain producing an SCWP lacking all Gal substitutions) and non-infection-associated Bc ATCC10987 and Bc 14579 SCWPs. Ba-specific monoclonal antibody EAII-6G6-2-3 demonstrated a 10–20 fold reduced reactivity to Bc G9241 and Bc 03BB87 SCWPs compared to Ba 7702/34F2, and low/undetectable reactivity to SCWPs from Bc CI, Bc CA, Ba CDC684, and non-infection-associated Bc strains. Our data indicate that the HexNAc motif is conserved among infection-associated Ba and Bc isolates (regardless of human or great ape origin), and that the number, positions and structures of Gal

  14. Structural and immunochemical relatedness suggests a conserved pathogenicity motif for secondary cell wall polysaccharides in Bacillus anthracis and infection-associated Bacillus cereus.

    PubMed

    Kamal, Nazia; Ganguly, Jhuma; Saile, Elke; Klee, Silke R; Hoffmaster, Alex; Carlson, Russell W; Forsberg, Lennart S; Kannenberg, Elmar L; Quinn, Conrad P

    2017-01-01

    Bacillus anthracis (Ba) and human infection-associated Bacillus cereus (Bc) strains Bc G9241 and Bc 03BB87 have secondary cell wall polysaccharides (SCWPs) comprising an aminoglycosyl trisaccharide repeat: →4)-β-d-ManpNAc-(1→4)-β-d-GlcpNAc-(1→6)-α-d-GlcpNAc-(1→, substituted at GlcNAc residues with both α- and β-Galp. In Bc G9241 and Bc 03BB87, an additional α-Galp is attached to O-3 of ManNAc. Using NMR spectroscopy, mass spectrometry and immunochemical methods, we compared these structures to SCWPs from Bc biovar anthracis strains isolated from great apes displaying "anthrax-like" symptoms in Cameroon (Bc CA) and Côte d'Ivoire (Bc CI). The SCWPs of Bc CA/CI contained the identical HexNAc trisaccharide backbone and Gal modifications found in Ba, together with the α-Gal-(1→3) substitution observed previously at ManNAc residues only in Bc G9241/03BB87. Interestingly, the great ape derived strains displayed a unique α-Gal-(1→3)-α-Gal-(1→3) disaccharide substitution at some ManNAc residues, a modification not found in any previously examined Ba or Bc strain. Immuno-analysis with specific polyclonal anti-Ba SCWP antiserum demonstrated a reactivity hierarchy: high reactivity with SCWPs from Ba 7702 and Ba Sterne 34F2, and Bc G9241 and Bc 03BB87; intermediate reactivity with SCWPs from Bc CI/CA; and low reactivity with the SCWPs from structurally distinct Ba CDC684 (a unique strain producing an SCWP lacking all Gal substitutions) and non-infection-associated Bc ATCC10987 and Bc 14579 SCWPs. Ba-specific monoclonal antibody EAII-6G6-2-3 demonstrated a 10-20 fold reduced reactivity to Bc G9241 and Bc 03BB87 SCWPs compared to Ba 7702/34F2, and low/undetectable reactivity to SCWPs from Bc CI, Bc CA, Ba CDC684, and non-infection-associated Bc strains. Our data indicate that the HexNAc motif is conserved among infection-associated Ba and Bc isolates (regardless of human or great ape origin), and that the number, positions and structures of Gal

  15. Combining endangered plants and animals as surrogates to identify priority conservation areas in Yunnan, China

    PubMed Central

    Yang, Feiling; Hu, Jinming; Wu, Ruidong

    2016-01-01

    Suitable surrogates are critical for identifying optimal priority conservation areas (PCAs) to protect regional biodiversity. This study explored the efficiency of using endangered plants and animals as surrogates for identifying PCAs at the county level in Yunnan, southwest China. We ran the Dobson algorithm under three surrogate scenarios at 75% and 100% conservation levels and identified four types of PCAs. Assessment of the protection efficiencies of the four types of PCAs showed that endangered plants had higher surrogacy values than endangered animals but that the two were not substitutable; coupled endangered plants and animals as surrogates yielded a higher surrogacy value than endangered plants or animals as surrogates; the plant-animal priority areas (PAPAs) was the optimal among the four types of PCAs for conserving both endangered plants and animals in Yunnan. PAPAs could well represent overall species diversity distribution patterns and overlap with critical biogeographical regions in Yunnan. Fourteen priority units in PAPAs should be urgently considered as optimizing Yunnan’s protected area system. The spatial pattern of PAPAs at the 100% conservation level could be conceptualized into three connected conservation belts, providing a valuable reference for optimizing the layout of the in situ protected area system in Yunnan. PMID:27538537

  16. Combining endangered plants and animals as surrogates to identify priority conservation areas in Yunnan, China

    NASA Astrophysics Data System (ADS)

    Yang, Feiling; Hu, Jinming; Wu, Ruidong

    2016-08-01

    Suitable surrogates are critical for identifying optimal priority conservation areas (PCAs) to protect regional biodiversity. This study explored the efficiency of using endangered plants and animals as surrogates for identifying PCAs at the county level in Yunnan, southwest China. We ran the Dobson algorithm under three surrogate scenarios at 75% and 100% conservation levels and identified four types of PCAs. Assessment of the protection efficiencies of the four types of PCAs showed that endangered plants had higher surrogacy values than endangered animals but that the two were not substitutable; coupled endangered plants and animals as surrogates yielded a higher surrogacy value than endangered plants or animals as surrogates; the plant-animal priority areas (PAPAs) was the optimal among the four types of PCAs for conserving both endangered plants and animals in Yunnan. PAPAs could well represent overall species diversity distribution patterns and overlap with critical biogeographical regions in Yunnan. Fourteen priority units in PAPAs should be urgently considered as optimizing Yunnan’s protected area system. The spatial pattern of PAPAs at the 100% conservation level could be conceptualized into three connected conservation belts, providing a valuable reference for optimizing the layout of the in situ protected area system in Yunnan.

  17. Expression patterns of TEL genes in Poaceae suggest a conserved association with cell differentiation.

    PubMed

    Paquet, Nicolas; Bernadet, Marie; Morin, Halima; Traas, Jan; Dron, Michel; Charon, Celine

    2005-06-01

    Poaceae species present a conserved distichous phyllotaxy (leaf position along the stem) and share common properties with respect to leaf initiation. The goal of this work was to determine if these common traits imply common genes. Therefore, homologues of the maize TERMINAL EAR1 gene in Poaceae were studied. This gene encodes an RNA-binding motif (RRM) protein, that is suggested to regulate leaf initiation. Using degenerate primers, one unique tel (terminal ear1-like) gene from seven Poaceae members, covering almost all the phylogenetic tree of the family, was identified by PCR. These genes present a very high degree of similarity, a much conserved exon-intron structure, and the three RRMs and TEL characteristic motifs. The evolution of tel sequences in Poaceae strongly correlates with the known phylogenetic tree of this family. RT-PCR gene expression analyses show conserved tel expression in the shoot apex in all species, suggesting functional orthology between these genes. In addition, in situ hybridization experiments with specific antisense probes show tel transcript accumulation in all differentiating cells of the leaf, from the recruitment of leaf founder cells to leaf margins cells. Tel expression is not restricted to initiating leaves as it is also found in pro-vascular tissues, root meristems, and immature inflorescences. Therefore, these results suggest that TEL is not only associated with leaf initiation but more generally with cell differentiation in Poaceae.

  18. Role of conserved cysteine residues in Herbaspirillum seropedicae NifA activity.

    PubMed

    Oliveira, Marco A S; Baura, Valter A; Aquino, Bruno; Huergo, Luciano F; Kadowaki, Marco A S; Chubatsu, Leda S; Souza, Emanuel M; Dixon, Ray; Pedrosa, Fábio O; Wassem, Roseli; Monteiro, Rose A

    2009-01-01

    Herbaspirillum seropedicae is an endophytic diazotrophic bacterium that associates with economically important crops. NifA protein, the transcriptional activator of nif genes in H. seropedicae, binds to nif promoters and, together with RNA polymerase-sigma(54) holoenzyme, catalyzes the formation of open complexes to allow transcription initiation. The activity of H. seropedicae NifA is controlled by ammonium and oxygen levels, but the mechanisms of such control are unknown. Oxygen sensitivity is attributed to a conserved motif of cysteine residues in NifA that spans the central AAA+ domain and the interdomain linker that connects the AAA+ domain to the C-terminal DNA binding domain. Here we mutagenized this conserved motif of cysteines and assayed the activity of mutant proteins in vivo. We also purified the mutant variants of NifA and tested their capacity to bind to the nifB promoter region. Chimeric proteins between H. seropedicae NifA, an oxygen-sensitive protein, and Azotobacter vinelandii NifA, an oxygen-tolerant protein, were constructed and showed that the oxygen response is conferred by the central AAA+ and C-terminal DNA binding domains of H. seropedicae NifA. We conclude that the conserved cysteine motif is essential for NifA activity, although single cysteine-to-serine mutants are still competent at binding DNA.

  19. An analysis of the positional distribution of DNA motifs in promoter regions and its biological relevance.

    PubMed

    Casimiro, Ana C; Vinga, Susana; Freitas, Ana T; Oliveira, Arlindo L

    2008-02-07

    Motif finding algorithms have developed in their ability to use computationally efficient methods to detect patterns in biological sequences. However the posterior classification of the output still suffers from some limitations, which makes it difficult to assess the biological significance of the motifs found. Previous work has highlighted the existence of positional bias of motifs in the DNA sequences, which might indicate not only that the pattern is important, but also provide hints of the positions where these patterns occur preferentially. We propose to integrate position uniformity tests and over-representation tests to improve the accuracy of the classification of motifs. Using artificial data, we have compared three different statistical tests (Chi-Square, Kolmogorov-Smirnov and a Chi-Square bootstrap) to assess whether a given motif occurs uniformly in the promoter region of a gene. Using the test that performed better in this dataset, we proceeded to study the positional distribution of several well known cis-regulatory elements, in the promoter sequences of different organisms (S. cerevisiae, H. sapiens, D. melanogaster, E. coli and several Dicotyledons plants). The results show that position conservation is relevant for the transcriptional machinery. We conclude that many biologically relevant motifs appear heterogeneously distributed in the promoter region of genes, and therefore, that non-uniformity is a good indicator of biological relevance and can be used to complement over-representation tests commonly used. In this article we present the results obtained for the S. cerevisiae data sets.

  20. CSTminer: a web tool for the identification of coding and noncoding conserved sequence tags through cross-species genome comparison

    PubMed Central

    Castrignanò, Tiziana; Canali, Alessandro; Grillo, Giorgio; Liuni, Sabino; Mignone, Flavio; Pesole, Graziano

    2004-01-01

    The identification and characterization of genome tracts that are highly conserved across species during evolution may contribute significantly to the functional annotation of whole-genome sequences. Indeed, such sequences are likely to correspond to known or unknown coding exons or regulatory motifs. Here, we present a web server implementing a previously developed algorithm that, by comparing user-submitted genome sequences, is able to identify statistically significant conserved blocks and assess their coding or noncoding nature through the measure of a coding potential score. The web tool, available at http://www.caspur.it/CSTminer/, is dynamically interconnected with the Ensembl genome resources and produces a graphical output showing a map of detected conserved sequences and annotated gene features. PMID:15215464

  1. iLIR@viral: A web resource for LIR motif-containing proteins in viruses.

    PubMed

    Jacomin, Anne-Claire; Samavedam, Siva; Charles, Hannah; Nezis, Ioannis P

    2017-10-03

    Macroautophagy/autophagy has been shown to mediate the selective lysosomal degradation of pathogenic bacteria and viruses (xenophagy), and to contribute to the activation of innate and adaptative immune responses. Autophagy can serve as an antiviral defense mechanism but also as a proviral process during infection. Atg8-family proteins play a central role in the autophagy process due to their ability to interact with components of the autophagy machinery as well as selective autophagy receptors and adaptor proteins. Such interactions are usually mediated through LC3-interacting region (LIR) motifs. So far, only one viral protein has been experimentally shown to have a functional LIR motif, leaving open a vast field for investigation. Here, we have developed the iLIR@viral database ( http://ilir.uk/virus/ ) as a freely accessible web resource listing all the putative canonical LIR motifs identified in viral proteins. Additionally, we used a curated text-mining analysis of the literature to identify novel putative LIR motif-containing proteins (LIRCPs) in viruses. We anticipate that iLIR@viral will assist with elucidating the full complement of LIRCPs in viruses.

  2. Staufen1 dimerizes via a conserved motif and a degenerate dsRNA-binding domain to promote mRNA decay

    PubMed Central

    Gleghorn, Michael L.; Gong, Chenguang; Kielkopf, Clara L.; Maquat, Lynne E.

    2014-01-01

    Staufen (STAU)1-mediated mRNA decay (SMD) degrades mammalian-cell mRNAs that bind the double-stranded (ds)RNA-binding protein STAU1 in their 3′-untranslated region. We report a new motif, which typifies STAU homologs from all vertebrate classes, that is responsible for human (h)STAU1 homodimerization. Our crystal structure and mutagenesis analyses reveal that this motif, now named the Staufen-swapping motif (SSM), and dsRNA-binding domain 5 (‘RBD’5) mediate protein dimerization: the two SSM α-helices of one molecule interact primarily through a hydrophobic patch with the two ‘RBD’5 α-helices of a second molecule. ‘RBD’5 adopts the canonical α-β-β-β-α fold of a functional RBD, but it lacks residues and features needed to bind duplex RNA. In cells, SSM-mediated hSTAU1 dimerization increases the efficiency of SMD by augmenting hSTAU1 binding to the ATP-dependent RNA helicase hUPF1. Dimerization regulates keratinocyte-mediated wound-healing and, undoubtedly, many other cellular processes. PMID:23524536

  3. RSAT matrix-clustering: dynamic exploration and redundancy reduction of transcription factor binding motif collections

    PubMed Central

    Jaeger, Sébastien; Thieffry, Denis

    2017-01-01

    Abstract Transcription factor (TF) databases contain multitudes of binding motifs (TFBMs) from various sources, from which non-redundant collections are derived by manual curation. The advent of high-throughput methods stimulated the production of novel collections with increasing numbers of motifs. Meta-databases, built by merging these collections, contain redundant versions, because available tools are not suited to automatically identify and explore biologically relevant clusters among thousands of motifs. Motif discovery from genome-scale data sets (e.g. ChIP-seq) also produces redundant motifs, hampering the interpretation of results. We present matrix-clustering, a versatile tool that clusters similar TFBMs into multiple trees, and automatically creates non-redundant TFBM collections. A feature unique to matrix-clustering is its dynamic visualisation of aligned TFBMs, and its capability to simultaneously treat multiple collections from various sources. We demonstrate that matrix-clustering considerably simplifies the interpretation of combined results from multiple motif discovery tools, and highlights biologically relevant variations of similar motifs. We also ran a large-scale application to cluster ∼11 000 motifs from 24 entire databases, showing that matrix-clustering correctly groups motifs belonging to the same TF families, and drastically reduced motif redundancy. matrix-clustering is integrated within the RSAT suite (http://rsat.eu/), accessible through a user-friendly web interface or command-line for its integration in pipelines. PMID:28591841

  4. Substitution of Asp-309 by Asn in the Arg-Asp-Pro (RDP) motif of Acetobacter diazotrophicus levansucrase affects sucrose hydrolysis, but not enzyme specificity.

    PubMed Central

    Batista, F R; Hernández, L; Fernández, J R; Arrieta, J; Menéndez, C; Gómez, R; Támbara, Y; Pons, T

    1999-01-01

    beta-Fructofuranosidases share a conserved aspartic acid-containing motif (Arg-Asp-Pro; RDP) which is absent from alpha-glucopyranosidases. The role of Asp-309 located in the RDP motif of levansucrase (EC 2.4.1.10) from Acetobacter diazotrophicus SRT4 was studied by site-directed mutagenesis. Substitution of Asp-309 by Asn did not affect enzyme secretion. The kcat of the mutant levansucrase was reduced 75-fold, but its Km was similar to that of the wild-type enzyme, indicating that Asp-309 plays a major role in catalysis. The two levansucrases showed optimal activity at pH 5.0 and yielded similar product profiles. Thus the mutation D309N affected the efficiency of sucrose hydrolysis, but not the enzyme specificity. Since the RDP motif is present in a conserved position in fructosyltransferases, invertases, levanases, inulinases and sucrose-6-phosphate hydrolases, it is likely to have a common functional role in beta-fructofuranosidases. PMID:9895294

  5. Peptide-binding motifs of two common equine class I MHC molecules in Thoroughbred horses.

    PubMed

    Bergmann, Tobias; Lindvall, Mikaela; Moore, Erin; Moore, Eugene; Sidney, John; Miller, Donald; Tallmadge, Rebecca L; Myers, Paisley T; Malaker, Stacy A; Shabanowitz, Jeffrey; Osterrieder, Nikolaus; Peters, Bjoern; Hunt, Donald F; Antczak, Douglas F; Sette, Alessandro

    2017-05-01

    Quantitative peptide-binding motifs of MHC class I alleles provide a valuable tool to efficiently identify putative T cell epitopes. Detailed information on equine MHC class I alleles is still very limited, and to date, only a single equine MHC class I allele, Eqca-1*00101 (ELA-A3 haplotype), has been characterized. The present study extends the number of characterized ELA class I specificities in two additional haplotypes found commonly in the Thoroughbred breed. Accordingly, we here report quantitative binding motifs for the ELA-A2 allele Eqca-16*00101 and the ELA-A9 allele Eqca-1*00201. Utilizing analyses of endogenously bound and eluted ligands and the screening of positional scanning combinatorial libraries, detailed and quantitative peptide-binding motifs were derived for both alleles. Eqca-16*00101 preferentially binds peptides with aliphatic/hydrophobic residues in position 2 and at the C-terminus, and Eqca-1*00201 has a preference for peptides with arginine in position 2 and hydrophobic/aliphatic residues at the C-terminus. Interestingly, the Eqca-16*00101 motif resembles that of the human HLA A02-supertype, while the Eqca-1*00201 motif resembles that of the HLA B27-supertype and two macaque class I alleles. It is expected that the identified motifs will facilitate the selection of candidate epitopes for the study of immune responses in horses.

  6. Assessing Local Structure Motifs Using Order Parameters for Motif Recognition, Interstitial Identification, and Diffusion Path Characterization

    DOE PAGES

    Zimmermann, Nils E. R.; Horton, Matthew K.; Jain, Anubhav; ...

    2017-11-13

    Structure–property relationships form the basis of many design rules in materials science, including synthesizability and long-term stability of catalysts, control of electrical and optoelectronic behavior in semiconductors, as well as the capacity of and transport properties in cathode materials for rechargeable batteries. The immediate atomic environments (i.e., the first coordination shells) of a few atomic sites are often a key factor in achieving a desired property. Some of the most frequently encountered coordination patterns are tetrahedra, octahedra, body and face-centered cubic as well as hexagonal close packed-like environments. Here, we showcase the usefulness of local order parameters to identify thesemore » basic structural motifs in inorganic solid materials by developing classification criteria. We introduce a systematic testing framework, the Einstein crystal test rig, that probes the response of order parameters to distortions in perfect motifs to validate our approach. Subsequently, we highlight three important application cases. First, we map basic crystal structure information of a large materials database in an intuitive manner by screening the Materials Project (MP) database (61,422 compounds) for element-specific motif distributions. Second, we use the structure-motif recognition capabilities to automatically find interstitials in metals, semiconductor, and insulator materials. Our Interstitialcy Finding Tool (InFiT) facilitates high-throughput screenings of defect properties. Third, the order parameters are reliable and compact quantitative structure descriptors for characterizing diffusion hops of intercalants as our example of magnesium in MnO 2-spinel indicates. Finally, the tools developed in our work are readily and freely available as software implementations in the pymatgen library, and we expect them to be further applied to machine-learning approaches for emerging applications in materials science.« less

  7. Assessing Local Structure Motifs Using Order Parameters for Motif Recognition, Interstitial Identification, and Diffusion Path Characterization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zimmermann, Nils E. R.; Horton, Matthew K.; Jain, Anubhav

    Structure–property relationships form the basis of many design rules in materials science, including synthesizability and long-term stability of catalysts, control of electrical and optoelectronic behavior in semiconductors, as well as the capacity of and transport properties in cathode materials for rechargeable batteries. The immediate atomic environments (i.e., the first coordination shells) of a few atomic sites are often a key factor in achieving a desired property. Some of the most frequently encountered coordination patterns are tetrahedra, octahedra, body and face-centered cubic as well as hexagonal close packed-like environments. Here, we showcase the usefulness of local order parameters to identify thesemore » basic structural motifs in inorganic solid materials by developing classification criteria. We introduce a systematic testing framework, the Einstein crystal test rig, that probes the response of order parameters to distortions in perfect motifs to validate our approach. Subsequently, we highlight three important application cases. First, we map basic crystal structure information of a large materials database in an intuitive manner by screening the Materials Project (MP) database (61,422 compounds) for element-specific motif distributions. Second, we use the structure-motif recognition capabilities to automatically find interstitials in metals, semiconductor, and insulator materials. Our Interstitialcy Finding Tool (InFiT) facilitates high-throughput screenings of defect properties. Third, the order parameters are reliable and compact quantitative structure descriptors for characterizing diffusion hops of intercalants as our example of magnesium in MnO 2-spinel indicates. Finally, the tools developed in our work are readily and freely available as software implementations in the pymatgen library, and we expect them to be further applied to machine-learning approaches for emerging applications in materials science.« less

  8. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence

    PubMed Central

    Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya

    2015-01-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930

  9. Dynamic motif occupancy (DynaMO) analysis identifies transcription factors and their binding sites driving dynamic biological processes

    PubMed Central

    Kuang, Zheng; Ji, Zhicheng

    2018-01-01

    Abstract Biological processes are usually associated with genome-wide remodeling of transcription driven by transcription factors (TFs). Identifying key TFs and their spatiotemporal binding patterns are indispensable to understanding how dynamic processes are programmed. However, most methods are designed to predict TF binding sites only. We present a computational method, dynamic motif occupancy analysis (DynaMO), to infer important TFs and their spatiotemporal binding activities in dynamic biological processes using chromatin profiling data from multiple biological conditions such as time-course histone modification ChIP-seq data. In the first step, DynaMO predicts TF binding sites with a random forests approach. Next and uniquely, DynaMO infers dynamic TF binding activities at predicted binding sites using their local chromatin profiles from multiple biological conditions. Another landmark of DynaMO is to identify key TFs in a dynamic process using a clustering and enrichment analysis of dynamic TF binding patterns. Application of DynaMO to the yeast ultradian cycle, mouse circadian clock and human neural differentiation exhibits its accuracy and versatility. We anticipate DynaMO will be generally useful for elucidating transcriptional programs in dynamic processes. PMID:29325176

  10. Biophysical studies and NMR structure of YAP2 WW domain - LATS1 PPxY motif complexes reveal the basis of their interaction.

    PubMed

    Verma, Apoorva; Jing-Song, Fan; Finch-Edmondson, Megan L; Velazquez-Campoy, Adrian; Balasegaran, Shanker; Sudol, Marius; Sivaraman, Jayaraman

    2018-01-30

    YES-associated protein (YAP) is a major effector protein of the Hippo tumor suppressor pathway, and is phosphorylated by the serine/threonine kinase LATS. Their binding is mediated by the interaction between WW domains of YAP and PPxY motifs of LATS. Their isoforms, YAP2 and LATS1 contain two WW domains and two PPxY motifs respectively. Here, we report the study of the interaction of these domains both in vitro and in human cell lines, to better understand the mechanism of their binding. We show that there is a reciprocal binding preference of YAP2-WW1 with LATS1-PPxY2, and YAP2-WW2 with LATS1-PPxY1. We solved the NMR structures of these complexes and identified several conserved residues that play a critical role in binding. We further created a YAP2 mutant by swapping the WW domains, and found that YAP2 phosphorylation at S127 by LATS1 is not affected by the spatial configuration of its WW domains. This is likely because the region between the PPxY motifs of LATS1 is unstructured, even upon binding with its partner. Based on our observations, we propose possible models for the interaction between YAP2 and LATS1.

  11. Biophysical studies and NMR structure of YAP2 WW domain - LATS1 PPxY motif complexes reveal the basis of their interaction

    PubMed Central

    Verma, Apoorva; Jing-Song, Fan; Finch-Edmondson, Megan L.; Velazquez-Campoy, Adrian; Balasegaran, Shanker; Sudol, Marius; Sivaraman, Jayaraman

    2018-01-01

    YES-associated protein (YAP) is a major effector protein of the Hippo tumor suppressor pathway, and is phosphorylated by the serine/threonine kinase LATS. Their binding is mediated by the interaction between WW domains of YAP and PPxY motifs of LATS. Their isoforms, YAP2 and LATS1 contain two WW domains and two PPxY motifs respectively. Here, we report the study of the interaction of these domains both in vitro and in human cell lines, to better understand the mechanism of their binding. We show that there is a reciprocal binding preference of YAP2-WW1 with LATS1-PPxY2, and YAP2-WW2 with LATS1-PPxY1. We solved the NMR structures of these complexes and identified several conserved residues that play a critical role in binding. We further created a YAP2 mutant by swapping the WW domains, and found that YAP2 phosphorylation at S127 by LATS1 is not affected by the spatial configuration of its WW domains. This is likely because the region between the PPxY motifs of LATS1 is unstructured, even upon binding with its partner. Based on our observations, we propose possible models for the interaction between YAP2 and LATS1. PMID:29487715

  12. Genomic Context Analysis of de Novo STXBP1 Mutations Identifies Evidence of Splice Site DNA-Motif Associated Hotspots.

    PubMed

    Uddin, Mohammed; Woodbury-Smith, Marc; Chan, Ada J S; Albanna, Ammar; Minassian, Berge; Boelman, Cyrus; Scherer, Stephen W

    2018-03-28

    Mutations within STXBP1 have been associated with a range of neurodevelopmental disorders implicating the pleotropic impact of this gene. Although the frequency of de novo mutations within STXBP1 for selective cohorts with early onset epileptic encephalopathy is more than 1%, there is no evidence for a hotspot within the gene. In this study, we analyzed the genomic context of de novo STXBP1 mutations to examine whether certain motifs indicated a greater risk of mutation. Through a comprehensive context analysis of 136 de novo /rare mutation (SNV/Indels) sites in this gene, strikingly 26.92% of all SNV mutations occurred within 5bp upstream or downstream of a 'GTA' motif ( P < 0.0005). This implies a genomic context modulated mutagenesis. Moreover, 51.85% (14 out of 27) of the 'GTA' mutations are splicing compared to 14.70% (20 out of 136) of all reported mutations within STXBP1 We also noted that 11 of these 14 'GTA' associated mutations are de novo in origin. Our analysis provides strong evidence of DNA motif modulated mutagenesis for STXBP1 de novo splicing mutations. Copyright © 2018 Uddin et al.

  13. Identifying Important Atlantic Areas for the conservation of Balearic shearwaters: Spatial overlap with conservation areas

    NASA Astrophysics Data System (ADS)

    Pérez-Roda, Amparo; Delord, Karine; Boué, Amélie; Arcos, José Manuel; García, David; Micol, Thierry; Weimerskirch, Henri; Pinaud, David; Louzao, Maite

    2017-07-01

    Marine protected areas (MPAs) are considered one of the main tools in both fisheries and conservation management to protect threatened species and their habitats around the globe. However, MPAs are underrepresented in marine environments compared to terrestrial environments. Within this context, we studied the Atlantic non-breeding distribution of the southern population of Balearic shearwaters (Puffinus mauretanicus) breeding in Eivissa during the 2011-2012 period based on global location sensing (GLS) devices. Our objectives were (1) to identify overall Important Atlantic Areas (IAAs) from a southern population, (2) to describe spatio-temporal patterns of oceanographic habitat use, and (3) to assess whether existing conservation areas (Natura 2000 sites and marine Important Bird Areas (IBAs)) cover the main IAAs of Balearic shearwaters. Our results highlighted that the Atlantic staging (from June to October in 2011) dynamic of the southern population was driven by individual segregation at both spatial and temporal scales. Individuals ranged in the North-East Atlantic over four main IAAs (Bay of Biscay: BoB, Western Iberian shelf: WIS, Gulf of Cadiz: GoC, West of Morocco: WoM). While most individuals spent more time on the WIS or in the GoC, a small number of birds visited IAAs at the extremes of their Atlantic distribution range (i.e., BoB and WoM). The chronology of the arrivals to the IAAs showed a latitudinal gradient with northern areas reached earlier during the Atlantic staging. The IAAs coincided with the most productive areas (higher chlorophyll a values) in the NE Atlantic between July and October. The spatial overlap between IAAs and conservation areas was higher for Natura 2000 sites than marine IBAs (areas with and without legal protection, respectively). Concerning the use of these areas, a slightly higher proportion of estimated positions fell within marine IBAs compared to designated Natura 2000 sites, with Spanish and Portuguese conservation

  14. The valine and lysine residues in the conserved FxVTxK motif are important for the function of phylogenetically distant plant cellulose synthases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Slabaugh, Erin; Scavuzzo-Duggan, Tess; Chaves, Arielle

    2015-12-08

    Cellulose synthases (CESAs) synthesize the β-1,4-glucan chains that coalesce to form cellulose microfibrils in plant cell walls. In addition to a large cytosolic (catalytic) domain, CESAs have eight predicted transmembrane helices (TMHs). However, analogous to the structure of BcsA, a bacterial CESA, predicted TMH5 in CESA may instead be an interfacial helix. This would place the conserved FxVTxK motif in the plant cell cytosol where it could function as a substrate-gating loop as occurs in BcsA. To define the functional importance of the CESA region containing FxVTxK, we tested five parallel mutations in Arabidopsis thaliana CESA1 and Physcomitrella patens CESA5more » in complementation assays of the relevant cesa mutants. In both organisms, the substitution of the valine or lysine residues in FxVTxK severely affected CESA function. In Arabidopsis roots, both changes were correlated with lower cellulose anisotropy, as revealed by Pontamine Fast Scarlet. Analysis of hypocotyl inner cell wall layers by atomic force microscopy showed that two altered versions of Atcesa1 could rescue cell wall phenotypes observed in the mutant background line. Overall, the data show that the FxVTxK motif is functionally important in two phylogenetically distant plant CESAs. The results show that Physcomitrella provides an efficient model for assessing the effects of engineered CESA mutations affecting primary cell wall synthesis and that diverse testing systems can lead to nuanced insights into CESA structure–function relationships. Although CESA membrane topology needs to be experimentally determined, the results support the possibility that the FxVTxK region functions similarly in CESA and BcsA.« less

  15. Cheap and Nasty? The Potential Perils of Using Management Costs to Identify Global Conservation Priorities

    PubMed Central

    McCreless, Erin; Visconti, Piero; Carwardine, Josie; Wilcox, Chris; Smith, Robert J.

    2013-01-01

    The financial cost of biodiversity conservation varies widely around the world and such costs should be considered when identifying countries to best focus conservation investments. Previous global prioritizations have been based on global models for protected area management costs, but this metric may be related to other factors that negatively influence the effectiveness and social impacts of conservation. Here we investigate such relationships and first show that countries with low predicted costs are less politically stable. Local support and capacity can mitigate the impacts of such instability, but we also found that these countries have less civil society involvement in conservation. Therefore, externally funded projects in these countries must rely on government agencies for implementation. This can be problematic, as our analyses show that governments in countries with low predicted costs score poorly on indices of corruption, bureaucratic quality and human rights. Taken together, our results demonstrate that using national-level estimates for protected area management costs to set global conservation priorities is simplistic, as projects in apparently low-cost countries are less likely to succeed and more likely to have negative impacts on people. We identify the need for an improved approach to develop global conservation cost metrics that better capture the true costs of avoiding or overcoming such problems. Critically, conservation scientists must engage with practitioners to better understand and implement context-specific solutions. This approach assumes that measures of conservation costs, like measures of conservation value, are organization specific, and would bring a much-needed focus on reducing the negative impacts of conservation to develop projects that benefit people and biodiversity. PMID:24260502

  16. BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements

    PubMed Central

    De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan

    2015-01-01

    Motivation: The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. Results: We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. Availability and implementation: BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Contact: Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26254488

  17. Identification of high-efficiency 3'GG gRNA motifs in indexed FASTA files with ngg2.

    PubMed

    Roberson, Elisha D O

    CRISPR/Cas9 is emerging as one of the most-used methods of genome modification in organisms ranging from bacteria to human cells. However, the efficiency of editing varies tremendously site-to-site. A recent report identified a novel motif, called the 3'GG motif, which substantially increases the efficiency of editing at all sites tested in C. elegans . Furthermore, they highlighted that previously published gRNAs with high editing efficiency also had this motif. I designed a python command-line tool, ngg2, to identify 3'GG gRNA sites from indexed FASTA files. As a proof-of-concept, I screened for these motifs in six model genomes: Saccharomyces cerevisiae , Caenorhabditis elegans , Drosophila melanogaster , Danio rerio , Mus musculus , and Homo sapiens. I also scanned the genomes of pig ( Sus scrofa ) and African elephant ( Loxodonta africana ) to demonstrate the utility in non-model organisms. I identified more than 60 million single match 3'GG motifs in these genomes. Greater than 61% of all protein coding genes in the reference genomes had at least one unique 3'GG gRNA site overlapping an exon. In particular, more than 96% of mouse and 93% of human protein coding genes have at least one unique, overlapping 3'GG gRNA. These identified sites can be used as a starting point in gRNA selection, and the ngg2 tool provides an important ability to identify 3'GG editing sites in any species with an available genome sequence.

  18. Conservation of the Human Integrin-Type Beta-Propeller Domain in Bacteria

    PubMed Central

    Chouhan, Bhanupratap; Denesyuk, Alexander; Heino, Jyrki; Johnson, Mark S.; Denessiouk, Konstantin

    2011-01-01

    Integrins are heterodimeric cell-surface receptors with key functions in cell-cell and cell-matrix adhesion. Integrin α and β subunits are present throughout the metazoans, but it is unclear whether the subunits predate the origin of multicellular organisms. Several component domains have been detected in bacteria, one of which, a specific 7-bladed β-propeller domain, is a unique feature of the integrin α subunits. Here, we describe a structure-derived motif, which incorporates key features of each blade from the X-ray structures of human αIIbβ3 and αVβ3, includes elements of the FG-GAP/Cage and Ca2+-binding motifs, and is specific only for the metazoan integrin domains. Separately, we searched for the metazoan integrin type β-propeller domains among all available sequences from bacteria and unicellular eukaryotic organisms, which must incorporate seven repeats, corresponding to the seven blades of the β-propeller domain, and so that the newly found structure-derived motif would exist in every repeat. As the result, among 47 available genomes of unicellular eukaryotes we could not find a single instance of seven repeats with the motif. Several sequences contained three repeats, a predicted transmembrane segment, and a short cytoplasmic motif associated with some integrins, but otherwise differ from the metazoan integrin α subunits. Among the available bacterial sequences, we found five examples containing seven sequential metazoan integrin-specific motifs within the seven repeats. The motifs differ in having one Ca2+-binding site per repeat, whereas metazoan integrins have three or four sites. The bacterial sequences are more conserved in terms of motif conservation and loop length, suggesting that the structure is more regular and compact than those example structures from human integrins. Although the bacterial examples are not full-length integrins, the full-length metazoan-type 7-bladed β-propeller domains are present, and sometimes two tandem

  19. Comparative genomics of metabolic capacities of regulons controlled by cis-regulatory RNA motifs in bacteria.

    PubMed

    Sun, Eric I; Leyn, Semen A; Kazanov, Marat D; Saier, Milton H; Novichkov, Pavel S; Rodionov, Dmitry A

    2013-09-02

    In silico comparative genomics approaches have been efficiently used for functional prediction and reconstruction of metabolic and regulatory networks. Riboswitches are metabolite-sensing structures often found in bacterial mRNA leaders controlling gene expression on transcriptional or translational levels.An increasing number of riboswitches and other cis-regulatory RNAs have been recently classified into numerous RNA families in the Rfam database. High conservation of these RNA motifs provides a unique advantage for their genomic identification and comparative analysis. A comparative genomics approach implemented in the RegPredict tool was used for reconstruction and functional annotation of regulons controlled by RNAs from 43 Rfam families in diverse taxonomic groups of Bacteria. The inferred regulons include ~5200 cis-regulatory RNAs and more than 12000 target genes in 255 microbial genomes. All predicted RNA-regulated genes were classified into specific and overall functional categories. Analysis of taxonomic distribution of these categories allowed us to establish major functional preferences for each analyzed cis-regulatory RNA motif family. Overall, most RNA motif regulons showed predictable functional content in accordance with their experimentally established effector ligands. Our results suggest that some RNA motifs (including thiamin pyrophosphate and cobalamin riboswitches that control the cofactor metabolism) are widespread and likely originated from the last common ancestor of all bacteria. However, many more analyzed RNA motifs are restricted to a narrow taxonomic group of bacteria and likely represent more recent evolutionary innovations. The reconstructed regulatory networks for major known RNA motifs substantially expand the existing knowledge of transcriptional regulation in bacteria. The inferred regulons can be used for genetic experiments, functional annotations of genes, metabolic reconstruction and evolutionary analysis. The obtained genome

  20. Identifying and prioritizing ungulate migration routes for landscape-level conservation

    USGS Publications Warehouse

    Sawyer, H.; Kauffman, M.J.; Nielson, R.M.; Horne, J.S.

    2009-01-01

    As habitat loss and fragmentation increase across ungulate ranges, identifying and prioritizing migration routes for conservation has taken on new urgency. Here we present a general framework using the Brownian bridge movement model (BBMM) that: (1) provides a probabilistic estimate of the migration routes of a sampled population, (2) distinguishes between route segments that function as stopover sites vs. those used primarily as movement corridors, and (3) prioritizes routes for conservation based upon the proportion of the sampled population that uses them. We applied this approach to a migratory mule deer (Odocoileus hemionus) population in a pristine area of southwest Wyoming, USA, where 2000 gas wells and 1609 km of pipelines and roads have been proposed for development. Our analysis clearly delineated where migration routes occurred relative to proposed development and provided guidance for on-the-ground conservation efforts. Mule deer migration routes were characterized by a series of stopover sites where deer spent most of their time, connected by movement corridors through which deer moved quickly. Our findings suggest management strategies that differentiate between stopover sites and movement corridors may be warranted. Because some migration routes were used by more mule deer than others, proportional level of use may provide a reasonable metric by which routes can be prioritized for conservation. The methods we outline should be applicable to a wide range of species that inhabit regions where migration routes are threatened or poorly understood. ?? 2009 by the Ecological Society of America.

  1. Identifying and prioritizing ungulate migration routes for landscape-level conservation

    USGS Publications Warehouse

    Sawyer, Hall; Kauffman, Matthew J.; Nielson, Ryan M.; Horne, Jon S.

    2009-01-01

    As habitat loss and fragmentation increase across ungulate ranges, identifying and prioritizing migration routes for conservation has taken on new urgency. Here we present a general framework using the Brownian bridge movement model (BBMM) that: (1) provides a probabilistic estimate of the migration routes of a sampled population, (2) distinguishes between route segments that function as stopover sites vs. those used primarily as movement corridors, and (3) prioritizes routes for conservation based upon the proportion of the sampled population that uses them. We applied this approach to a migratory mule deer (Odocoileus hemionus) population in a pristine area of southwest Wyoming, USA, where 2000 gas wells and 1609 km of pipelines and roads have been proposed for development. Our analysis clearly delineated where migration routes occurred relative to proposed development and provided guidance for on-the-ground conservation efforts. Mule deer migration routes were characterized by a series of stopover sites where deer spent most of their time, connected by movement corridors through which deer moved quickly. Our findings suggest management strategies that differentiate between stopover sites and movement corridors may be warranted. Because some migration routes were used by more mule deer than others, proportional level of use may provide a reasonable metric by which routes can be prioritized for conservation. The methods we outline should be applicable to a wide range of species that inhabit regions where migration routes are threatened or poorly understood.

  2. MONKEY: Identifying conserved transcription-factor binding sitesin multiple alignments using a binding site-specific evolutionarymodel

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Moses, Alan M.; Chiang, Derek Y.; Pollard, Daniel A.

    2004-10-28

    We introduce a method (MONKEY) to identify conserved transcription-factor binding sites in multispecies alignments. MONKEY employs probabilistic models of factor specificity and binding site evolution, on which basis we compute the likelihood that putative sites are conserved and assign statistical significance to each hit. Using genomes from the genus Saccharomyces, we illustrate how the significance of real sites increases with evolutionary distance and explore the relationship between conservation and function.

  3. Identification and analysis of Eimeria nieschulzi gametocyte genes reveal splicing events of gam genes and conserved motifs in the wall-forming proteins within the genus Eimeria (Coccidia, Apicomplexa)

    PubMed Central

    Wiedmer, Stefanie; Erdbeer, Alexander; Volke, Beate; Randel, Stephanie; Kapplusch, Franz; Hanig, Sacha; Kurth, Michael

    2017-01-01

    The genus Eimeria (Apicomplexa, Coccidia) provides a wide range of different species with different hosts to study common and variable features within the genus and its species. A common characteristic of all known Eimeria species is the oocyst, the infectious stage where its life cycle starts and ends. In our study, we utilized Eimeria nieschulzi as a model organism. This rat-specific parasite has complex oocyst morphology and can be transfected and even cultivated in vitro up to the oocyst stage. We wanted to elucidate how the known oocyst wall-forming proteins are preserved in this rodent Eimeria species compared to other Eimeria. In newly obtained genomics data, we were able to identify different gametocyte genes that are orthologous to already known gam genes involved in the oocyst wall formation of avian Eimeria species. These genes appeared putatively as single exon genes, but cDNA analysis showed alternative splicing events in the transcripts. The analysis of the translated sequence revealed different conserved motifs but also dissimilar regions in GAM proteins, as well as polymorphic regions. The occurrence of an underrepresented gam56 gene version suggests the existence of a second distinct E. nieschulzi genotype within the E. nieschulzi Landers isolate that we maintain. PMID:29210668

  4. A Novel Family in Medicago truncatula Consisting of More Than 300 Nodule-Specific Genes Coding for Small, Secreted Polypeptides with Conserved Cysteine Motifs1[w

    PubMed Central

    Mergaert, Peter; Nikovics, Krisztina; Kelemen, Zsolt; Maunoury, Nicolas; Vaubert, Danièle; Kondorosi, Adam; Kondorosi, Eva

    2003-01-01

    Transcriptome analysis of Medicago truncatula nodules has led to the discovery of a gene family named NCR (nodule-specific cysteine rich) with more than 300 members. The encoded polypeptides were short (60–90 amino acids), carried a conserved signal peptide, and, except for a conserved cysteine motif, displayed otherwise extensive sequence divergence. Family members were found in pea (Pisum sativum), broad bean (Vicia faba), white clover (Trifolium repens), and Galega orientalis but not in other plants, including other legumes, suggesting that the family might be specific for galegoid legumes forming indeterminate nodules. Gene expression of all family members was restricted to nodules except for two, also expressed in mycorrhizal roots. NCR genes exhibited distinct temporal and spatial expression patterns in nodules and, thus, were coupled to different stages of development. The signal peptide targeted the polypeptides in the secretory pathway, as shown by green fluorescent protein fusions expressed in onion (Allium cepa) epidermal cells. Coregulation of certain NCR genes with genes coding for a potentially secreted calmodulin-like protein and for a signal peptide peptidase suggests a concerted action in nodule development. Potential functions of the NCR polypeptides in cell-to-cell signaling and creation of a defense system are discussed. PMID:12746522

  5. Comparative analysis of cis-regulation following stroke and seizures in subspaces of conserved eigensystems

    PubMed Central

    2010-01-01

    Background It is often desirable to separate effects of different regulators on gene expression, or to identify effects of the same regulator across several systems. Here, we focus on the rat brain following stroke or seizures, and demonstrate how the two tasks can be approached simultaneously. Results We applied SVD to time-series gene expression datasets from the rat experimental models of stroke and seizures. We demonstrate conservation of two eigensystems, reflecting inflammation and/or apoptosis (eigensystem 2) and neuronal synaptic activity (eigensystem 3), between the stroke and seizures. We analyzed cis-regulation of gene expression in the subspaces of the conserved eigensystems. Bayesian networks analysis was performed separately for either experimental model, with cross-system validation of the highest-ranking features. In this way, we correctly re-discovered the role of AP1 in the regulation of apoptosis, and the involvement of Creb and Egr in the regulation of synaptic activity-related genes. We identified a novel antagonistic effect of the motif recognized by the nuclear matrix attachment region-binding protein Satb1 on AP1-driven transcriptional activation, suggesting a link between chromatin loop structure and gene activation by AP1. The effects of motifs binding Satb1 and Creb on gene expression in brain conform to the assumption of the linear response model of gene regulation. Our data also suggest that numerous enhancers of neuronal-specific genes are important for their responsiveness to the synaptic activity. Conclusion Eigensystems conserved between stroke and seizures separate effects of inflammation/apoptosis and neuronal synaptic activity, exerted by different transcription factors, on gene expression in rat brain. PMID:20565733

  6. Discovery of T Cell Receptor β Motifs Specific to HLA-B27-Positive Ankylosing Spondylitis by Deep Repertoire Sequence Analysis.

    PubMed

    Faham, Malek; Carlton, Victoria; Moorhead, Martin; Zheng, Jianbiao; Klinger, Mark; Pepin, Francois; Asbury, Thomas; Vignali, Marissa; Emerson, Ryan O; Robins, Harlan S; Ireland, James; Baechler-Gillespie, Emily; Inman, Robert D

    2017-04-01

    Ankylosing spondylitis (AS), a chronic inflammatory disorder, has a notable association with HLA-B27. One hypothesis suggests that a common antigen that binds to HLA-B27 is important for AS disease pathogenesis. This study was undertaken to determine sequences and motifs that are shared among HLA-B27-positive AS patients, using T cell repertoire next-generation sequencing. To identify motifs enriched among B27-positive AS patients, we performed T cell receptor β (TCRβ) repertoire sequencing on samples from 191 B27-positive AS patients, 43 B27-negative AS patients, and 227 controls, and we obtained >77 million TCRβ clonotype sequences. First, we assessed whether any of 50 previously published sequences were enriched in B27-positive AS patients. We then used training and test cohorts to identify discovered motifs that were enriched in B27-positive AS patients versus controls. Six previously published and 11 discovered motifs were enriched in the B27-positive AS samples as compared to controls. After combining motifs related by sequence, we identified a total of 15 independent motifs. Both the full set of 15 motifs and a set of 6 published motifs were enriched in the B27-positive AS patients as compared to B27-positive healthy individuals (P = 0.049 and P = 0.001, respectively). Using an independent cohort, we validated that at least some of these motifs were associated with AS, and not simply with B27-positive status. We identified TCRβ motifs that are enriched in B27-positive AS patients as compared to B27-positive healthy controls. This suggests that a common antigen, presented by HLA-B27 and detected by CD8+ T cells, may be associated with AS disease pathogenesis. © 2016, American College of Rheumatology.

  7. Unusual conformation of the SxN motif in the crystal structure of penicillin-binding protein A from Mycobacterium tuberculosis.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fedarovich, Alena; Nicholas, Robert A.; Davies, Christopher

    PBPA from Mycobacterium tuberculosis is a class B-like penicillin-binding protein (PBP) that is not essential for cell growth in M. tuberculosis, but is important for proper cell division in Mycobacterium smegmatis. We have determined the crystal structure of PBPA at 2.05 {angstrom} resolution, the first published structure of a PBP from this important pathogen. Compared to other PBPs, PBPA has a relatively small N-terminal domain, and conservation of a cluster of charged residues within this domain suggests that PBPA is more related to class B PBPs than previously inferred from sequence analysis. The C-terminal domain is a typical transpeptidase foldmore » and contains the three conserved active-site motifs characterisitic of penicillin-interacting enzymes. While the arrangement of the SxxK and KTG motifs is similar to that observed in other PBPs, the SxN motif is markedly displaced away from the active site, such that its serine (Ser281) is not involved in hydrogen bonding with residues of the other two motifs. A disulfide bridge between Cys282 (the 'x' of the SxN motif) and Cys266, which resides on an adjacent loop, may be responsible for this unusual conformation. Another interesting feature of the structure is a relatively long connection between {beta}5 and {alpha}11, which restricts the space available in the active site of PBPA and suggests that conformational changes would be required to accommodate peptide substrate or {beta}-lactam antibiotics during acylation. Finally, the structure shows that one of the two threonines postulated to be targets for phosphorylation is inaccessible (Thr362), whereas the other (Thr437) is well placed on a surface loop near the active site.« less

  8. ProMotE: an efficient algorithm for counting independent motifs in uncertain network topologies.

    PubMed

    Ren, Yuanfang; Sarkar, Aisharjya; Kahveci, Tamer

    2018-06-26

    Identifying motifs in biological networks is essential in uncovering key functions served by these networks. Finding non-overlapping motif instances is however a computationally challenging task. The fact that biological interactions are uncertain events further complicates the problem, as it makes the existence of an embedding of a given motif an uncertain event as well. In this paper, we develop a novel method, ProMotE (Probabilistic Motif Embedding), to count non-overlapping embeddings of a given motif in probabilistic networks. We utilize a polynomial model to capture the uncertainty. We develop three strategies to scale our algorithm to large networks. Our experiments demonstrate that our method scales to large networks in practical time with high accuracy where existing methods fail. Moreover, our experiments on cancer and degenerative disease networks show that our method helps in uncovering key functional characteristics of biological networks.

  9. RSAT matrix-clustering: dynamic exploration and redundancy reduction of transcription factor binding motif collections.

    PubMed

    Castro-Mondragon, Jaime Abraham; Jaeger, Sébastien; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

    2017-07-27

    Transcription factor (TF) databases contain multitudes of binding motifs (TFBMs) from various sources, from which non-redundant collections are derived by manual curation. The advent of high-throughput methods stimulated the production of novel collections with increasing numbers of motifs. Meta-databases, built by merging these collections, contain redundant versions, because available tools are not suited to automatically identify and explore biologically relevant clusters among thousands of motifs. Motif discovery from genome-scale data sets (e.g. ChIP-seq) also produces redundant motifs, hampering the interpretation of results. We present matrix-clustering, a versatile tool that clusters similar TFBMs into multiple trees, and automatically creates non-redundant TFBM collections. A feature unique to matrix-clustering is its dynamic visualisation of aligned TFBMs, and its capability to simultaneously treat multiple collections from various sources. We demonstrate that matrix-clustering considerably simplifies the interpretation of combined results from multiple motif discovery tools, and highlights biologically relevant variations of similar motifs. We also ran a large-scale application to cluster ∼11 000 motifs from 24 entire databases, showing that matrix-clustering correctly groups motifs belonging to the same TF families, and drastically reduced motif redundancy. matrix-clustering is integrated within the RSAT suite (http://rsat.eu/), accessible through a user-friendly web interface or command-line for its integration in pipelines. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. Identification of sequence-structure RNA binding motifs for SELEX-derived aptamers.

    PubMed

    Hoinka, Jan; Zotenko, Elena; Friedman, Adam; Sauna, Zuben E; Przytycka, Teresa M

    2012-06-15

    Systematic Evolution of Ligands by EXponential Enrichment (SELEX) represents a state-of-the-art technology to isolate single-stranded (ribo)nucleic acid fragments, named aptamers, which bind to a molecule (or molecules) of interest via specific structural regions induced by their sequence-dependent fold. This powerful method has applications in designing protein inhibitors, molecular detection systems, therapeutic drugs and antibody replacement among others. However, full understanding and consequently optimal utilization of the process has lagged behind its wide application due to the lack of dedicated computational approaches. At the same time, the combination of SELEX with novel sequencing technologies is beginning to provide the data that will allow the examination of a variety of properties of the selection process. To close this gap we developed, Aptamotif, a computational method for the identification of sequence-structure motifs in SELEX-derived aptamers. To increase the chances of identifying functional motifs, Aptamotif uses an ensemble-based approach. We validated the method using two published aptamer datasets containing experimentally determined motifs of increasing complexity. We were able to recreate the author's findings to a high degree, thus proving the capability of our approach to identify binding motifs in SELEX data. Additionally, using our new experimental dataset, we illustrate the application of Aptamotif to elucidate several properties of the selection process.

  11. Identification of sequence–structure RNA binding motifs for SELEX-derived aptamers

    PubMed Central

    Hoinka, Jan; Zotenko, Elena; Friedman, Adam; Sauna, Zuben E.; Przytycka, Teresa M.

    2012-01-01

    Motivation: Systematic Evolution of Ligands by EXponential Enrichment (SELEX) represents a state-of-the-art technology to isolate single-stranded (ribo)nucleic acid fragments, named aptamers, which bind to a molecule (or molecules) of interest via specific structural regions induced by their sequence-dependent fold. This powerful method has applications in designing protein inhibitors, molecular detection systems, therapeutic drugs and antibody replacement among others. However, full understanding and consequently optimal utilization of the process has lagged behind its wide application due to the lack of dedicated computational approaches. At the same time, the combination of SELEX with novel sequencing technologies is beginning to provide the data that will allow the examination of a variety of properties of the selection process. Results: To close this gap we developed, Aptamotif, a computational method for the identification of sequence–structure motifs in SELEX-derived aptamers. To increase the chances of identifying functional motifs, Aptamotif uses an ensemble-based approach. We validated the method using two published aptamer datasets containing experimentally determined motifs of increasing complexity. We were able to recreate the author's findings to a high degree, thus proving the capability of our approach to identify binding motifs in SELEX data. Additionally, using our new experimental dataset, we illustrate the application of Aptamotif to elucidate several properties of the selection process. Contact: przytyck@ncbi.nlm.nih.gov, Zuben.Sauna@fda.hhs.gov PMID:22689764

  12. De novo truncating variants in the AHDC1 gene encoding the AT-hook DNA-binding motif-containing protein 1 are associated with intellectual disability and developmental delay.

    PubMed

    Yang, Hui; Douglas, Ganka; Monaghan, Kristin G; Retterer, Kyle; Cho, Megan T; Escobar, Luis F; Tucker, Megan E; Stoler, Joan; Rodan, Lance H; Stein, Diane; Marks, Warren; Enns, Gregory M; Platt, Julia; Cox, Rachel; Wheeler, Patricia G; Crain, Carrie; Calhoun, Amy; Tryon, Rebecca; Richard, Gabriele; Vitazka, Patrik; Chung, Wendy K

    2015-10-01

    Whole-exome sequencing (WES) represents a significant breakthrough in clinical genetics, and identifies a genetic etiology in up to 30% of cases of intellectual disability (ID). Using WES, we identified seven unrelated patients with a similar clinical phenotype of severe intellectual disability or neurodevelopmental delay who were all heterozygous for de novo truncating variants in the AT-hook DNA-binding motif-containing protein 1 (AHDC1). The patients were all minimally verbal or nonverbal and had variable neurological problems including spastic quadriplegia, ataxia, nystagmus, seizures, autism, and self-injurious behaviors. Additional common clinical features include dysmorphic facial features and feeding difficulties associated with failure to thrive and short stature. The AHDC1 gene has only one coding exon, and the protein contains conserved regions including AT-hook motifs and a PDZ binding domain. We postulate that all seven variants detected in these patients result in a truncated protein missing critical functional domains, disrupting interactions with other proteins important for brain development. Our study demonstrates that truncating variants in AHDC1 are associated with ID and are primarily associated with a neurodevelopmental phenotype.

  13. Optimized mixed Markov models for motif identification

    PubMed Central

    Huang, Weichun; Umbach, David M; Ohler, Uwe; Li, Leping

    2006-01-01

    Background Identifying functional elements, such as transcriptional factor binding sites, is a fundamental step in reconstructing gene regulatory networks and remains a challenging issue, largely due to limited availability of training samples. Results We introduce a novel and flexible model, the Optimized Mixture Markov model (OMiMa), and related methods to allow adjustment of model complexity for different motifs. In comparison with other leading methods, OMiMa can incorporate more than the NNSplice's pairwise dependencies; OMiMa avoids model over-fitting better than the Permuted Variable Length Markov Model (PVLMM); and OMiMa requires smaller training samples than the Maximum Entropy Model (MEM). Testing on both simulated and actual data (regulatory cis-elements and splice sites), we found OMiMa's performance superior to the other leading methods in terms of prediction accuracy, required size of training data or computational time. Our OMiMa system, to our knowledge, is the only motif finding tool that incorporates automatic selection of the best model. OMiMa is freely available at [1]. Conclusion Our optimized mixture of Markov models represents an alternative to the existing methods for modeling dependent structures within a biological motif. Our model is conceptually simple and effective, and can improve prediction accuracy and/or computational speed over other leading methods. PMID:16749929

  14. The Drosophila Juvenile Hormone Receptor Candidates Methoprene-tolerant (MET) and Germ Cell-expressed (GCE) Utilize a Conserved LIXXL Motif to Bind the FTZ-F1 Nuclear Receptor*

    PubMed Central

    Bernardo, Travis J.; Dubrovsky, Edward B.

    2012-01-01

    Juvenile hormone (JH) has been implicated in many developmental processes in holometabolous insects, but its mechanism of signaling remains controversial. We previously found that in Drosophila Schneider 2 cells, the nuclear receptor FTZ-F1 is required for activation of the E75A gene by JH. Here, we utilized insect two-hybrid assays to show that FTZ-F1 interacts with two JH receptor candidates, the bHLH-PAS paralogs MET and GCE, in a JH-dependent manner. These interactions are severely reduced when helix 12 of the FTZ-F1 activation function 2 (AF2) is removed, implicating AF2 as an interacting site. Through homology modeling, we found that MET and GCE possess a C-terminal α-helix featuring a conserved motif LIXXL that represents a novel nuclear receptor (NR) box. Docking simulations supported by two-hybrid experiments revealed that FTZ-F1·MET and FTZ-F1·GCE heterodimer formation involves a typical NR box-AF2 interaction but does not require the canonical charge clamp residues of FTZ-F1 and relies primarily on hydrophobic contacts, including a unique interaction with helix 4. Moreover, we identified paralog-specific features, including a secondary interaction site found only in MET. Our findings suggest that a novel NR box enables MET and GCE to interact JH-dependently with the AF2 of FTZ-F1. PMID:22249180

  15. Identifying Conservation and Restoration Priorities for Saproxylic and Old-Growth Forest Species: A Case Study in Switzerland

    NASA Astrophysics Data System (ADS)

    Lachat, Thibault; Bütler, Rita

    2009-07-01

    Saproxylic (dead-wood-associated) and old-growth species are among the most threatened species in European forest ecosystems, as they are susceptible to intensive forest management. Identifying areas with particular relevant features of biodiversity is of prime concern when developing species conservation and habitat restoration strategies and in optimizing resource investments. We present an approach to identify regional conservation and restoration priorities even if knowledge on species distribution is weak, such as for saproxylic and old-growth species in Switzerland. Habitat suitability maps were modeled for an expert-based selection of 55 focal species, using an ecological niche factor analyses (ENFA). All the maps were then overlaid, in order to identify potential species’ hotspots for different species groups of the 55 focal species (e.g., birds, fungi, red-listed species). We found that hotspots for various species groups did not correspond. Our results indicate that an approach based on “richness hotspots” may fail to conserve specific species groups. We hence recommend defining a biodiversity conservation strategy prior to implementing conservation/restoration efforts in specific regions. The conservation priority setting of the five biogeographical regions in Switzerland, however, did not differ when different hotspot definitions were applied. This observation emphasizes that the chosen method is robust. Since the ENFA needs only presence data, this species prediction method seems to be useful for any situation where the species distribution is poorly known and/or absence data are lacking. In order to identify priorities for either conservation or restoration efforts, we recommend a method based on presence data only, because absence data may reflect factors unrelated to species presence.

  16. Identifying conservation and restoration priorities for saproxylic and old-growth forest species: a case study in Switzerland.

    PubMed

    Lachat, Thibault; Bütler, Rita

    2009-07-01

    Saproxylic (dead-wood-associated) and old-growth species are among the most threatened species in European forest ecosystems, as they are susceptible to intensive forest management. Identifying areas with particular relevant features of biodiversity is of prime concern when developing species conservation and habitat restoration strategies and in optimizing resource investments. We present an approach to identify regional conservation and restoration priorities even if knowledge on species distribution is weak, such as for saproxylic and old-growth species in Switzerland. Habitat suitability maps were modeled for an expert-based selection of 55 focal species, using an ecological niche factor analyses (ENFA). All the maps were then overlaid, in order to identify potential species' hotspots for different species groups of the 55 focal species (e.g., birds, fungi, red-listed species). We found that hotspots for various species groups did not correspond. Our results indicate that an approach based on "richness hotspots" may fail to conserve specific species groups. We hence recommend defining a biodiversity conservation strategy prior to implementing conservation/restoration efforts in specific regions. The conservation priority setting of the five biogeographical regions in Switzerland, however, did not differ when different hotspot definitions were applied. This observation emphasizes that the chosen method is robust. Since the ENFA needs only presence data, this species prediction method seems to be useful for any situation where the species distribution is poorly known and/or absence data are lacking. In order to identify priorities for either conservation or restoration efforts, we recommend a method based on presence data only, because absence data may reflect factors unrelated to species presence.

  17. Enrichment of Circular Code Motifs in the Genes of the Yeast Saccharomyces cerevisiae.

    PubMed

    Michel, Christian J; Ngoune, Viviane Nguefack; Poch, Olivier; Ripp, Raymond; Thompson, Julie D

    2017-12-03

    A set X of 20 trinucleotides has been found to have the highest average occurrence in the reading frame, compared to the two shifted frames, of genes of bacteria, archaea, eukaryotes, plasmids and viruses. This set X has an interesting mathematical property, since X is a maximal C3 self-complementary trinucleotide circular code. Furthermore, any motif obtained from this circular code X has the capacity to retrieve, maintain and synchronize the original (reading) frame. Since 1996, the theory of circular codes in genes has mainly been developed by analysing the properties of the 20 trinucleotides of X, using combinatorics and statistical approaches. For the first time, we test this theory by analysing the X motifs, i.e., motifs from the circular code X, in the complete genome of the yeast Saccharomyces cerevisiae . Several properties of X motifs are identified by basic statistics (at the frequency level), and evaluated by comparison to R motifs, i.e., random motifs generated from 30 different random codes R. We first show that the frequency of X motifs is significantly greater than that of R motifs in the genome of S. cerevisiae . We then verify that no significant difference is observed between the frequencies of X and R motifs in the non-coding regions of S. cerevisiae , but that the occurrence number of X motifs is significantly higher than R motifs in the genes (protein-coding regions). This property is true for all cardinalities of X motifs (from 4 to 20) and for all 16 chromosomes. We further investigate the distribution of X motifs in the three frames of S. cerevisiae genes and show that they occur more frequently in the reading frame, regardless of their cardinality or their length. Finally, the ratio of X genes, i.e., genes with at least one X motif, to non-X genes, in the set of verified genes is significantly different to that observed in the set of putative or dubious genes with no experimental evidence. These results, taken together, represent the first

  18. Dynamic motif occupancy (DynaMO) analysis identifies transcription factors and their binding sites driving dynamic biological processes.

    PubMed

    Kuang, Zheng; Ji, Zhicheng; Boeke, Jef D; Ji, Hongkai

    2018-01-09

    Biological processes are usually associated with genome-wide remodeling of transcription driven by transcription factors (TFs). Identifying key TFs and their spatiotemporal binding patterns are indispensable to understanding how dynamic processes are programmed. However, most methods are designed to predict TF binding sites only. We present a computational method, dynamic motif occupancy analysis (DynaMO), to infer important TFs and their spatiotemporal binding activities in dynamic biological processes using chromatin profiling data from multiple biological conditions such as time-course histone modification ChIP-seq data. In the first step, DynaMO predicts TF binding sites with a random forests approach. Next and uniquely, DynaMO infers dynamic TF binding activities at predicted binding sites using their local chromatin profiles from multiple biological conditions. Another landmark of DynaMO is to identify key TFs in a dynamic process using a clustering and enrichment analysis of dynamic TF binding patterns. Application of DynaMO to the yeast ultradian cycle, mouse circadian clock and human neural differentiation exhibits its accuracy and versatility. We anticipate DynaMO will be generally useful for elucidating transcriptional programs in dynamic processes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. The valine and lysine residues in the conserved FxVTxK motif are important for the function of phylogenetically distant plant cellulose synthases.

    PubMed

    Slabaugh, Erin; Scavuzzo-Duggan, Tess; Chaves, Arielle; Wilson, Liza; Wilson, Carmen; Davis, Jonathan K; Cosgrove, Daniel J; Anderson, Charles T; Roberts, Alison W; Haigler, Candace H

    2016-05-01

    Cellulose synthases (CESAs) synthesize the β-1,4-glucan chains that coalesce to form cellulose microfibrils in plant cell walls. In addition to a large cytosolic (catalytic) domain, CESAs have eight predicted transmembrane helices (TMHs). However, analogous to the structure of BcsA, a bacterial CESA, predicted TMH5 in CESA may instead be an interfacial helix. This would place the conserved FxVTxK motif in the plant cell cytosol where it could function as a substrate-gating loop as occurs in BcsA. To define the functional importance of the CESA region containing FxVTxK, we tested five parallel mutations in Arabidopsis thaliana CESA1 and Physcomitrella patens CESA5 in complementation assays of the relevant cesa mutants. In both organisms, the substitution of the valine or lysine residues in FxVTxK severely affected CESA function. In Arabidopsis roots, both changes were correlated with lower cellulose anisotropy, as revealed by Pontamine Fast Scarlet. Analysis of hypocotyl inner cell wall layers by atomic force microscopy showed that two altered versions of Atcesa1 could rescue cell wall phenotypes observed in the mutant background line. Overall, the data show that the FxVTxK motif is functionally important in two phylogenetically distant plant CESAs. The results show that Physcomitrella provides an efficient model for assessing the effects of engineered CESA mutations affecting primary cell wall synthesis and that diverse testing systems can lead to nuanced insights into CESA structure-function relationships. Although CESA membrane topology needs to be experimentally determined, the results support the possibility that the FxVTxK region functions similarly in CESA and BcsA. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  20. Informative priors based on transcription factor structural class improve de novo motif discovery.

    PubMed

    Narlikar, Leelavati; Gordân, Raluca; Ohler, Uwe; Hartemink, Alexander J

    2006-07-15

    An important problem in molecular biology is to identify the locations at which a transcription factor (TF) binds to DNA, given a set of DNA sequences believed to be bound by that TF. In previous work, we showed that information in the DNA sequence of a binding site is sufficient to predict the structural class of the TF that binds it. In particular, this suggests that we can predict which locations in any DNA sequence are more likely to be bound by certain classes of TFs than others. Here, we argue that traditional methods for de novo motif finding can be significantly improved by adopting an informative prior probability that a TF binding site occurs at each sequence location. To demonstrate the utility of such an approach, we present priority, a powerful new de novo motif finding algorithm. Using data from TRANSFAC, we train three classifiers to recognize binding sites of basic leucine zipper, forkhead, and basic helix loop helix TFs. These classifiers are used to equip priority with three class-specific priors, in addition to a default prior to handle TFs of other classes. We apply priority and a number of popular motif finding programs to sets of yeast intergenic regions that are reported by ChIP-chip to be bound by particular TFs. priority identifies motifs the other methods fail to identify, and correctly predicts the structural class of the TF recognizing the identified binding sites. Supplementary material and code can be found at http://www.cs.duke.edu/~amink/.

  1. Functional significance of the E loop, a novel motif conserved in the lantibiotic immunity ATP-binding cassette transport systems.

    PubMed

    Okuda, Ken-ichi; Yanagihara, Sae; Sugayama, Tomomichi; Zendo, Takeshi; Nakayama, Jiro; Sonomoto, Kenji

    2010-06-01

    Lantibiotics are peptide-derived antibacterial substances produced by some Gram-positive bacteria and characterized by the presence of unusual amino acids, like lanthionines and dehydrated amino acids. Because lantibiotic producers may be attacked by self-produced lantibiotics, they express immunity proteins on the cytoplasmic membrane. An ATP-binding cassette (ABC) transport system mediated by the LanFEG protein complex is a major system in lantibiotic immunity. Multiple-sequence alignment analysis revealed that LanF proteins contain the E loop, a variant of the Q loop, which is a well-conserved motif in the nucleotide-binding domains (NBDs) of general ABC transporters. To elucidate E loop function, we introduced a mutation in the NukF protein, which is involved in the nukacin-ISK-1 immunity system. Amino acid replacement of glutamic acid in the E loop with glutamine (E85Q) resulted in slight decreases in the immunity level and transport activity. Additionally, the E85A mutation severely impaired the immunity level and transport activity. On the other hand, ATPase activities of purified E85Q and E85A mutants were almost similar to that of the wild type. These results suggested that the E loop found in ABC transporters involved in lantibiotic immunity plays a significant role in the function of these transporters, especially in the structural change of transmembrane domains.

  2. An intracellular motif of GLUT4 regulates fusion of GLUT4-containing vesicles.

    PubMed

    Heyward, Catherine A; Pettitt, Trevor R; Leney, Sophie E; Welsh, Gavin I; Tavaré, Jeremy M; Wakelam, Michael J O

    2008-05-20

    Insulin stimulates glucose uptake by adipocytes through increasing translocation of the glucose transporter GLUT4 from an intracellular compartment to the plasma membrane. Fusion of GLUT4-containing vesicles at the cell surface is thought to involve phospholipase D activity, generating the signalling lipid phosphatidic acid, although the mechanism of action is not yet clear. Here we report the identification of a putative phosphatidic acid-binding motif in a GLUT4 intracellular loop. Mutation of this motif causes a decrease in the insulin-induced exposure of GLUT4 at the cell surface of 3T3-L1 adipocytes via an effect on vesicle fusion. The potential phosphatidic acid-binding motif identified in this study is unique to GLUT4 among the sugar transporters, therefore this motif may provide a unique mechanism for regulating insulin-induced translocation by phospholipase D signalling.

  3. The LINKS motif zippers trans-acyltransferase polyketide synthase assembly lines into a biosynthetic megacomplex

    PubMed Central

    Gay, Darren C.; Wagner, Drew T.; Meinke, Jessica L.; Zogzas, Charles E.; Gay, Glen R.; Keatinge-Clay, Adrian T.

    2016-01-01

    Polyketides such as the clinically-valuable antibacterial agent mupirocin are constructed by architecturally-sophisticated assembly lines known as trans-acyltransferase polyketide synthases. Organelle-sized megacomplexes composed of several copies of trans-acyltransferase polyketide synthase assembly lines have been observed by others through transmission electron microscopy to be located at the Bacillus subtilis plasma membrane, where the synthesis and export of the antibacterial polyketide bacillaene takes place. In this work we analyze ten crystal structures of trans-acyltransferase polyketide synthases ketosynthase domains, seven of which are reported here for the first time, to characterize a motif capable of zippering assembly lines into a megacomplex. While each of the three-helix LINKS (Laterally-INteracting Ketosynthase Sequence) motifs is observed to similarly dock with a spatially-reversed copy of itself through hydrophobic and ionic interactions, the amino acid sequences of this motif are not conserved. Such a code is appropriate for mediating homotypic contacts between assembly lines to ensure the ordered self-assembly of a noncovalent, yet tightly-knit, enzymatic network. LINKS-mediated lateral interactions would also have the effect of bolstering the vertical association of the polypeptides that comprise a polyketide synthase assembly line. PMID:26724270

  4. Trend Motif: A Graph Mining Approach for Analysis of Dynamic Complex Networks

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jin, R; McCallen, S; Almaas, E

    2007-05-28

    Complex networks have been used successfully in scientific disciplines ranging from sociology to microbiology to describe systems of interacting units. Until recently, studies of complex networks have mainly focused on their network topology. However, in many real world applications, the edges and vertices have associated attributes that are frequently represented as vertex or edge weights. Furthermore, these weights are often not static, instead changing with time and forming a time series. Hence, to fully understand the dynamics of the complex network, we have to consider both network topology and related time series data. In this work, we propose a motifmore » mining approach to identify trend motifs for such purposes. Simply stated, a trend motif describes a recurring subgraph where each of its vertices or edges displays similar dynamics over a userdefined period. Given this, each trend motif occurrence can help reveal significant events in a complex system; frequent trend motifs may aid in uncovering dynamic rules of change for the system, and the distribution of trend motifs may characterize the global dynamics of the system. Here, we have developed efficient mining algorithms to extract trend motifs. Our experimental validation using three disparate empirical datasets, ranging from the stock market, world trade, to a protein interaction network, has demonstrated the efficiency and effectiveness of our approach.« less

  5. Identifying conserved gene clusters in the presence of homology families.

    PubMed

    He, Xin; Goldwasser, Michael H

    2005-01-01

    The study of conserved gene clusters is important for understanding the forces behind genome organization and evolution, as well as the function of individual genes or gene groups. In this paper, we present a new model and algorithm for identifying conserved gene clusters from pairwise genome comparison. This generalizes a recent model called "gene teams." A gene team is a set of genes that appear homologously in two or more species, possibly in a different order yet with the distance of adjacent genes in the team for each chromosome always no more than a certain threshold. We remove the constraint in the original model that each gene must have a unique occurrence in each chromosome and thus allow the analysis on complex prokaryotic or eukaryotic genomes with extensive paralogs. Our algorithm analyzes a pair of chromosomes in O(mn) time and uses O(m+n) space, where m and n are the number of genes in the respective chromosomes. We demonstrate the utility of our methods by studying two bacterial genomes, E. coli K-12 and B. subtilis. Many of the teams identified by our algorithm correlate with documented E. coli operons, while several others match predicted operons, previously suggested by computational techniques. Our implementation and data are publicly available at euler.slu.edu/ approximately goldwasser/homologyteams/.

  6. Insights into the Activity and Substrate Binding of Xylella fastidiosa Polygalacturonase by Modification of a Unique QMK Amino Acid Motif Using Protein Chimeras.

    PubMed

    Warren, Jeremy G; Lincoln, James E; Kirkpatrick, Bruce C

    2015-01-01

    Polygalacturonases (EC 3.2.1.15) catalyze the random hydrolysis of 1, 4-alpha-D-galactosiduronic linkages in pectate and other galacturonans. Xylella fastidiosa possesses a single polygalacturonase gene, pglA (PD1485), and X. fastidiosa mutants deficient in the production of polygalacturonase are non-pathogenic and show a compromised ability to systemically infect grapevines. These results suggested that grapevines expressing sufficient amounts of an inhibitor of X. fastidiosa polygalacturonase might be protected from disease. Previous work in our laboratory and others have tried without success to produce soluble active X. fastidiosa polygalacturonase for use in inhibition assays. In this study, we created two enzymatically active X. fastidiosa / A. vitis polygalacturonase chimeras, AX1A and AX2A to explore the functionality of X. fastidiosa polygalacturonase in vitro. The AX1A chimera was constructed to specifically test if recombinant chimeric protein, produced in Escherichia coli, is soluble and if the X. fastidiosa polygalacturonase catalytic amino acids are able to hydrolyze polygalacturonic acid. The AX2A chimera was constructed to evaluate the ability of a unique QMK motif of X. fastidiosa polygalacturonase, most polygalacturonases have a R(I/L)K motif, to bind to and allow the hydrolysis of polygalacturonic acid. Furthermore, the AX2A chimera was also used to explore what effect modification of the QMK motif of X. fastidiosa polygalacturonase to a conserved RIK motif has on enzymatic activity. These experiments showed that both the AX1A and AX2A polygalacturonase chimeras were soluble and able to hydrolyze the polygalacturonic acid substrate. Additionally, the modification of the QMK motif to the conserved RIK motif eliminated hydrolytic activity, suggesting that the QMK motif is important for the activity of X. fastidiosa polygalacturonase. This result suggests X. fastidiosa polygalacturonase may preferentially hydrolyze a different pectic substrate or

  7. Insights into the Activity and Substrate Binding of Xylella fastidiosa Polygalacturonase by Modification of a Unique QMK Amino Acid Motif Using Protein Chimeras

    PubMed Central

    Warren, Jeremy G.; Lincoln, James E.; Kirkpatrick, Bruce C.

    2015-01-01

    Polygalacturonases (EC 3.2.1.15) catalyze the random hydrolysis of 1, 4-alpha-D-galactosiduronic linkages in pectate and other galacturonans. Xylella fastidiosa possesses a single polygalacturonase gene, pglA (PD1485), and X. fastidiosa mutants deficient in the production of polygalacturonase are non-pathogenic and show a compromised ability to systemically infect grapevines. These results suggested that grapevines expressing sufficient amounts of an inhibitor of X. fastidiosa polygalacturonase might be protected from disease. Previous work in our laboratory and others have tried without success to produce soluble active X. fastidiosa polygalacturonase for use in inhibition assays. In this study, we created two enzymatically active X. fastidiosa / A. vitis polygalacturonase chimeras, AX1A and AX2A to explore the functionality of X. fastidiosa polygalacturonase in vitro. The AX1A chimera was constructed to specifically test if recombinant chimeric protein, produced in Escherichia coli, is soluble and if the X. fastidiosa polygalacturonase catalytic amino acids are able to hydrolyze polygalacturonic acid. The AX2A chimera was constructed to evaluate the ability of a unique QMK motif of X. fastidiosa polygalacturonase, most polygalacturonases have a R(I/L)K motif, to bind to and allow the hydrolysis of polygalacturonic acid. Furthermore, the AX2A chimera was also used to explore what effect modification of the QMK motif of X. fastidiosa polygalacturonase to a conserved RIK motif has on enzymatic activity. These experiments showed that both the AX1A and AX2A polygalacturonase chimeras were soluble and able to hydrolyze the polygalacturonic acid substrate. Additionally, the modification of the QMK motif to the conserved RIK motif eliminated hydrolytic activity, suggesting that the QMK motif is important for the activity of X. fastidiosa polygalacturonase. This result suggests X. fastidiosa polygalacturonase may preferentially hydrolyze a different pectic substrate or

  8. Mouse TCOF1 is expressed widely, has motifs conserved in nucleolar phosphoproteins, and maps to chromosome 18.

    PubMed

    Paznekas, W A; Zhang, N; Gridley, T; Jabs, E W

    1997-09-08

    Mutations in the human TCOF1 gene have been identified in patients with Treacher Collins Syndrome (Mandibulofacial Dysostosis), an autosomal dominant condition affecting the craniofacial region. We report the isolation of the entire mouse Tcof1 coding sequence (3960 bp) by performing a computer-based search for mouse cDNA clones homologous to TCOF1 and generating overlapping RT-PCR products from mouse RNA. Tcof1 is a 1320 amino acid protein of 135 kd with 61.4% identity to TCOF1 and displays repeating motifs enriched for serine- and acidic amino acid-rich regions with potential phosphorylation sites and putative nuclear localization signals. Tcof1 maps to the mouse chromosome 18 region syntenic with human chromosome 5q32-->q33 which contains the TCOF1 locus. Northern blot hybridization indicates Tcof1 expression is ubiquitous in adult tissues and in the embryonic stage, is elevated at 11 dpc when the branchial arches and facial swellings are present in mouse. Our results are consistent with TCOF1 mutations leading to the Treacher Collins syndrome phenotype.

  9. Discovery of phosphorylation motif mixtures in phosphoproteomics data

    PubMed Central

    Ritz, Anna; Shakhnarovich, Gregory; Salomon, Arthur R.; Raphael, Benjamin J.

    2009-01-01

    Motivation: Modification of proteins via phosphorylation is a primary mechanism for signal transduction in cells. Phosphorylation sites on proteins are determined in part through particular patterns, or motifs, present in the amino acid sequence. Results: We describe an algorithm that simultaneously discovers multiple motifs in a set of peptides that were phosphorylated by several different kinases. Such sets of peptides are routinely produced in proteomics experiments.Our motif-finding algorithm uses the principle of minimum description length to determine a mixture of sequence motifs that distinguish a foreground set of phosphopeptides from a background set of unphosphorylated peptides. We show that our algorithm outperforms existing motif-finding algorithms on synthetic datasets consisting of mixtures of known phosphorylation sites. We also derive a motif specificity score that quantifies whether or not the phosphoproteins containing an instance of a motif have a significant number of known interactions. Application of our motif-finding algorithm to recently published human and mouse proteomic studies recovers several known phosphorylation motifs and reveals a number of novel motifs that are enriched for interactions with a particular kinase or phosphatase. Our tools provide a new approach for uncovering the sequence specificities of uncharacterized kinases or phosphatases. Availability: Software is available at http:/cs.brown.edu/people/braphael/software.html. Contact: aritz@cs.brown.edu; braphael@cs.brown.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:18996944

  10. Helix-packing motifs in membrane proteins.

    PubMed

    Walters, R F S; DeGrado, W F

    2006-09-12

    The fold of a helical membrane protein is largely determined by interactions between membrane-imbedded helices. To elucidate recurring helix-helix interaction motifs, we dissected the crystallographic structures of membrane proteins into a library of interacting helical pairs. The pairs were clustered according to their three-dimensional similarity (rmsd motifs whose structural features can be understood in terms of simple principles of helix-helix packing. Thus, the universe of common transmembrane helix-pairing motifs is relatively simple. The largest cluster, which comprises 29% of the library members, consists of an antiparallel motif with left-handed packing angles, and it is frequently stabilized by packing of small side chains occurring every seven residues in the sequence. Right-handed parallel and antiparallel structures show a similar tendency to segregate small residues to the helix-helix interface but spaced at four-residue intervals. Position-specific sequence propensities were derived for the most populated motifs. These structural and sequential motifs should be quite useful for the design and structural prediction of membrane proteins.

  11. The RXL motif of the African cassava mosaic virus Rep protein is necessary for rereplication of yeast DNA and viral infection in plants

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hipp, Katharina; Rau, Peter; Schäfer, Benjamin

    Geminiviruses, single-stranded DNA plant viruses, encode a replication-initiator protein (Rep) that is indispensable for virus replication. A potential cyclin interaction motif (RXL) in the sequence of African cassava mosaic virus Rep may be an alternative link to cell cycle controls to the known interaction with plant homologs of retinoblastoma protein (pRBR). Mutation of this motif abrogated rereplication in fission yeast induced by expression of wildtype Rep suggesting that Rep interacts via its RXL motif with one or several yeast proteins. The RXL motif is essential for viral infection of Nicotiana benthamiana plants, since mutation of this motif in infectious clonesmore » prevented any symptomatic infection. The cell-cycle link (Clink) protein of a nanovirus (faba bean necrotic yellows virus) was investigated that activates the cell cycle by binding via its LXCXE motif to pRBR. Expression of wildtype Clink and a Clink mutant deficient in pRBR-binding did not trigger rereplication in fission yeast. - Highlights: • A potential cyclin interaction motif is conserved in geminivirus Rep proteins. • In ACMV Rep, this motif (RXL) is essential for rereplication of fission yeast DNA. • Mutating RXL abrogated viral infection completely in Nicotiana benthamiana. • Expression of a nanovirus Clink protein in yeast did not induce rereplication. • Plant viruses may have evolved multiple routes to exploit host DNA synthesis.« less

  12. Identifying mRNA sequence elements for target recognition by human Argonaute proteins

    PubMed Central

    Li, Jingjing; Kim, TaeHyung; Nutiu, Razvan; Ray, Debashish; Hughes, Timothy R.; Zhang, Zhaolei

    2014-01-01

    It is commonly known that mammalian microRNAs (miRNAs) guide the RNA-induced silencing complex (RISC) to target mRNAs through the seed-pairing rule. However, recent experiments that coimmunoprecipitate the Argonaute proteins (AGOs), the central catalytic component of RISC, have consistently revealed extensive AGO-associated mRNAs that lack seed complementarity with miRNAs. We herein test the hypothesis that AGO has its own binding preference within target mRNAs, independent of guide miRNAs. By systematically analyzing the data from in vivo cross-linking experiments with human AGOs, we have identified a structurally accessible and evolutionarily conserved region (∼10 nucleotides in length) that alone can accurately predict AGO–mRNA associations, independent of the presence of miRNA binding sites. Within this region, we further identified an enriched motif that was replicable on independent AGO-immunoprecipitation data sets. We used RNAcompete to enumerate the RNA-binding preference of human AGO2 to all possible 7-mer RNA sequences and validated the AGO motif in vitro. These findings reveal a novel function of AGOs as sequence-specific RNA-binding proteins, which may aid miRNAs in recognizing their targets with high specificity. PMID:24663241

  13. Solution structure of a DNA mimicking motif of an RNA aptamer against transcription factor AML1 Runt domain.

    PubMed

    Nomura, Yusuke; Tanaka, Yoichiro; Fukunaga, Jun-ichi; Fujiwara, Kazuya; Chiba, Manabu; Iibuchi, Hiroaki; Tanaka, Taku; Nakamura, Yoshikazu; Kawai, Gota; Kozu, Tomoko; Sakamoto, Taiichi

    2013-12-01

    AML1/RUNX1 is an essential transcription factor involved in the differentiation of hematopoietic cells. AML1 binds to the Runt-binding double-stranded DNA element (RDE) of target genes through its N-terminal Runt domain. In a previous study, we obtained RNA aptamers against the AML1 Runt domain by systematic evolution of ligands by exponential enrichment and revealed that RNA aptamers exhibit higher affinity for the Runt domain than that for RDE and possess the 5'-GCGMGNN-3' and 5'-N'N'CCAC-3' conserved motif (M: A or C; N and N' form Watson-Crick base pairs) that is important for Runt domain binding. In this study, to understand the structural basis of recognition of the Runt domain by the aptamer motif, the solution structure of a 22-mer RNA was determined using nuclear magnetic resonance. The motif contains the AH(+)-C mismatch and base triple and adopts an unusual backbone structure. Structural analysis of the aptamer motif indicated that the aptamer binds to the Runt domain by mimicking the RDE sequence and structure. Our data should enhance the understanding of the structural basis of DNA mimicry by RNA molecules.

  14. Tetrapods on the EDGE: Overcoming data limitations to identify phylogenetic conservation priorities

    PubMed Central

    Gray, Claudia L.; Wearn, Oliver R.; Owen, Nisha R.

    2018-01-01

    The scale of the ongoing biodiversity crisis requires both effective conservation prioritisation and urgent action. As extinction is non-random across the tree of life, it is important to prioritise threatened species which represent large amounts of evolutionary history. The EDGE metric prioritises species based on their Evolutionary Distinctiveness (ED), which measures the relative contribution of a species to the total evolutionary history of their taxonomic group, and Global Endangerment (GE), or extinction risk. EDGE prioritisations rely on adequate phylogenetic and extinction risk data to generate meaningful priorities for conservation. However, comprehensive phylogenetic trees of large taxonomic groups are extremely rare and, even when available, become quickly out-of-date due to the rapid rate of species descriptions and taxonomic revisions. Thus, it is important that conservationists can use the available data to incorporate evolutionary history into conservation prioritisation. We compared published and new methods to estimate missing ED scores for species absent from a phylogenetic tree whilst simultaneously correcting the ED scores of their close taxonomic relatives. We found that following artificial removal of species from a phylogenetic tree, the new method provided the closest estimates of their “true” ED score, differing from the true ED score by an average of less than 1%, compared to the 31% and 38% difference of the previous methods. The previous methods also substantially under- and over-estimated scores as more species were artificially removed from a phylogenetic tree. We therefore used the new method to estimate ED scores for all tetrapods. From these scores we updated EDGE prioritisation rankings for all tetrapod species with IUCN Red List assessments, including the first EDGE prioritisation for reptiles. Further, we identified criteria to identify robust priority species in an effort to further inform conservation action whilst

  15. Identifiability of conservative linear mechanical systems. [applied to large flexible spacecraft structures

    NASA Technical Reports Server (NTRS)

    Sirlin, S. W.; Longman, R. W.; Juang, J. N.

    1985-01-01

    With a sufficiently great number of sensors and actuators, any finite dimensional dynamic system is identifiable on the basis of input-output data. It is presently indicated that, for conservative nongyroscopic linear mechanical systems, the number of sensors and actuators required for identifiability is very large, where 'identifiability' is understood as a unique determination of the mass and stiffness matrices. The required number of sensors and actuators drops by a factor of two, given a relaxation of the identifiability criterion so that identification can fail only if the system parameters being identified lie in a set of measure zero. When the mass matrix is known a priori, this additional information does not significantly affect the requirements for guaranteed identifiability, though the number of parameters to be determined is reduced by a factor of two.

  16. Residential energy consumption and conservation programs: A systematic approach to identify inefficient households, provide meaningful feedback, and prioritize homes for conservation intervention

    NASA Astrophysics Data System (ADS)

    Macsleyne, Amelia Chadbourne Carus

    There are three main objectives for residential energy conservation policies: to reduce the use of fossil fuels, reduce greenhouse gas emissions, and reduce the energy costs seen by the consumer (U.S. Department of Energy: Strategic Objectives, 2006). A prominent difficulty currently facing conservation policy makers and program managers is how to identify and communicate with households that would be good candidates for conservation intervention, in such a way that affects a change in consumption patterns and is cost-effective. This research addresses this issue by separating the problem into three components: how to identify houses that are significantly more inefficient than comparable households; how to find the maximum financially-feasible investment in energy efficiency for a household in order to reduce annual energy costs and/or improve indoor comfort; and how to prioritize low-income households for a subsidized weatherization program. Each component of the problem is presented as a paper prepared for publication. Household consumption related to physical house efficiency, thermostat settings, and daily appliance usage is studied in the first and second paper by analyzing natural gas utility meter readings associated with over 10,000 households from 2001-2006. A rich description of a house's architectural characteristics and household demographics is attained by integrating publicly available databases based on the house address. This combination of information allows for the largest number of individual households studied at this level of detail to date. The third paper uses conservation program data from two natural gas utilities that administer and sponsor the program; over 1,000 weatherized households are included in this sample. This research focuses on natural gas-related household conservation. However, the same principles and methods could be applied for electricity-related conservation programs. We find positive policy implications from each of

  17. Deciphering functional glycosaminoglycan motifs in development.

    PubMed

    Townley, Robert A; Bülow, Hannes E

    2018-03-23

    Glycosaminoglycans (GAGs) such as heparan sulfate, chondroitin/dermatan sulfate, and keratan sulfate are linear glycans, which when attached to protein backbones form proteoglycans. GAGs are essential components of the extracellular space in metazoans. Extensive modifications of the glycans such as sulfation, deacetylation and epimerization create structural GAG motifs. These motifs regulate protein-protein interactions and are thereby repsonsible for many of the essential functions of GAGs. This review focusses on recent genetic approaches to characterize GAG motifs and their function in defined signaling pathways during development. We discuss a coding approach for GAGs that would enable computational analyses of GAG sequences such as alignments and the computation of position weight matrices to describe GAG motifs. Copyright © 2018 Elsevier Ltd. All rights reserved.

  18. Compendium of Immune Signatures Identifies Conserved and Species-Specific Biology in Response to Inflammation.

    PubMed

    Godec, Jernej; Tan, Yan; Liberzon, Arthur; Tamayo, Pablo; Bhattacharya, Sanchita; Butte, Atul J; Mesirov, Jill P; Haining, W Nicholas

    2016-01-19

    Gene-expression profiling has become a mainstay in immunology, but subtle changes in gene networks related to biological processes are hard to discern when comparing various datasets. For instance, conservation of the transcriptional response to sepsis in mouse models and human disease remains controversial. To improve transcriptional analysis in immunology, we created ImmuneSigDB: a manually annotated compendium of ∼5,000 gene-sets from diverse cell states, experimental manipulations, and genetic perturbations in immunology. Analysis using ImmuneSigDB identified signatures induced in activated myeloid cells and differentiating lymphocytes that were highly conserved between humans and mice. Sepsis triggered conserved patterns of gene expression in humans and mouse models. However, we also identified species-specific biological processes in the sepsis transcriptional response: although both species upregulated phagocytosis-related genes, a mitosis signature was specific to humans. ImmuneSigDB enables granular analysis of transcriptomic data to improve biological understanding of immune processes of the human and mouse immune systems. Copyright © 2016 Elsevier Inc. All rights reserved.

  19. SVM2Motif—Reconstructing Overlapping DNA Sequence Motifs by Mimicking an SVM Predictor

    PubMed Central

    Vidovic, Marina M. -C.; Görnitz, Nico; Müller, Klaus-Robert; Rätsch, Gunnar; Kloft, Marius

    2015-01-01

    Identifying discriminative motifs underlying the functionality and evolution of organisms is a major challenge in computational biology. Machine learning approaches such as support vector machines (SVMs) achieve state-of-the-art performances in genomic discrimination tasks, but—due to its black-box character—motifs underlying its decision function are largely unknown. As a remedy, positional oligomer importance matrices (POIMs) allow us to visualize the significance of position-specific subsequences. Although being a major step towards the explanation of trained SVM models, they suffer from the fact that their size grows exponentially in the length of the motif, which renders their manual inspection feasible only for comparably small motif sizes, typically k ≤ 5. In this work, we extend the work on positional oligomer importance matrices, by presenting a new machine-learning methodology, entitled motifPOIM, to extract the truly relevant motifs—regardless of their length and complexity—underlying the predictions of a trained SVM model. Our framework thereby considers the motifs as free parameters in a probabilistic model, a task which can be phrased as a non-convex optimization problem. The exponential dependence of the POIM size on the oligomer length poses a major numerical challenge, which we address by an efficient optimization framework that allows us to find possibly overlapping motifs consisting of up to hundreds of nucleotides. We demonstrate the efficacy of our approach on a synthetic data set as well as a real-world human splice site data set. PMID:26690911

  20. Learning cellular sorting pathways using protein interactions and sequence motifs.

    PubMed

    Lin, Tien-Ho; Bar-Joseph, Ziv; Murphy, Robert F

    2011-11-01

    Proper subcellular localization is critical for proteins to perform their roles in cellular functions. Proteins are transported by different cellular sorting pathways, some of which take a protein through several intermediate locations until reaching its final destination. The pathway a protein is transported through is determined by carrier proteins that bind to specific sequence motifs. In this article, we present a new method that integrates protein interaction and sequence motif data to model how proteins are sorted through these sorting pathways. We use a hidden Markov model (HMM) to represent protein sorting pathways. The model is able to determine intermediate sorting states and to assign carrier proteins and motifs to the sorting pathways. In simulation studies, we show that the method can accurately recover an underlying sorting model. Using data for yeast, we show that our model leads to accurate prediction of subcellular localization. We also show that the pathways learned by our model recover many known sorting pathways and correctly assign proteins to the path they utilize. The learned model identified new pathways and their putative carriers and motifs and these may represent novel protein sorting mechanisms. Supplementary results and software implementation are available from http://murphylab.web.cmu.edu/software/2010_RECOMB_pathways/.

  1. Learning Cellular Sorting Pathways Using Protein Interactions and Sequence Motifs

    PubMed Central

    Lin, Tien-Ho; Bar-Joseph, Ziv

    2011-01-01

    Abstract Proper subcellular localization is critical for proteins to perform their roles in cellular functions. Proteins are transported by different cellular sorting pathways, some of which take a protein through several intermediate locations until reaching its final destination. The pathway a protein is transported through is determined by carrier proteins that bind to specific sequence motifs. In this article, we present a new method that integrates protein interaction and sequence motif data to model how proteins are sorted through these sorting pathways. We use a hidden Markov model (HMM) to represent protein sorting pathways. The model is able to determine intermediate sorting states and to assign carrier proteins and motifs to the sorting pathways. In simulation studies, we show that the method can accurately recover an underlying sorting model. Using data for yeast, we show that our model leads to accurate prediction of subcellular localization. We also show that the pathways learned by our model recover many known sorting pathways and correctly assign proteins to the path they utilize. The learned model identified new pathways and their putative carriers and motifs and these may represent novel protein sorting mechanisms. Supplementary results and software implementation are available from http://murphylab.web.cmu.edu/software/2010_RECOMB_pathways/. PMID:21999284

  2. Comparing spatially explicit ecological and social values for natural areas to identify effective conservation strategies.

    PubMed

    Bryan, Brett Anthony; Raymond, Christopher Mark; Crossman, Neville David; King, Darran

    2011-02-01

    Consideration of the social values people assign to relatively undisturbed native ecosystems is critical for the success of science-based conservation plans. We used an interview process to identify and map social values assigned to 31 ecosystem services provided by natural areas in an agricultural landscape in southern Australia. We then modeled the spatial distribution of 12 components of ecological value commonly used in setting spatial conservation priorities. We used the analytical hierarchy process to weight these components and used multiattribute utility theory to combine them into a single spatial layer of ecological value. Social values assigned to natural areas were negatively correlated with ecological values overall, but were positively correlated with some components of ecological value. In terms of the spatial distribution of values, people valued protected areas, whereas those natural areas underrepresented in the reserve system were of higher ecological value. The habitats of threatened animal species were assigned both high ecological value and high social value. Only small areas were assigned both high ecological value and high social value in the study area, whereas large areas of high ecological value were of low social value, and vice versa. We used the assigned ecological and social values to identify different conservation strategies (e.g., information sharing, community engagement, incentive payments) that may be effective for specific areas. We suggest that consideration of both ecological and social values in selection of conservation strategies can enhance the success of science-based conservation planning. ©2010 Society for Conservation Biology.

  3. Redemptive Journey: The Storytelling Motif in Andersen's "The Snow Queen."

    ERIC Educational Resources Information Center

    Misheff, Sue

    1989-01-01

    Discusses how Hans Christian Andersen's "The Snow Queen" uses the motif of storytelling to describe the journey taken by the heroine Gerda. Identifies a story as that which is alive and active and which causes catharsis for those who participate in it. (MG)

  4. A private DNA motif finding algorithm.

    PubMed

    Chen, Rui; Peng, Yun; Choi, Byron; Xu, Jianliang; Hu, Haibo

    2014-08-01

    With the increasing availability of genomic sequence data, numerous methods have been proposed for finding DNA motifs. The discovery of DNA motifs serves a critical step in many biological applications. However, the privacy implication of DNA analysis is normally neglected in the existing methods. In this work, we propose a private DNA motif finding algorithm in which a DNA owner's privacy is protected by a rigorous privacy model, known as ∊-differential privacy. It provides provable privacy guarantees that are independent of adversaries' background knowledge. Our algorithm makes use of the n-gram model and is optimized for processing large-scale DNA sequences. We evaluate the performance of our algorithm over real-life genomic data and demonstrate the promise of integrating privacy into DNA motif finding. Copyright © 2014 Elsevier Inc. All rights reserved.

  5. Positive evolutionary selection of an HD motif on Alzheimer precursor protein orthologues suggests a functional role.

    PubMed

    Miklós, István; Zádori, Zoltán

    2012-02-01

    HD amino acid duplex has been found in the active center of many different enzymes. The dyad plays remarkably different roles in their catalytic processes that usually involve metal coordination. An HD motif is positioned directly on the amyloid beta fragment (Aβ) and on the carboxy-terminal region of the extracellular domain (CAED) of the human amyloid precursor protein (APP) and a taxonomically well defined group of APP orthologues (APPOs). In human Aβ HD is part of a presumed, RGD-like integrin-binding motif RHD; however, neither RHD nor RXD demonstrates reasonable conservation in APPOs. The sequences of CAEDs and the position of the HD are not particularly conserved either, yet we show with a novel statistical method using evolutionary modeling that the presence of HD on CAEDs cannot be the result of neutral evolutionary forces (p<0.0001). The motif is positively selected along the evolutionary process in the majority of APPOs, despite the fact that HD motif is underrepresented in the proteomes of all species of the animal kingdom. Position migration can be explained by high probability occurrence of multiple copies of HD on intermediate sequences, from which only one is kept by selective evolutionary forces, in a similar way as in the case of the "transcription binding site turnover." CAED of all APP orthologues and homologues are predicted to bind metal ions including Amyloid-like protein 1 (APLP1) and Amyloid-like protein 2 (APLP2). Our results suggest that HDs on the CAEDs are most probably key components of metal-binding domains, which facilitate and/or regulate inter- or intra-molecular interactions in a metal ion-dependent or metal ion concentration-dependent manner. The involvement of naturally occurring mutations of HD (Tottori (D7N) and English (H6R) mutations) in early onset Alzheimer's disease gives additional support to our finding that HD has an evolutionary preserved function on APPOs.

  6. Positive Evolutionary Selection of an HD Motif on Alzheimer Precursor Protein Orthologues Suggests a Functional Role

    PubMed Central

    Miklós, István; Zádori, Zoltán

    2012-01-01

    HD amino acid duplex has been found in the active center of many different enzymes. The dyad plays remarkably different roles in their catalytic processes that usually involve metal coordination. An HD motif is positioned directly on the amyloid beta fragment (Aβ) and on the carboxy-terminal region of the extracellular domain (CAED) of the human amyloid precursor protein (APP) and a taxonomically well defined group of APP orthologues (APPOs). In human Aβ HD is part of a presumed, RGD-like integrin-binding motif RHD; however, neither RHD nor RXD demonstrates reasonable conservation in APPOs. The sequences of CAEDs and the position of the HD are not particularly conserved either, yet we show with a novel statistical method using evolutionary modeling that the presence of HD on CAEDs cannot be the result of neutral evolutionary forces (p<0.0001). The motif is positively selected along the evolutionary process in the majority of APPOs, despite the fact that HD motif is underrepresented in the proteomes of all species of the animal kingdom. Position migration can be explained by high probability occurrence of multiple copies of HD on intermediate sequences, from which only one is kept by selective evolutionary forces, in a similar way as in the case of the “transcription binding site turnover.” CAED of all APP orthologues and homologues are predicted to bind metal ions including Amyloid-like protein 1 (APLP1) and Amyloid-like protein 2 (APLP2). Our results suggest that HDs on the CAEDs are most probably key components of metal-binding domains, which facilitate and/or regulate inter- or intra-molecular interactions in a metal ion-dependent or metal ion concentration-dependent manner. The involvement of naturally occurring mutations of HD (Tottori (D7N) and English (H6R) mutations) in early onset Alzheimer's disease gives additional support to our finding that HD has an evolutionary preserved function on APPOs. PMID:22319430

  7. FoldMiner and LOCK 2: protein structure comparison and motif discovery on the web.

    PubMed

    Shapiro, Jessica; Brutlag, Douglas

    2004-07-01

    The FoldMiner web server (http://foldminer.stanford.edu/) provides remote access to methods for protein structure alignment and unsupervised motif discovery. FoldMiner is unique among such algorithms in that it improves both the motif definition and the sensitivity of a structural similarity search by combining the search and motif discovery methods and using information from each process to enhance the other. In a typical run, a query structure is aligned to all structures in one of several databases of single domain targets in order to identify its structural neighbors and to discover a motif that is the basis for the similarity among the query and statistically significant targets. This process is fully automated, but options for manual refinement of the results are available as well. The server uses the Chime plugin and customized controls to allow for visualization of the motif and of structural superpositions. In addition, we provide an interface to the LOCK 2 algorithm for rapid alignments of a query structure to smaller numbers of user-specified targets.

  8. A Novel Protein Interaction between Nucleotide Binding Domain of Hsp70 and p53 Motif

    PubMed Central

    Elengoe, Asita; Naser, Mohammed Abu; Hamdan, Salehhuddin

    2015-01-01

    Currently, protein interaction of Homo sapiens nucleotide binding domain (NBD) of heat shock 70 kDa protein (PDB: 1HJO) with p53 motif remains to be elucidated. The NBD-p53 motif complex enhances the p53 stabilization, thereby increasing the tumor suppression activity in cancer treatment. Therefore, we identified the interaction between NBD and p53 using STRING version 9.1 program. Then, we modeled the three-dimensional structure of p53 motif through homology modeling and determined the binding affinity and stability of NBD-p53 motif complex structure via molecular docking and dynamics (MD) simulation. Human DNA binding domain of p53 motif (SCMGGMNR) retrieved from UniProt (UniProtKB: P04637) was docked with the NBD protein, using the Autodock version 4.2 program. The binding energy and intermolecular energy for the NBD-p53 motif complex were −0.44 Kcal/mol and −9.90 Kcal/mol, respectively. Moreover, RMSD, RMSF, hydrogen bonds, salt bridge, and secondary structure analyses revealed that the NBD protein had a strong bond with p53 motif and the protein-ligand complex was stable. Thus, the current data would be highly encouraging for designing Hsp70 structure based drug in cancer therapy. PMID:26098630

  9. A Novel Protein Interaction between Nucleotide Binding Domain of Hsp70 and p53 Motif.

    PubMed

    Elengoe, Asita; Naser, Mohammed Abu; Hamdan, Salehhuddin

    2015-01-01

    Currently, protein interaction of Homo sapiens nucleotide binding domain (NBD) of heat shock 70 kDa protein (PDB: 1HJO) with p53 motif remains to be elucidated. The NBD-p53 motif complex enhances the p53 stabilization, thereby increasing the tumor suppression activity in cancer treatment. Therefore, we identified the interaction between NBD and p53 using STRING version 9.1 program. Then, we modeled the three-dimensional structure of p53 motif through homology modeling and determined the binding affinity and stability of NBD-p53 motif complex structure via molecular docking and dynamics (MD) simulation. Human DNA binding domain of p53 motif (SCMGGMNR) retrieved from UniProt (UniProtKB: P04637) was docked with the NBD protein, using the Autodock version 4.2 program. The binding energy and intermolecular energy for the NBD-p53 motif complex were -0.44 Kcal/mol and -9.90 Kcal/mol, respectively. Moreover, RMSD, RMSF, hydrogen bonds, salt bridge, and secondary structure analyses revealed that the NBD protein had a strong bond with p53 motif and the protein-ligand complex was stable. Thus, the current data would be highly encouraging for designing Hsp70 structure based drug in cancer therapy.

  10. smRNAome profiling to identify conserved and novel microRNAs in Stevia rebaudiana Bertoni

    PubMed Central

    2012-01-01

    Background MicroRNAs (miRNAs) constitute a family of small RNA (sRNA) population that regulates the gene expression and plays an important role in plant development, metabolism, signal transduction and stress response. Extensive studies on miRNAs have been performed in different plants such as Arabidopsis thaliana, Oryza sativa etc. and volume of the miRNA database, mirBASE, has been increasing on day to day basis. Stevia rebaudiana Bertoni is an important perennial herb which accumulates high concentrations of diterpene steviol glycosides which contributes to its high indexed sweetening property with no calorific value. Several studies have been carried out for understanding molecular mechanism involved in biosynthesis of these glycosides, however, information about miRNAs has been lacking in S. rebaudiana. Deep sequencing of small RNAs combined with transcriptomic data is a powerful tool for identifying conserved and novel miRNAs irrespective of availability of genome sequence data. Results To identify miRNAs in S. rebaudiana, sRNA library was constructed and sequenced using Illumina genome analyzer II. A total of 30,472,534 reads representing 2,509,190 distinct sequences were obtained from sRNA library. Based on sequence similarity, we identified 100 miRNAs belonging to 34 highly conserved families. Also, we identified 12 novel miRNAs whose precursors were potentially generated from stevia EST and nucleotide sequences. All novel sequences have not been earlier described in other plant species. Putative target genes were predicted for most conserved and novel miRNAs. The predicted targets are mainly mRNA encoding enzymes regulating essential plant metabolic and signaling pathways. Conclusions This study led to the identification of 34 highly conserved miRNA families and 12 novel potential miRNAs indicating that specific miRNAs exist in stevia species. Our results provided information on stevia miRNAs and their targets building a foundation for future studies to

  11. smRNAome profiling to identify conserved and novel microRNAs in Stevia rebaudiana Bertoni.

    PubMed

    Mandhan, Vibha; Kaur, Jagdeep; Singh, Kashmir

    2012-11-01

    MicroRNAs (miRNAs) constitute a family of small RNA (sRNA) population that regulates the gene expression and plays an important role in plant development, metabolism, signal transduction and stress response. Extensive studies on miRNAs have been performed in different plants such as Arabidopsis thaliana, Oryza sativa etc. and volume of the miRNA database, mirBASE, has been increasing on day to day basis. Stevia rebaudiana Bertoni is an important perennial herb which accumulates high concentrations of diterpene steviol glycosides which contributes to its high indexed sweetening property with no calorific value. Several studies have been carried out for understanding molecular mechanism involved in biosynthesis of these glycosides, however, information about miRNAs has been lacking in S. rebaudiana. Deep sequencing of small RNAs combined with transcriptomic data is a powerful tool for identifying conserved and novel miRNAs irrespective of availability of genome sequence data. To identify miRNAs in S. rebaudiana, sRNA library was constructed and sequenced using Illumina genome analyzer II. A total of 30,472,534 reads representing 2,509,190 distinct sequences were obtained from sRNA library. Based on sequence similarity, we identified 100 miRNAs belonging to 34 highly conserved families. Also, we identified 12 novel miRNAs whose precursors were potentially generated from stevia EST and nucleotide sequences. All novel sequences have not been earlier described in other plant species. Putative target genes were predicted for most conserved and novel miRNAs. The predicted targets are mainly mRNA encoding enzymes regulating essential plant metabolic and signaling pathways. This study led to the identification of 34 highly conserved miRNA families and 12 novel potential miRNAs indicating that specific miRNAs exist in stevia species. Our results provided information on stevia miRNAs and their targets building a foundation for future studies to understand their roles in key

  12. Sequential visibility-graph motifs

    NASA Astrophysics Data System (ADS)

    Iacovacci, Jacopo; Lacasa, Lucas

    2016-04-01

    Visibility algorithms transform time series into graphs and encode dynamical information in their topology, paving the way for graph-theoretical time series analysis as well as building a bridge between nonlinear dynamics and network science. In this work we introduce and study the concept of sequential visibility-graph motifs, smaller substructures of n consecutive nodes that appear with characteristic frequencies. We develop a theory to compute in an exact way the motif profiles associated with general classes of deterministic and stochastic dynamics. We find that this simple property is indeed a highly informative and computationally efficient feature capable of distinguishing among different dynamics and robust against noise contamination. We finally confirm that it can be used in practice to perform unsupervised learning, by extracting motif profiles from experimental heart-rate series and being able, accordingly, to disentangle meditative from other relaxation states. Applications of this general theory include the automatic classification and description of physical, biological, and financial time series.

  13. Occurrence probability of structured motifs in random sequences.

    PubMed

    Robin, S; Daudin, J-J; Richard, H; Sagot, M-F; Schbath, S

    2002-01-01

    The problem of extracting from a set of nucleic acid sequences motifs which may have biological function is more and more important. In this paper, we are interested in particular motifs that may be implicated in the transcription process. These motifs, called structured motifs, are composed of two ordered parts separated by a variable distance and allowing for substitutions. In order to assess their statistical significance, we propose approximations of the probability of occurrences of such a structured motif in a given sequence. An application of our method to evaluate candidate promoters in E. coli and B. subtilis is presented. Simulations show the goodness of the approximations.

  14. The nuclear OXPHOS genes in insecta: a common evolutionary origin, a common cis-regulatory motif, a common destiny for gene duplicates

    PubMed Central

    Porcelli, Damiano; Barsanti, Paolo; Pesole, Graziano; Caggese, Corrado

    2007-01-01

    Background When orthologous sequences from species distributed throughout an optimal range of divergence times are available, comparative genomics is a powerful tool to address problems such as the identification of the forces that shape gene structure during evolution, although the functional constraints involved may vary in different genes and lineages. Results We identified and annotated in the MitoComp2 dataset the orthologs of 68 nuclear genes controlling oxidative phosphorylation in 11 Drosophilidae species and in five non-Drosophilidae insects, and compared them with each other and with their counterparts in three vertebrates (Fugu rubripes, Danio rerio and Homo sapiens) and in the cnidarian Nematostella vectensis, taking into account conservation of gene structure and regulatory motifs, and preservation of gene paralogs in the genome. Comparative analysis indicates that the ancestral insect OXPHOS genes were intron rich and that extensive intron loss and lineage-specific intron gain occurred during evolution. Comparison with vertebrates and cnidarians also shows that many OXPHOS gene introns predate the cnidarian/Bilateria evolutionary split. The nuclear respiratory gene element (NRG) has played a key role in the evolution of the insect OXPHOS genes; it is constantly conserved in the OXPHOS orthologs of all the insect species examined, while their duplicates either completely lack the element or possess only relics of the motif. Conclusion Our observations reinforce the notion that the common ancestor of most animal phyla had intron-rich gene, and suggest that changes in the pattern of expression of the gene facilitate the fixation of duplications in the genome and the development of novel genetic functions. PMID:18315839

  15. The LINKS motif zippers trans-acyltransferase polyketide synthase assembly lines into a biosynthetic megacomplex.

    PubMed

    Gay, Darren C; Wagner, Drew T; Meinke, Jessica L; Zogzas, Charles E; Gay, Glen R; Keatinge-Clay, Adrian T

    2016-03-01

    Polyketides such as the clinically-valuable antibacterial agent mupirocin are constructed by architecturally-sophisticated assembly lines known as trans-acyltransferase polyketide synthases. Organelle-sized megacomplexes composed of several copies of trans-acyltransferase polyketide synthase assembly lines have been observed by others through transmission electron microscopy to be located at the Bacillus subtilis plasma membrane, where the synthesis and export of the antibacterial polyketide bacillaene takes place. In this work we analyze ten crystal structures of trans-acyltransferase polyketide synthases ketosynthase domains, seven of which are reported here for the first time, to characterize a motif capable of zippering assembly lines into a megacomplex. While each of the three-helix LINKS (Laterally-INteracting Ketosynthase Sequence) motifs is observed to similarly dock with a spatially-reversed copy of itself through hydrophobic and ionic interactions, the amino acid sequences of this motif are not conserved. Such a code is appropriate for mediating homotypic contacts between assembly lines to ensure the ordered self-assembly of a noncovalent, yet tightly-knit, enzymatic network. LINKS-mediated lateral interactions would also have the effect of bolstering the vertical association of the polypeptides that comprise a polyketide synthase assembly line. Copyright © 2015 Elsevier Inc. All rights reserved.

  16. Helix–hairpin–helix motifs confer salt resistance and processivity on chimeric DNA polymerases

    PubMed Central

    Pavlov, Andrey R.; Belova, Galina I.; Kozyavkin, Sergei A.; Slesarev, Alexei I.

    2002-01-01

    Helix–hairpin–helix (HhH) is a widespread motif involved in sequence-nonspecific DNA binding. The majority of HhH motifs function as DNA-binding modules with typical occurrence of one HhH motif or one or two (HhH)2 domains in proteins. We recently identified 24 HhH motifs in DNA topoisomerase V (Topo V). Although these motifs are dispensable for the topoisomerase activity of Topo V, their removal narrows the salt concentration range for topoisomerase activity tenfold. Here, we demonstrate the utility of Topo V's HhH motifs for modulating DNA-binding properties of the Stoffel fragment of TaqDNA polymerase and Pfu DNA polymerase. Different HhH cassettes fused with either NH2 terminus or COOH terminus of DNA polymerases broaden the salt concentration range of the polymerase activity significantly (up to 0.5 M NaCl or 1.8 M potassium glutamate). We found that anions play a major role in the inhibition of DNA polymerase activity. The resistance of initial extension rates and the processivity of chimeric polymerases to salts depend on the structure of added HhH motifs. Regardless of the type of the construct, the thermal stability of chimeric Taq polymerases increases under the optimal ionic conditions, as compared with that of TaqDNA polymerase or its Stoffel fragment. Our approach to raise the salt tolerance, processivity, and thermostability of Taq and Pfu DNA polymerases may be applied to all pol1- and polB-type polymerases, as well as to other DNA processing enzymes. PMID:12368475

  17. Divergence and Conservative Evolution of XTNX Genes in Land Plants.

    PubMed

    Zhang, Yan-Mei; Xue, Jia-Yu; Liu, Li-Wei; Sun, Xiao-Qin; Zhou, Guang-Can; Chen, Min; Shao, Zhu-Qing; Hang, Yue-Yu

    2017-01-01

    The Toll-interleukin-1 receptor (TIR) and Nucleotide-binding site (NBS) domains are two major components of the TIR-NBS-leucine-rich repeat family plant disease resistance genes. Extensive functional and evolutionary studies have been performed on these genes; however, the characterization of a small group of genes that are composed of atypical TIR and NBS domains, namely XTNX genes, is limited. The present study investigated this specific gene family by conducting genome-wide analyses of 59 green plant genomes. A total of 143 XTNX genes were identified in 51 of the 52 land plant genomes, whereas no XTNX gene was detected in any green algae genomes, which indicated that XTNX genes originated upon emergence of land plants. Phylogenetic analysis revealed that the ancestral XTNX gene underwent two rounds of ancient duplications in land plants, which resulted in the formation of clades I/II and clades IIa/IIb successively. Although clades I and IIb have evolved conservatively in angiosperms, the motif composition difference and sequence divergence at the amino acid level suggest that functional divergence may have occurred since the separation of the two clades. In contrast, several features of the clade IIa genes, including the absence in the majority of dicots, the long branches in the tree, the frequent loss of ancestral motifs, and the loss of expression in all detected tissues of Zea mays , all suggest that the genes in this lineage might have undergone pseudogenization. This study highlights that XTNX genes are a gene family originated anciently in land plants and underwent specific conservative pattern in evolution.

  18. GSHSite: Exploiting an Iteratively Statistical Method to Identify S-Glutathionylation Sites with Substrate Specificity

    PubMed Central

    Chen, Yi-Ju; Lu, Cheng-Tsung; Huang, Kai-Yao; Wu, Hsin-Yi; Chen, Yu-Ju; Lee, Tzong-Yi

    2015-01-01

    S-glutathionylation, the covalent attachment of a glutathione (GSH) to the sulfur atom of cysteine, is a selective and reversible protein post-translational modification (PTM) that regulates protein activity, localization, and stability. Despite its implication in the regulation of protein functions and cell signaling, the substrate specificity of cysteine S-glutathionylation remains unknown. Based on a total of 1783 experimentally identified S-glutathionylation sites from mouse macrophages, this work presents an informatics investigation on S-glutathionylation sites including structural factors such as the flanking amino acids composition and the accessible surface area (ASA). TwoSampleLogo presents that positively charged amino acids flanking the S-glutathionylated cysteine may influence the formation of S-glutathionylation in closed three-dimensional environment. A statistical method is further applied to iteratively detect the conserved substrate motifs with statistical significance. Support vector machine (SVM) is then applied to generate predictive model considering the substrate motifs. According to five-fold cross-validation, the SVMs trained with substrate motifs could achieve an enhanced sensitivity, specificity, and accuracy, and provides a promising performance in an independent test set. The effectiveness of the proposed method is demonstrated by the correct identification of previously reported S-glutathionylation sites of mouse thioredoxin (TXN) and human protein tyrosine phosphatase 1b (PTP1B). Finally, the constructed models are adopted to implement an effective web-based tool, named GSHSite (http://csb.cse.yzu.edu.tw/GSHSite/), for identifying uncharacterized GSH substrate sites on the protein sequences. PMID:25849935

  19. Simultaneously learning DNA motif along with its position and sequence rank preferences through expectation maximization algorithm.

    PubMed

    Zhang, ZhiZhuo; Chang, Cheng Wei; Hugo, Willy; Cheung, Edwin; Sung, Wing-Kin

    2013-03-01

    Although de novo motifs can be discovered through mining over-represented sequence patterns, this approach misses some real motifs and generates many false positives. To improve accuracy, one solution is to consider some additional binding features (i.e., position preference and sequence rank preference). This information is usually required from the user. This article presents a de novo motif discovery algorithm called SEME (sampling with expectation maximization for motif elicitation), which uses pure probabilistic mixture model to model the motif's binding features and uses expectation maximization (EM) algorithms to simultaneously learn the sequence motif, position, and sequence rank preferences without asking for any prior knowledge from the user. SEME is both efficient and accurate thanks to two important techniques: the variable motif length extension and importance sampling. Using 75 large-scale synthetic datasets, 32 metazoan compendium benchmark datasets, and 164 chromatin immunoprecipitation sequencing (ChIP-Seq) libraries, we demonstrated the superior performance of SEME over existing programs in finding transcription factor (TF) binding sites. SEME is further applied to a more difficult problem of finding the co-regulated TF (coTF) motifs in 15 ChIP-Seq libraries. It identified significantly more correct coTF motifs and, at the same time, predicted coTF motifs with better matching to the known motifs. Finally, we show that the learned position and sequence rank preferences of each coTF reveals potential interaction mechanisms between the primary TF and the coTF within these sites. Some of these findings were further validated by the ChIP-Seq experiments of the coTFs. The application is available online.

  20. LDsplit: screening for cis-regulatory motifs stimulating meiotic recombination hotspots by analysis of DNA sequence polymorphisms.

    PubMed

    Yang, Peng; Wu, Min; Guo, Jing; Kwoh, Chee Keong; Przytycka, Teresa M; Zheng, Jie

    2014-02-17

    As a fundamental genomic element, meiotic recombination hotspot plays important roles in life sciences. Thus uncovering its regulatory mechanisms has broad impact on biomedical research. Despite the recent identification of the zinc finger protein PRDM9 and its 13-mer binding motif as major regulators for meiotic recombination hotspots, other regulators remain to be discovered. Existing methods for finding DNA sequence motifs of recombination hotspots often rely on the enrichment of co-localizations between hotspots and short DNA patterns, which ignore the cross-individual variation of recombination rates and sequence polymorphisms in the population. Our objective in this paper is to capture signals encoded in genetic variations for the discovery of recombination-associated DNA motifs. Recently, an algorithm called "LDsplit" has been designed to detect the association between single nucleotide polymorphisms (SNPs) and proximal meiotic recombination hotspots. The association is measured by the difference of population recombination rates at a hotspot between two alleles of a candidate SNP. Here we present an open source software tool of LDsplit, with integrative data visualization for recombination hotspots and their proximal SNPs. Applying LDsplit on SNPs inside an established 7-mer motif bound by PRDM9 we observed that SNP alleles preserving the original motif tend to have higher recombination rates than the opposite alleles that disrupt the motif. Running on SNP windows around hotspots each containing an occurrence of the 7-mer motif, LDsplit is able to guide the established motif finding algorithm of MEME to recover the 7-mer motif. In contrast, without LDsplit the 7-mer motif could not be identified. LDsplit is a software tool for the discovery of cis-regulatory DNA sequence motifs stimulating meiotic recombination hotspots by screening and narrowing down to hotspot associated SNPs. It is the first computational method that utilizes the genetic variation of

  1. LDsplit: screening for cis-regulatory motifs stimulating meiotic recombination hotspots by analysis of DNA sequence polymorphisms

    PubMed Central

    2014-01-01

    Background As a fundamental genomic element, meiotic recombination hotspot plays important roles in life sciences. Thus uncovering its regulatory mechanisms has broad impact on biomedical research. Despite the recent identification of the zinc finger protein PRDM9 and its 13-mer binding motif as major regulators for meiotic recombination hotspots, other regulators remain to be discovered. Existing methods for finding DNA sequence motifs of recombination hotspots often rely on the enrichment of co-localizations between hotspots and short DNA patterns, which ignore the cross-individual variation of recombination rates and sequence polymorphisms in the population. Our objective in this paper is to capture signals encoded in genetic variations for the discovery of recombination-associated DNA motifs. Results Recently, an algorithm called “LDsplit” has been designed to detect the association between single nucleotide polymorphisms (SNPs) and proximal meiotic recombination hotspots. The association is measured by the difference of population recombination rates at a hotspot between two alleles of a candidate SNP. Here we present an open source software tool of LDsplit, with integrative data visualization for recombination hotspots and their proximal SNPs. Applying LDsplit on SNPs inside an established 7-mer motif bound by PRDM9 we observed that SNP alleles preserving the original motif tend to have higher recombination rates than the opposite alleles that disrupt the motif. Running on SNP windows around hotspots each containing an occurrence of the 7-mer motif, LDsplit is able to guide the established motif finding algorithm of MEME to recover the 7-mer motif. In contrast, without LDsplit the 7-mer motif could not be identified. Conclusions LDsplit is a software tool for the discovery of cis-regulatory DNA sequence motifs stimulating meiotic recombination hotspots by screening and narrowing down to hotspot associated SNPs. It is the first computational method that

  2. Relevance of CARC and CRAC Cholesterol-Recognition Motifs in the Nicotinic Acetylcholine Receptor and Other Membrane-Bound Receptors.

    PubMed

    Di Scala, Coralie; Baier, Carlos J; Evans, Luke S; Williamson, Philip T F; Fantini, Jacques; Barrantes, Francisco J

    2017-01-01

    Cholesterol is a ubiquitous neutral lipid, which finely tunes the activity of a wide range of membrane proteins, including neurotransmitter and hormone receptors and ion channels. Given the scarcity of available X-ray crystallographic structures and the even fewer in which cholesterol sites have been directly visualized, application of in silico computational methods remains a valid alternative for the detection and thermodynamic characterization of cholesterol-specific sites in functionally important membrane proteins. The membrane-embedded segments of the paradigm neurotransmitter receptor for acetylcholine display a series of cholesterol consensus domains (which we have coined "CARC"). The CARC motif exhibits a preference for the outer membrane leaflet and its mirror motif, CRAC, for the inner one. Some membrane proteins possess the double CARC-CRAC sequences within the same transmembrane domain. In addition to in silico molecular modeling, the affinity, concentration dependence, and specificity of the cholesterol-recognition motif-protein interaction have recently found experimental validation in other biophysical approaches like monolayer techniques and nuclear magnetic resonance spectroscopy. From the combined studies, it becomes apparent that the CARC motif is now more firmly established as a high-affinity cholesterol-binding domain for membrane-bound receptors and remarkably conserved along phylogenetic evolution. © 2017 Elsevier Inc. All rights reserved.

  3. RNA motif search with data-driven element ordering.

    PubMed

    Rampášek, Ladislav; Jimenez, Randi M; Lupták, Andrej; Vinař, Tomáš; Brejová, Broňa

    2016-05-18

    In this paper, we study the problem of RNA motif search in long genomic sequences. This approach uses a combination of sequence and structure constraints to uncover new distant homologs of known functional RNAs. The problem is NP-hard and is traditionally solved by backtracking algorithms. We have designed a new algorithm for RNA motif search and implemented a new motif search tool RNArobo. The tool enhances the RNAbob descriptor language, allowing insertions in helices, which enables better characterization of ribozymes and aptamers. A typical RNA motif consists of multiple elements and the running time of the algorithm is highly dependent on their ordering. By approaching the element ordering problem in a principled way, we demonstrate more than 100-fold speedup of the search for complex motifs compared to previously published tools. We have developed a new method for RNA motif search that allows for a significant speedup of the search of complex motifs that include pseudoknots. Such speed improvements are crucial at a time when the rate of DNA sequencing outpaces growth in computing. RNArobo is available at http://compbio.fmph.uniba.sk/rnarobo .

  4. Highly Conserved Arg Residue of ERFNIN Motif of Pro-Domain is Important for pH-Induced Zymogen Activation Process in Cysteine Cathepsins K and L.

    PubMed

    Aich, Pulakesh; Biswas, Sampa

    2018-06-01

    Pro-domain of a cysteine cathepsin contains a highly conserved Ex 2 Rx 2 Fx 2 Nx 3 Ix 3 N (ERFNIN) motif. The zymogen structure of cathepsins revealed that the Arg(R) residue of the motif is a central residue of a salt-bridge/H-bond network, stabilizing the scaffold of the pro-domain. Importance of the arginine is also demonstrated in studies where a single mutation (Arg → Trp) in human lysosomal cathepsin K (hCTSK) is linked to a bone-related genetic disorder "Pycnodysostosis". In the present study, we have characterized in vitro Arg → Trp mutant of hCTSK and the same mutant of hCTSL. The R → W mutant of hCTSK revealed that this mutation leads to an unstable zymogen that is spontaneously activated and auto-proteolytically degraded rapidly. In contrast, the same mutant of hCTSL is sufficiently stable and has proteolytic activity almost like its wild-type counterpart; however it shows an altered zymogen activation condition in terms of pH, temperature and time. Far and near UV circular dichroism and intrinsic tryptophan fluorescence experiments have revealed that the mutation has minimal effect on structure of the protease hCTSL. Molecular modeling studies shows that the mutated Trp31 in hCTSL forms an aromatic cluster with Tyr23 and Trp30 leading to a local stabilization of pro-domain and supplements the loss of salt-bridge interaction mediated by Arg31 in wild-type. In hCTSK-R31W mutant, due to presence of a non-aromatic Ser30 residue such interaction is not possible and may be responsible for local instability. These differences may cause detrimental effects of R31W mutation on the regulation of hCTSK auto-activation process compared to altered activation process in hCTSL.

  5. Crystal structure of bacterial cell-surface alginate-binding protein with an M75 peptidase motif

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Maruyama, Yukie; Ochiai, Akihito; Mikami, Bunzo

    Research highlights: {yields} Bacterial alginate-binding Algp7 is similar to component EfeO of Fe{sup 2+} transporter. {yields} We determined the crystal structure of Algp7 with a metal-binding motif. {yields} Algp7 consists of two helical bundles formed through duplication of a single bundle. {yields} A deep cleft involved in alginate binding locates around the metal-binding site. {yields} Algp7 may function as a Fe{sup 2+}-chelated alginate-binding protein. -- Abstract: A gram-negative Sphingomonas sp. A1 directly incorporates alginate polysaccharide into the cytoplasm via the cell-surface pit and ABC transporter. A cell-surface alginate-binding protein, Algp7, functions as a concentrator of the polysaccharide in the pit.more » Based on the primary structure and genetic organization in the bacterial genome, Algp7 was found to be homologous to an M75 peptidase motif-containing EfeO, a component of a ferrous ion transporter. Despite the presence of an M75 peptidase motif with high similarity, the Algp7 protein purified from recombinant Escherichia coli cells was inert on insulin B chain and N-benzoyl-Phe-Val-Arg-p-nitroanilide, both of which are substrates for a typical M75 peptidase, imelysin, from Pseudomonas aeruginosa. The X-ray crystallographic structure of Algp7 was determined at 2.10 A resolution by single-wavelength anomalous diffraction. Although a metal-binding motif, HxxE, conserved in zinc ion-dependent M75 peptidases is also found in Algp7, the crystal structure of Algp7 contains no metal even at the motif. The protein consists of two structurally similar up-and-down helical bundles as the basic scaffold. A deep cleft between the bundles is sufficiently large to accommodate macromolecules such as alginate polysaccharide. This is the first structural report on a bacterial cell-surface alginate-binding protein with an M75 peptidase motif.« less

  6. Multilayer motif analysis of brain networks

    NASA Astrophysics Data System (ADS)

    Battiston, Federico; Nicosia, Vincenzo; Chavez, Mario; Latora, Vito

    2017-04-01

    In the last decade, network science has shed new light both on the structural (anatomical) and on the functional (correlations in the activity) connectivity among the different areas of the human brain. The analysis of brain networks has made possible to detect the central areas of a neural system and to identify its building blocks by looking at overabundant small subgraphs, known as motifs. However, network analysis of the brain has so far mainly focused on anatomical and functional networks as separate entities. The recently developed mathematical framework of multi-layer networks allows us to perform an analysis of the human brain where the structural and functional layers are considered together. In this work, we describe how to classify the subgraphs of a multiplex network, and we extend the motif analysis to networks with an arbitrary number of layers. We then extract multi-layer motifs in brain networks of healthy subjects by considering networks with two layers, anatomical and functional, respectively, obtained from diffusion and functional magnetic resonance imaging. Results indicate that subgraphs in which the presence of a physical connection between brain areas (links at the structural layer) coexists with a non-trivial positive correlation in their activities are statistically overabundant. Finally, we investigate the existence of a reinforcement mechanism between the two layers by looking at how the probability to find a link in one layer depends on the intensity of the connection in the other one. Showing that functional connectivity is non-trivially constrained by the underlying anatomical network, our work contributes to a better understanding of the interplay between the structure and function in the human brain.

  7. A Second Las17 Monomeric Actin-Binding Motif Functions in Arp2/3-Dependent Actin Polymerization During Endocytosis

    PubMed Central

    Feliciano, Daniel; Tolsma, Thomas O.; Farrell, Kristen B.; Aradi, Al; Di Pietro, Santiago M.

    2018-01-01

    During clathrin-mediated endocytosis (CME), actin assembly provides force to drive vesicle internalization. Members of the Wiskott–Aldrich syndrome protein (WASP) family play a fundamental role stimulating actin assembly. WASP family proteins contain a WH2 motif that binds globular actin (G-actin) and a central-acidic motif that binds the Arp2/3 complex, thus promoting the formation of branched actin filaments. Yeast WASP (Las17) is the strongest of five factors promoting Arp2/3-dependent actin polymerization during CME. It was suggested that this strong activity may be caused by a putative second G-actin-binding motif in Las17. Here, we describe the in vitro and in vivo characterization of such Las17 G-actin-binding motif (LGM) and its dependence on a group of conserved arginine residues. Using the yeast two-hybrid system, GST-pulldown, fluorescence polarization and pyrene-actin polymerization assays, we show that LGM binds G-actin and is necessary for normal Arp2/3-mediated actin polymerization in vitro. Live-cell fluorescence microscopy experiments demonstrate that LGM is required for normal dynamics of actin polymerization during CME. Further, LGM is necessary for normal dynamics of endocytic machinery components that are recruited at early, intermediate and late stages of endocytosis, as well as for optimal endocytosis of native CME cargo. Both in vitro and in vivo experiments show that LGM has relatively lower potency compared to the previously known Las17 G-actin-binding motif, WH2. These results establish a second G-actin-binding motif in Las17 and advance our knowledge on the mechanism of actin assembly during CME. PMID:25615019

  8. Identification of sequence motifs significantly associated with antisense activity.

    PubMed

    McQuisten, Kyle A; Peek, Andrew S

    2007-06-07

    Predicting the suppression activity of antisense oligonucleotide sequences is the main goal of the rational design of nucleic acids. To create an effective predictive model, it is important to know what properties of an oligonucleotide sequence associate significantly with antisense activity. Also, for the model to be efficient we must know what properties do not associate significantly and can be omitted from the model. This paper will discuss the results of a randomization procedure to find motifs that associate significantly with either high or low antisense suppression activity, analysis of their properties, as well as the results of support vector machine modelling using these significant motifs as features. We discovered 155 motifs that associate significantly with high antisense suppression activity and 202 motifs that associate significantly with low suppression activity. The motifs range in length from 2 to 5 bases, contain several motifs that have been previously discovered as associating highly with antisense activity, and have thermodynamic properties consistent with previous work associating thermodynamic properties of sequences with their antisense activity. Statistical analysis revealed no correlation between a motif's position within an antisense sequence and that sequences antisense activity. Also, many significant motifs existed as subwords of other significant motifs. Support vector regression experiments indicated that the feature set of significant motifs increased correlation compared to all possible motifs as well as several subsets of the significant motifs. The thermodynamic properties of the significantly associated motifs support existing data correlating the thermodynamic properties of the antisense oligonucleotide with antisense efficiency, reinforcing our hypothesis that antisense suppression is strongly associated with probe/target thermodynamics, as there are no enzymatic mediators to speed the process along like the RNA Induced

  9. Conserved structure and inferred evolutionary history of long terminal repeats (LTRs)

    PubMed Central

    2013-01-01

    Background Long terminal repeats (LTRs, consisting of U3-R-U5 portions) are important elements of retroviruses and related retrotransposons. They are difficult to analyse due to their variability. The aim was to obtain a more comprehensive view of structure, diversity and phylogeny of LTRs than hitherto possible. Results Hidden Markov models (HMM) were created for 11 clades of LTRs belonging to Retroviridae (class III retroviruses), animal Metaviridae (Gypsy/Ty3) elements and plant Pseudoviridae (Copia/Ty1) elements, complementing our work with Orthoretrovirus HMMs. The great variation in LTR length of plant Metaviridae and the few divergent animal Pseudoviridae prevented building HMMs from both of these groups. Animal Metaviridae LTRs had the same conserved motifs as retroviral LTRs, confirming that the two groups are closely related. The conserved motifs were the short inverted repeats (SIRs), integrase recognition signals (5´TGTTRNR…YNYAACA 3´); the polyadenylation signal or AATAAA motif; a GT-rich stretch downstream of the polyadenylation signal; and a less conserved AT-rich stretch corresponding to the core promoter element, the TATA box. Plant Pseudoviridae LTRs differed slightly in having a conserved TATA-box, TATATA, but no conserved polyadenylation signal, plus a much shorter R region. The sensitivity of the HMMs for detection in genomic sequences was around 50% for most models, at a relatively high specificity, suitable for genome screening. The HMMs yielded consensus sequences, which were aligned by creating an HMM model (a ‘Superviterbi’ alignment). This yielded a phylogenetic tree that was compared with a Pol-based tree. Both LTR and Pol trees supported monophyly of retroviruses. In both, Pseudoviridae was ancestral to all other LTR retrotransposons. However, the LTR trees showed the chromovirus portion of Metaviridae clustering together with Pseudoviridae, dividing Metaviridae into two portions with distinct phylogeny. Conclusion The HMMs

  10. The glycine-rich motif of Pyrococcus abyssi DNA polymerase D is critical for protein stability.

    PubMed

    Castrec, Benoît; Laurent, Sébastien; Henneke, Ghislaine; Flament, Didier; Raffin, Jean-Paul

    2010-03-05

    A glycine-rich motif described as being involved in human polymerase delta proliferating cell nuclear antigen (PCNA) binding has also been identified in all euryarchaeal DNA polymerase D (Pol D) family members. We redefined the motif as the (G)-PYF box. In the present study, Pol D (G)-PYF box motif mutants from Pyrococcus abyssi were generated to investigate its role in functional interactions with the cognate PCNA. We demonstrated that this motif is not essential for interactions between PabPol D (P. abyssi Pol D) and PCNA, using surface plasmon resonance and primer extension studies. Interestingly, the (G)-PYF box is located in a hydrophobic region close to the active site. The (G)-PYF box mutants exhibited altered DNA binding properties. In addition, the thermal stability of all mutants was reduced compared to that of wild type, and this effect could be attributed to increased exposure of the hydrophobic region. These studies suggest that the (G)-PYF box motif mediates intersubunit interactions and that it may be crucial for the thermostability of PabPol D. (c) 2010 Elsevier Ltd. All rights reserved.

  11. A two-helix motif positions the active site of lysophosphatidic acid acyltransferase for catalysis within the membrane bilayer

    PubMed Central

    Robertson, Rosanna M.; Yao, Jiangwei; Gajewski, Stefan; Kumar, Gyanendra; Martin, Erik W.; Rock, Charles O.; White, Stephen W.

    2017-01-01

    Phosphatidic acid is the central intermediate in membrane phospholipid synthesis and is generated by two acyltransferases in a pathway conserved in all life forms. The second step in this pathway is catalyzed by 1-acyl-sn-glycero-3-phosphate acyltransferase, called PlsC in bacteria. The crystal structure of PlsC from Thermotoga maritima reveals an unusual hydrophobic/aromatic N-terminal two-helix motif linked to an acyltransferase αβ domain that contains the catalytic HX4D motif. PlsC dictates the acyl chain composition of the 2-position of phospholipids, and the acyl chain selectivity ‘ruler’ is an appropriately placed and closed hydrophobic tunnel. This was confirmed by site-directed mutagenesis and membrane composition analysis of Escherichia coli cells expressing the mutated proteins. MD simulations reveal that the two-helix motif represents a novel substructure that firmly anchors the protein to one leaflet of the membrane. This binding mode allows the PlsC active site to acylate lysophospholipids within the membrane bilayer using soluble acyl donors. PMID:28714993

  12. Identifying designatable units for intraspecific conservation prioritization: a hierarchical approach applied to the lake whitefish species complex (Coregonus spp.)

    PubMed Central

    Mee, Jonathan A; Bernatchez, Louis; Reist, Jim D; Rogers, Sean M; Taylor, Eric B

    2015-01-01

    The concept of the designatable unit (DU) affords a practical approach to identifying diversity below the species level for conservation prioritization. However, its suitability for defining conservation units in ecologically diverse, geographically widespread and taxonomically challenging species complexes has not been broadly evaluated. The lake whitefish species complex (Coregonus spp.) is geographically widespread in the Northern Hemisphere, and it contains a great deal of variability in ecology and evolutionary legacy within and among populations, as well as a great deal of taxonomic ambiguity. Here, we employ a set of hierarchical criteria to identify DUs within the Canadian distribution of the lake whitefish species complex. We identified 36 DUs based on (i) reproductive isolation, (ii) phylogeographic groupings, (iii) local adaptation and (iv) biogeographic regions. The identification of DUs is required for clear discussion regarding the conservation prioritization of lake whitefish populations. We suggest conservation priorities among lake whitefish DUs based on biological consequences of extinction, risk of extinction and distinctiveness. Our results exemplify the need for extensive genetic and biogeographic analyses for any species with broad geographic distributions and the need for detailed evaluation of evolutionary history and adaptive ecological divergence when defining intraspecific conservation units. PMID:26029257

  13. cWINNOWER Algorithm for Finding Fuzzy DNA Motifs

    NASA Technical Reports Server (NTRS)

    Liang, Shoudan

    2003-01-01

    The cWINNOWER algorithm detects fuzzy motifs in DNA sequences rich in protein-binding signals. A signal is defined as any short nucleotide pattern having up to d mutations differing from a motif of length l. The algorithm finds such motifs if multiple mutated copies of the motif (i.e., the signals) are present in the DNA sequence in sufficient abundance. The cWINNOWER algorithm substantially improves the sensitivity of the winnower method of Pevzner and Sze by imposing a consensus constraint, enabling it to detect much weaker signals. We studied the minimum number of detectable motifs qc as a function of sequence length N for random sequences. We found that qc increases linearly with N for a fast version of the algorithm based on counting three-member sub-cliques. Imposing consensus constraints reduces qc, by a factor of three in this case, which makes the algorithm dramatically more sensitive. Our most sensitive algorithm, which counts four-member sub-cliques, needs a minimum of only 13 signals to detect motifs in a sequence of length N = 12000 for (l,d) = (15,4).

  14. Transient α-helices in the disordered RPEL motifs of the serum response factor coactivator MKL1

    NASA Astrophysics Data System (ADS)

    Mizuguchi, Mineyuki; Fuju, Takahiro; Obita, Takayuki; Ishikawa, Mitsuru; Tsuda, Masaaki; Tabuchi, Akiko

    2014-06-01

    The megakaryoblastic leukemia 1 (MKL1) protein functions as a transcriptional coactivator of the serum response factor. MKL1 has three RPEL motifs (RPEL1, RPEL2, and RPEL3) in its N-terminal region. MKL1 binds to monomeric G-actin through RPEL motifs, and the dissociation of MKL1 from G-actin promotes the translocation of MKL1 to the nucleus. Although structural data are available for RPEL motifs of MKL1 in complex with G-actin, the structural characteristics of RPEL motifs in the free state have been poorly defined. Here we characterized the structures of free RPEL motifs using NMR and CD spectroscopy. NMR and CD measurements showed that free RPEL motifs are largely unstructured in solution. However, NMR analysis identified transient α-helices in the regions where helices α1 and α2 are induced upon binding to G-actin. Proline mutagenesis showed that the transient α-helices are locally formed without helix-helix interactions. The helix content is higher in the order of RPEL1, RPEL2, and RPEL3. The amount of preformed structure may correlate with the binding affinity between the intrinsically disordered protein and its target molecule.

  15. Conserved structures formed by heterogeneous RNA sequences drive silencing of an inflammation responsive post-transcriptional operon

    PubMed Central

    Basu, Abhijit; Jain, Niyati; Tolbert, Blanton S.; Komar, Anton A.

    2017-01-01

    Abstract RNA–protein interactions with physiological outcomes usually rely on conserved sequences within the RNA element. By contrast, activity of the diverse gamma-interferon-activated inhibitor of translation (GAIT)-elements relies on the conserved RNA folding motifs rather than the conserved sequence motifs. These elements drive the translational silencing of a group of chemokine (CC/CXC) and chemokine receptor (CCR) mRNAs, thereby helping to resolve physiological inflammation. Despite sequence dissimilarity, these RNA elements adopt common secondary structures (as revealed by 2D-1H NMR spectroscopy), providing a basis for their interaction with the RNA-binding GAIT complex. However, many of these elements (e.g. those derived from CCL22, CXCL13, CCR4 and ceruloplasmin (Cp) mRNAs) have substantially different affinities for GAIT complex binding. Toeprinting analysis shows that different positions within the overall conserved GAIT element structure contribute to differential affinities of the GAIT protein complex towards the elements. Thus, heterogeneity of GAIT elements may provide hierarchical fine-tuning of the resolution of inflammation. PMID:29069516

  16. TrawlerWeb: an online de novo motif discovery tool for next-generation sequencing datasets.

    PubMed

    Dang, Louis T; Tondl, Markus; Chiu, Man Ho H; Revote, Jerico; Paten, Benedict; Tano, Vincent; Tokolyi, Alex; Besse, Florence; Quaife-Ryan, Greg; Cumming, Helen; Drvodelic, Mark J; Eichenlaub, Michael P; Hallab, Jeannette C; Stolper, Julian S; Rossello, Fernando J; Bogoyevitch, Marie A; Jans, David A; Nim, Hieu T; Porrello, Enzo R; Hudson, James E; Ramialison, Mirana

    2018-04-05

    A strong focus of the post-genomic era is mining of the non-coding regulatory genome in order to unravel the function of regulatory elements that coordinate gene expression (Nat 489:57-74, 2012; Nat 507:462-70, 2014; Nat 507:455-61, 2014; Nat 518:317-30, 2015). Whole-genome approaches based on next-generation sequencing (NGS) have provided insight into the genomic location of regulatory elements throughout different cell types, organs and organisms. These technologies are now widespread and commonly used in laboratories from various fields of research. This highlights the need for fast and user-friendly software tools dedicated to extracting cis-regulatory information contained in these regulatory regions; for instance transcription factor binding site (TFBS) composition. Ideally, such tools should not require prior programming knowledge to ensure they are accessible for all users. We present TrawlerWeb, a web-based version of the Trawler_standalone tool (Nat Methods 4:563-5, 2007; Nat Protoc 5:323-34, 2010), to allow for the identification of enriched motifs in DNA sequences obtained from next-generation sequencing experiments in order to predict their TFBS composition. TrawlerWeb is designed for online queries with standard options common to web-based motif discovery tools. In addition, TrawlerWeb provides three unique new features: 1) TrawlerWeb allows the input of BED files directly generated from NGS experiments, 2) it automatically generates an input-matched biologically relevant background, and 3) it displays resulting conservation scores for each instance of the motif found in the input sequences, which assists the researcher in prioritising the motifs to validate experimentally. Finally, to date, this web-based version of Trawler_standalone remains the fastest online de novo motif discovery tool compared to other popular web-based software, while generating predictions with high accuracy. TrawlerWeb provides users with a fast, simple and easy-to-use web

  17. Modeling gene regulatory network motifs using statecharts

    PubMed Central

    2012-01-01

    Background Gene regulatory networks are widely used by biologists to describe the interactions among genes, proteins and other components at the intra-cellular level. Recently, a great effort has been devoted to give gene regulatory networks a formal semantics based on existing computational frameworks. For this purpose, we consider Statecharts, which are a modular, hierarchical and executable formal model widely used to represent software systems. We use Statecharts for modeling small and recurring patterns of interactions in gene regulatory networks, called motifs. Results We present an improved method for modeling gene regulatory network motifs using Statecharts and we describe the successful modeling of several motifs, including those which could not be modeled or whose models could not be distinguished using the method of a previous proposal. We model motifs in an easy and intuitive way by taking advantage of the visual features of Statecharts. Our modeling approach is able to simulate some interesting temporal properties of gene regulatory network motifs: the delay in the activation and the deactivation of the "output" gene in the coherent type-1 feedforward loop, the pulse in the incoherent type-1 feedforward loop, the bistability nature of double positive and double negative feedback loops, the oscillatory behavior of the negative feedback loop, and the "lock-in" effect of positive autoregulation. Conclusions We present a Statecharts-based approach for the modeling of gene regulatory network motifs in biological systems. The basic motifs used to build more complex networks (that is, simple regulation, reciprocal regulation, feedback loop, feedforward loop, and autoregulation) can be faithfully described and their temporal dynamics can be analyzed. PMID:22536967

  18. FPGA implementation of motifs-based neuronal network and synchronization analysis

    NASA Astrophysics Data System (ADS)

    Deng, Bin; Zhu, Zechen; Yang, Shuangming; Wei, Xile; Wang, Jiang; Yu, Haitao

    2016-06-01

    Motifs in complex networks play a crucial role in determining the brain functions. In this paper, 13 kinds of motifs are implemented with Field Programmable Gate Array (FPGA) to investigate the relationships between the networks properties and motifs properties. We use discretization method and pipelined architecture to construct various motifs with Hindmarsh-Rose (HR) neuron as the node model. We also build a small-world network based on these motifs and conduct the synchronization analysis of motifs as well as the constructed network. We find that the synchronization properties of motif determine that of motif-based small-world network, which demonstrates effectiveness of our proposed hardware simulation platform. By imitation of some vital nuclei in the brain to generate normal discharges, our proposed FPGA-based artificial neuronal networks have the potential to replace the injured nuclei to complete the brain function in the treatment of Parkinson's disease and epilepsy.

  19. TFII-I regulates target genes in the PI-3K and TGF-β signaling pathways through a novel DNA binding motif.

    PubMed

    Segura-Puimedon, Maria; Borralleras, Cristina; Pérez-Jurado, Luis A; Campuzano, Victoria

    2013-09-25

    General transcription factor (TFII-I) is a multi-functional protein involved in the transcriptional regulation of critical developmental genes, encoded by the GTF2I gene located on chromosome 7q11.23. Haploinsufficiency at GTF2I has been shown to play a major role in the neurodevelopmental features of Williams-Beuren syndrome (WBS). Identification of genes regulated by TFII-I is thus critical to detect molecular determinants of WBS as well as to identify potential new targets for specific pharmacological interventions, which are currently absent. We performed a microarray screening for transcriptional targets of TFII-I in cortex and embryonic cells from Gtf2i mutant and wild-type mice. Candidate genes with altered expression were verified using real-time PCR. A novel motif shared by deregulated genes was found and chromatin immunoprecipitation assays in embryonic fibroblasts were used to document in vitro TFII-I binding to this motif in the promoter regions of deregulated genes. Interestingly, the PI3K and TGFβ signaling pathways were over-represented among TFII-I-modulated genes. In this study we have found a highly conserved DNA element, common to a set of genes regulated by TFII-I, and identified and validated novel in vivo neuronal targets of this protein affecting the PI3K and TGFβ signaling pathways. Overall, our data further contribute to unravel the complexity and variability of the different genetic programs orchestrated by TFII-I. © 2013 Elsevier B.V. All rights reserved.

  20. Conserved Noncoding Elements in the Most Distant Genera of Cephalochordates: The Goldilocks Principle

    PubMed Central

    Yue, Jia-Xing; Kozmikova, Iryna; Ono, Hiroki; Nossa, Carlos W.; Kozmik, Zbynek; Putnam, Nicholas H.; Yu, Jr-Kai; Holland, Linda Z.

    2016-01-01

    Cephalochordates, the sister group of vertebrates + tunicates, are evolving particularly slowly. Therefore, genome comparisons between two congeners of Branchiostoma revealed so many conserved noncoding elements (CNEs), that it was not clear how many are functional regulatory elements. To more effectively identify CNEs with potential regulatory functions, we compared noncoding sequences of genomes of the most phylogenetically distant cephalochordate genera, Asymmetron and Branchiostoma, which diverged approximately 120–160 million years ago. We found 113,070 noncoding elements conserved between the two species, amounting to 3.3% of the genome. The genomic distribution, target gene ontology, and enriched motifs of these CNEs all suggest that many of them are probably cis-regulatory elements. More than 90% of previously verified amphioxus regulatory elements were re-captured in this study. A search of the cephalochordate CNEs around 50 developmental genes in several vertebrate genomes revealed eight CNEs conserved between cephalochordates and vertebrates, indicating sequence conservation over >500 million years of divergence. The function of five CNEs was tested in reporter assays in zebrafish, and one was also tested in amphioxus. All five CNEs proved to be tissue-specific enhancers. Taken together, these findings indicate that even though Branchiostoma and Asymmetron are distantly related, as they are evolving slowly, comparisons between them are likely optimal for identifying most of their tissue-specific cis-regulatory elements laying the foundation for functional characterizations and a better understanding of the evolution of developmental regulation in cephalochordates. PMID:27412606

  1. Functional synthetic Antennapedia genes and the dual roles of YPWM motif and linker size in transcriptional activation and repression

    PubMed Central

    Papadopoulos, Dimitrios K.; Reséndez-Pérez, Diana; Cárdenas-Chávez, Diana L.; Villanueva-Segura, Karina; Canales-del-Castillo, Ricardo; Felix, Daniel A.; Fünfschilling, Raphael; Gehring, Walter J.

    2011-01-01

    Segmental identity along the anteroposterior axis of bilateral animals is specified by Hox genes. These genes encode transcription factors, harboring the conserved homeodomain and, generally, a YPWM motif, which binds Hox cofactors and increases Hox transcriptional specificity in vivo. Here we derive synthetic Drosophila Antennapedia genes, consisting only of the YPWM motif and homeodomain, and investigate their functional role throughout development. Synthetic peptides and full-length Antennapedia proteins cause head-to-thorax transformations in the embryo, as well as antenna-to-tarsus and eye-to-wing transformations in the adult, thus converting the entire head to a mesothorax. This conversion is achieved by repression of genes required for head and antennal development and ectopic activation of genes promoting thoracic and tarsal fates, respectively. Synthetic Antennapedia peptides bind DNA specifically and interact with Extradenticle and Bric-à-brac interacting protein 2 cofactors in vitro and ex vivo. Substitution of the YPWM motif by alanines abolishes Antennapedia homeotic function, whereas substitution of YPWM by the WRPW repressor motif, which binds the transcriptional corepressor Groucho, allows all proteins to act as repressors only. Finally, naturally occurring variations in the size of the linker between the homeodomain and YPWM motif enhance Antennapedia repressive or activating efficiency, emphasizing the importance of linker size, rather than sequence, for specificity. Our results clearly show that synthetic Antennapedia genes are functional in vivo and therefore provide powerful tools for synthetic biology. Moreover, the YPWM motif is necessary—whereas the entire N terminus of the protein is dispensable—for Antennapedia homeotic function, indicating its dual role in transcriptional activation and repression by recruiting either coactivators or corepressors. PMID:21712439

  2. A flexible motif search technique based on generalized profiles.

    PubMed

    Bucher, P; Karplus, K; Moeri, N; Hofmann, K

    1996-03-01

    A flexible motif search technique is presented which has two major components: (1) a generalized profile syntax serving as a motif definition language; and (2) a motif search method specifically adapted to the problem of finding multiple instances of a motif in the same sequence. The new profile structure, which is the core of the generalized profile syntax, combines the functions of a variety of motif descriptors implemented in other methods, including regular expression-like patterns, weight matrices, previously used profiles, and certain types of hidden Markov models (HMMs). The relationship between generalized profiles and other biomolecular motif descriptors is analyzed in detail, with special attention to HMMs. Generalized profiles are shown to be equivalent to a particular class of HMMs, and conversion procedures in both directions are given. The conversion procedures provide an interpretation for local alignment in the framework of stochastic models, allowing for clear, simple significance tests. A mathematical statement of the motif search problem defines the new method exactly without linking it to a specific algorithmic solution. Part of the definition includes a new definition of disjointness of alignments.

  3. Identification of a novel mitotic phosphorylation motif associated with protein localization to the mitotic apparatus

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yang, Feng; Camp, David G.; Gritsenko, Marina A.

    2007-11-16

    The chromosomal passenger complex (CPC) is a critical regulator of chromosome, cytoskeleton and membrane dynamics during mitosis. Here, we identified phosphopeptides and phosphoprotein complexes recognized by a phosphorylation specific antibody that labels the CPC using liquid chromatography coupled to mass spectrometry. A mitotic phosphorylation motif (PX{G/T/S}{L/M}[pS]P or WGL[pS]P) was identified in 11 proteins including Fzr/Cdh1 and RIC-8, two proteins with potential links to the CPC. Phosphoprotein complexes contained known CPC components INCENP, Aurora-B and TD-60, as well as SMAD2, 14-3-3 proteins, PP2A, and Cdk1, a likely kinase for this motif. Protein sequence analysis identified phosphorylation motifs in additional proteins includingmore » SMAD2, Plk3 and INCENP. Mitotic SMAD2 and Plk3 phosphorylation was confirmed using phosphorylation specific antibodies, and in the case of Plk3, phosphorylation correlates with its localization to the mitotic apparatus. A mutagenesis approach was used to show INCENP phosphorylation is required for midbody localization. These results provide evidence for a shared phosphorylation event that regulates localization of critical proteins during mitosis.« less

  4. The PDZ-binding motif of Yes-associated protein is required for its co-activation of TEAD-mediated CTGF transcription and oncogenic cell transforming activity

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shimomura, Tadanori; Miyamura, Norio; Hata, Shoji

    2014-01-17

    Highlights: •Loss of the PDZ-binding motif inhibits constitutively active YAP (5SA)-induced oncogenic cell transformation. •The PDZ-binding motif of YAP promotes its nuclear localization in cultured cells and mouse liver. •Loss of the PDZ-binding motif inhibits YAP (5SA)-induced CTGF transcription in cultured cells and mouse liver. -- Abstract: YAP is a transcriptional co-activator that acts downstream of the Hippo signaling pathway and regulates multiple cellular processes, including proliferation. Hippo pathway-dependent phosphorylation of YAP negatively regulates its function. Conversely, attenuation of Hippo-mediated phosphorylation of YAP increases its ability to stimulate proliferation and eventually induces oncogenic transformation. The C-terminus of YAP contains amore » highly conserved PDZ-binding motif that regulates YAP’s functions in multiple ways. However, to date, the importance of the PDZ-binding motif to the oncogenic cell transforming activity of YAP has not been determined. In this study, we disrupted the PDZ-binding motif in the YAP (5SA) protein, in which the sites normally targeted by Hippo pathway-dependent phosphorylation are mutated. We found that loss of the PDZ-binding motif significantly inhibited the oncogenic transformation of cultured cells induced by YAP (5SA). In addition, the increased nuclear localization of YAP (5SA) and its enhanced activation of TEAD-dependent transcription of the cell proliferation gene CTGF were strongly reduced when the PDZ-binding motif was deleted. Similarly, in mouse liver, deletion of the PDZ-binding motif suppressed nuclear localization of YAP (5SA) and YAP (5SA)-induced CTGF expression. Taken together, our results indicate that the PDZ-binding motif of YAP is critical for YAP-mediated oncogenesis, and that this effect is mediated by YAP’s co-activation of TEAD-mediated CTGF transcription.« less

  5. Direct AUC optimization of regulatory motifs.

    PubMed

    Zhu, Lin; Zhang, Hong-Bo; Huang, De-Shuang

    2017-07-15

    The discovery of transcription factor binding site (TFBS) motifs is essential for untangling the complex mechanism of genetic variation under different developmental and environmental conditions. Among the huge amount of computational approaches for de novo identification of TFBS motifs, discriminative motif learning (DML) methods have been proven to be promising for harnessing the discovery power of accumulated huge amount of high-throughput binding data. However, they have to sacrifice accuracy for speed and could fail to fully utilize the information of the input sequences. We propose a novel algorithm called CDAUC for optimizing DML-learned motifs based on the area under the receiver-operating characteristic curve (AUC) criterion, which has been widely used in the literature to evaluate the significance of extracted motifs. We show that when the considered AUC loss function is optimized in a coordinate-wise manner, the cost function of each resultant sub-problem is a piece-wise constant function, whose optimal value can be found exactly and efficiently. Further, a key step of each iteration of CDAUC can be efficiently solved as a computational geometry problem. Experimental results on real world high-throughput datasets illustrate that CDAUC outperforms competing methods for refining DML motifs, while being one order of magnitude faster. Meanwhile, preliminary results also show that CDAUC may also be useful for improving the interpretability of convolutional kernels generated by the emerging deep learning approaches for predicting TF sequences specificities. CDAUC is available at: https://drive.google.com/drive/folders/0BxOW5MtIZbJjNFpCeHlBVWJHeW8 . dshuang@tongji.edu.cn. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  6. Reversibly bound chloride in the atrial natriuretic peptide receptor hormone-binding domain: possible allosteric regulation and a conserved structural motif for the chloride-binding site.

    PubMed

    Ogawa, Haruo; Qiu, Yue; Philo, John S; Arakawa, Tsutomu; Ogata, Craig M; Misono, Kunio S

    2010-03-01

    The binding of atrial natriuretic peptide (ANP) to its receptor requires chloride, and it is chloride concentration dependent. The extracellular domain (ECD) of the ANP receptor (ANPR) contains a chloride near the ANP-binding site, suggesting a possible regulatory role. The bound chloride, however, is completely buried in the polypeptide fold, and its functional role has remained unclear. Here, we have confirmed that chloride is necessary for ANP binding to the recombinant ECD or the full-length ANPR expressed in CHO cells. ECD without chloride (ECD(-)) did not bind ANP. Its binding activity was fully restored by bromide or chloride addition. A new X-ray structure of the bromide-bound ECD is essentially identical to that of the chloride-bound ECD. Furthermore, bromide atoms are localized at the same positions as chloride atoms both in the apo and in the ANP-bound structures, indicating exchangeable and reversible halide binding. Far-UV CD and thermal unfolding data show that ECD(-) largely retains the native structure. Sedimentation equilibrium in the absence of chloride shows that ECD(-) forms a strongly associated dimer, possibly preventing the structural rearrangement of the two monomers that is necessary for ANP binding. The primary and tertiary structures of the chloride-binding site in ANPR are highly conserved among receptor-guanylate cyclases and metabotropic glutamate receptors. The chloride-dependent ANP binding, reversible chloride binding, and the highly conserved chloride-binding site motif suggest a regulatory role for the receptor bound chloride. Chloride-dependent regulation of ANPR may operate in the kidney, modulating ANP-induced natriuresis.

  7. Reversibly Bound Chloride in the Atrial Natriuretic Peptide Receptor Hormone Binding Domain: Possible Allosteric Regulation and a Conserved Structural Motif for the Chloride-binding Site

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ogawa, H.; Qiu, Y; Philo, J

    2010-01-01

    The binding of atrial natriuretic peptide (ANP) to its receptor requires chloride, and it is chloride concentration dependent. The extracellular domain (ECD) of the ANP receptor (ANPR) contains a chloride near the ANP-binding site, suggesting a possible regulatory role. The bound chloride, however, is completely buried in the polypeptide fold, and its functional role has remained unclear. Here, we have confirmed that chloride is necessary for ANP binding to the recombinant ECD or the full-length ANPR expressed in CHO cells. ECD without chloride (ECD(-)) did not bind ANP. Its binding activity was fully restored by bromide or chloride addition. Amore » new X-ray structure of the bromide-bound ECD is essentially identical to that of the chloride-bound ECD. Furthermore, bromide atoms are localized at the same positions as chloride atoms both in the apo and in the ANP-bound structures, indicating exchangeable and reversible halide binding. Far-UV CD and thermal unfolding data show that ECD(-) largely retains the native structure. Sedimentation equilibrium in the absence of chloride shows that ECD(-) forms a strongly associated dimer, possibly preventing the structural rearrangement of the two monomers that is necessary for ANP binding. The primary and tertiary structures of the chloride-binding site in ANPR are highly conserved among receptor-guanylate cyclases and metabotropic glutamate receptors. The chloride-dependent ANP binding, reversible chloride binding, and the highly conserved chloride-binding site motif suggest a regulatory role for the receptor bound chloride. Chloride-dependent regulation of ANPR may operate in the kidney, modulating ANP-induced natriuresis.« less

  8. Reversibly bound chloride in the atrial natriuretic peptide receptor hormone-binding domain: Possible allosteric regulation and a conserved structural motif for the chloride-binding site

    PubMed Central

    Ogawa, Haruo; Qiu, Yue; Philo, John S; Arakawa, Tsutomu; Ogata, Craig M; Misono, Kunio S

    2010-01-01

    The binding of atrial natriuretic peptide (ANP) to its receptor requires chloride, and it is chloride concentration dependent. The extracellular domain (ECD) of the ANP receptor (ANPR) contains a chloride near the ANP-binding site, suggesting a possible regulatory role. The bound chloride, however, is completely buried in the polypeptide fold, and its functional role has remained unclear. Here, we have confirmed that chloride is necessary for ANP binding to the recombinant ECD or the full-length ANPR expressed in CHO cells. ECD without chloride (ECD(−)) did not bind ANP. Its binding activity was fully restored by bromide or chloride addition. A new X-ray structure of the bromide-bound ECD is essentially identical to that of the chloride-bound ECD. Furthermore, bromide atoms are localized at the same positions as chloride atoms both in the apo and in the ANP-bound structures, indicating exchangeable and reversible halide binding. Far-UV CD and thermal unfolding data show that ECD(−) largely retains the native structure. Sedimentation equilibrium in the absence of chloride shows that ECD(−) forms a strongly associated dimer, possibly preventing the structural rearrangement of the two monomers that is necessary for ANP binding. The primary and tertiary structures of the chloride-binding site in ANPR are highly conserved among receptor-guanylate cyclases and metabotropic glutamate receptors. The chloride-dependent ANP binding, reversible chloride binding, and the highly conserved chloride-binding site motif suggest a regulatory role for the receptor bound chloride. Chloride-dependent regulation of ANPR may operate in the kidney, modulating ANP-induced natriuresis. PMID:20066666

  9. A freshwater biodiversity hotspot under pressure - assessing threats and identifying conservation needs for ancient Lake Ohrid

    NASA Astrophysics Data System (ADS)

    Kostoski, G.; Albrecht, C.; Trajanovski, S.; Wilke, T.

    2010-07-01

    Freshwater habitats and species living in freshwater are generally more prone to extinction than terrestrial or marine ones. Immediate conservation measures for world-wide freshwater resources are thus of eminent importance. This is particularly true for so called ancient lakes. While these lakes are famous for being evolutionary theatres, often displaying an extraordinarily high degree of biodiversity and endemism, in many cases these biota are also experiencing extreme anthropogenic impact. Lake Ohrid, the European biodiversity hotspot, is a prime example for a lake with a magnitude of narrow range endemic taxa that are under increasing anthropogenic pressure. Unfortunately, evidence for a "creeping biodiversity crisis" has accumulated over the last decades, and major socio-political changes have gone along with human-mediated environmental changes. Based on field surveys, monitoring data, published records, and expert interviews, we aimed to (1) assess threats to Lake Ohrids' (endemic) biodiversity, (2) summarize existing conservation activities and strategies, and (3) outline future conservation needs for Lake Ohrid. We compiled threats to both specific taxa (and in cases to particular species) as well as to the lake ecosystems itself. Major conservation concerns identified for Lake Ohrid are: (1) watershed impacts, (2) agriculture and forestry, (3) tourism and population growth, (4) non-indigenous species, (5) habitat alteration or loss, (6) unsustainable exploitation of fisheries, and (7) global climate change. Of the 11 IUCN (International Union for Conservation of Nature and Natural Resources) threat classes scored, seven have moderate and three severe impacts. These latter threat classes are energy production and mining, biological resource use, and pollution. We review and discuss institutional responsibilities, environmental monitoring and ecosystem management, existing parks and reserves, biodiversity and species measures, international conservation

  10. A freshwater biodiversity hotspot under pressure - assessing threats and identifying conservation needs for ancient Lake Ohrid

    NASA Astrophysics Data System (ADS)

    Kostoski, G.; Albrecht, C.; Trajanovski, S.; Wilke, T.

    2010-12-01

    Immediate conservation measures for world-wide freshwater resources are of eminent importance. This is particularly true for so-called ancient lakes. While these lakes are famous for being evolutionary theatres, often displaying an extraordinarily high degree of biodiversity and endemism, in many cases these biota are also experiencing extreme anthropogenic impact. Lake Ohrid, a major European biodiversity hotspot situated in a trans-frontier setting on the Balkans, is a prime example for a lake with a magnitude of narrow range endemic taxa that are under increasing anthropogenic pressure. Unfortunately, evidence for a "creeping biodiversity crisis" has accumulated over the last decades, and major socio-political changes have gone along with human-mediated environmental changes. Based on field surveys, monitoring data, published records, and expert interviews, we aimed to (1) assess threats to Lake Ohrids' (endemic) biodiversity, (2) summarize existing conservation activities and strategies, and (3) outline future conservation needs for Lake Ohrid. We compiled threats to both specific taxa (and in cases to particular species) as well as to the lake ecosystems itself. Major conservation concerns identified for Lake Ohrid are: (1) watershed impacts, (2) agriculture and forestry, (3) tourism and population growth, (4) non-indigenous species, (5) habitat alteration or loss, (6) unsustainable exploitation of fisheries, and (7) global climate change. Among the major (well-known) threats with high impact are nutrient input (particularly of phosphorus), habitat conversion and silt load. Other threats are potentially of high impact but less well known. Such threats include pollution with hazardous substances (from sources such as mines, former industries, agriculture) or climate change. We review and discuss institutional responsibilities, environmental monitoring and ecosystem management, existing parks and reserves, biodiversity and species measures, international

  11. Unravelling daily human mobility motifs

    PubMed Central

    Schneider, Christian M.; Belik, Vitaly; Couronné, Thomas; Smoreda, Zbigniew; González, Marta C.

    2013-01-01

    Human mobility is differentiated by time scales. While the mechanism for long time scales has been studied, the underlying mechanism on the daily scale is still unrevealed. Here, we uncover the mechanism responsible for the daily mobility patterns by analysing the temporal and spatial trajectories of thousands of persons as individual networks. Using the concept of motifs from network theory, we find only 17 unique networks are present in daily mobility and they follow simple rules. These networks, called here motifs, are sufficient to capture up to 90 per cent of the population in surveys and mobile phone datasets for different countries. Each individual exhibits a characteristic motif, which seems to be stable over several months. Consequently, daily human mobility can be reproduced by an analytically tractable framework for Markov chains by modelling periods of high-frequency trips followed by periods of lower activity as the key ingredient. PMID:23658117

  12. Global analyses of TetR family transcriptional regulators in mycobacteria indicates conservation across species and diversity in regulated functions.

    PubMed

    Balhana, Ricardo J C; Singla, Ashima; Sikder, Mahmudul Hasan; Withers, Mike; Kendall, Sharon L

    2015-06-27

    Mycobacteria inhabit diverse niches and display high metabolic versatility. They can colonise both humans and animals and are also able to survive in the environment. In order to succeed, response to environmental cues via transcriptional regulation is required. In this study we focused on the TetR family of transcriptional regulators (TFTRs) in mycobacteria. We used InterPro to classify the entire complement of transcriptional regulators in 10 mycobacterial species and these analyses showed that TFTRs are the most abundant family of regulators in all species. We identified those TFTRs that are conserved across all species analysed and those that are unique to the pathogens included in the analysis. We examined genomic contexts of 663 of the conserved TFTRs and observed that the majority of TFTRs are separated by 200 bp or less from divergently oriented genes. Analyses of divergent genes indicated that the TFTRs control diverse biochemical functions not limited to efflux pumps. TFTRs typically bind to palindromic motifs and we identified 11 highly significant novel motifs in the upstream regions of divergently oriented TFTRs. The C-terminal ligand binding domain from the TFTR complement in M. tuberculosis showed great diversity in amino acid sequence but with an overall architecture common to other TFTRs. This study suggests that mycobacteria depend on TFTRs for the transcriptional control of a number of metabolic functions yet the physiological role of the majority of these regulators remain unknown.

  13. G4 motifs affect origin positioning and efficiency in two vertebrate replicators

    PubMed Central

    Valton, Anne-Laure; Hassan-Zadeh, Vahideh; Lema, Ingrid; Boggetto, Nicole; Alberti, Patrizia; Saintomé, Carole; Riou, Jean-François; Prioleau, Marie-Noëlle

    2014-01-01

    DNA replication ensures the accurate duplication of the genome at each cell cycle. It begins at specific sites called replication origins. Genome-wide studies in vertebrates have recently identified a consensus G-rich motif potentially able to form G-quadruplexes (G4) in most replication origins. However, there is no experimental evidence to demonstrate that G4 are actually required for replication initiation. We show here, with two model origins, that G4 motifs are required for replication initiation. Two G4 motifs cooperate in one of our model origins. The other contains only one critical G4, and its orientation determines the precise position of the replication start site. Point mutations affecting the stability of this G4 in vitro also impair origin function. Finally, this G4 is not sufficient for origin activity and must cooperate with a 200-bp cis-regulatory element. In conclusion, our study strongly supports the predicted essential role of G4 in replication initiation. PMID:24521668

  14. A three-dimensional RNA motif in Potato spindle tuber viroid mediates trafficking from palisade mesophyll to spongy mesophyll in Nicotiana benthamiana.

    PubMed

    Takeda, Ryuta; Petrov, Anton I; Leontis, Neocles B; Ding, Biao

    2011-01-01

    Cell-to-cell trafficking of RNA is an emerging biological principle that integrates systemic gene regulation, viral infection, antiviral response, and cell-to-cell communication. A key mechanistic question is how an RNA is specifically selected for trafficking from one type of cell into another type. Here, we report the identification of an RNA motif in Potato spindle tuber viroid (PSTVd) required for trafficking from palisade mesophyll to spongy mesophyll in Nicotiana benthamiana leaves. This motif, called loop 6, has the sequence 5'-CGA-3'...5'-GAC-3' flanked on both sides by cis Watson-Crick G/C and G/U wobble base pairs. We present a three-dimensional (3D) structural model of loop 6 that specifies all non-Watson-Crick base pair interactions, derived by isostericity-based sequence comparisons with 3D RNA motifs from the RNA x-ray crystal structure database. The model is supported by available chemical modification patterns, natural sequence conservation/variations in PSTVd isolates and related species, and functional characterization of all possible mutants for each of the loop 6 base pairs. Our findings and approaches have broad implications for studying the 3D RNA structural motifs mediating trafficking of diverse RNA species across specific cellular boundaries and for studying the structure-function relationships of RNA motifs in other biological processes.

  15. Multiple TPR motifs characterize the Fanconi anemia FANCG protein.

    PubMed

    Blom, Eric; van de Vrugt, Henri J; de Vries, Yne; de Winter, Johan P; Arwert, Fré; Joenje, Hans

    2004-01-05

    The genome protection pathway that is defective in patients with Fanconi anemia (FA) is controlled by at least eight genes, including BRCA2. A key step in the pathway involves the monoubiquitylation of FANCD2, which critically depends on a multi-subunit nuclear 'core complex' of at least six FANC proteins (FANCA, -C, -E, -F, -G, and -L). Except for FANCL, which has WD40 repeats and a RING finger domain, no significant domain structure has so far been recognized in any of the core complex proteins. By using a homology search strategy comparing the human FANCG protein sequence with its ortholog sequences in Oryzias latipes (Japanese rice fish) and Danio rerio (zebrafish) we identified at least seven tetratricopeptide repeat motifs (TPRs) covering a major part of this protein. TPRs are degenerate 34-amino acid repeat motifs which function as scaffolds mediating protein-protein interactions, often found in multiprotein complexes. In four out of five TPR motifs tested (TPR1, -2, -5, and -6), targeted missense mutagenesis disrupting the motifs at the critical position 8 of each TPR caused complete or partial loss of FANCG function. Loss of function was evident from failure of the mutant proteins to complement the cellular FA phenotype in FA-G lymphoblasts, which was correlated with loss of binding to FANCA. Although the TPR4 mutant fully complemented the cells, it showed a reduced interaction with FANCA, suggesting that this TPR may also be of functional importance. The recognition of FANCG as a typical TPR protein predicts this protein to play a key role in the assembly and/or stabilization of the nuclear FA protein core complex.

  16. Truncation- and motif-based pan-cancer analysis reveals tumor-suppressing kinases.

    PubMed

    Hudson, Andrew M; Stephenson, Natalie L; Li, Cynthia; Trotter, Eleanor; Fletcher, Adam J; Katona, Gitta; Bieniasz-Krzywiec, Patrycja; Howell, Matthew; Wirth, Chris; Furney, Simon; Miller, Crispin J; Brognard, John

    2018-04-17

    A major challenge in cancer genomics is identifying "driver" mutations from the many neutral "passenger" mutations within a given tumor. To identify driver mutations that would otherwise be lost within mutational noise, we filtered genomic data by motifs that are critical for kinase activity. In the first step of our screen, we used data from the Cancer Cell Line Encyclopedia and The Cancer Genome Atlas to identify kinases with truncation mutations occurring within or before the kinase domain. The top 30 tumor-suppressing kinases were aligned, and hotspots for loss-of-function (LOF) mutations were identified on the basis of amino acid conservation and mutational frequency. The functional consequences of new LOF mutations were biochemically validated, and the top 15 hotspot LOF residues were used in a pan-cancer analysis to define the tumor-suppressing kinome. A ranked list revealed MAP2K7, an essential mediator of the c-Jun N-terminal kinase (JNK) pathway, as a candidate tumor suppressor in gastric cancer, despite its mutational frequency falling within the mutational noise for this cancer type. The majority of mutations in MAP2K7 abolished its catalytic activity, and reactivation of the JNK pathway in gastric cancer cells harboring LOF mutations in MAP2K7 or the downstream kinase JNK suppressed clonogenicity and growth in soft agar, demonstrating the functional relevance of inactivating the JNK pathway in gastric cancer. Together, our data highlight a broadly applicable strategy to identify functional cancer driver mutations and define the JNK pathway as tumor-suppressive in gastric cancer. Copyright © 2018 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.

  17. The Malarial Host-Targeting Signal Is Conserved in the Irish Potato Famine Pathogen

    PubMed Central

    Liolios, Konstantinos; Win, Joe; Kanneganti, Thirumala-Devi; Young, Carolyn; Kamoun, Sophien; Haldar, Kasturi

    2006-01-01

    Animal and plant eukaryotic pathogens, such as the human malaria parasite Plasmodium falciparum and the potato late blight agent Phytophthora infestans, are widely divergent eukaryotic microbes. Yet they both produce secretory virulence and pathogenic proteins that alter host cell functions. In P. falciparum, export of parasite proteins to the host erythrocyte is mediated by leader sequences shown to contain a host-targeting (HT) motif centered on an RxLx (E, D, or Q) core: this motif appears to signify a major pathogenic export pathway with hundreds of putative effectors. Here we show that a secretory protein of P. infestans, which is perceived by plant disease resistance proteins and induces hypersensitive plant cell death, contains a leader sequence that is equivalent to the Plasmodium HT-leader in its ability to export fusion of green fluorescent protein (GFP) from the P. falciparum parasite to the host erythrocyte. This export is dependent on an RxLR sequence conserved in P. infestans leaders, as well as in leaders of all ten secretory oomycete proteins shown to function inside plant cells. The RxLR motif is also detected in hundreds of secretory proteins of P. infestans, Phytophthora sojae, and Phytophthora ramorum and has high value in predicting host-targeted leaders. A consensus motif further reveals E/D residues enriched within ~25 amino acids downstream of the RxLR, which are also needed for export. Together the data suggest that in these plant pathogenic oomycetes, a consensus HT motif may reside in an extended sequence of ~25–30 amino acids, rather than in a short linear sequence. Evidence is presented that although the consensus is much shorter in P. falciparum, information sufficient for vacuolar export is contained in a region of ~30 amino acids, which includes sequences flanking the HT core. Finally, positional conservation between Phytophthora RxLR and P. falciparum RxLx (E, D, Q) is consistent with the idea that the context of their

  18. Condensin II Regulates Interphase Chromatin Organization Through the Mrg-Binding Motif of Cap-H2

    PubMed Central

    Wallace, Heather A.; Klebba, Joseph E.; Kusch, Thomas; Rogers, Gregory C.; Bosco, Giovanni

    2015-01-01

    The spatial organization of the genome within the eukaryotic nucleus is a dynamic process that plays a central role in cellular processes such as gene expression, DNA replication, and chromosome segregation. Condensins are conserved multi-subunit protein complexes that contribute to chromosome organization by regulating chromosome compaction and homolog pairing. Previous work in our laboratory has shown that the Cap-H2 subunit of condensin II physically and genetically interacts with the Drosophila homolog of human MORF4-related gene on chromosome 15 (MRG15). Like Cap-H2, Mrg15 is required for interphase chromosome compaction and homolog pairing. However, the mechanism by which Mrg15 and Cap-H2 cooperate to maintain interphase chromatin organization remains unclear. Here, we show that Cap-H2 localizes to interband regions on polytene chromosomes and co-localizes with Mrg15 at regions of active transcription across the genome. We show that co-localization of Cap-H2 on polytene chromosomes is partially dependent on Mrg15. We have identified a binding motif within Cap-H2 that is essential for its interaction with Mrg15, and have found that mutation of this motif results in loss of localization of Cap-H2 on polytene chromosomes and results in partial suppression of Cap-H2-mediated compaction and homolog unpairing. Our data are consistent with a model in which Mrg15 acts as a loading factor to facilitate Cap-H2 binding to chromatin and mediate changes in chromatin organization. PMID:25758823

  19. A novel paired domain DNA recognition motif can mediate Pax2 repression of gene transcription.

    PubMed

    Håvik, B; Ragnhildstveit, E; Lorens, J B; Saelemyr, K; Fauske, O; Knudsen, L K; Fjose, A

    1999-12-20

    The paired domain (PD) is an evolutionarily conserved DNA-binding domain encoded by the Pax gene family of developmental regulators. The Pax proteins are transcription factors and are involved in a variety of processes such as brain development, patterning of the central nervous system (CNS), and B-cell development. In this report we demonstrate that the zebrafish Pax2 PD can interact with a novel type of DNA sequences in vitro, the triple-A motif, consisting of a heptameric nucleotide sequence G/CAAACA/TC with an invariant core of three adjacent adenosines. This recognition sequence was found to be conserved in known natural Pax5 repressor elements involved in controlling the expression of the p53 and J-chain genes. By identifying similar high affinity binding sites in potential target genes of the Pax2 protein, including the pax2 gene itself, we obtained further evidence that the triple-A sites are biologically significant. The putative natural target sites also provide a basis for defining an extended consensus recognition sequence. In addition, we observed in transformation assays a direct correlation between Pax2 repressor activity and the presence of triple-A sites. The results suggest that a transcriptional regulatory function of Pax proteins can be modulated by PD binding to different categories of target sequences. Copyright 1999 Academic Press.

  20. A dileucine motif is involved in plasma membrane expression and endocytosis of rat sodium taurocholate cotransporting polypeptide (Ntcp).

    PubMed

    Stross, Claudia; Kluge, Stefanie; Weissenberger, Katrin; Winands, Elisabeth; Häussinger, Dieter; Kubitz, Ralf

    2013-11-15

    The sodium taurocholate cotransporting polypeptide (Ntcp) is the major uptake transporter for bile salts into liver parenchymal cells, and PKC-mediated endocytosis was shown to regulate the number of Ntcp molecules at the plasma membrane. In this study, mechanisms of Ntcp internalization were analyzed by flow cytometry, immunofluorescence, and Western blot analyses in HepG2 cells. PKC activation induced endocytosis of Ntcp from the plasma membrane by ~30%. Endocytosis of Ntcp was clathrin dependent and was followed by lysosomal degradation. A dileucine motif located in the third intracellular loop of Ntcp was essential for endocytosis but also for processing and plasma membrane targeting, suggesting a dual function of this motif for intracellular trafficking of Ntcp. Mutation of two of five potential phosphorylation sites surrounding the dileucine motif (Thr225 and Ser226) inhibited PKC-mediated endocytosis. In conclusion, we could identify a motif, which is critical for Ntcp plasma membrane localization. Endocytic retrieval protects hepatocytes from elevated bile salt concentrations and is of special interest, because NTCP has been identified as a receptor for the hepatitis B and D virus.

  1. Comparison of the receptor FGFRL1 from sea urchins and humans illustrates evolution of a zinc binding motif in the intracellular domain

    PubMed Central

    2009-01-01

    Background FGFRL1, the gene for the fifth member of the fibroblast growth factor receptor (FGFR) family, is found in all vertebrates from fish to man and in the cephalochordate amphioxus. Since it does not occur in more distantly related invertebrates such as insects and nematodes, we have speculated that FGFRL1 might have evolved just before branching of the vertebrate lineage from the other invertebrates (Beyeler and Trueb, 2006). Results We identified the gene for FGFRL1 also in the sea urchin Strongylocentrotus purpuratus and cloned its mRNA. The deduced amino acid sequence shares 62% sequence similarity with the human protein and shows conservation of all disulfides and N-linked carbohydrate attachment sites. Similar to the human protein, the S. purpuratus protein contains a histidine-rich motif at the C-terminus, but this motif is much shorter than the human counterpart. To analyze the function of the novel motif, recombinant fusion proteins were prepared in a bacterial expression system. The human fusion protein bound to nickel and zinc affinity columns, whereas the sea urchin protein barely interacted with such columns. Direct determination of metal ions by atomic absorption revealed 2.6 mole zinc/mole protein for human FGFRL1 and 1.7 mole zinc/mole protein for sea urchin FGFRL1. Conclusion The FGFRL1 gene has evolved much earlier than previously assumed. A comparison of the intracellular domain between sea urchin and human FGFRL1 provides interesting insights into the shaping of a novel zinc binding domain. PMID:20021659

  2. A dinucleotide motif in oligonucleotides shows potent immunomodulatory activity and overrides species-specific recognition observed with CpG motif.

    PubMed

    Kandimalla, Ekambar R; Bhagat, Lakshmi; Zhu, Fu-Gang; Yu, Dong; Cong, Yan-Ping; Wang, Daqing; Tang, Jimmy X; Tang, Jin-Yan; Knetter, Cathrine F; Lien, Egil; Agrawal, Sudhir

    2003-11-25

    Bacterial and synthetic DNAs containing CpG dinucleotides in specific sequence contexts activate the vertebrate immune system through Toll-like receptor 9 (TLR9). In the present study, we used a synthetic nucleoside with a bicyclic heterobase [1-(2'-deoxy-beta-d-ribofuranosyl)-2-oxo-7-deaza-8-methyl-purine; R] to replace the C in CpG, resulting in an RpG dinucleotide. The RpG dinucleotide was incorporated in mouse- and human-specific motifs in oligodeoxynucleotides (oligos) and 3'-3-linked oligos, referred to as immunomers. Oligos containing the RpG motif induced cytokine secretion in mouse spleen-cell cultures. Immunomers containing RpG dinucleotides showed activity in transfected-HEK293 cells stably expressing mouse TLR9, suggesting direct involvement of TLR9 in the recognition of RpG motif. In J774 macrophages, RpG motifs activated NF-kappa B and mitogen-activated protein kinase pathways. Immunomers containing the RpG dinucleotide induced high levels of IL-12 and IFN-gamma, but lower IL-6 in time- and concentration-dependent fashion in mouse spleen-cell cultures costimulated with IL-2. Importantly, immunomers containing GTRGTT and GARGTT motifs were recognized to a similar extent by both mouse and human immune systems. Additionally, both mouse- and human-specific RpG immunomers potently stimulated proliferation of peripheral blood mononuclear cells obtained from diverse vertebrate species, including monkey, pig, horse, sheep, goat, rat, and chicken. An immunomer containing GTRGTT motif prevented conalbumin-induced and ragweed allergen-induced allergic inflammation in mice. We show that a synthetic bicyclic nucleotide is recognized in the C position of a CpG dinucleotide by immune cells from diverse vertebrate species without bias for flanking sequences, suggesting a divergent nucleotide motif recognition pattern of TLR9.

  3. Identification of an Electrostatic Ruler Motif for Sequence-Specific Binding of Collagenase to Collagen.

    PubMed

    Subramanian, Sundar Raman; Singam, Ettayapuram Ramaprasad Azhagiya; Berinski, Michael; Subramanian, Venkatesan; Wade, Rebecca C

    2016-08-25

    Sequence-specific cleavage of collagen by mammalian collagenase plays a pivotal role in cell function. Collagenases are matrix metalloproteinases that cleave the peptide bond at a specific position on fibrillar collagen. The collagenase Hemopexin-like (HPX) domain has been proposed to be responsible for substrate recognition, but the mechanism by which collagenases identify the cleavage site on fibrillar collagen is not clearly understood. In this study, Brownian dynamics simulations coupled with atomic-detail and coarse-grained molecular dynamics simulations were performed to dock matrix metalloproteinase-1 (MMP-1) on a collagen IIIα1 triple helical peptide. We find that the HPX domain recognizes the collagen triple helix at a conserved R-X11-R motif C-terminal to the cleavage site to which the HPX domain of collagen is guided electrostatically. The binding of the HPX domain between the two arginine residues is energetically stabilized by hydrophobic contacts with collagen. From the simulations and analysis of the sequences and structural flexibility of collagen and collagenase, a mechanistic scheme by which MMP-1 can recognize and bind collagen for proteolysis is proposed.

  4. Structural motifs of pre-nucleation clusters.

    PubMed

    Zhang, Y; Türkmen, I R; Wassermann, B; Erko, A; Rühl, E

    2013-10-07

    Structural motifs of pre-nucleation clusters prepared in single, optically levitated supersaturated aqueous aerosol microparticles containing CaBr2 as a model system are reported. Cluster formation is identified by means of X-ray absorption in the Br K-edge regime. The salt concentration beyond the saturation point is varied by controlling the humidity in the ambient atmosphere surrounding the 15-30 μm microdroplets. This leads to the formation of metastable supersaturated liquid particles. Distinct spectral shifts in near-edge spectra as a function of salt concentration are observed, in which the energy position of the Br K-edge is red-shifted by up to 7.1 ± 0.4 eV if the dilute solution is compared to the solid. The K-edge positions of supersaturated solutions are found between these limits. The changes in electronic structure are rationalized in terms of the formation of pre-nucleation clusters. This assumption is verified by spectral simulations using first-principle density functional theory and molecular dynamics calculations, in which structural motifs are considered, explaining the experimental results. These consist of solvated CaBr2 moieties, rather than building blocks forming calcium bromide hexahydrates, the crystal system that is formed by drying aqueous CaBr2 solutions.

  5. STEME: A Robust, Accurate Motif Finder for Large Data Sets

    PubMed Central

    Reid, John E.; Wernisch, Lorenz

    2014-01-01

    Motif finding is a difficult problem that has been studied for over 20 years. Some older popular motif finders are not suitable for analysis of the large data sets generated by next-generation sequencing. We recently published an efficient approximation (STEME) to the EM algorithm that is at the core of many motif finders such as MEME. This approximation allows the EM algorithm to be applied to large data sets. In this work we describe several efficient extensions to STEME that are based on the MEME algorithm. Together with the original STEME EM approximation, these extensions make STEME a fully-fledged motif finder with similar properties to MEME. We discuss the difficulty of objectively comparing motif finders. We show that STEME performs comparably to existing prominent discriminative motif finders, DREME and Trawler, on 13 sets of transcription factor binding data in mouse ES cells. We demonstrate the ability of STEME to find long degenerate motifs which these discriminative motif finders do not find. As part of our method, we extend an earlier method due to Nagarajan et al. for the efficient calculation of motif E-values. STEME's source code is available under an open source license and STEME is available via a web interface. PMID:24625410

  6. Phospholipid composition and a polybasic motif determine D6 PROTEIN KINASE polar association with the plasma membrane and tropic responses.

    PubMed

    Barbosa, Inês C R; Shikata, Hiromasa; Zourelidou, Melina; Heilmann, Mareike; Heilmann, Ingo; Schwechheimer, Claus

    2016-12-15

    Polar transport of the phytohormone auxin through PIN-FORMED (PIN) auxin efflux carriers is essential for the spatiotemporal control of plant development. The Arabidopsis thaliana serine/threonine kinase D6 PROTEIN KINASE (D6PK) is polarly localized at the plasma membrane of many cells where it colocalizes with PINs and activates PIN-mediated auxin efflux. Here, we show that the association of D6PK with the basal plasma membrane and PINs is dependent on the phospholipid composition of the plasma membrane as well as on the phosphatidylinositol phosphate 5-kinases PIP5K1 and PIP5K2 in epidermis cells of the primary root. We further show that D6PK directly binds polyacidic phospholipids through a polybasic lysine-rich motif in the middle domain of the kinase. The lysine-rich motif is required for proper PIN3 phosphorylation and for auxin transport-dependent tropic growth. Polybasic motifs are also present at a conserved position in other D6PK-related kinases and required for membrane and phospholipid binding. Thus, phospholipid-dependent recruitment to membranes through polybasic motifs might not only be required for D6PK-mediated auxin transport but also other processes regulated by these, as yet, functionally uncharacterized kinases. © 2016. Published by The Company of Biologists Ltd.

  7. DNA motif alignment by evolving a population of Markov chains.

    PubMed

    Bi, Chengpeng

    2009-01-30

    Deciphering cis-regulatory elements or de novo motif-finding in genomes still remains elusive although much algorithmic effort has been expended. The Markov chain Monte Carlo (MCMC) method such as Gibbs motif samplers has been widely employed to solve the de novo motif-finding problem through sequence local alignment. Nonetheless, the MCMC-based motif samplers still suffer from local maxima like EM. Therefore, as a prerequisite for finding good local alignments, these motif algorithms are often independently run a multitude of times, but without information exchange between different chains. Hence it would be worth a new algorithm design enabling such information exchange. This paper presents a novel motif-finding algorithm by evolving a population of Markov chains with information exchange (PMC), each of which is initialized as a random alignment and run by the Metropolis-Hastings sampler (MHS). It is progressively updated through a series of local alignments stochastically sampled. Explicitly, the PMC motif algorithm performs stochastic sampling as specified by a population-based proposal distribution rather than individual ones, and adaptively evolves the population as a whole towards a global maximum. The alignment information exchange is accomplished by taking advantage of the pooled motif site distributions. A distinct method for running multiple independent Markov chains (IMC) without information exchange, or dubbed as the IMC motif algorithm, is also devised to compare with its PMC counterpart. Experimental studies demonstrate that the performance could be improved if pooled information were used to run a population of motif samplers. The new PMC algorithm was able to improve the convergence and outperformed other popular algorithms tested using simulated and biological motif sequences.

  8. STD-NMR experiments identify a structural motif with novel second-site activity against West Nile virus NS2B-NS3 protease.

    PubMed

    Schöne, Tobias; Grimm, Lena Lisbeth; Sakai, Naoki; Zhang, Linlin; Hilgenfeld, Rolf; Peters, Thomas

    2017-10-01

    West Nile virus (WNV) belongs to the genus Flavivirus of the family Flaviviridae. This mosquito-borne virus that is highly pathogenic to humans has been evolving into a global threat during the past two decades. Despite many efforts, neither antiviral drugs nor vaccines are available. The viral protease NS2B-NS3 pro is essential for viral replication, and therefore it is considered a prime drug target. However, success in the development of specific NS2B-NS3 pro inhibitors had been moderate so far. In the search for new structural motifs with binding affinity for NS2B-NS3 pro , we have screened a fragment library, the Maybridge Ro5 library, employing saturation transfer difference (STD) NMR experiments as readout. About 30% of 429 fragments showed binding to NS2B-NS3 pro . Subsequent STD-NMR competition experiments using the known active site fragment A as reporter ligand yielded 14 competitively binding fragments, and 22 fragments not competing with A. In a fluorophore-based protease assay, all of these fragments showed inhibition in the micromolar range. Interestingly, 10 of these 22 fragments showed a notable increase of STD intensities in the presence of compound A suggesting cooperative binding. The most promising non-competitive inhibitors 1 and 2 (IC 50 ∼ 500 μM) share a structural motif that may guide the development of novel second-site (potentially allosteric) inhibitors of NS2B-NS3 pro . To identify the matching protein binding site, chemical shift perturbation studies employing 1 H, 15 N-TROSY-HSQC experiments with uniformly 2 H, 15 N-labeled protease were performed in the presence of 1, and in the concomitant absence or presence of A. The data suggest that 1 interacts with Met 52* of NS2B, identifying a secondary site adjacent to the binding site of A. Therefore, our study paves the way for the synthesis of novel bidentate NS2B-NS3 pro inhibitors. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. Subtle Changes in Motif Positioning Cause Tissue-Specific Effects on Robustness of an Enhancer's Activity

    PubMed Central

    Erceg, Jelena; Saunders, Timothy E.; Girardot, Charles; Devos, Damien P.; Hufnagel, Lars; Furlong, Eileen E. M.

    2014-01-01

    Deciphering the specific contribution of individual motifs within cis-regulatory modules (CRMs) is crucial to understanding how gene expression is regulated and how this process is affected by sequence variation. But despite vast improvements in the ability to identify where transcription factors (TFs) bind throughout the genome, we are limited in our ability to relate information on motif occupancy to function from sequence alone. Here, we engineered 63 synthetic CRMs to systematically assess the relationship between variation in the content and spacing of motifs within CRMs to CRM activity during development using Drosophila transgenic embryos. In over half the cases, very simple elements containing only one or two types of TF binding motifs were capable of driving specific spatio-temporal patterns during development. Different motif organizations provide different degrees of robustness to enhancer activity, ranging from binary on-off responses to more subtle effects including embryo-to-embryo and within-embryo variation. By quantifying the effects of subtle changes in motif organization, we were able to model biophysical rules that explain CRM behavior and may contribute to the spatial positioning of CRM activity in vivo. For the same enhancer, the effects of small differences in motif positions varied in developmentally related tissues, suggesting that gene expression may be more susceptible to sequence variation in one tissue compared to another. This result has important implications for human eQTL studies in which many associated mutations are found in cis-regulatory regions, though the mechanism for how they affect tissue-specific gene expression is often not understood. PMID:24391522

  10. Forkhead Box Transcription Factors of the FOXA Class Are Required for Basal Transcription of Angiotensin-Converting Enzyme 2

    PubMed Central

    Pedersen, Kim Brint; Chodavarapu, Harshita

    2017-01-01

    Angiotensin-converting enzyme 2 (ACE2) has protective effects on a wide range of morbidities associated with elevated angiotensin-II signaling. Most tissues, including pancreatic islets, express ACE2 mainly from the proximal promoter region. We previously found that hepatocyte nuclear factors 1α and 1β stimulate ACE2 expression from three highly conserved hepatocyte nuclear factor 1 binding motifs in the proximal promoter region. We hypothesized that other highly conserved motifs would also affect ACE2 expression. By systematic mutation of conserved elements, we identified five regions affecting ACE2 expression, of which two regions bound transcriptional activators. One of these is a functional FOXA binding motif. We further identified the main protein binding the FOXA motif in 832/13 insulinoma cells as well as in mouse pancreatic islets as FOXA2. PMID:29082356

  11. DNA motif elucidation using belief propagation.

    PubMed

    Wong, Ka-Chun; Chan, Tak-Ming; Peng, Chengbin; Li, Yue; Zhang, Zhaolei

    2013-09-01

    Protein-binding microarray (PBM) is a high-throughout platform that can measure the DNA-binding preference of a protein in a comprehensive and unbiased manner. A typical PBM experiment can measure binding signal intensities of a protein to all the possible DNA k-mers (k=8∼10); such comprehensive binding affinity data usually need to be reduced and represented as motif models before they can be further analyzed and applied. Since proteins can often bind to DNA in multiple modes, one of the major challenges is to decompose the comprehensive affinity data into multimodal motif representations. Here, we describe a new algorithm that uses Hidden Markov Models (HMMs) and can derive precise and multimodal motifs using belief propagations. We describe an HMM-based approach using belief propagations (kmerHMM), which accepts and preprocesses PBM probe raw data into median-binding intensities of individual k-mers. The k-mers are ranked and aligned for training an HMM as the underlying motif representation. Multiple motifs are then extracted from the HMM using belief propagations. Comparisons of kmerHMM with other leading methods on several data sets demonstrated its effectiveness and uniqueness. Especially, it achieved the best performance on more than half of the data sets. In addition, the multiple binding modes derived by kmerHMM are biologically meaningful and will be useful in interpreting other genome-wide data such as those generated from ChIP-seq. The executables and source codes are available at the authors' websites: e.g. http://www.cs.toronto.edu/∼wkc/kmerHMM.

  12. A Three-Dimensional RNA Motif in Potato spindle tuber viroid Mediates Trafficking from Palisade Mesophyll to Spongy Mesophyll in Nicotiana benthamiana[W

    PubMed Central

    Takeda, Ryuta; Petrov, Anton I.; Leontis, Neocles B.; Ding, Biao

    2011-01-01

    Cell-to-cell trafficking of RNA is an emerging biological principle that integrates systemic gene regulation, viral infection, antiviral response, and cell-to-cell communication. A key mechanistic question is how an RNA is specifically selected for trafficking from one type of cell into another type. Here, we report the identification of an RNA motif in Potato spindle tuber viroid (PSTVd) required for trafficking from palisade mesophyll to spongy mesophyll in Nicotiana benthamiana leaves. This motif, called loop 6, has the sequence 5′-CGA-3′...5′-GAC-3′ flanked on both sides by cis Watson-Crick G/C and G/U wobble base pairs. We present a three-dimensional (3D) structural model of loop 6 that specifies all non-Watson-Crick base pair interactions, derived by isostericity-based sequence comparisons with 3D RNA motifs from the RNA x-ray crystal structure database. The model is supported by available chemical modification patterns, natural sequence conservation/variations in PSTVd isolates and related species, and functional characterization of all possible mutants for each of the loop 6 base pairs. Our findings and approaches have broad implications for studying the 3D RNA structural motifs mediating trafficking of diverse RNA species across specific cellular boundaries and for studying the structure-function relationships of RNA motifs in other biological processes. PMID:21258006

  13. The Verrucomicrobia LexA-Binding Motif: Insights into the Evolutionary Dynamics of the SOS Response.

    PubMed

    Erill, Ivan; Campoy, Susana; Kılıç, Sefa; Barbé, Jordi

    2016-01-01

    The SOS response is the primary bacterial mechanism to address DNA damage, coordinating multiple cellular processes that include DNA repair, cell division, and translesion synthesis. In contrast to other regulatory systems, the composition of the SOS genetic network and the binding motif of its transcriptional repressor, LexA, have been shown to vary greatly across bacterial clades, making it an ideal system to study the co-evolution of transcription factors and their regulons. Leveraging comparative genomics approaches and prior knowledge on the core SOS regulon, here we define the binding motif of the Verrucomicrobia, a recently described phylum of emerging interest due to its association with eukaryotic hosts. Site directed mutagenesis of the Verrucomicrobium spinosum recA promoter confirms that LexA binds a 14 bp palindromic motif with consensus sequence TGTTC-N4-GAACA. Computational analyses suggest that recognition of this novel motif is determined primarily by changes in base-contacting residues of the third alpha helix of the LexA helix-turn-helix DNA binding motif. In conjunction with comparative genomics analysis of the LexA regulon in the Verrucomicrobia phylum, electrophoretic shift assays reveal that LexA binds to operators in the promoter region of DNA repair genes and a mutagenesis cassette in this organism, and identify previously unreported components of the SOS response. The identification of tandem LexA-binding sites generating instances of other LexA-binding motifs in the lexA gene promoter of Verrucomicrobia species leads us to postulate a novel mechanism for LexA-binding motif evolution. This model, based on gene duplication, successfully addresses outstanding questions in the intricate co-evolution of the LexA protein, its binding motif and the regulatory network it controls.

  14. The 5e motif of eukaryotic signal recognition particle RNA contains a conserved adenosine for the binding of SRP72

    PubMed Central

    Iakhiaeva, Elena; Wower, Jacek; Wower, Iwona K.; Zwieb, Christian

    2008-01-01

    The signal recognition particle (SRP) plays a pivotal role in transporting proteins to cell membranes. In higher eukaryotes, SRP consists of an RNA molecule and six proteins. The largest of the SRP proteins, SRP72, was found previously to bind to the SRP RNA. A fragment of human SRP72 (72c′) bound effectively to human SRP RNA but only weakly to the similar SRP RNA of the archaeon Methanococcus jannaschii. Chimeras between the human and M. jannaschii SRP RNAs were constructed and used as substrates for 72c′. SRP RNA helical section 5e contained the 72c′ binding site. Systematic alteration within 5e revealed that the A240G and A240C changes dramatically reduced the binding of 72c′. Human SRP RNA with a single A240G change was unable to form a complex with full-length human SRP72. Two small RNA fragments, one composed of helical section 5ef, the other of section 5e, competed equally well for the binding of 72c′, demonstrating that no other regions of the SRPR RNA were required. The biochemical data completely agreed with the nucleotide conservation pattern observed across the phylogenetic spectrum. Thus, most eukaryotic SRP RNAs are likely to require for function an adenosine within their 5e motifs. The human 5ef RNA was remarkably resistant to ribonucleolytic attack suggesting that the 240-AUC-242 “loop” and its surrounding nucleotides form a peculiar compact structure recognized only by SRP72. PMID:18441046

  15. Motivated Proteins: A web application for studying small three-dimensional protein motifs

    PubMed Central

    Leader, David P; Milner-White, E James

    2009-01-01

    Background Small loop-shaped motifs are common constituents of the three-dimensional structure of proteins. Typically they comprise between three and seven amino acid residues, and are defined by a combination of dihedral angles and hydrogen bonding partners. The most abundant of these are αβ-motifs, asx-motifs, asx-turns, β-bulges, β-bulge loops, β-turns, nests, niches, Schellmann loops, ST-motifs, ST-staples and ST-turns. We have constructed a database of such motifs from a range of high-quality protein structures and built a web application as a visual interface to this. Description The web application, Motivated Proteins, provides access to these 12 motifs (with 48 sub-categories) in a database of over 400 representative proteins. Queries can be made for specific categories or sub-categories of motif, motifs in the vicinity of ligands, motifs which include part of an enzyme active site, overlapping motifs, or motifs which include a particular amino acid sequence. Individual proteins can be specified, or, where appropriate, motifs for all proteins listed. The results of queries are presented in textual form as an (X)HTML table, and may be saved as parsable plain text or XML. Motifs can be viewed and manipulated either individually or in the context of the protein in the Jmol applet structural viewer. Cartoons of the motifs imposed on a linear representation of protein secondary structure are also provided. Summary information for the motifs is available, as are histograms of amino acid distribution, and graphs of dihedral angles at individual positions in the motifs. Conclusion Motivated Proteins is a publicly and freely accessible web application that enables protein scientists to study small three-dimensional motifs without requiring knowledge of either Structured Query Language or the underlying database schema. PMID:19210785

  16. Proteome-wide search for functional motifs altered in tumors: Prediction of nuclear export signals inactivated by cancer-related mutations

    PubMed Central

    Prieto, Gorka; Fullaondo, Asier; Rodríguez, Jose A.

    2016-01-01

    Large-scale sequencing projects are uncovering a growing number of missense mutations in human tumors. Understanding the phenotypic consequences of these alterations represents a formidable challenge. In silico prediction of functionally relevant amino acid motifs disrupted by cancer mutations could provide insight into the potential impact of a mutation, and guide functional tests. We have previously described Wregex, a tool for the identification of potential functional motifs, such as nuclear export signals (NESs), in proteins. Here, we present an improved version that allows motif prediction to be combined with data from large repositories, such as the Catalogue of Somatic Mutations in Cancer (COSMIC), and to be applied to a whole proteome scale. As an example, we have searched the human proteome for candidate NES motifs that could be altered by cancer-related mutations included in the COSMIC database. A subset of the candidate NESs identified was experimentally tested using an in vivo nuclear export assay. A significant proportion of the selected motifs exhibited nuclear export activity, which was abrogated by the COSMIC mutations. In addition, our search identified a cancer mutation that inactivates the NES of the human deubiquitinase USP21, and leads to the aberrant accumulation of this protein in the nucleus. PMID:27174732

  17. DNA containing CpG motifs induces angiogenesis

    NASA Astrophysics Data System (ADS)

    Zheng, Mei; Klinman, Dennis M.; Gierynska, Malgorzata; Rouse, Barry T.

    2002-06-01

    New blood vessel formation in the cornea is an essential step in the pathogenesis of a blinding immunoinflammatory reaction caused by ocular infection with herpes simplex virus (HSV). By using a murine corneal micropocket assay, we found that HSV DNA (which contains a significant excess of potentially bioactive "CpG" motifs when compared with mammalian DNA) induces angiogenesis. Moreover, synthetic oligodeoxynucleotides containing CpG motifs attract inflammatory cells and stimulate the release of vascular endothelial growth factor (VEGF), which in turn triggers new blood vessel formation. In vitro, CpG DNA induces the J774A.1 murine macrophage cell line to produce VEGF. In vivo CpG-induced angiogenesis was blocked by the administration of anti-mVEGF Ab or the inclusion of "neutralizing" oligodeoxynucleotides that specifically oppose the stimulatory activity of CpG DNA. These findings establish that DNA containing bioactive CpG motifs induces angiogenesis, and suggest that CpG motifs in HSV DNA may contribute to the blinding lesions of stromal keratitis.

  18. Feature extraction using gray-level co-occurrence matrix of wavelet coefficients and texture matching for batik motif recognition

    NASA Astrophysics Data System (ADS)

    Suciati, Nanik; Herumurti, Darlis; Wijaya, Arya Yudhi

    2017-02-01

    Batik is one of Indonesian's traditional cloth. Motif or pattern drawn on a piece of batik fabric has a specific name and philosopy. Although batik cloths are widely used in everyday life, but only few people understand its motif and philosophy. This research is intended to develop a batik motif recognition system which can be used to identify motif of Batik image automatically. First, a batik image is decomposed into sub-images using wavelet transform. Six texture descriptors, i.e. max probability, correlation, contrast, uniformity, homogenity and entropy, are extracted from gray-level co-occurrence matrix of each sub-image. The texture features are then matched to the template features using canberra distance. The experiment is performed on Batik Dataset consisting of 1088 batik images grouped into seven motifs. The best recognition rate, that is 92,1%, is achieved using feature extraction process with 5 level wavelet decomposition and 4 directional gray-level co-occurrence matrix.

  19. Newly identified allatostatin Bs and their receptor in the two-spotted cricket, Gryllus bimaculatus.

    PubMed

    Tsukamoto, Yusuke; Nagata, Shinji

    2016-06-01

    A cDNA encoding allatostatin Bs (ASTBs) containing the W(X)6W motif was identified using a database generated by a next generation sequencer (NGS) in the two-spotted cricket, Gryllus bimaculatus. The contig sequence revealed the presence of five novel putative ASTBs (GbASTBs) in addition to GbASTBs previously identified in G. bimaculatus. MALDI-TOF MS analyses revealed the presence of these novel and previously identified GbASTBs with three missing GbASTBs. We also identified a cDNA encoding G. bimaculatus GbASTB receptor (GbASTBR) in the NGS data. Phylogenetic analysis demonstrated that this receptor was highly conserved with other insect ASTBRs, including the sex peptide receptor of Drosophila melanogaster. Calcium imaging analyses indicated that the GbASTBR heterologously expressed in HEK293 cells exhibited responses to all identified GbASTBs at a concentration range of 10(-10)-10(-5)M. Copyright © 2016 Elsevier Inc. All rights reserved.

  20. Ca2+-binding Motif of βγ-Crystallins*

    PubMed Central

    Srivastava, Shanti Swaroop; Mishra, Amita; Krishnan, Bal; Sharma, Yogendra

    2014-01-01

    βγ-Crystallin-type double clamp (N/D)(N/D)XX(S/T)S motif is an established but sparsely investigated motif for Ca2+ binding. A βγ-crystallin domain is formed of two Greek key motifs, accommodating two Ca2+-binding sites. βγ-Crystallins make a separate class of Ca2+-binding proteins (CaBP), apparently a major group of CaBP in bacteria. Paralleling the diversity in βγ-crystallin domains, these motifs also show great diversity, both in structure and in function. Although the expression of some of them has been associated with stress, virulence, and adhesion, the functional implications of Ca2+ binding to βγ-crystallins in mediating biological processes are yet to be elucidated. PMID:24567326

  1. Mutational Analysis of the QRRQ Motif in the Yeast Hig1 Type 2 Protein Rcf1 Reveals a Regulatory Role for the Cytochrome c Oxidase Complex*

    PubMed Central

    Garlich, Joshua; Strecker, Valentina; Wittig, Ilka; Stuart, Rosemary A.

    2017-01-01

    The yeast Rcf1 protein is a member of the conserved family of proteins termed the hypoxia-induced gene (domain) 1 (Hig1 or HIGD1) family. Rcf1 interacts with components of the mitochondrial oxidative phosphorylation system, in particular the cytochrome bc1 (complex III)-cytochrome c oxidase (complex IV) supercomplex (termed III-IV) and the ADP/ATP carrier proteins. Rcf1 plays a role in the assembly and modulation of the activity of complex IV; however, the molecular basis for how Rcf1 influences the activity of complex IV is currently unknown. Hig1 type 2 isoforms, which include the Rcf1 protein, are characterized in part by the presence of a conserved motif, (Q/I)X3(R/H)XRX3Q, termed here the QRRQ motif. We show that mutation of conserved residues within the Rcf1 QRRQ motif alters the interactions between Rcf1 and partner proteins and results in the destabilization of complex IV and alteration of its enzymatic properties. Our findings indicate that Rcf1 does not serve as a stoichiometric component, i.e. as a subunit of complex IV, to support its activity. Rather, we propose that Rcf1 serves to dynamically interact with complex IV during its assembly process and, in doing so, regulates a late maturation step of complex IV. We speculate that the Rcf1/Hig1 proteins play a role in the incorporation and/or remodeling of lipids, in particular cardiolipin, into complex IV and. possibly, other mitochondrial proteins such as ADP/ATP carrier proteins. PMID:28167530

  2. HRD Motif as the Central Hub of the Signaling Network for Activation Loop Autophosphorylation in Abl Kinase.

    PubMed

    La Sala, Giuseppina; Riccardi, Laura; Gaspari, Roberto; Cavalli, Andrea; Hantschel, Oliver; De Vivo, Marco

    2016-11-08

    A number of structural factors modulate the activity of Abelson (Abl) tyrosine kinase, whose deregulation is often related to oncogenic processes. First, only the open conformation of the Abl kinase domain's activation loop (A-loop) favors ATP binding to the catalytic cleft. In this regard, the trans-autophosphorylation of the Y412 residue, which is located along the A-loop, favors the stability of the open conformation, in turn enhancing Abl activity. Another key factor for full Abl activity is the formation of active conformations of the catalytic DFG motif in the Abl kinase domain. Furthermore, binding of the SH2 domain to the N-lobe of the Abl kinase was recently demonstrated to have a long-range allosteric effect on the stabilization of the A-loop open state. Intriguingly, these distinct structural factors imply a complex signal transmission network for controlling the A-loop's flexibility and conformational preference for optimal Abl function. However, the exact dynamical features of this signal transmission network structure remain unclear. Here, we report on microsecond-long molecular dynamics coupled with enhanced sampling simulations of multiple Abl model systems, in the presence or absence of the SH2 domain and with the DFG motif flipped in two ways (in or out conformation). Through comparative analysis, our simulations augment the interpretation of the existing Abl experimental data, revealing a dynamical network of interactions that interconnect SH2 domain binding with A-loop plasticity and Y412 autophosphorylation in Abl. This signaling network engages the DFG motif and, importantly, other conserved structural elements of the kinase domain, namely, the EPK-ELK H-bond network and the HRD motif. Our results show that the signal propagation for modulating the A-loop spatial localization is highly dependent on the HRD motif conformation, which thus acts as the central hub of this (allosteric) signaling network controlling Abl activation and function.

  3. Opuntia in México: Identifying Priority Areas for Conserving Biodiversity in a Multi-Use Landscape

    PubMed Central

    Illoldi-Rangel, Patricia; Ciarleglio, Michael; Sheinvar, Leia; Linaje, Miguel; Sánchez-Cordero, Victor; Sarkar, Sahotra

    2012-01-01

    Background México is one of the world's centers of species diversity (richness) for Opuntia cacti. Yet, in spite of their economic and ecological importance, Opuntia species remain poorly studied and protected in México. Many of the species are sparsely but widely distributed across the landscape and are subject to a variety of human uses, so devising implementable conservation plans for them presents formidable difficulties. Multi–criteria analysis can be used to design a spatially coherent conservation area network while permitting sustainable human usage. Methods and Findings Species distribution models were created for 60 Opuntia species using MaxEnt. Targets of representation within conservation area networks were assigned at 100% for the geographically rarest species and 10% for the most common ones. Three different conservation plans were developed to represent the species within these networks using total area, shape, and connectivity as relevant criteria. Multi–criteria analysis and a metaheuristic adaptive tabu search algorithm were used to search for optimal solutions. The plans were built on the existing protected areas of México and prioritized additional areas for management for the persistence of Opuntia species. All plans required around one–third of México's total area to be prioritized for attention for Opuntia conservation, underscoring the implausibility of Opuntia conservation through traditional land reservation. Tabu search turned out to be both computationally tractable and easily implementable for search problems of this kind. Conclusions Opuntia conservation in México require the management of large areas of land for multiple uses. The multi-criteria analyses identified priority areas and organized them in large contiguous blocks that can be effectively managed. A high level of connectivity was established among the prioritized areas resulting in the enhancement of possible modes of plant dispersal as well as only a small number

  4. The Regulatory Factor ZFHX3 Modifies Circadian Function in SCN via an AT Motif-Driven Axis

    PubMed Central

    Parsons, Michael J.; Brancaccio, Marco; Sethi, Siddharth; Maywood, Elizabeth S.; Satija, Rahul; Edwards, Jessica K.; Jagannath, Aarti; Couch, Yvonne; Finelli, Mattéa J.; Smyllie, Nicola J.; Esapa, Christopher; Butler, Rachel; Barnard, Alun R.; Chesham, Johanna E.; Saito, Shoko; Joynson, Greg; Wells, Sara; Foster, Russell G.; Oliver, Peter L.; Simon, Michelle M.; Mallon, Ann-Marie; Hastings, Michael H.; Nolan, Patrick M.

    2015-01-01

    Summary We identified a dominant missense mutation in the SCN transcription factor Zfhx3, termed short circuit (Zfhx3Sci), which accelerates circadian locomotor rhythms in mice. ZFHX3 regulates transcription via direct interaction with predicted AT motifs in target genes. The mutant protein has a decreased ability to activate consensus AT motifs in vitro. Using RNA sequencing, we found minimal effects on core clock genes in Zfhx3Sci/+ SCN, whereas the expression of neuropeptides critical for SCN intercellular signaling was significantly disturbed. Moreover, mutant ZFHX3 had a decreased ability to activate AT motifs in the promoters of these neuropeptide genes. Lentiviral transduction of SCN slices showed that the ZFHX3-mediated activation of AT motifs is circadian, with decreased amplitude and robustness of these oscillations in Zfhx3Sci/+ SCN slices. In conclusion, by cloning Zfhx3Sci, we have uncovered a circadian transcriptional axis that determines the period and robustness of behavioral and SCN molecular rhythms. PMID:26232227

  5. Hybrid DNA i-motif: Aminoethylprolyl-PNA (pC5) enhance the stability of DNA (dC5) i-motif structure.

    PubMed

    Gade, Chandrasekhar Reddy; Sharma, Nagendra K

    2017-12-15

    This report describes the synthesis of C-rich sequence, cytosine pentamer, of aep-PNA and its biophysical studies for the formation of hybrid DNA:aep-PNAi-motif structure with DNA cytosine pentamer (dC 5 ) under acidic pH conditions. Herein, the CD/UV/NMR/ESI-Mass studies strongly support the formation of stable hybrid DNA i-motif structure with aep-PNA even near acidic conditions. Hence aep-PNA C-rich sequence cytosine could be considered as potential DNA i-motif stabilizing agents in vivo conditions. Copyright © 2017 Elsevier Ltd. All rights reserved.

  6. ELM: the status of the 2010 eukaryotic linear motif resource

    PubMed Central

    Gould, Cathryn M.; Diella, Francesca; Via, Allegra; Puntervoll, Pål; Gemünd, Christine; Chabanis-Davidson, Sophie; Michael, Sushama; Sayadi, Ahmed; Bryne, Jan Christian; Chica, Claudia; Seiler, Markus; Davey, Norman E.; Haslam, Niall; Weatheritt, Robert J.; Budd, Aidan; Hughes, Tim; Paś, Jakub; Rychlewski, Leszek; Travé, Gilles; Aasland, Rein; Helmer-Citterich, Manuela; Linding, Rune; Gibson, Toby J.

    2010-01-01

    Linear motifs are short segments of multidomain proteins that provide regulatory functions independently of protein tertiary structure. Much of intracellular signalling passes through protein modifications at linear motifs. Many thousands of linear motif instances, most notably phosphorylation sites, have now been reported. Although clearly very abundant, linear motifs are difficult to predict de novo in protein sequences due to the difficulty of obtaining robust statistical assessments. The ELM resource at http://elm.eu.org/ provides an expanding knowledge base, currently covering 146 known motifs, with annotation that includes >1300 experimentally reported instances. ELM is also an exploratory tool for suggesting new candidates of known linear motifs in proteins of interest. Information about protein domains, protein structure and native disorder, cellular and taxonomic contexts is used to reduce or deprecate false positive matches. Results are graphically displayed in a ‘Bar Code’ format, which also displays known instances from homologous proteins through a novel ‘Instance Mapper’ protocol based on PHI-BLAST. ELM server output provides links to the ELM annotation as well as to a number of remote resources. Using the links, researchers can explore the motifs, proteins, complex structures and associated literature to evaluate whether candidate motifs might be worth experimental investigation. PMID:19920119

  7. Automated Recognition of RNA Structure Motifs by Their SHAPE Data Signatures.

    PubMed

    Radecki, Pierce; Ledda, Mirko; Aviran, Sharon

    2018-06-14

    High-throughput structure profiling (SP) experiments that provide information at nucleotide resolution are revolutionizing our ability to study RNA structures. Of particular interest are RNA elements whose underlying structures are necessary for their biological functions. We previously introduced patteRNA , an algorithm for rapidly mining SP data for patterns characteristic of such motifs. This work provided a proof-of-concept for the detection of motifs and the capability of distinguishing structures displaying pronounced conformational changes. Here, we describe several improvements and automation routines to patteRNA . We then consider more elaborate biological situations starting with the comparison or integration of results from searches for distinct motifs and across datasets. To facilitate such analyses, we characterize patteRNA ’s outputs and describe a normalization framework that regularizes results. We then demonstrate that our algorithm successfully discerns between highly similar structural variants of the human immunodeficiency virus type 1 (HIV-1) Rev response element (RRE) and readily identifies its exact location in whole-genome structure profiles of HIV-1. This work highlights the breadth of information that can be gleaned from SP data and broadens the utility of data-driven methods as tools for the detection of novel RNA elements.

  8. Self-assembly of multi-stranded RNA motifs into lattices and tubular structures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stewart, Jaimie Marie; Subramanian, Hari K. K.; Franco, Elisa

    Rational design of nucleic acidmolecules yields selfassembling scaffolds with increasing complexity, size and functionality. It is an open question whether design methods tailored to build DNA nanostructures can be adapted to build RNA nanostructures with comparable features. We demonstrate the formation of RNA lattices and tubular assemblies from double crossover (DX) tiles, a canonical motif in DNA nanotechnology. Tubular structures can exceed 1 m in length, suggesting that this DX motif can produce very robust lattices. Some of these tubes spontaneously form with left-handed chirality. We obtain assemblies by using two methods: a protocol where gel-extracted RNA strands are slowlymore » annealed, and a one-pot transcription and anneal procedure. We then identify the tile nick position as a structural requirement for lattice formation. These results demonstrate that stable RNA structures can be obtained with design tools imported from DNA nanotechnology. These large assemblies could be potentially integrated with a variety of functional RNA motifs for drug or nanoparticle delivery, or for colocalization of cellular components.« less

  9. Self-assembly of multi-stranded RNA motifs into lattices and tubular structures

    DOE PAGES

    Stewart, Jaimie Marie; Subramanian, Hari K. K.; Franco, Elisa

    2017-02-16

    Rational design of nucleic acidmolecules yields selfassembling scaffolds with increasing complexity, size and functionality. It is an open question whether design methods tailored to build DNA nanostructures can be adapted to build RNA nanostructures with comparable features. We demonstrate the formation of RNA lattices and tubular assemblies from double crossover (DX) tiles, a canonical motif in DNA nanotechnology. Tubular structures can exceed 1 m in length, suggesting that this DX motif can produce very robust lattices. Some of these tubes spontaneously form with left-handed chirality. We obtain assemblies by using two methods: a protocol where gel-extracted RNA strands are slowlymore » annealed, and a one-pot transcription and anneal procedure. We then identify the tile nick position as a structural requirement for lattice formation. These results demonstrate that stable RNA structures can be obtained with design tools imported from DNA nanotechnology. These large assemblies could be potentially integrated with a variety of functional RNA motifs for drug or nanoparticle delivery, or for colocalization of cellular components.« less

  10. Self-assembly of multi-stranded RNA motifs into lattices and tubular structures

    PubMed Central

    Stewart, Jaimie Marie; Subramanian, Hari K. K.

    2017-01-01

    Abstract Rational design of nucleic acid molecules yields self-assembling scaffolds with increasing complexity, size and functionality. It is an open question whether design methods tailored to build DNA nanostructures can be adapted to build RNA nanostructures with comparable features. Here we demonstrate the formation of RNA lattices and tubular assemblies from double crossover (DX) tiles, a canonical motif in DNA nanotechnology. Tubular structures can exceed 1 μm in length, suggesting that this DX motif can produce very robust lattices. Some of these tubes spontaneously form with left-handed chirality. We obtain assemblies by using two methods: a protocol where gel-extracted RNA strands are slowly annealed, and a one-pot transcription and anneal procedure. We identify the tile nick position as a structural requirement for lattice formation. Our results demonstrate that stable RNA structures can be obtained with design tools imported from DNA nanotechnology. These large assemblies could be potentially integrated with a variety of functional RNA motifs for drug or nanoparticle delivery, or for colocalization of cellular components. PMID:28204562

  11. Identification of the divergent calmodulin binding motif in yeast Ssb1/Hsp75 protein and in other HSP70 family members.

    PubMed

    Heinen, R C; Diniz-Mendes, L; Silva, J T; Paschoalin, V M F

    2006-11-01

    Yeast soluble proteins were fractionated by calmodulin-agarose affinity chromatography and the Ca2+/calmodulin-binding proteins were analyzed by SDS-PAGE. One prominent protein of 66 kDa was excised from the gel, digested with trypsin and the masses of the resultant fragments were determined by MALDI/MS. Twenty-one of 38 monoisotopic peptide masses obtained after tryptic digestion were matched to the heat shock protein Ssb1/Hsp75, covering 37% of its sequence. Computational analysis of the primary structure of Ssb1/Hsp75 identified a unique potential amphipathic alpha-helix in its N-terminal ATPase domain with features of target regions for Ca2+/calmodulin binding. This region, which shares 89% similarity to the experimentally determined calmodulin-binding domain from mouse, Hsc70, is conserved in near half of the 113 members of the HSP70 family investigated, from yeast to plant and animals. Based on the sequence of this region, phylogenetic analysis grouped the HSP70s in three distinct branches. Two of them comprise the non-calmodulin binding Hsp70s BIP/GR78, a subfamily of eukaryotic HSP70 localized in the endoplasmic reticulum, and DnaK, a subfamily of prokaryotic HSP70. A third heterogeneous group is formed by eukaryotic cytosolic HSP70s containing the new calmodulin-binding motif and other cytosolic HSP70s whose sequences do not conform to those conserved motif, indicating that not all eukaryotic cytosolic Hsp70s are target for calmodulin regulation. Furthermore, the calmodulin-binding domain found in eukaryotic HSP70s is also the target for binding of Bag-1 - an enhancer of ADP/ATP exchange activity of Hsp70s. A model in which calmodulin displaces Bag-1 and modulates Ssb1/Hsp75 chaperone activity is discussed.

  12. TFBSshape: a motif database for DNA shape features of transcription factor binding sites.

    PubMed

    Yang, Lin; Zhou, Tianyin; Dror, Iris; Mathelier, Anthony; Wasserman, Wyeth W; Gordân, Raluca; Rohs, Remo

    2014-01-01

    Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein-DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone.

  13. TFBSshape: a motif database for DNA shape features of transcription factor binding sites

    PubMed Central

    Yang, Lin; Zhou, Tianyin; Dror, Iris; Mathelier, Anthony; Wasserman, Wyeth W.; Gordân, Raluca; Rohs, Remo

    2014-01-01

    Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein–DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone. PMID:24214955

  14. A methodological approach to identify agro-biodiversity hotspots for priority in situ conservation of plant genetic resources

    PubMed Central

    Pacicco, Luca; Bodesmo, Mara; Torricelli, Renzo

    2018-01-01

    Agro-biodiversity is seriously threatened worldwide and strategies to preserve it are dramatically required. We propose here a methodological approach aimed to identify areas with a high level of agro-biodiversity in which to set or enhance in situ conservation of plant genetic resources. These areas are identified using three criteria: Presence of Landrace diversity, Presence of wild species and Agro-ecosystem ecological diversity. A Restrictive and an Additive prioritization strategy has been applied on the entire Italian territory and has resulted in establishing nationwide 53 and 197 agro-biodiversity hotspots respectively. At present the strategies can easily be applied at a European level and can be helpful to develop conservation strategies everywhere. PMID:29856765

  15. Permuting the PGF Signature Motif Blocks both Archaeosortase-Dependent C-Terminal Cleavage and Prenyl Lipid Attachment for the Haloferax volcanii S-Layer Glycoprotein.

    PubMed

    Abdul Halim, Mohd Farid; Karch, Kelly R; Zhou, Yitian; Haft, Daniel H; Garcia, Benjamin A; Pohlschroder, Mechthild

    2015-12-28

    For years, the S-layer glycoprotein (SLG), the sole component of many archaeal cell walls, was thought to be anchored to the cell surface by a C-terminal transmembrane segment. Recently, however, we demonstrated that the Haloferax volcanii SLG C terminus is removed by an archaeosortase (ArtA), a novel peptidase. SLG, which was previously shown to be lipid modified, contains a C-terminal tripartite structure, including a highly conserved proline-glycine-phenylalanine (PGF) motif. Here, we demonstrate that ArtA does not process an SLG variant where the PGF motif is replaced with a PFG motif (slg(G796F,F797G)). Furthermore, using radiolabeling, we show that SLG lipid modification requires the PGF motif and is ArtA dependent, lending confirmation to the use of a novel C-terminal lipid-mediated protein-anchoring mechanism by prokaryotes. Similar to the case for the ΔartA strain, the growth, cellular morphology, and cell wall of the slg(G796F,F797G) strain, in which modifications of additional H. volcanii ArtA substrates should not be altered, are adversely affected, demonstrating the importance of these posttranslational SLG modifications. Our data suggest that ArtA is either directly or indirectly involved in a novel proteolysis-coupled, covalent lipid-mediated anchoring mechanism. Given that archaeosortase homologs are encoded by a broad range of prokaryotes, it is likely that this anchoring mechanism is widely conserved. Prokaryotic proteins bound to cell surfaces through intercalation, covalent attachment, or protein-protein interactions play critical roles in essential cellular processes. Unfortunately, the molecular mechanisms that anchor proteins to archaeal cell surfaces remain poorly characterized. Here, using the archaeon H. volcanii as a model system, we report the first in vivo studies of a novel protein-anchoring pathway involving lipid modification of a peptidase-processed C terminus. Our findings not only yield important insights into poorly understood

  16. Scale-dependent complementarity of climatic velocity and environmental diversity for identifying priority areas for conservation under climate change.

    PubMed

    Carroll, Carlos; Roberts, David R; Michalak, Julia L; Lawler, Joshua J; Nielsen, Scott E; Stralberg, Diana; Hamann, Andreas; Mcrae, Brad H; Wang, Tongli

    2017-11-01

    As most regions of the earth transition to altered climatic conditions, new methods are needed to identify refugia and other areas whose conservation would facilitate persistence of biodiversity under climate change. We compared several common approaches to conservation planning focused on climate resilience over a broad range of ecological settings across North America and evaluated how commonalities in the priority areas identified by different methods varied with regional context and spatial scale. Our results indicate that priority areas based on different environmental diversity metrics differed substantially from each other and from priorities based on spatiotemporal metrics such as climatic velocity. Refugia identified by diversity or velocity metrics were not strongly associated with the current protected area system, suggesting the need for additional conservation measures including protection of refugia. Despite the inherent uncertainties in predicting future climate, we found that variation among climatic velocities derived from different general circulation models and emissions pathways was less than the variation among the suite of environmental diversity metrics. To address uncertainty created by this variation, planners can combine priorities identified by alternative metrics at a single resolution and downweight areas of high variation between metrics. Alternately, coarse-resolution velocity metrics can be combined with fine-resolution diversity metrics in order to leverage the respective strengths of the two groups of metrics as tools for identification of potential macro- and microrefugia that in combination maximize both transient and long-term resilience to climate change. Planners should compare and integrate approaches that span a range of model complexity and spatial scale to match the range of ecological and physical processes influencing persistence of biodiversity and identify a conservation network resilient to threats operating at

  17. The MiiA motif is a common marker present in polytopic surface proteins of oral and urinary tract invasive bacteria.

    PubMed

    Martín-Galiano, Antonio J

    2017-04-01

    Many surface virulence factors of bacterial pathogens show mosaicism and confounding phylogenetic origin. The Streptococcus gordonii platelet-binding GspB protein, the Streptococcus sanguinis SrpA adhesin and the Streptococcus pneumoniae DiiA protein, share an imperfect 27-residue motif. Given the disparate domain architectures of these proteins and its association to invasive disease, this motif was named MiiA from Multiarchitecture invasion-involved motif A. MiiA is predicted to adopt a beta-sheet folding, probably related to the Ig-like fold, with a symmetrical positioning of two conserved aspartic residues. A specific hidden Markov model profiling MiiA was built, which specifically detected the motif in proteins from 58 species, mainly in cell-wall proteins from Gram-positive bacteria. These proteins contained one to ten MiiA motifs, which were embedded within larger repeat units of 70-82 residues. MiiA motifs combined to other domains and elements such as coiled-coils and low-complexity regions. The species carrying MiiA-proteins included commensals from the urogenital tract and the oral cavity, which can cause opportunistic endocarditis and sepsis. Intra-protein MiiA repeats showed a complex mixture of orthologal, paralogal and inter-species relationships, suggestive of a multistep origin. Presence of these repeats in proteins involved in oligosaccharide recognition and lifestyle of species suggest a putative function for MiiA repeats in sugars binding, probably those present in receptors of epithelial and blood cells. MiiA modules appear to have been transferred horizontally between species co-habiting in the same niche to create their own MiiA-containing determinants. The present work provides a global study and a catalog of potential MiiA virulence factors that should be analyzed experimentally. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. Mechanisms of Zero-Lag Synchronization in Cortical Motifs

    PubMed Central

    Gollo, Leonardo L.; Mirasso, Claudio; Sporns, Olaf; Breakspear, Michael

    2014-01-01

    Zero-lag synchronization between distant cortical areas has been observed in a diversity of experimental data sets and between many different regions of the brain. Several computational mechanisms have been proposed to account for such isochronous synchronization in the presence of long conduction delays: Of these, the phenomenon of “dynamical relaying” – a mechanism that relies on a specific network motif – has proven to be the most robust with respect to parameter mismatch and system noise. Surprisingly, despite a contrary belief in the community, the common driving motif is an unreliable means of establishing zero-lag synchrony. Although dynamical relaying has been validated in empirical and computational studies, the deeper dynamical mechanisms and comparison to dynamics on other motifs is lacking. By systematically comparing synchronization on a variety of small motifs, we establish that the presence of a single reciprocally connected pair – a “resonance pair” – plays a crucial role in disambiguating those motifs that foster zero-lag synchrony in the presence of conduction delays (such as dynamical relaying) from those that do not (such as the common driving triad). Remarkably, minor structural changes to the common driving motif that incorporate a reciprocal pair recover robust zero-lag synchrony. The findings are observed in computational models of spiking neurons, populations of spiking neurons and neural mass models, and arise whether the oscillatory systems are periodic, chaotic, noise-free or driven by stochastic inputs. The influence of the resonance pair is also robust to parameter mismatch and asymmetrical time delays amongst the elements of the motif. We call this manner of facilitating zero-lag synchrony resonance-induced synchronization, outline the conditions for its occurrence, and propose that it may be a general mechanism to promote zero-lag synchrony in the brain. PMID:24763382

  19. The effect of orthology and coregulation on detecting regulatory motifs.

    PubMed

    Storms, Valerie; Claeys, Marleen; Sanchez, Aminael; De Moor, Bart; Verstuyf, Annemieke; Marchal, Kathleen

    2010-02-03

    Computational de novo discovery of transcription factor binding sites is still a challenging problem. The growing number of sequenced genomes allows integrating orthology evidence with coregulation information when searching for motifs. Moreover, the more advanced motif detection algorithms explicitly model the phylogenetic relatedness between the orthologous input sequences and thus should be well adapted towards using orthologous information. In this study, we evaluated the conditions under which complementing coregulation with orthologous information improves motif detection for the class of probabilistic motif detection algorithms with an explicit evolutionary model. We designed datasets (real and synthetic) covering different degrees of coregulation and orthologous information to test how well Phylogibbs and Phylogenetic sampler, as representatives of the motif detection algorithms with evolutionary model performed as compared to MEME, a more classical motif detection algorithm that treats orthologs independently. Under certain conditions detecting motifs in the combined coregulation-orthology space is indeed more efficient than using each space separately, but this is not always the case. Moreover, the difference in success rate between the advanced algorithms and MEME is still marginal. The success rate of motif detection depends on the complex interplay between the added information and the specificities of the applied algorithms. Insights in this relation provide information useful to both developers and users. All benchmark datasets are available at http://homes.esat.kuleuven.be/~kmarchal/Supplementary_Storms_Valerie_PlosONE.

  20. Allele frequencies of variants in ultra conserved elements identify selective pressure on transcription factor binding.

    PubMed

    Silla, Toomas; Kepp, Katrin; Tai, E Shyong; Goh, Liang; Davila, Sonia; Catela Ivkovic, Tina; Calin, George A; Voorhoeve, P Mathijs

    2014-01-01

    Ultra-conserved genes or elements (UCGs/UCEs) in the human genome are extreme examples of conservation. We characterized natural variations in 2884 UCEs and UCGs in two distinct populations; Singaporean Chinese (n = 280) and Italian (n = 501) by using a pooled sample, targeted capture, sequencing approach. We identify, with high confidence, in these regions the abundance of rare SNVs (MAF<0.5%) of which 75% is not present in dbSNP137. UCEs association studies for complex human traits can use this information to model expected background variation and thus necessary power for association studies. By combining our data with 1000 Genome Project data, we show in three independent datasets that prevalent UCE variants (MAF>5%) are more often found in relatively less-conserved nucleotides within UCEs, compared to rare variants. Moreover, prevalent variants are less likely to overlap transcription factor binding site. Using SNPfold we found no significant influence of RNA secondary structure on UCE conservation. All together, these results suggest UCEs are not under selective pressure as a stretch of DNA but are under differential evolutionary pressure on the single nucleotide level.

  1. cWINNOWER algorithm for finding fuzzy dna motifs

    NASA Technical Reports Server (NTRS)

    Liang, S.; Samanta, M. P.; Biegel, B. A.

    2004-01-01

    The cWINNOWER algorithm detects fuzzy motifs in DNA sequences rich in protein-binding signals. A signal is defined as any short nucleotide pattern having up to d mutations differing from a motif of length l. The algorithm finds such motifs if a clique consisting of a sufficiently large number of mutated copies of the motif (i.e., the signals) is present in the DNA sequence. The cWINNOWER algorithm substantially improves the sensitivity of the winnower method of Pevzner and Sze by imposing a consensus constraint, enabling it to detect much weaker signals. We studied the minimum detectable clique size qc as a function of sequence length N for random sequences. We found that qc increases linearly with N for a fast version of the algorithm based on counting three-member sub-cliques. Imposing consensus constraints reduces qc by a factor of three in this case, which makes the algorithm dramatically more sensitive. Our most sensitive algorithm, which counts four-member sub-cliques, needs a minimum of only 13 signals to detect motifs in a sequence of length N = 12,000 for (l, d) = (15, 4). Copyright Imperial College Press.

  2. The conserved role of Krox-20 in directing Hox gene expression during vertebrate hindbrain segmentation.

    PubMed

    Nonchev, S; Maconochie, M; Vesque, C; Aparicio, S; Ariza-McNaughton, L; Manzanares, M; Maruthainar, K; Kuroiwa, A; Brenner, S; Charnay, P; Krumlauf, R

    1996-09-03

    Transient segmentation in the hindbrain is a fundamental morphogenetic phenomenon in the vertebrate embryo, and the restricted expression of subsets of Hox genes in the developing rhombomeric units and their derivatives is linked with regional specification. Here we show that patterning of the vertebrate hindbrain involves the direct upregulation of the chicken and pufferfish group 2 paralogous genes, Hoxb-2 and Hoxa-2, in rhombomeres 3 and 5 (r3 and r5) by the zinc finger gene Krox-20. We identified evolutionarily conserved r3/r5 enhancers that contain high affinity Krox-20. binding sites capable of mediating transactivation by Krox-20. In addition to conservation of binding sites critical for Krox-20 activity in the chicken Hoxa-2 and pufferfish Hoxb-2 genes, the r3/r5 enhancers are also characterized by the presence of a number of identical motifs likely to be involved in cooperative interactions with Krox-20 during the process of hindbrain patterning in vertebrates.

  3. Sequence information gain based motif analysis.

    PubMed

    Maynou, Joan; Pairó, Erola; Marco, Santiago; Perera, Alexandre

    2015-11-09

    The detection of regulatory regions in candidate sequences is essential for the understanding of the regulation of a particular gene and the mechanisms involved. This paper proposes a novel methodology based on information theoretic metrics for finding regulatory sequences in promoter regions. This methodology (SIGMA) has been tested on genomic sequence data for Homo sapiens and Mus musculus. SIGMA has been compared with different publicly available alternatives for motif detection, such as MEME/MAST, Biostrings (Bioconductor package), MotifRegressor, and previous work such Qresiduals projections or information theoretic based detectors. Comparative results, in the form of Receiver Operating Characteristic curves, show how, in 70% of the studied Transcription Factor Binding Sites, the SIGMA detector has a better performance and behaves more robustly than the methods compared, while having a similar computational time. The performance of SIGMA can be explained by its parametric simplicity in the modelling of the non-linear co-variability in the binding motif positions. Sequence Information Gain based Motif Analysis is a generalisation of a non-linear model of the cis-regulatory sequences detection based on Information Theory. This generalisation allows us to detect transcription factor binding sites with maximum performance disregarding the covariability observed in the positions of the training set of sequences. SIGMA is freely available to the public at http://b2slab.upc.edu.

  4. Magnesium-binding architectures in RNA crystal structures: validation, binding preferences, classification and motif detection

    PubMed Central

    Zheng, Heping; Shabalin, Ivan G.; Handing, Katarzyna B.; Bujnicki, Janusz M.; Minor, Wladek

    2015-01-01

    The ubiquitous presence of magnesium ions in RNA has long been recognized as a key factor governing RNA folding, and is crucial for many diverse functions of RNA molecules. In this work, Mg2+-binding architectures in RNA were systematically studied using a database of RNA crystal structures from the Protein Data Bank (PDB). Due to the abundance of poorly modeled or incorrectly identified Mg2+ ions, the set of all sites was comprehensively validated and filtered to identify a benchmark dataset of 15 334 ‘reliable’ RNA-bound Mg2+ sites. The normalized frequencies by which specific RNA atoms coordinate Mg2+ were derived for both the inner and outer coordination spheres. A hierarchical classification system of Mg2+ sites in RNA structures was designed and applied to the benchmark dataset, yielding a set of 41 types of inner-sphere and 95 types of outer-sphere coordinating patterns. This classification system has also been applied to describe six previously reported Mg2+-binding motifs and detect them in new RNA structures. Investigation of the most populous site types resulted in the identification of seven novel Mg2+-binding motifs, and all RNA structures in the PDB were screened for the presence of these motifs. PMID:25800744

  5. Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses

    PubMed Central

    Turco, Gina; Schnable, James C.; Pedersen, Brent; Freeling, Michael

    2013-01-01

    Conserved non-coding sequences (CNS) are islands of non-coding sequence that, like protein coding exons, show less divergence in sequence between related species than functionless DNA. Several CNSs have been demonstrated experimentally to function as cis-regulatory regions. However, the specific functions of most CNSs remain unknown. Previous searches for CNS in plants have either anchored on exons and only identified nearby sequences or required years of painstaking manual annotation. Here we present an open source tool that can accurately identify CNSs between any two related species with sequenced genomes, including both those immediately adjacent to exons and distal sequences separated by >12 kb of non-coding sequence. We have used this tool to characterize new motifs, associate CNSs with additional functions, and identify previously undetected genes encoding RNA and protein in the genomes of five grass species. We provide a list of 15,363 orthologous CNSs conserved across all grasses tested. We were also able to identify regulatory sequences present in the common ancestor of grasses that have been lost in one or more extant grass lineages. Lists of orthologous gene pairs and associated CNSs are provided for reference inbred lines of arabidopsis, Japonica rice, foxtail millet, sorghum, brachypodium, and maize. PMID:23874343

  6. The Effect of Orthology and Coregulation on Detecting Regulatory Motifs

    PubMed Central

    Storms, Valerie; Claeys, Marleen; Sanchez, Aminael; De Moor, Bart; Verstuyf, Annemieke; Marchal, Kathleen

    2010-01-01

    Background Computational de novo discovery of transcription factor binding sites is still a challenging problem. The growing number of sequenced genomes allows integrating orthology evidence with coregulation information when searching for motifs. Moreover, the more advanced motif detection algorithms explicitly model the phylogenetic relatedness between the orthologous input sequences and thus should be well adapted towards using orthologous information. In this study, we evaluated the conditions under which complementing coregulation with orthologous information improves motif detection for the class of probabilistic motif detection algorithms with an explicit evolutionary model. Methodology We designed datasets (real and synthetic) covering different degrees of coregulation and orthologous information to test how well Phylogibbs and Phylogenetic sampler, as representatives of the motif detection algorithms with evolutionary model performed as compared to MEME, a more classical motif detection algorithm that treats orthologs independently. Results and Conclusions Under certain conditions detecting motifs in the combined coregulation-orthology space is indeed more efficient than using each space separately, but this is not always the case. Moreover, the difference in success rate between the advanced algorithms and MEME is still marginal. The success rate of motif detection depends on the complex interplay between the added information and the specificities of the applied algorithms. Insights in this relation provide information useful to both developers and users. All benchmark datasets are available at http://homes.esat.kuleuven.be/~kmarchal/Supplementary_Storms_Valerie_PlosONE. PMID:20140085

  7. Principles of regulatory information conservation between mouse and human.

    PubMed

    Cheng, Yong; Ma, Zhihai; Kim, Bong-Hyun; Wu, Weisheng; Cayting, Philip; Boyle, Alan P; Sundaram, Vasavi; Xing, Xiaoyun; Dogan, Nergiz; Li, Jingjing; Euskirchen, Ghia; Lin, Shin; Lin, Yiing; Visel, Axel; Kawli, Trupti; Yang, Xinqiong; Patacsil, Dorrelyn; Keller, Cheryl A; Giardine, Belinda; Kundaje, Anshul; Wang, Ting; Pennacchio, Len A; Weng, Zhiping; Hardison, Ross C; Snyder, Michael P

    2014-11-20

    To broaden our understanding of the evolution of gene regulation mechanisms, we generated occupancy profiles for 34 orthologous transcription factors (TFs) in human-mouse erythroid progenitor, lymphoblast and embryonic stem-cell lines. By combining the genome-wide transcription factor occupancy repertoires, associated epigenetic signals, and co-association patterns, here we deduce several evolutionary principles of gene regulatory features operating since the mouse and human lineages diverged. The genomic distribution profiles, primary binding motifs, chromatin states, and DNA methylation preferences are well conserved for TF-occupied sequences. However, the extent to which orthologous DNA segments are bound by orthologous TFs varies both among TFs and with genomic location: binding at promoters is more highly conserved than binding at distal elements. Notably, occupancy-conserved TF-occupied sequences tend to be pleiotropic; they function in several tissues and also co-associate with many TFs. Single nucleotide variants at sites with potential regulatory functions are enriched in occupancy-conserved TF-occupied sequences.

  8. Principles of regulatory information conservation between mouse and human

    DOE PAGES

    Cheng, Yong; Ma, Zhihai; Kim, Bong-Hyun; ...

    2014-11-19

    To broaden our understanding of the evolution of gene regulation mechanisms, we generated occupancy profiles for 34 orthologous transcription factors (TFs) in human–mouse erythroid progenitor, lymphoblast and embryonic stem-cell lines. By combining the genome-wide transcription factor occupancy repertoires, associated epigenetic signals, and co-association patterns, here we deduce several evolutionary principles of gene regulatory features operating since the mouse and human lineages diverged. The genomic distribution profiles, primary binding motifs, chromatin states, and DNA methylation preferences are well conserved for TF-occupied sequences. However, the extent to which orthologous DNA segments are bound by orthologous TFs varies both among TFs and withmore » genomic location: binding at promoters is more highly conserved than binding at distal elements. Notably, occupancy-conserved TF-occupied sequences tend to be pleiotropic; they function in several tissues and also co-associate with many TFs. Lastly, single nucleotide variants at sites with potential regulatory functions are enriched in occupancy-conserved TF-occupied sequences.« less

  9. Cross-reactions vs co-sensitization evaluated by in silico motifs and in vitro IgE microarray testing.

    PubMed

    Pfiffner, P; Stadler, B M; Rasi, C; Scala, E; Mari, A

    2012-02-01

    Using an in silico allergen clustering method, we have recently shown that allergen extracts are highly cross-reactive. Here we used serological data from a multi-array IgE test based on recombinant or highly purified natural allergens to evaluate whether co-reactions are true cross-reactions or co-sensitizations by allergens with the same motifs. The serum database consisted of 3142 samples, each tested against 103 highly purified natural or recombinant allergens. Cross-reactivity was predicted by an iterative motif-finding algorithm through sequence motifs identified in 2708 known allergens. Allergen proteins containing the same motifs cross-reacted as predicted. However, proteins with identical motifs revealed a hierarchy in the degree of cross-reaction: The more frequent an allergen was positive in the allergic population, the less frequently it was cross-reacting and vice versa. Co-sensitization was analyzed by splitting the dataset into patient groups that were most likely sensitized through geographical occurrence of allergens. Interestingly, most co-reactions are cross-reactions but not co-sensitizations. The observed hierarchy of cross-reactivity may play an important role for the future management of allergic diseases. © 2011 John Wiley & Sons A/S.

  10. Fox-2 Splicing Factor Binds to a Conserved Intron Motif to PromoteInclusion of Protein 4.1R Alternative Exon 16

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ponthier, Julie L.; Schluepen, Christina; Chen, Weiguo

    Activation of protein 4.1R exon 16 (E16) inclusion during erythropoiesis represents a physiologically important splicing switch that increases 4.1R affinity for spectrin and actin. Previous studies showed that negative regulation of E16 splicing is mediated by the binding of hnRNP A/B proteins to silencer elements in the exon and that downregulation of hnRNP A/B proteins in erythroblasts leads to activation of E16 inclusion. This paper demonstrates that positive regulation of E16 splicing can be mediated by Fox-2 or Fox-1, two closely related splicing factors that possess identical RNA recognition motifs. SELEX experiments with human Fox-1 revealed highly selective binding tomore » the hexamer UGCAUG. Both Fox-1 and Fox-2 were able to bind the conserved UGCAUG elements in the proximal intron downstream of E16, and both could activate E16 splicing in HeLa cell co-transfection assays in a UGCAUG-dependent manner. Conversely, knockdown of Fox-2 expression, achieved with two different siRNA sequences resulted in decreased E16 splicing. Moreover, immunoblot experiments demonstrate mouse erythroblasts express Fox-2, but not Fox-1. These findings suggest that Fox-2 is a physiological activator of E16 splicing in differentiating erythroid cells in vivo. Recent experiments show that UGCAUG is present in the proximal intron sequence of many tissue-specific alternative exons, and we propose that the Fox family of splicing enhancers plays an important role in alternative splicing switches during differentiation in metazoan organisms.« less

  11. Crystal structure and novel recognition motif of rho ADP-ribosylating C3 exoenzyme from Clostridium botulinum: structural insights for recognition specificity and catalysis.

    PubMed

    Han, S; Arvai, A S; Clancy, S B; Tainer, J A

    2001-01-05

    Clostridium botulinum C3 exoenzyme inactivates the small GTP-binding protein family Rho by ADP-ribosylating asparagine 41, which depolymerizes the actin cytoskeleton. C3 thus represents a major family of the bacterial toxins that transfer the ADP-ribose moiety of NAD to specific amino acids in acceptor proteins to modify key biological activities in eukaryotic cells, including protein synthesis, differentiation, transformation, and intracellular signaling. The 1.7 A resolution C3 exoenzyme structure establishes the conserved features of the core NAD-binding beta-sandwich fold with other ADP-ribosylating toxins despite little sequence conservation. Importantly, the central core of the C3 exoenzyme structure is distinguished by the absence of an active site loop observed in many other ADP-ribosylating toxins. Unlike the ADP-ribosylating toxins that possess the active site loop near the central core, the C3 exoenzyme replaces the active site loop with an alpha-helix, alpha3. Moreover, structural and sequence similarities with the catalytic domain of vegetative insecticidal protein 2 (VIP2), an actin ADP-ribosyltransferase, unexpectedly implicates two adjacent, protruding turns, which join beta5 and beta6 of the toxin core fold, as a novel recognition specificity motif for this newly defined toxin family. Turn 1 evidently positions the solvent-exposed, aromatic side-chain of Phe209 to interact with the hydrophobic region of Rho adjacent to its GTP-binding site. Turn 2 evidently both places the Gln212 side-chain for hydrogen bonding to recognize Rho Asn41 for nucleophilic attack on the anomeric carbon of NAD ribose and holds the key Glu214 catalytic side-chain in the adjacent catalytic pocket. This proposed bipartite ADP-ribosylating toxin turn-turn (ARTT) motif places the VIP2 and C3 toxin classes into a single ARTT family characterized by analogous target protein recognition via turn 1 aromatic and turn 2 hydrogen-bonding side-chain moieties. Turn 2 centrally anchors

  12. Identification of a conserved B-cell epitope on the GapC protein of Streptococcus dysgalactiae.

    PubMed

    Zhang, Limeng; Zhou, Xue; Fan, Ziyao; Tang, Wei; Chen, Liang; Dai, Jian; Wei, Yuhua; Zhang, Jianxin; Yang, Xuan; Yang, Xijing; Liu, Daolong; Yu, Liquan; Zhang, Hua; Wu, Zhijun; Yu, Yongzhong; Sun, Hunan; Cui, Yudong

    2015-01-01

    Streptococcus dysgalactiae (S. dysgalactia) GapC is a highly conserved surface dehydrogenase among the streptococcus spp., which is responsible for inducing protective antibody immune responses in animals. However, the B-cell epitope of S. dysgalactia GapC have not been well characterized. In this study, a monoclonal antibody 1F2 (mAb1F2) against S. dysgalactiae GapC was generated by the hybridoma technique and used to screen a phage-displayed 12-mer random peptide library (Ph.D.-12) for mapping the linear B-cell epitope. The mAb1F2 recognized phages displaying peptides with the consensus motif TRINDLT. Amino acid sequence of the motif exactly matched (30)TRINDLT(36) of the S. dysgalactia GapC. Subsequently, site-directed mutagenic analysis further demonstrated that residues R31, I32, N33, D34 and L35 formed the core of (30)TRINDLT(36), and this core motif was the minimal determinant of the B-cell epitope recognized by the mAb1F2. The epitope (30)TRINDLT(36) showed high homology among different streptococcus species. Overall, our findings characterized a conserved B-cell epitope, which will be useful for the further study of epitope-based vaccines. Copyright © 2015 Elsevier Ltd. All rights reserved.

  13. Evidence of function for conserved noncoding sequences in Arabidopsis thaliana.

    PubMed

    Spangler, Jacob B; Subramaniam, Sabarinath; Freeling, Michael; Feltus, F Alex

    2012-01-01

    • Whole genome duplication events provide a lineage with a large reservoir of genes that can be molded by evolutionary forces into phenotypes that fit alternative environments. A well-studied whole genome duplication, the α-event, occurred in an ancestor of the model plant Arabidopsis thaliana. Retained segments of the α-event have been defined in recent years in the form of duplicate protein coding sequences (α-pairs) and associated conserved noncoding DNA sequences (CNSs). Our aim was to identify any association between CNSs and α-pair co-functionality at the gene expression level. • Here, we tested for correlation between CNS counts and α-pair co-expression and expression intensity across nine expression datasets: aerial tissue, flowers, leaves, roots, rosettes, seedlings, seeds, shoots and whole plants. • We provide evidence for a putative regulatory role of the CNSs. The association of CNSs with α-pair co-expression and expression intensity varied by gene function, subgene position and the presence of transcription factor binding motifs. A range of possible CNS regulatory mechanisms, including intron-mediated enhancement, messenger RNA fold stability and transcriptional regulation, are discussed. • This study provides a framework to understand how CNS motifs are involved in the maintenance of gene expression after a whole genome duplication event. © 2011 The Authors. New Phytologist © 2011 New Phytologist Trust.

  14. Cancer-related marketing centrality motifs acting as pivot units in the human signaling network and mediating cross-talk between biological pathways.

    PubMed

    Li, Wan; Chen, Lina; Li, Xia; Jia, Xu; Feng, Chenchen; Zhang, Liangcai; He, Weiming; Lv, Junjie; He, Yuehan; Li, Weiguo; Qu, Xiaoli; Zhou, Yanyan; Shi, Yuchen

    2013-12-01

    Network motifs in central positions are considered to not only have more in-coming and out-going connections but are also localized in an area where more paths reach the networks. These central motifs have been extensively investigated to determine their consistent functions or associations with specific function categories. However, their functional potentials in the maintenance of cross-talk between different functional communities are unclear. In this paper, we constructed an integrated human signaling network from the Pathway Interaction Database. We identified 39 essential cancer-related motifs in central roles, which we called cancer-related marketing centrality motifs, using combined centrality indices on the system level. Our results demonstrated that these cancer-related marketing centrality motifs were pivotal units in the signaling network, and could mediate cross-talk between 61 biological pathways (25 could be mediated by one motif on average), most of which were cancer-related pathways. Further analysis showed that molecules of most marketing centrality motifs were in the same or adjacent subcellular localizations, such as the motif containing PI3K, PDK1 and AKT1 in the plasma membrane, to mediate signal transduction between 32 cancer-related pathways. Finally, we analyzed the pivotal roles of cancer genes in these marketing centrality motifs in the pathogenesis of cancers, and found that non-cancer genes were potential cancer-related genes.

  15. Use of a Drosophila Genome-Wide Conserved Sequence Database to Identify Functionally Related cis-Regulatory Enhancers

    PubMed Central

    Brody, Thomas; Yavatkar, Amarendra S; Kuzin, Alexander; Kundu, Mukta; Tyson, Leonard J; Ross, Jermaine; Lin, Tzu-Yang; Lee, Chi-Hon; Awasaki, Takeshi; Lee, Tzumin; Odenwald, Ward F

    2012-01-01

    Background: Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. Results: We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. Conclusions: cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process. Developmental Dynamics 241:169–189, 2012. © 2011 Wiley Periodicals, Inc. Key findings A genome-wide catalog of Drosophila conserved DNA sequence clusters. cis-Decoder discovers functionally related enhancers. Functionally related enhancers share balanced sequence element copy numbers. Many enhancers function during multiple phases of development. PMID:22174086

  16. Limitations and potentials of current motif discovery algorithms

    PubMed Central

    Hu, Jianjun; Li, Bin; Kihara, Daisuke

    2005-01-01

    Computational methods for de novo identification of gene regulation elements, such as transcription factor binding sites, have proved to be useful for deciphering genetic regulatory networks. However, despite the availability of a large number of algorithms, their strengths and weaknesses are not sufficiently understood. Here, we designed a comprehensive set of performance measures and benchmarked five modern sequence-based motif discovery algorithms using large datasets generated from Escherichia coli RegulonDB. Factors that affect the prediction accuracy, scalability and reliability are characterized. It is revealed that the nucleotide and the binding site level accuracy are very low, while the motif level accuracy is relatively high, which indicates that the algorithms can usually capture at least one correct motif in an input sequence. To exploit diverse predictions from multiple runs of one or more algorithms, a consensus ensemble algorithm has been developed, which achieved 6–45% improvement over the base algorithms by increasing both the sensitivity and specificity. Our study illustrates limitations and potentials of existing sequence-based motif discovery algorithms. Taking advantage of the revealed potentials, several promising directions for further improvements are discussed. Since the sequence-based algorithms are the baseline of most of the modern motif discovery algorithms, this paper suggests substantial improvements would be possible for them. PMID:16284194

  17. Convergent evolution and mimicry of protein linear motifs in host-pathogen interactions.

    PubMed

    Chemes, Lucía Beatriz; de Prat-Gay, Gonzalo; Sánchez, Ignacio Enrique

    2015-06-01

    Pathogen linear motif mimics are highly evolvable elements that facilitate rewiring of host protein interaction networks. Host linear motifs and pathogen mimics differ in sequence, leading to thermodynamic and structural differences in the resulting protein-protein interactions. Moreover, the functional output of a mimic depends on the motif and domain repertoire of the pathogen protein. Regulatory evolution mediated by linear motifs can be understood by measuring evolutionary rates, quantifying positive and negative selection and performing phylogenetic reconstructions of linear motif natural history. Convergent evolution of linear motif mimics is widespread among unrelated proteins from viral, prokaryotic and eukaryotic pathogens and can also take place within individual protein phylogenies. Statistics, biochemistry and laboratory models of infection link pathogen linear motifs to phenotypic traits such as tropism, virulence and oncogenicity. In vitro evolution experiments and analysis of natural sequences suggest that changes in linear motif composition underlie pathogen adaptation to a changing environment. Copyright © 2015 Elsevier Ltd. All rights reserved.

  18. A kinesin-1 binding motif in vaccinia virus that is widespread throughout the human genome

    PubMed Central

    Dodding, Mark P; Mitter, Richard; Humphries, Ashley C; Way, Michael

    2011-01-01

    Transport of cargoes by kinesin-1 is essential for many cellular processes. Nevertheless, the number of proteins known to recruit kinesin-1 via its cargo binding light chain (KLC) is still quite small. We also know relatively little about the molecular features that define kinesin-1 binding. We now show that a bipartite tryptophan-based kinesin-1 binding motif, originally identified in Calsyntenin is present in A36, a vaccinia integral membrane protein. This bipartite motif in A36 is required for kinesin-1-dependent transport of the virus to the cell periphery. Bioinformatic analysis reveals that related bipartite tryptophan-based motifs are present in over 450 human proteins. Using vaccinia as a surrogate cargo, we show that regions of proteins containing this motif can function to recruit KLC and promote virus transport in the absence of A36. These proteins interact with the kinesin light chain outside the context of infection and have distinct preferences for KLC1 and KLC2. Our observations demonstrate that KLC binding can be conferred by a common set of features that are found in a wide range of proteins associated with diverse cellular functions and human diseases. PMID:21915095

  19. Methods and statistics for combining motif match scores.

    PubMed

    Bailey, T L; Gribskov, M

    1998-01-01

    Position-specific scoring matrices are useful for representing and searching for protein sequence motifs. A sequence family can often be described by a group of one or more motifs, and an effective search must combine the scores for matching a sequence to each of the motifs in the group. We describe three methods for combining match scores and estimating the statistical significance of the combined scores and evaluate the search quality (classification accuracy) and the accuracy of the estimate of statistical significance of each. The three methods are: 1) sum of scores, 2) sum of reduced variates, 3) product of score p-values. We show that method 3) is superior to the other two methods in both regards, and that combining motif scores indeed gives better search accuracy. The MAST sequence homology search algorithm utilizing the product of p-values scoring method is available for interactive use and downloading at URL http:/(/)www.sdsc.edu/MEME.

  20. A conserved MCM single-stranded DNA binding element is essential for replication initiation.

    PubMed

    Froelich, Clifford A; Kang, Sukhyun; Epling, Leslie B; Bell, Stephen P; Enemark, Eric J

    2014-04-01

    The ring-shaped MCM helicase is essential to all phases of DNA replication. The complex loads at replication origins as an inactive double-hexamer encircling duplex DNA. Helicase activation converts this species to two active single hexamers that encircle single-stranded DNA (ssDNA). The molecular details of MCM DNA interactions during these events are unknown. We determined the crystal structure of the Pyrococcus furiosus MCM N-terminal domain hexamer bound to ssDNA and define a conserved MCM-ssDNA binding motif (MSSB). Intriguingly, ssDNA binds the MCM ring interior perpendicular to the central channel with defined polarity. In eukaryotes, the MSSB is conserved in several Mcm2-7 subunits, and MSSB mutant combinations in S. cerevisiae Mcm2-7 are not viable. Mutant Mcm2-7 complexes assemble and are recruited to replication origins, but are defective in helicase loading and activation. Our findings identify an important MCM-ssDNA interaction and suggest it functions during helicase activation to select the strand for translocation. DOI: http://dx.doi.org/10.7554/eLife.01993.001.

  1. A conserved MCM single-stranded DNA binding element is essential for replication initiation

    PubMed Central

    Froelich, Clifford A; Kang, Sukhyun; Epling, Leslie B; Bell, Stephen P; Enemark, Eric J

    2014-01-01

    The ring-shaped MCM helicase is essential to all phases of DNA replication. The complex loads at replication origins as an inactive double-hexamer encircling duplex DNA. Helicase activation converts this species to two active single hexamers that encircle single-stranded DNA (ssDNA). The molecular details of MCM DNA interactions during these events are unknown. We determined the crystal structure of the Pyrococcus furiosus MCM N-terminal domain hexamer bound to ssDNA and define a conserved MCM-ssDNA binding motif (MSSB). Intriguingly, ssDNA binds the MCM ring interior perpendicular to the central channel with defined polarity. In eukaryotes, the MSSB is conserved in several Mcm2-7 subunits, and MSSB mutant combinations in S. cerevisiae Mcm2-7 are not viable. Mutant Mcm2-7 complexes assemble and are recruited to replication origins, but are defective in helicase loading and activation. Our findings identify an important MCM-ssDNA interaction and suggest it functions during helicase activation to select the strand for translocation. DOI: http://dx.doi.org/10.7554/eLife.01993.001 PMID:24692448

  2. A Monte Carlo-based framework enhances the discovery and interpretation of regulatory sequence motifs

    PubMed Central

    2012-01-01

    Background Discovery of functionally significant short, statistically overrepresented subsequence patterns (motifs) in a set of sequences is a challenging problem in bioinformatics. Oftentimes, not all sequences in the set contain a motif. These non-motif-containing sequences complicate the algorithmic discovery of motifs. Filtering the non-motif-containing sequences from the larger set of sequences while simultaneously determining the identity of the motif is, therefore, desirable and a non-trivial problem in motif discovery research. Results We describe MotifCatcher, a framework that extends the sensitivity of existing motif-finding tools by employing random sampling to effectively remove non-motif-containing sequences from the motif search. We developed two implementations of our algorithm; each built around a commonly used motif-finding tool, and applied our algorithm to three diverse chromatin immunoprecipitation (ChIP) data sets. In each case, the motif finder with the MotifCatcher extension demonstrated improved sensitivity over the motif finder alone. Our approach organizes candidate functionally significant discovered motifs into a tree, which allowed us to make additional insights. In all cases, we were able to support our findings with experimental work from the literature. Conclusions Our framework demonstrates that additional processing at the sequence entry level can significantly improve the performance of existing motif-finding tools. For each biological data set tested, we were able to propose novel biological hypotheses supported by experimental work from the literature. Specifically, in Escherichia coli, we suggested binding site motifs for 6 non-traditional LexA protein binding sites; in Saccharomyces cerevisiae, we hypothesize 2 disparate mechanisms for novel binding sites of the Cse4p protein; and in Halobacterium sp. NRC-1, we discoverd subtle differences in a general transcription factor (GTF) binding site motif across several data sets. We

  3. MOCCS: Clarifying DNA-binding motif ambiguity using ChIP-Seq data.

    PubMed

    Ozaki, Haruka; Iwasaki, Wataru

    2016-08-01

    As a key mechanism of gene regulation, transcription factors (TFs) bind to DNA by recognizing specific short sequence patterns that are called DNA-binding motifs. A single TF can accept ambiguity within its DNA-binding motifs, which comprise both canonical (typical) and non-canonical motifs. Clarification of such DNA-binding motif ambiguity is crucial for revealing gene regulatory networks and evaluating mutations in cis-regulatory elements. Although chromatin immunoprecipitation sequencing (ChIP-seq) now provides abundant data on the genomic sequences to which a given TF binds, existing motif discovery methods are unable to directly answer whether a given TF can bind to a specific DNA-binding motif. Here, we report a method for clarifying the DNA-binding motif ambiguity, MOCCS. Given ChIP-Seq data of any TF, MOCCS comprehensively analyzes and describes every k-mer to which that TF binds. Analysis of simulated datasets revealed that MOCCS is applicable to various ChIP-Seq datasets, requiring only a few minutes per dataset. Application to the ENCODE ChIP-Seq datasets proved that MOCCS directly evaluates whether a given TF binds to each DNA-binding motif, even if known position weight matrix models do not provide sufficient information on DNA-binding motif ambiguity. Furthermore, users are not required to provide numerous parameters or background genomic sequence models that are typically unavailable. MOCCS is implemented in Perl and R and is freely available via https://github.com/yuifu/moccs. By complementing existing motif-discovery software, MOCCS will contribute to the basic understanding of how the genome controls diverse cellular processes via DNA-protein interactions. Copyright © 2016 Elsevier Ltd. All rights reserved.

  4. CircularLogo: A lightweight web application to visualize intra-motif dependencies.

    PubMed

    Ye, Zhenqing; Ma, Tao; Kalmbach, Michael T; Dasari, Surendra; Kocher, Jean-Pierre A; Wang, Liguo

    2017-05-22

    The sequence logo has been widely used to represent DNA or RNA motifs for more than three decades. Despite its intelligibility and intuitiveness, the traditional sequence logo is unable to display the intra-motif dependencies and therefore is insufficient to fully characterize nucleotide motifs. Many methods have been developed to quantify the intra-motif dependencies, but fewer tools are available for visualization. We developed CircularLogo, a web-based interactive application, which is able to not only visualize the position-specific nucleotide consensus and diversity but also display the intra-motif dependencies. Applying CircularLogo to HNF6 binding sites and tRNA sequences demonstrated its ability to show intra-motif dependencies and intuitively reveal biomolecular structure. CircularLogo is implemented in JavaScript and Python based on the Django web framework. The program's source code and user's manual are freely available at http://circularlogo.sourceforge.net . CircularLogo web server can be accessed from http://bioinformaticstools.mayo.edu/circularlogo/index.html . CircularLogo is an innovative web application that is specifically designed to visualize and interactively explore intra-motif dependencies.

  5. Cryo-EM near-atomic structure of a dsRNA fungal virus shows ancient structural motifs preserved in the dsRNA viral lineage

    PubMed Central

    Luque, Daniel; Gómez-Blanco, Josué; Garriga, Damiá; Brilot, Axel F.; González, José M.; Havens, Wendy M.; Carrascosa, José L.; Trus, Benes L.; Verdaguer, Nuria; Ghabrial, Said A.; Castón, José R.

    2014-01-01

    Viruses evolve so rapidly that sequence-based comparison is not suitable for detecting relatedness among distant viruses. Structure-based comparisons suggest that evolution led to a small number of viral classes or lineages that can be grouped by capsid protein (CP) folds. Here, we report that the CP structure of the fungal dsRNA Penicillium chrysogenum virus (PcV) shows the progenitor fold of the dsRNA virus lineage and suggests a relationship between lineages. Cryo-EM structure at near-atomic resolution showed that the 982-aa PcV CP is formed by a repeated α-helical core, indicative of gene duplication despite lack of sequence similarity between the two halves. Superimposition of secondary structure elements identified a single “hotspot” at which variation is introduced by insertion of peptide segments. Structural comparison of PcV and other distantly related dsRNA viruses detected preferential insertion sites at which the complexity of the conserved α-helical core, made up of ancestral structural motifs that have acted as a skeleton, might have increased, leading to evolution of the highly varied current structures. Analyses of structural motifs only apparent after systematic structural comparisons indicated that the hallmark fold preserved in the dsRNA virus lineage shares a long (spinal) α-helix tangential to the capsid surface with the head-tailed phage and herpesvirus viral lineage. PMID:24821769

  6. Unusual occurrence of a DAG motif in the Ipomovirus Cassava brown streak virus and implications for its vector transmission.

    PubMed

    Ateka, Elijah; Alicai, Titus; Ndunguru, Joseph; Tairo, Fred; Sseruwagi, Peter; Kiarie, Samuel; Makori, Timothy; Kehoe, Monica A; Boykin, Laura M

    2017-01-01

    Cassava is the main staple food for over 800 million people globally. Its production in eastern Africa is being constrained by two devastating Ipomoviruses that cause cassava brown streak disease (CBSD); Cassava brown streak virus (CBSV) and Ugandan cassava brown streak virus (UCBSV), with up to 100% yield loss for smallholder farmers in the region. To date, vector studies have not resulted in reproducible and highly efficient transmission of CBSV and UCBSV. Most virus transmission studies have used Bemisia tabaci (whitefly), but a maximum of 41% U/CBSV transmission efficiency has been documented for this vector. With the advent of next generation sequencing, researchers are generating whole genome sequences for both CBSV and UCBSV from throughout eastern Africa. Our initial goal for this study was to characterize U/CBSV whole genomes from CBSD symptomatic cassava plants sampled in Kenya. We have generated 8 new whole genomes (3 CBSV and 5 UCBSV) from Kenya, and in the process of analyzing these genomes together with 26 previously published sequences, we uncovered the aphid transmission associated DAG motif within coat protein genes of all CBSV whole genomes at amino acid positions 52-54, but not in UCBSV. Upon further investigation, the DAG motif was also found at the same positions in two other Ipomoviruses: Squash vein yellowing virus (SqVYV), Coccinia mottle virus (CocMoV). Until this study, the highly-conserved DAG motif, which is associated with aphid transmission was only noticed once, in SqVYV but discounted as being of minimal importance. This study represents the first comprehensive look at Ipomovirus genomes to determine the extent of DAG motif presence and significance for vector relations. The presence of this motif suggests that aphids could potentially be a vector of CBSV, SqVYV and CocMov. Further transmission and ipomoviral protein evolutionary studies are needed to confirm this hypothesis.

  7. Role of Conserved Glycine in Zinc-dependent Medium Chain Dehydrogenase/Reductase Superfamily*

    PubMed Central

    Tiwari, Manish Kumar; Singh, Raushan Kumar; Singh, Ranjitha; Jeya, Marimuthu; Zhao, Huimin; Lee, Jung-Kul

    2012-01-01

    The medium-chain dehydrogenase/reductase (MDR) superfamily consists of a large group of enzymes with a broad range of activities. Members of this superfamily are currently the subject of intensive investigation, but many aspects, including the zinc dependence of MDR superfamily proteins, have not yet have been adequately investigated. Using a density functional theory-based screening strategy, we have identified a strictly conserved glycine residue (Gly) in the zinc-dependent MDR superfamily. To elucidate the role of this conserved Gly in MDR, we carried out a comprehensive structural, functional, and computational analysis of four MDR enzymes through a series of studies including site-directed mutagenesis, isothermal titration calorimetry, electron paramagnetic resonance (EPR), quantum mechanics, and molecular mechanics analysis. Gly substitution by other amino acids posed a significant threat to the metal binding affinity and activity of MDR superfamily enzymes. Mutagenesis at the conserved Gly resulted in alterations in the coordination of the catalytic zinc ion, with concomitant changes in metal-ligand bond length, bond angle, and the affinity (Kd) toward the zinc ion. The Gly mutants also showed different spectroscopic properties in EPR compared with those of the wild type, indicating that the binding geometries of the zinc to the zinc binding ligands were changed by the mutation. The present results demonstrate that the conserved Gly in the GHE motif plays a role in maintaining the metal binding affinity and the electronic state of the catalytic zinc ion during catalysis of the MDR superfamily enzymes. PMID:22500022

  8. Comparative analyses of Legionella species identifies genetic features of strains causing Legionnaires' disease.

    PubMed

    Gomez-Valero, Laura; Rusniok, Christophe; Rolando, Monica; Neou, Mario; Dervins-Ravault, Delphine; Demirtas, Jasmin; Rouy, Zoe; Moore, Robert J; Chen, Honglei; Petty, Nicola K; Jarraud, Sophie; Etienne, Jerome; Steinert, Michael; Heuner, Klaus; Gribaldo, Simonetta; Médigue, Claudine; Glöckner, Gernot; Hartland, Elizabeth L; Buchrieser, Carmen

    2014-01-01

    The genus Legionella comprises over 60 species. However, L. pneumophila and L. longbeachae alone cause over 95% of Legionnaires’ disease. To identify the genetic bases underlying the different capacities to cause disease we sequenced and compared the genomes of L. micdadei, L. hackeliae and L. fallonii (LLAP10), which are all rarely isolated from humans. We show that these Legionella species possess different virulence capacities in amoeba and macrophages, correlating with their occurrence in humans. Our comparative analysis of 11 Legionella genomes belonging to five species reveals highly heterogeneous genome content with over 60% representing species-specific genes; these comprise a complete prophage in L. micdadei, the first ever identified in a Legionella genome. Mobile elements are abundant in Legionella genomes; many encode type IV secretion systems for conjugative transfer, pointing to their importance for adaptation of the genus. The Dot/Icm secretion system is conserved, although the core set of substrates is small, as only 24 out of over 300 described Dot/Icm effector genes are present in all Legionella species. We also identified new eukaryotic motifs including thaumatin, synaptobrevin or clathrin/coatomer adaptine like domains. Legionella genomes are highly dynamic due to a large mobilome mainly comprising type IV secretion systems, while a minority of core substrates is shared among the diverse species. Eukaryotic like proteins and motifs remain a hallmark of the genus Legionella. Key factors such as proteins involved in oxygen binding, iron storage, host membrane transport and certain Dot/Icm substrates are specific features of disease-related strains.

  9. Functional analysis of the Arabidopsis PLDZ2 promoter reveals an evolutionarily conserved low-Pi-responsive transcriptional enhancer element

    PubMed Central

    Oropeza-Aburto, Araceli; Cruz-Ramírez, Alfredo; Acevedo-Hernández, Gustavo J.; Pérez-Torres, Claudia-Anahí; Caballero-Pérez, Juan; Herrera-Estrella, Luis

    2012-01-01

    Plants have evolved a plethora of responses to cope with phosphate (Pi) deficiency, including the transcriptional activation of a large set of genes. Among Pi-responsive genes, the expression of the Arabidopsis phospholipase DZ2 (PLDZ2) is activated to participate in the degradation of phospholipids in roots in order to release Pi to support other cellular activities. A deletion analysis was performed to identify the regions determining the strength, tissue-specific expression, and Pi responsiveness of this regulatory region. This study also reports the identification and characterization of a transcriptional enhancer element that is present in the PLDZ2 promoter and able to confer Pi responsiveness to a minimal, inactive 35S promoter. This enhancer also shares the cytokinin and sucrose responsive properties observed for the intact PLDZ2 promoter. The EZ2 element contains two P1BS motifs, each of which is the DNA binding site of transcription factor PHR1. Mutation analysis showed that the P1BS motifs present in EZ2 are necessary but not sufficient for the enhancer function, revealing the importance of adjacent sequences. The structural organization of EZ2 is conserved in the orthologous genes of at least eight families of rosids, suggesting that architectural features such as the distance between the two P1BS motifs are also important for the regulatory properties of this enhancer element. PMID:22210906

  10. Analysis of the linker region joining the adenylation and carrier protein domains of the modular nonribosomal peptide synthetases.

    PubMed

    Miller, Bradley R; Sundlov, Jesse A; Drake, Eric J; Makin, Thomas A; Gulick, Andrew M

    2014-10-01

    Nonribosomal peptide synthetases (NRPSs) are multimodular proteins capable of producing important peptide natural products. Using an assembly line process, the amino acid substrate and peptide intermediates are passed between the active sites of different catalytic domains of the NRPS while bound covalently to a peptidyl carrier protein (PCP) domain. Examination of the linker sequences that join the NRPS adenylation and PCP domains identified several conserved proline residues that are not found in standalone adenylation domains. We examined the roles of these proline residues and neighboring conserved sequences through mutagenesis and biochemical analysis of the reaction catalyzed by the adenylation domain and the fully reconstituted NRPS pathway. In particular, we identified a conserved LPxP motif at the start of the adenylation-PCP linker. The LPxP motif interacts with a region on the adenylation domain to stabilize a critical catalytic lysine residue belonging to the A10 motif that immediately precedes the linker. Further, this interaction with the C-terminal subdomain of the adenylation domain may coordinate movement of the PCP with the conformational change of the adenylation domain. Through this work, we extend the conserved A10 motif of the adenylation domain and identify residues that enable proper adenylation domain function. © 2014 Wiley Periodicals, Inc.

  11. Structural modelling and phylogenetic analyses of PgeIF4A2 (Eukaryotic translation initiation factor) from Pennisetum glaucum reveal signature motifs with a role in stress tolerance and development

    PubMed Central

    Agarwal, Aakrati; Mudgil, Yashwanti; Pandey, Saurabh; Fartyal, Dhirendra; Reddy, Malireddy K

    2016-01-01

    Eukaryotic translation initiation factor 4A (eIF4A) is an indispensable component of the translation machinery and also play a role in developmental processes and stress alleviation in plants and animals. Different eIF4A isoforms are present in the cytosol of the cell, namely, eIF4A1, eIF4A2, and eIF4A3 and their expression is tightly regulated in cap-dependent translation. We revealed the structural model of PgeIF4A2 protein using the crystal structure of Homo sapiens eIF4A3 (PDB ID: 2J0S) as template by Modeller 9.12. The resultant PgeIF4A2 model structure was refined by PROCHECK, ProSA, Verify3D and RMSD that showed the model structure is reliable with 77 % amino acid sequence identity with template. Investigation revealed two conserved signatures for ATP-dependent RNA Helicase DEAD-box conserved site (VLDEADEML) and RNA helicase DEAD-box type, Q-motif in sheet-turn-helix and α-helical region respectively. All these conserved motifs are responsible for response during developmental stages and stress tolerance in plants. PMID:28358146

  12. Structural modelling and phylogenetic analyses of PgeIF4A2 (Eukaryotic translation initiation factor) from Pennisetum glaucum reveal signature motifs with a role in stress tolerance and development.

    PubMed

    Agarwal, Aakrati; Mudgil, Yashwanti; Pandey, Saurabh; Fartyal, Dhirendra; Reddy, Malireddy K

    2016-01-01

    Eukaryotic translation initiation factor 4A (eIF4A) is an indispensable component of the translation machinery and also play a role in developmental processes and stress alleviation in plants and animals. Different eIF4A isoforms are present in the cytosol of the cell, namely, eIF4A1, eIF4A2, and eIF4A3 and their expression is tightly regulated in cap-dependent translation. We revealed the structural model of PgeIF4A2 protein using the crystal structure of Homo sapiens eIF4A3 (PDB ID: 2J0S) as template by Modeller 9.12. The resultant PgeIF4A2 model structure was refined by PROCHECK, ProSA, Verify3D and RMSD that showed the model structure is reliable with 77 % amino acid sequence identity with template. Investigation revealed two conserved signatures for ATP-dependent RNA Helicase DEAD-box conserved site (VLDEADEML) and RNA helicase DEAD-box type, Q-motif in sheet-turn-helix and α-helical region respectively. All these conserved motifs are responsible for response during developmental stages and stress tolerance in plants.

  13. Symmetry compression method for discovering network motifs.

    PubMed

    Wang, Jianxin; Huang, Yuannan; Wu, Fang-Xiang; Pan, Yi

    2012-01-01

    Discovering network motifs could provide a significant insight into systems biology. Interestingly, many biological networks have been found to have a high degree of symmetry (automorphism), which is inherent in biological network topologies. The symmetry due to the large number of basic symmetric subgraphs (BSSs) causes a certain redundant calculation in discovering network motifs. Therefore, we compress all basic symmetric subgraphs before extracting compressed subgraphs and propose an efficient decompression algorithm to decompress all compressed subgraphs without loss of any information. In contrast to previous approaches, the novel Symmetry Compression method for Motif Detection, named as SCMD, eliminates most redundant calculations caused by widespread symmetry of biological networks. We use SCMD to improve three notable exact algorithms and two efficient sampling algorithms. Results of all exact algorithms with SCMD are the same as those of the original algorithms, since SCMD is a lossless method. The sampling results show that the use of SCMD almost does not affect the quality of sampling results. For highly symmetric networks, we find that SCMD used in both exact and sampling algorithms can help get a remarkable speedup. Furthermore, SCMD enables us to find larger motifs in biological networks with notable symmetry than previously possible.

  14. Stakeholder-led science: engaging resource managers to identify science needs for long-term management of floodplain conservation lands

    USGS Publications Warehouse

    Bouska, Kristin L.; Lindner, Garth; Paukert, Craig P.; Jacobson, Robert B.

    2016-01-01

    Floodplains pose challenges to managers of conservation lands because of constantly changing interactions with their rivers. Although scientific knowledge and understanding of the dynamics and drivers of river-floodplain systems can provide guidance to floodplain managers, the scientific process often occurs in isolation from management. Further, communication barriers between scientists and managers can be obstacles to appropriate application of scientific knowledge. With the coproduction of science in mind, our objectives were the following: (1) to document management priorities of floodplain conservation lands, and (2) identify science needs required to better manage the identified management priorities under nonstationary conditions, i.e., climate change, through stakeholder queries and interactions. We conducted an online survey with 80 resource managers of floodplain conservation lands along the Upper and Middle Mississippi River and Lower Missouri River, USA, to evaluate management priority, management intensity, and available scientific information for management objectives and conservation targets. Management objectives with the least information available relative to priority included controlling invasive species, maintaining respectful relationships with neighbors, and managing native, nongame species. Conservation targets with the least information available to manage relative to management priority included pollinators, marsh birds, reptiles, and shore birds. A follow-up workshop and survey focused on clarifying science needs to achieve management objectives under nonstationary conditions. Managers agreed that metrics of inundation, including depth and extent of inundation, and frequency, duration, and timing of inundation would be the most useful metrics for management of floodplain conservation lands with multiple objectives. This assessment provides guidance for developing relevant and accessible science products to inform management of highly

  15. Finding specific RNA motifs: Function in a zeptomole world?

    PubMed Central

    KNIGHT, ROB; YARUS, MICHAEL

    2003-01-01

    We have developed a new method for estimating the abundance of any modular (piecewise) RNA motif within a longer random region. We have used this method to estimate the size of the active motifs available to modern SELEX experiments (picomoles of unique sequences) and to a plausible RNA World (zeptomoles of unique sequences: 1 zmole = 602 sequences). Unexpectedly, activities such as specific isoleucine binding are almost certainly present in zeptomoles of molecules, and even ribozymes such as self-cleavage motifs may appear (depending on assumptions about the minimal structures). The number of specified nucleotides is not the only important determinant of a motif’s rarity: The number of modules into which it is divided, and the details of this division, are also crucial. We propose three maxims for easily isolated motifs: the Maxim of Minimization, the Maxim of Multiplicity, and the Maxim of the Median. These maxims together state that selected motifs should be small and composed of as many separate, equally sized modules as possible. For evenly divided motifs with four modules, the largest accessible activity in picomole scale (1–1000 pmole) pools of length 100 is about 34 nucleotides; while for zeptomole scale (1–1000 zmole) pools it is about 20 specific nucleotides (50% probability of occurrence). This latter figure includes some ribozymes and aptamers. Consequently, an RNA metabolism apparently could have begun with only zeptomoles of RNA molecules. PMID:12554865

  16. SSMART: Sequence-structure motif identification for RNA-binding proteins.

    PubMed

    Munteanu, Alina; Mukherjee, Neelanjan; Ohler, Uwe

    2018-06-11

    RNA-binding proteins (RBPs) regulate every aspect of RNA metabolism and function. There are hundreds of RBPs encoded in the eukaryotic genomes, and each recognize its RNA targets through a specific mixture of RNA sequence and structure properties. For most RBPs, however, only a primary sequence motif has been determined, while the structure of the binding sites is uncharacterized. We developed SSMART, an RNA motif finder that simultaneously models the primary sequence and the structural properties of the RNA targets sites. The sequence-structure motifs are represented as consensus strings over a degenerate alphabet, extending the IUPAC codes for nucleotides to account for secondary structure preferences. Evaluation on synthetic data showed that SSMART is able to recover both sequence and structure motifs implanted into 3'UTR-like sequences, for various degrees of structured/unstructured binding sites. In addition, we successfully used SSMART on high-throughput in vivo and in vitro data, showing that we not only recover the known sequence motif, but also gain insight into the structural preferences of the RBP. Availability: SSMART is freely available at https://ohlerlab.mdc-berlin.de/software/SSMART_137/. Supplementary data are available at Bioinformatics online.

  17. Functionally conserved cis-regulatory elements of COL18A1 identified through zebrafish transgenesis.

    PubMed

    Kague, Erika; Bessling, Seneca L; Lee, Josephine; Hu, Gui; Passos-Bueno, Maria Rita; Fisher, Shannon

    2010-01-15

    Type XVIII collagen is a component of basement membranes, and expressed prominently in the eye, blood vessels, liver, and the central nervous system. Homozygous mutations in COL18A1 lead to Knobloch Syndrome, characterized by ocular defects and occipital encephalocele. However, relatively little has been described on the role of type XVIII collagen in development, and nothing is known about the regulation of its tissue-specific expression pattern. We have used zebrafish transgenesis to identify and characterize cis-regulatory sequences controlling expression of the human gene. Candidate enhancers were selected from non-coding sequence associated with COL18A1 based on sequence conservation among mammals. Although these displayed no overt conservation with orthologous zebrafish sequences, four regions nonetheless acted as tissue-specific transcriptional enhancers in the zebrafish embryo, and together recapitulated the major aspects of col18a1 expression. Additional post-hoc computational analysis on positive enhancer sequences revealed alignments between mammalian and teleost sequences, which we hypothesize predict the corresponding zebrafish enhancers; for one of these, we demonstrate functional overlap with the orthologous human enhancer sequence. Our results provide important insight into the biological function and regulation of COL18A1, and point to additional sequences that may contribute to complex diseases involving COL18A1. More generally, we show that combining functional data with targeted analyses for phylogenetic conservation can reveal conserved cis-regulatory elements in the large number of cases where computational alignment alone falls short. Copyright 2009 Elsevier Inc. All rights reserved.

  18. Identifying core habitat and connectivity for focal species in the interior cedar-hemlock forest of North America to complete a conservation area design

    Treesearch

    Lance Craighead; Baden Cross

    2007-01-01

    To identify the remaining areas of the Interior Cedar- Hemlock Forest of North America and prioritize them for conservation planning, the Craighead Environmental Research Institute has developed a 2-scale method for mapping critical habitat utilizing 1) a broad-scale model to identify important regional locations as the basis for a Conservation Area Design (CAD), and 2...

  19. Noroviruses Co-opt the Function of Host Proteins VAPA and VAPB for Replication via a Phenylalanine-Phenylalanine-Acidic-Tract-Motif Mimic in Nonstructural Viral Protein NS1/2.

    PubMed

    McCune, Broc T; Tang, Wei; Lu, Jia; Eaglesham, James B; Thorne, Lucy; Mayer, Anne E; Condiff, Emily; Nice, Timothy J; Goodfellow, Ian; Krezel, Andrzej M; Virgin, Herbert W

    2017-07-11

    The Norovirus genus contains important human pathogens, but the role of host pathways in norovirus replication is largely unknown. Murine noroviruses provide the opportunity to study norovirus replication in cell culture and in small animals. The human norovirus nonstructural protein NS1/2 interacts with the host protein VAMP-associated protein A (VAPA), but the significance of the NS1/2-VAPA interaction is unexplored. Here we report decreased murine norovirus replication in VAPA- and VAPB-deficient cells. We characterized the role of VAPA in detail. VAPA was required for the efficiency of a step(s) in the viral replication cycle after entry of viral RNA into the cytoplasm but before the synthesis of viral minus-sense RNA. The interaction of VAPA with viral NS1/2 proteins is conserved between murine and human noroviruses. Murine norovirus NS1/2 directly bound the major sperm protein (MSP) domain of VAPA through its NS1 domain. Mutations within NS1 that disrupted interaction with VAPA inhibited viral replication. Structural analysis revealed that the viral NS1 domain contains a mimic of the phenylalanine-phenylalanine-acidic-tract (FFAT) motif that enables host proteins to bind to the VAPA MSP domain. The NS1/2-FFAT mimic region interacted with the VAPA-MSP domain in a manner similar to that seen with bona fide host FFAT motifs. Amino acids in the FFAT mimic region of the NS1 domain that are important for viral replication are highly conserved across murine norovirus strains. Thus, VAPA interaction with a norovirus protein that functionally mimics host FFAT motifs is important for murine norovirus replication. IMPORTANCE Human noroviruses are a leading cause of gastroenteritis worldwide, but host factors involved in norovirus replication are incompletely understood. Murine noroviruses have been studied to define mechanisms of norovirus replication. Here we defined the importance of the interaction between the hitherto poorly studied NS1/2 norovirus protein and the

  20. MDD-carb: a combinatorial model for the identification of protein carbonylation sites with substrate motifs.

    PubMed

    Kao, Hui-Ju; Weng, Shun-Long; Huang, Kai-Yao; Kaunang, Fergie Joanda; Hsu, Justin Bo-Kai; Huang, Chien-Hsun; Lee, Tzong-Yi

    2017-12-21

    Carbonylation, which takes place through oxidation of reactive oxygen species (ROS) on specific residues, is an irreversibly oxidative modification of proteins. It has been reported that the carbonylation is related to a number of metabolic or aging diseases including diabetes, chronic lung disease, Parkinson's disease, and Alzheimer's disease. Due to the lack of computational methods dedicated to exploring motif signatures of protein carbonylation sites, we were motivated to exploit an iterative statistical method to characterize and identify carbonylated sites with motif signatures. By manually curating experimental data from research articles, we obtained 332, 144, 135, and 140 verified substrate sites for K (lysine), R (arginine), T (threonine), and P (proline) residues, respectively, from 241 carbonylated proteins. In order to examine the informative attributes for classifying between carbonylated and non-carbonylated sites, multifarious features including composition of twenty amino acids (AAC), composition of amino acid pairs (AAPC), position-specific scoring matrix (PSSM), and positional weighted matrix (PWM) were investigated in this study. Additionally, in an attempt to explore the motif signatures of carbonylation sites, an iterative statistical method was adopted to detect statistically significant dependencies of amino acid compositions between specific positions around substrate sites. Profile hidden Markov model (HMM) was then utilized to train a predictive model from each motif signature. Moreover, based on the method of support vector machine (SVM), we adopted it to construct an integrative model by combining the values of bit scores obtained from profile HMMs. The combinatorial model could provide an enhanced performance with evenly predictive sensitivity and specificity in the evaluation of cross-validation and independent testing. This study provides a new scheme for exploring potential motif signatures at substrate sites of protein

  1. A Comparison Study for DNA Motif Modeling on Protein Binding Microarray.

    PubMed

    Wong, Ka-Chun; Li, Yue; Peng, Chengbin; Wong, Hau-San

    2016-01-01

    Transcription factor binding sites (TFBSs) are relatively short (5-15 bp) and degenerate. Identifying them is a computationally challenging task. In particular, protein binding microarray (PBM) is a high-throughput platform that can measure the DNA binding preference of a protein in a comprehensive and unbiased manner; for instance, a typical PBM experiment can measure binding signal intensities of a protein to all possible DNA k-mers (k = 8∼10). Since proteins can often bind to DNA with different binding intensities, one of the major challenges is to build TFBS (also known as DNA motif) models which can fully capture the quantitative binding affinity data. To learn DNA motif models from the non-convex objective function landscape, several optimization methods are compared and applied to the PBM motif model building problem. In particular, representative methods from different optimization paradigms have been chosen for modeling performance comparison on hundreds of PBM datasets. The results suggest that the multimodal optimization methods are very effective for capturing the binding preference information from PBM data. In particular, we observe a general performance improvement if choosing di-nucleotide modeling over mono-nucleotide modeling. In addition, the models learned by the best-performing method are applied to two independent applications: PBM probe rotation testing and ChIP-Seq peak sequence prediction, demonstrating its biological applicability.

  2. Antibody Light-Chain-Restricted Recognition of the Site of Immune Pressure in the RV144 HIV-1 Vaccine Trial Is Phylogenetically Conserved

    DOE PAGES

    Wiehe, Kevin; Easterhoff, David; Luo, Kan; ...

    2014-11-29

    In HIV-1, the ability to mount antibody responses to conserved, neutralizing epitopes is critical for protection. Here we have studied the light chain usage of human and rhesus macaque antibodies targeted to a dominant region of the HIV-1 envelope second variable (V2) region involving lysine (K) 169, the site of immune pressure in the RV144 vaccine efficacy trial. We found that humans and rhesus macaques used orthologous lambda variable gene segments encoding a glutamic acid-aspartic acid (ED) motif for K169 recognition. Structure determination of an unmutated ancestor antibody demonstrated that the V2 binding site was preconfigured for ED motif-mediated recognitionmore » prior to maturation. Thus, light chain usage for recognition of the site of immune pressure in the RV144 trial is highly conserved across species. In conclusion, these data indicate that the HIV-1 K169-recognizing ED motif has persisted over the diversification between rhesus macaques and humans, suggesting an evolutionary advantage of this antibody recognition mode.« less

  3. Antibody Light-Chain-Restricted Recognition of the Site of Immune Pressure in the RV144 HIV-1 Vaccine Trial Is Phylogenetically Conserved

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wiehe, Kevin; Easterhoff, David; Luo, Kan

    In HIV-1, the ability to mount antibody responses to conserved, neutralizing epitopes is critical for protection. Here we have studied the light chain usage of human and rhesus macaque antibodies targeted to a dominant region of the HIV-1 envelope second variable (V2) region involving lysine (K) 169, the site of immune pressure in the RV144 vaccine efficacy trial. We found that humans and rhesus macaques used orthologous lambda variable gene segments encoding a glutamic acid-aspartic acid (ED) motif for K169 recognition. Structure determination of an unmutated ancestor antibody demonstrated that the V2 binding site was preconfigured for ED motif-mediated recognitionmore » prior to maturation. Thus, light chain usage for recognition of the site of immune pressure in the RV144 trial is highly conserved across species. In conclusion, these data indicate that the HIV-1 K169-recognizing ED motif has persisted over the diversification between rhesus macaques and humans, suggesting an evolutionary advantage of this antibody recognition mode.« less

  4. A subclass of plant heat shock cognate 70 chaperones carries a motif that facilitates trafficking through plasmodesmata

    PubMed Central

    Aoki, Koh; Kragler, Friedrich; Xoconostle-Cázares, Beatriz; Lucas, William J.

    2002-01-01

    Plasmodesmata establish a pathway for the trafficking of non-cell-autonomously acting proteins and ribonucleoprotein complexes. Plasmodesmal enriched cell fractions and the contents of enucleate sieve elements, in the form of phloem sap, were used to isolate and characterize heat shock cognate 70 (Hsc70) chaperones associated with this cell-to-cell transport pathway. Three Cucurbita maxima Hsc70 chaperones were cloned and functional and sequence analysis led to the identification of a previously uncharacterized subclass of non-cell-autonomous chaperones. The highly conserved nature of the heat shock protein 70 (Hsp70) family, in conjunction with mutant analysis, permitted the characterization of a motif that allows these Hsc70 chaperones to engage the plasmodesmal non-cell-autonomous translocation machinery. Proof of concept that this motif is necessary for Hsp70 gain-of-movement function was obtained through the engineering of a human Hsp70 that acquired the capacity to traffic through plasmodesmata. These results are discussed in terms of the roles likely played by this subclass of Hsc70 chaperones in the trafficking of non-cell-autonomous proteins. PMID:12456884

  5. Identification and Targeting of an Interaction between a Tyrosine Motif within Hepatitis C Virus Core Protein and AP2M1 Essential for Viral Assembly

    PubMed Central

    Ziv-Av, Amotz; Gerber, Doron; Jacob, Yves; Einav, Shirit

    2012-01-01

    Novel therapies are urgently needed against hepatitis C virus infection (HCV), a major global health problem. The current model of infectious virus production suggests that HCV virions are assembled on or near the surface of lipid droplets, acquire their envelope at the ER, and egress through the secretory pathway. The mechanisms of HCV assembly and particularly the role of viral-host protein-protein interactions in mediating this process are, however, poorly understood. We identified a conserved heretofore unrecognized YXXΦ motif (Φ is a bulky hydrophobic residue) within the core protein. This motif is homologous to sorting signals within host cargo proteins known to mediate binding of AP2M1, the μ subunit of clathrin adaptor protein complex 2 (AP-2), and intracellular trafficking. Using microfluidics affinity analysis, protein-fragment complementation assays, and co-immunoprecipitations in infected cells, we show that this motif mediates core binding to AP2M1. YXXΦ mutations, silencing AP2M1 expression or overexpressing a dominant negative AP2M1 mutant had no effect on HCV RNA replication, however, they dramatically inhibited intra- and extracellular infectivity, consistent with a defect in viral assembly. Quantitative confocal immunofluorescence analysis revealed that core's YXXΦ motif mediates recruitment of AP2M1 to lipid droplets and that the observed defect in HCV assembly following disruption of core-AP2M1 binding correlates with accumulation of core on lipid droplets, reduced core colocalization with E2 and reduced core localization to trans-Golgi network (TGN), the presumed site of viral particles maturation. Furthermore, AAK1 and GAK, serine/threonine kinases known to stimulate binding of AP2M1 to host cargo proteins, regulate core-AP2M1 binding and are essential for HCV assembly. Last, approved anti-cancer drugs that inhibit AAK1 or GAK not only disrupt core-AP2M1 binding, but also significantly inhibit HCV assembly and infectious virus production

  6. Modeling protein homopolymeric repeats: possible polyglutamine structural motifs for Huntington's disease.

    PubMed

    Lathrop, R H; Casale, M; Tobias, D J; Marsh, J L; Thompson, L M

    1998-01-01

    We describe a prototype system (Poly-X) for assisting an expert user in modeling protein repeats. Poly-X reduces the large number of degrees of freedom required to specify a protein motif in complete atomic detail. The result is a small number of parameters that are easily understood by, and under the direct control of, a domain expert. The system was applied to the polyglutamine (poly-Q) repeat in the first exon of huntingtin, the gene implicated in Huntington's disease. We present four poly-Q structural motifs: two poly-Q beta-sheet motifs (parallel and antiparallel) that constitute plausible alternatives to a similar previously published poly-Q beta-sheet motif, and two novel poly-Q helix motifs (alpha-helix and pi-helix). To our knowledge, helical forms of polyglutamine have not been proposed before. The motifs suggest that there may be several plausible aggregation structures for the intranuclear inclusion bodies which have been found in diseased neurons, and may help in the effort to understand the structural basis for Huntington's disease.

  7. Functional structural motifs for protein-ligand, protein-protein, and protein-nucleic acid interactions and their connection to supersecondary structures.

    PubMed

    Kinjo, Akira R; Nakamura, Haruki

    2013-01-01

    Protein functions are mediated by interactions between proteins and other molecules. One useful approach to analyze protein functions is to compare and classify the structures of interaction interfaces of proteins. Here, we describe the procedures for compiling a database of interface structures and efficiently comparing the interface structures. To do so requires a good understanding of the data structures of the Protein Data Bank (PDB). Therefore, we also provide a detailed account of the PDB exchange dictionary necessary for extracting data that are relevant for analyzing interaction interfaces and secondary structures. We identify recurring structural motifs by classifying similar interface structures, and we define a coarse-grained representation of supersecondary structures (SSS) which represents a sequence of two or three secondary structure elements including their relative orientations as a string of four to seven letters. By examining the correspondence between structural motifs and SSS strings, we show that no SSS string has particularly high propensity to be found interaction interfaces in general, indicating any SSS can be used as a binding interface. When individual structural motifs are examined, there are some SSS strings that have high propensity for particular groups of structural motifs. In addition, it is shown that while the SSS strings found in particular structural motifs for nonpolymer and protein interfaces are as abundant as in other structural motifs that belong to the same subunit, structural motifs for nucleic acid interfaces exhibit somewhat stronger preference for SSS strings. In regard to protein folds, many motif-specific SSS strings were found across many folds, suggesting that SSS may be a useful description to investigate the universality of ligand binding modes.

  8. Transcriptome-wide analysis of the Trypanosoma cruzi proliferative cycle identifies the periodically expressed mRNAs and their multiple levels of control

    PubMed Central

    Chávez, Santiago; Eastman, Guillermo; Smircich, Pablo; Becco, Lorena Lourdes; Oliveira-Rizzo, Carolina; Fort, Rafael; Potenza, Mariana; Garat, Beatriz; Sotelo-Silveira, José Roberto

    2017-01-01

    Trypanosoma cruzi is the protozoan parasite causing American trypanosomiasis or Chagas disease, a neglected parasitosis with important human health impact in Latin America. The efficacy of current therapy is limited, and its toxicity is high. Since parasite proliferation is a fundamental target for rational drug design, we sought to progress into its understanding by applying a genome-wide approach. Treating a TcI linage strain with hydroxyurea, we isolated epimastigotes in late G1, S and G2/M cell cycle stages at 70% purity. The sequencing of each phase identified 305 stage-specific transcripts (1.5-fold change, p≤0.01), coding for conserved cell cycle regulated proteins and numerous proteins whose cell cycle dependence has not been recognized before. Comparisons with the parasite T. brucei and the human host reveal important differences. The meta-analysis of T. cruzi transcriptomic and ribonomic data indicates that cell cycle regulated mRNAs are subject to sub-cellular compartmentalization. Compositional and structural biases of these genes- including CAI, GC content, UTR length, and polycistron position- may contribute to their regulation. To discover nucleotide motifs responsible for the co-regulation of cell cycle regulated genes, we looked for overrepresented motifs at their UTRs and found a variant of the cell cycle sequence motif at the 3' UTR of most of the S and G2 stage genes. We additionally identified hairpin structures at the 5' UTRs of a high proportion of the transcripts, suggesting that periodic gene expression might also rely on translation initiation in T. cruzi. In summary, we report a comprehensive list of T. cruzi cell cycle regulated genes, including many previously unstudied proteins, we show evidence favoring a multi-step control of their expression, and we identify mRNA motifs that may mediate their regulation. Our results provide novel information of the T. cruzi proliferative proteins and the integrated levels of their gene expression

  9. Structural Integrity of the Greek Key Motif in βγ-Crystallins Is Vital for Central Eye Lens Transparency

    PubMed Central

    Vendra, Venkata Pulla Rao; Agarwal, Garima; Chandani, Sushil; Talla, Venu; Srinivasan, Narayanaswamy; Balasubramanian, Dorairajan

    2013-01-01

    Background We highlight an unrecognized physiological role for the Greek key motif, an evolutionarily conserved super-secondary structural topology of the βγ-crystallins. These proteins constitute the bulk of the human eye lens, packed at very high concentrations in a compact, globular, short-range order, generating transparency. Congenital cataract (affecting 400,000 newborns yearly worldwide), associated with 54 mutations in βγ-crystallins, occurs in two major phenotypes nuclear cataract, which blocks the central visual axis, hampering the development of the growing eye and demanding earliest intervention, and the milder peripheral progressive cataract where surgery can wait. In order to understand this phenotypic dichotomy at the molecular level, we have studied the structural and aggregation features of representative mutations. Methods Wild type and several representative mutant proteins were cloned, expressed and purified and their secondary and tertiary structural details, as well as structural stability, were compared in solution, using spectroscopy. Their tendencies to aggregate in vitro and in cellulo were also compared. In addition, we analyzed their structural differences by molecular modeling in silico. Results Based on their properties, mutants are seen to fall into two classes. Mutants A36P, L45PL54P, R140X, and G165fs display lowered solubility and structural stability, expose several buried residues to the surface, aggregate in vitro and in cellulo, and disturb/distort the Greek key motif. And they are associated with nuclear cataract. In contrast, mutants P24T and R77S, associated with peripheral cataract, behave quite similar to the wild type molecule, and do not affect the Greek key topology. Conclusion When a mutation distorts even one of the four Greek key motifs, the protein readily self-aggregates and precipitates, consistent with the phenotype of nuclear cataract, while mutations not affecting the motif display ‘native state

  10. Nucleophosmin integrates within the nucleolus via multi-modal interactions with proteins displaying R-rich linear motifs and rRNA.

    PubMed

    Mitrea, Diana M; Cika, Jaclyn A; Guy, Clifford S; Ban, David; Banerjee, Priya R; Stanley, Christopher B; Nourse, Amanda; Deniz, Ashok A; Kriwacki, Richard W

    2016-02-02

    The nucleolus is a membrane-less organelle formed through liquid-liquid phase separation of its components from the surrounding nucleoplasm. Here, we show that nucleophosmin (NPM1) integrates within the nucleolus via a multi-modal mechanism involving multivalent interactions with proteins containing arginine-rich linear motifs (R-motifs) and ribosomal RNA (rRNA). Importantly, these R-motifs are found in canonical nucleolar localization signals. Based on a novel combination of biophysical approaches, we propose a model for the molecular organization within liquid-like droplets formed by the N-terminal domain of NPM1 and R-motif peptides, thus providing insights into the structural organization of the nucleolus. We identify multivalency of acidic tracts and folded nucleic acid binding domains, mediated by N-terminal domain oligomerization, as structural features required for phase separation of NPM1 with other nucleolar components in vitro and for localization within mammalian nucleoli. We propose that one mechanism of nucleolar localization involves phase separation of proteins within the nucleolus.

  11. The Rho ADP-ribosylating C3 exoenzyme binds cells via an Arg-Gly-Asp motif.

    PubMed

    Rohrbeck, Astrid; Höltje, Markus; Adolf, Andrej; Oms, Elisabeth; Hagemann, Sandra; Ahnert-Hilger, Gudrun; Just, Ingo

    2017-10-27

    The Rho ADP-ribosylating C3 exoenzyme (C3bot) is a bacterial protein toxin devoid of a cell-binding or -translocation domain. Nevertheless, C3 can efficiently enter intact cells, including neurons, but the mechanism of C3 binding and uptake is not yet understood. Previously, we identified the intermediate filament vimentin as an extracellular membranous interaction partner of C3. However, uptake of C3 into cells still occurs (although reduced) in the absence of vimentin, indicating involvement of an additional host cell receptor. C3 harbors an Arg-Gly-Asp (RGD) motif, which is the major integrin-binding site, present in a variety of integrin ligands. To check whether the RGD motif of C3 is involved in binding to cells, we performed a competition assay with C3 and RGD peptide or with a monoclonal antibody binding to β1-integrin subunit and binding assays in different cell lines, primary neurons, and synaptosomes with C3-RGD mutants. Here, we report that preincubation of cells with the GRGDNP peptide strongly reduced C3 binding to cells. Moreover, mutation of the RGD motif reduced C3 binding to intact cells and also to recombinant vimentin. Anti-integrin antibodies also lowered the C3 binding to cells. Our results indicate that the RGD motif of C3 is at least one essential C3 motif for binding to host cells and that integrin is an additional receptor for C3 besides vimentin. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.

  12. The BaMM web server for de-novo motif discovery and regulatory sequence analysis.

    PubMed

    Kiesel, Anja; Roth, Christian; Ge, Wanwan; Wess, Maximilian; Meier, Markus; Söding, Johannes

    2018-05-28

    The BaMM web server offers four tools: (i) de-novo discovery of enriched motifs in a set of nucleotide sequences, (ii) scanning a set of nucleotide sequences with motifs to find motif occurrences, (iii) searching with an input motif for similar motifs in our BaMM database with motifs for >1000 transcription factors, trained from the GTRD ChIP-seq database and (iv) browsing and keyword searching the motif database. In contrast to most other servers, we represent sequence motifs not by position weight matrices (PWMs) but by Bayesian Markov Models (BaMMs) of order 4, which we showed previously to perform substantially better in ROC analyses than PWMs or first order models. To address the inadequacy of P- and E-values as measures of motif quality, we introduce the AvRec score, the average recall over the TP-to-FP ratio between 1 and 100. The BaMM server is freely accessible without registration at https://bammmotif.mpibpc.mpg.de.

  13. Identifying Priority Areas for Conservation: A Global Assessment for Forest-Dependent Birds

    PubMed Central

    Buchanan, Graeme M.; Donald, Paul F.; Butchart, Stuart H. M.

    2011-01-01

    Limited resources are available to address the world's growing environmental problems, requiring conservationists to identify priority sites for action. Using new distribution maps for all of the world's forest-dependent birds (60.6% of all bird species), we quantify the contribution of remaining forest to conserving global avian biodiversity. For each of the world's partly or wholly forested 5-km cells, we estimated an impact score of its contribution to the distribution of all the forest bird species estimated to occur within it, and so is proportional to the impact on the conservation status of the world's forest-dependent birds were the forest it contains lost. The distribution of scores was highly skewed, a very small proportion of cells having scores several orders of magnitude above the global mean. Ecoregions containing the highest values of this score included relatively species-poor islands such as Hawaii and Palau, the relatively species-rich islands of Indonesia and the Philippines, and the megadiverse Atlantic Forests and northern Andes of South America. Ecoregions with high impact scores and high deforestation rates (2000–2005) included montane forests in Cameroon and the Eastern Arc of Tanzania, although deforestation data were not available for all ecoregions. Ecoregions with high impact scores, high rates of recent deforestation and low coverage by the protected area network included Indonesia's Seram rain forests and the moist forests of Trinidad and Tobago. Key sites in these ecoregions represent some of the most urgent priorities for expansion of the global protected areas network to meet Convention on Biological Diversity targets to increase the proportion of land formally protected to 17% by 2020. Areas with high impact scores, rapid deforestation, low protection and high carbon storage values may represent significant opportunities for both biodiversity conservation and climate change mitigation, for example through Reducing Emissions from

  14. hfAIM: A reliable bioinformatics approach for in silico genome-wide identification of autophagy-associated Atg8-interacting motifs in various organisms

    PubMed Central

    Xie, Qingjun; Tzfadia, Oren; Levy, Matan; Weithorn, Efrat; Peled-Zehavi, Hadas; Van Parys, Thomas; Van de Peer, Yves; Galili, Gad

    2016-01-01

    ABSTRACT Most of the proteins that are specifically turned over by selective autophagy are recognized by the presence of short Atg8 interacting motifs (AIMs) that facilitate their association with the autophagy apparatus. Such AIMs can be identified by bioinformatics methods based on their defined degenerate consensus F/W/Y-X-X-L/I/V sequences in which X represents any amino acid. Achieving reliability and/or fidelity of the prediction of such AIMs on a genome-wide scale represents a major challenge. Here, we present a bioinformatics approach, high fidelity AIM (hfAIM), which uses additional sequence requirements—the presence of acidic amino acids and the absence of positively charged amino acids in certain positions—to reliably identify AIMs in proteins. We demonstrate that the use of the hfAIM method allows for in silico high fidelity prediction of AIMs in AIM-containing proteins (ACPs) on a genome-wide scale in various organisms. Furthermore, by using hfAIM to identify putative AIMs in the Arabidopsis proteome, we illustrate a potential contribution of selective autophagy to various biological processes. More specifically, we identified 9 peroxisomal PEX proteins that contain hfAIM motifs, among which AtPEX1, AtPEX6 and AtPEX10 possess evolutionary-conserved AIMs. Bimolecular fluorescence complementation (BiFC) results verified that AtPEX6 and AtPEX10 indeed interact with Atg8 in planta. In addition, we show that mutations occurring within or nearby hfAIMs in PEX1, PEX6 and PEX10 caused defects in the growth and development of various organisms. Taken together, the above results suggest that the hfAIM tool can be used to effectively perform genome-wide in silico screens of proteins that are potentially regulated by selective autophagy. The hfAIM system is a web tool that can be accessed at link: http://bioinformatics.psb.ugent.be/hfAIM/. PMID:27071037

  15. Rules for the recognition of dilysine retrieval motifs by coatomer

    PubMed Central

    Ma, Wenfu; Goldberg, Jonathan

    2013-01-01

    Cytoplasmic dilysine motifs on transmembrane proteins are captured by coatomer α-COP and β′-COP subunits and packaged into COPI-coated vesicles for Golgi-to-ER retrieval. Numerous ER/Golgi proteins contain K(x)Kxx motifs, but the rules for their recognition are unclear. We present crystal structures of α-COP and β′-COP bound to a series of naturally occurring retrieval motifs—encompassing KKxx, KxKxx and non-canonical RKxx and viral KxHxx sequences. Binding experiments show that α-COP and β′-COP have generally the same specificity for KKxx and KxKxx, but only β′-COP recognizes the RKxx signal. Dilysine motif recognition involves lysine side-chain interactions with two acidic patches. Surprisingly, however, KKxx and KxKxx motifs bind differently, with their lysine residues transposed at the binding patches. We derive rules for retrieval motif recognition from key structural features: the reversed binding modes, the recognition of the C-terminal carboxylate group which enforces lysine positional context, and the tolerance of the acidic patches for non-lysine residues. PMID:23481256

  16. Lariat sequencing in a unicellular yeast identifies regulated alternative splicing of exons that are evolutionarily conserved with humans.

    PubMed

    Awan, Ali R; Manfredo, Amanda; Pleiss, Jeffrey A

    2013-07-30

    Alternative splicing is a potent regulator of gene expression that vastly increases proteomic diversity in multicellular eukaryotes and is associated with organismal complexity. Although alternative splicing is widespread in vertebrates, little is known about the evolutionary origins of this process, in part because of the absence of phylogenetically conserved events that cross major eukaryotic clades. Here we describe a lariat-sequencing approach, which offers high sensitivity for detecting splicing events, and its application to the unicellular fungus, Schizosaccharomyces pombe, an organism that shares many of the hallmarks of alternative splicing in mammalian systems but for which no previous examples of exon-skipping had been demonstrated. Over 200 previously unannotated splicing events were identified, including examples of regulated alternative splicing. Remarkably, an evolutionary analysis of four of the exons identified here as subject to skipping in S. pombe reveals high sequence conservation and perfect length conservation with their homologs in scores of plants, animals, and fungi. Moreover, alternative splicing of two of these exons have been documented in multiple vertebrate organisms, making these the first demonstrations of identical alternative-splicing patterns in species that are separated by over 1 billion y of evolution.

  17. A conserved intronic U1 snRNP-binding sequence promotes trans-splicing in Drosophila

    PubMed Central

    Gao, Jun-Li; Fan, Yu-Jie; Wang, Xiu-Ye; Zhang, Yu; Pu, Jia; Li, Liang; Shao, Wei; Zhan, Shuai; Hao, Jianjiang

    2015-01-01

    Unlike typical cis-splicing, trans-splicing joins exons from two separate transcripts to produce chimeric mRNA and has been detected in most eukaryotes. Trans-splicing in trypanosomes and nematodes has been characterized as a spliced leader RNA-facilitated reaction; in contrast, its mechanism in higher eukaryotes remains unclear. Here we investigate mod(mdg4), a classic trans-spliced gene in Drosophila, and report that two critical RNA sequences in the middle of the last 5′ intron, TSA and TSB, promote trans-splicing of mod(mdg4). In TSA, a 13-nucleotide (nt) core motif is conserved across Drosophila species and is essential and sufficient for trans-splicing, which binds U1 small nuclear RNP (snRNP) through strong base-pairing with U1 snRNA. In TSB, a conserved secondary structure acts as an enhancer. Deletions of TSA and TSB using the CRISPR/Cas9 system result in developmental defects in flies. Although it is not clear how the 5′ intron finds the 3′ introns, compensatory changes in U1 snRNA rescue trans-splicing of TSA mutants, demonstrating that U1 recruitment is critical to promote trans-splicing in vivo. Furthermore, TSA core-like motifs are found in many other trans-spliced Drosophila genes, including lola. These findings represent a novel mechanism of trans-splicing, in which RNA motifs in the 5′ intron are sufficient to bring separate transcripts into close proximity to promote trans-splicing. PMID:25838544

  18. Identification of amino acid residues in protein SRP72 required for binding to a kinked 5e motif of the human signal recognition particle RNA.

    PubMed

    Iakhiaeva, Elena; Iakhiaev, Alexei; Zwieb, Christian

    2010-11-13

    Human cells depend critically on the signal recognition particle (SRP) for the sorting and delivery of their proteins. The SRP is a ribonucleoprotein complex which binds to signal sequences of secretory polypeptides as they emerge from the ribosome. Among the six proteins of the eukaryotic SRP, the largest protein, SRP72, is essential for protein targeting and possesses a poorly characterized RNA binding domain. We delineated the minimal region of SRP72 capable of forming a stable complex with an SRP RNA fragment. The region encompassed residues 545 to 585 of the full-length human SRP72 and contained a lysine-rich cluster (KKKKKKKKGK) at postions 552 to 561 as well as a conserved Pfam motif with the sequence PDPXRWLPXXER at positions 572 to 583. We demonstrated by site-directed mutagenesis that both regions participated in the formation of a complex with the RNA. In agreement with biochemical data and results from chymotryptic digestion experiments, molecular modeling of SRP72 implied that the invariant W577 was located inside the predicted structure of an RNA binding domain. The 11-nucleotide 5e motif contained within the SRP RNA fragment was shown by comparative electrophoresis on native polyacrylamide gels to conform to an RNA kink-turn. The model of the complex suggested that the conserved A240 of the K-turn, previously identified as being essential for the binding to SRP72, could protrude into a groove of the SRP72 RNA binding domain, similar but not identical to how other K-turn recognizing proteins interact with RNA. The results from the presented experiments provided insights into the molecular details of a functionally important and structurally interesting RNA-protein interaction. A model for how a ligand binding pocket of SRP72 can accommodate a new RNA K-turn in the 5e region of the eukaryotic SRP RNA is proposed.

  19. Identifying protein phosphorylation sites with kinase substrate specificity on human viruses.

    PubMed

    Bretaña, Neil Arvin; Lu, Cheng-Tsung; Chiang, Chiu-Yun; Su, Min-Gang; Huang, Kai-Yao; Lee, Tzong-Yi; Weng, Shun-Long

    2012-01-01

    Viruses infect humans and progress inside the body leading to various diseases and complications. The phosphorylation of viral proteins catalyzed by host kinases plays crucial regulatory roles in enhancing replication and inhibition of normal host-cell functions. Due to its biological importance, there is a desire to identify the protein phosphorylation sites on human viruses. However, the use of mass spectrometry-based experiments is proven to be expensive and labor-intensive. Furthermore, previous studies which have identified phosphorylation sites in human viruses do not include the investigation of the responsible kinases. Thus, we are motivated to propose a new method to identify protein phosphorylation sites with its kinase substrate specificity on human viruses. The experimentally verified phosphorylation data were extracted from virPTM--a database containing 301 experimentally verified phosphorylation data on 104 human kinase-phosphorylated virus proteins. In an attempt to investigate kinase substrate specificities in viral protein phosphorylation sites, maximal dependence decomposition (MDD) is employed to cluster a large set of phosphorylation data into subgroups containing significantly conserved motifs. The experimental human phosphorylation sites are collected from Phospho.ELM, grouped according to its kinase annotation, and compared with the virus MDD clusters. This investigation identifies human kinases such as CK2, PKB, CDK, and MAPK as potential kinases for catalyzing virus protein substrates as confirmed by published literature. Profile hidden Markov model is then applied to learn a predictive model for each subgroup. A five-fold cross validation evaluation on the MDD-clustered HMMs yields an average accuracy of 84.93% for Serine, and 78.05% for Threonine. Furthermore, an independent testing data collected from UniProtKB and Phospho.ELM is used to make a comparison of predictive performance on three popular kinase-specific phosphorylation site

  20. Mutation of the Conserved Calcium-Binding Motif in Neisseria gonorrhoeae PilC1 Impacts Adhesion but Not Piliation

    PubMed Central

    Cheng, Yuan; Johnson, Michael D. L.; Burillo-Kirch, Christine; Mocny, Jeffrey C.; Anderson, James E.; Garrett, Christopher K.; Redinbo, Matthew R.

    2013-01-01

    Neisseria gonorrhoeae PilC1 is a member of the PilC family of type IV pilus-associated adhesins found in Neisseria species and other type IV pilus-producing genera. Previously, a calcium-binding domain was described in the C-terminal domains of PilY1 of Pseudomonas aeruginosa and in PilC1 and PilC2 of Kingella kingae. Genetic analysis of N. gonorrhoeae revealed a similar calcium-binding motif in PilC1. To evaluate the potential significance of this calcium-binding region in N. gonorrhoeae, we produced recombinant full-length PilC1 and a PilC1 C-terminal domain fragment. We show that, while alterations of the calcium-binding motif disrupted the ability of PilC1 to bind calcium, they did not grossly affect the secondary structure of the protein. Furthermore, we demonstrate that both full-length wild-type PilC1 and full-length calcium-binding-deficient PilC1 inhibited gonococcal adherence to cultured human cervical epithelial cells, unlike the truncated PilC1 C-terminal domain. Similar to PilC1 in K. kingae, but in contrast to the calcium-binding mutant of P. aeruginosa PilY1, an equivalent mutation in N. gonorrhoeae PilC1 produced normal amounts of pili. However, the N. gonorrhoeae PilC1 calcium-binding mutant still had partial defects in gonococcal adhesion to ME180 cells and genetic transformation, which are both essential virulence factors in this human pathogen. Thus, we conclude that calcium binding to PilC1 plays a critical role in pilus function in N. gonorrhoeae. PMID:24002068