Science.gov

Sample records for compo composite motif

  1. Composite Materials Processing of Cast Iron and Ceramics Using Compo-Casting Technology

    NASA Astrophysics Data System (ADS)

    Tomita, Yoshihiro; Sumimoto, Haruyoshi

    The compo-casting technology of ceramics and cast iron is expected to be one of the major casting technologies that can expand the application fields of cast iron. This technique allows the heat energy of the molten metal to be utilized to produce cast iron products which are added with functions of ceramic materials. The largest problem in compo-casting technology is generation of cracks caused by thermal shock. Although this crack generation can be prevented by reducing the thermal stress by means of preheating ceramics, the necessary preheating temperature is considerably high and its precise controlling is difficult at the practical foundry working sites. In this study, we tried to numerically predict the critical preheating temperature of ceramics using the thermal stress analysis in unsteady heat transfer and the Newman's diagram, and found that the preheating of ceramics to reduce thermal stress could be substituted with placing an appropriate cast iron cover around the ceramics. Excellent results were obtained by using a method whereby a ceramic bar was covered with a flake graphite cast iron cover and fixed in a sand mold and then molten metal was poured. Then, two or three ceramics were examined at the same time under the compocasting condition. As a result, three specimens could be done at the same time by adjusting the cover space to 15mm. Moreover, irregular shape ceramics were examined under the compocasting condition. As a result, the compocasting could be done by devising the cover shape. In each condition, it was confirmed that the cover shape made from the analytical result was effective to the compocasting by doing the thermometry of the specimens.

  2. Factoring local sequence composition in motif significance analysis.

    PubMed

    Ng, Patrick; Keich, Uri

    2008-01-01

    We recently introduced a biologically realistic and reliable significance analysis of the output of a popular class of motif finders. In this paper we further improve our significance analysis by incorporating local base composition information. Relying on realistic biological data simulation, as well as on FDR analysis applied to real data, we show that our method is significantly better than the increasingly popular practice of using the normal approximation to estimate the significance of a finder's output. Finally we turn to leveraging our reliable significance analysis to improve the actual motif finding task. Specifically, endowing a variant of the Gibbs Sampler with our improved significance analysis we demonstrate that de novo finders can perform better than has been perceived. Significantly, our new variant outperforms all the finders reviewed in a recently published comprehensive analysis of the Harbison genome-wide binding location data. Interestingly, many of these finders incorporate additional information such as nucleosome positioning and the significance of binding data.

  3. Motif types, motif locations and base composition patterns around the RNA polyadenylation site in microorganisms, plants and animals

    PubMed Central

    2014-01-01

    Background The polyadenylation of RNA is critical for gene functioning, but the conserved sequence motifs (often called signal or signature motifs), motif locations and abundances, and base composition patterns around mRNA polyadenylation [poly(A)] sites are still uncharacterized in most species. The evolutionary tendency for poly(A) site selection is still largely unknown. Results We analyzed the poly(A) site regions of 31 species or phyla. Different groups of species showed different poly(A) signal motifs: UUACUU at the poly(A) site in the parasite Trypanosoma cruzi; UGUAAC (approximately 13 bases upstream of the site) in the alga Chlamydomonas reinhardtii; UGUUUG (or UGUUUGUU) at mainly the fourth base downstream of the poly(A) site in the parasite Blastocystis hominis; and AAUAAA at approximately 16 bases and approximately 19 bases upstream of the poly(A) site in animals and plants, respectively. Polyadenylation signal motifs are usually several hundred times more abundant around poly(A) sites than in whole genomes. These predominant motifs usually had very specific locations, whether upstream of, at, or downstream of poly(A) sites, depending on the species or phylum. The poly(A) site was usually an adenosine (A) in all analyzed species except for B. hominis, and there was weak A predominance in C. reinhardtii. Fungi, animals, plants, and the protist Phytophthora infestans shared a general base abundance pattern (or base composition pattern) of “U-rich—A-rich—U-rich—Poly(A) site—U-rich regions”, or U-A-U-A-U for short, with some variation for each kingdom or subkingdom. Conclusion This study identified the poly(A) signal motifs, motif locations, and base composition patterns around mRNA poly(A) sites in protists, fungi, plants, and animals and provided insight into poly(A) site evolution. PMID:25052519

  4. Composite Structural Motifs of Binding Sites for Delineating Biological Functions of Proteins

    PubMed Central

    Kinjo, Akira R.; Nakamura, Haruki

    2012-01-01

    Most biological processes are described as a series of interactions between proteins and other molecules, and interactions are in turn described in terms of atomic structures. To annotate protein functions as sets of interaction states at atomic resolution, and thereby to better understand the relation between protein interactions and biological functions, we conducted exhaustive all-against-all atomic structure comparisons of all known binding sites for ligands including small molecules, proteins and nucleic acids, and identified recurring elementary motifs. By integrating the elementary motifs associated with each subunit, we defined composite motifs that represent context-dependent combinations of elementary motifs. It is demonstrated that function similarity can be better inferred from composite motif similarity compared to the similarity of protein sequences or of individual binding sites. By integrating the composite motifs associated with each protein function, we define meta-composite motifs each of which is regarded as a time-independent diagrammatic representation of a biological process. It is shown that meta-composite motifs provide richer annotations of biological processes than sequence clusters. The present results serve as a basis for bridging atomic structures to higher-order biological phenomena by classification and integration of binding site structures. PMID:22347478

  5. Phospholipid composition and a polybasic motif determine D6 PROTEIN KINASE polar association with the plasma membrane and tropic responses.

    PubMed

    Barbosa, Inês C R; Shikata, Hiromasa; Zourelidou, Melina; Heilmann, Mareike; Heilmann, Ingo; Schwechheimer, Claus

    2016-12-15

    Polar transport of the phytohormone auxin through PIN-FORMED (PIN) auxin efflux carriers is essential for the spatiotemporal control of plant development. The Arabidopsis thaliana serine/threonine kinase D6 PROTEIN KINASE (D6PK) is polarly localized at the plasma membrane of many cells where it colocalizes with PINs and activates PIN-mediated auxin efflux. Here, we show that the association of D6PK with the basal plasma membrane and PINs is dependent on the phospholipid composition of the plasma membrane as well as on the phosphatidylinositol phosphate 5-kinases PIP5K1 and PIP5K2 in epidermis cells of the primary root. We further show that D6PK directly binds polyacidic phospholipids through a polybasic lysine-rich motif in the middle domain of the kinase. The lysine-rich motif is required for proper PIN3 phosphorylation and for auxin transport-dependent tropic growth. Polybasic motifs are also present at a conserved position in other D6PK-related kinases and required for membrane and phospholipid binding. Thus, phospholipid-dependent recruitment to membranes through polybasic motifs might not only be required for D6PK-mediated auxin transport but also other processes regulated by these, as yet, functionally uncharacterized kinases. © 2016. Published by The Company of Biologists Ltd.

  6. Composition of the Hemagglutinin Polybasic Proteolytic Cleavage Motif Mediates Variable Virulence of H7N7 Avian Influenza Viruses

    PubMed Central

    Abdelwhab, E. M.; Veits, Jutta; Ulrich, Reiner; Kasbohm, Elisa; Teifke, Jens P.; Mettenleiter, Thomas C.

    2016-01-01

    Acquisition of a polybasic cleavage site (pCS) in the hemagglutinin (HA) is a prerequisite for the shift of low pathogenic (LP) avian influenza virus (AIV) to the highly pathogenic (HP) form in chickens. Whereas presence of a pCS is required for high pathogenicity, less is known about the effect of composition of pCS on virulence of AIV particularly H7N7. Here, we investigated the virulence of four avian H7N7 viruses after insertion of different naturally occurring pCS from two HPAIV H7N7 (designated pCSGE and pCSUK) or from H7N1 (pCSIT). In vitro, the different pCS motifs modulated viral replication and the HA cleavability independent on the HA background. However, in vivo, the level of virulence conferred by the different pCS varied significantly. Within the respective viral backgrounds viruses with pCSIT and pCSGE were more virulent than those coding for pCSUK. The latter showed also the most restricted spread in inoculated birds. Besides the pCS, other gene segments modulated virulence of these H7N7 viruses. Together, the specific composition of the pCS significantly influences virulence of H7N7 viruses. Eurasian LPAIV H7N7 may shift to high pathogenicity after acquisition of “specific” pCS motifs and/or other gene segments from HPAIV. PMID:28004772

  7. Skin grafting in severely contracted socket with the use of 'Compo'.

    PubMed

    Betharia, S M; Kanthamani; Prakash, H; Kumar, S

    1990-01-01

    The results of split thickness autologous skin grafting along with the use of a dental impression material (Compo), a thermoplastic substance are presented in a series of 11 patients of acquired, severely contracted, anophthalmic sockets. Only the fornix fixation sutures and the central tarsorrhaphy were employed for the proper placement of graft without the use of retention devices. Artificial eyes were successfully fitted and retained subsequently after 6 weeks of grafting.

  8. Plant community composition determines the strength of top-down control in a soil food web motif.

    PubMed

    Thakur, Madhav Prakash; Eisenhauer, Nico

    2015-03-16

    Top-down control of prey by predators are magnified in productive ecosystems due to higher sustenance of prey communities. In soil micro-arthropod food webs, plant communities regulate the availability of basal resources like soil microbial biomass. Mixed plant communities are often associated with higher microbial biomass than monocultures. Therefore, top-down control is expected to be higher in soil food webs of mixed plant communities. Moreover, higher predator densities can increase the suppression of prey, which can induce interactive effects between predator densities and plant community composition on prey populations. Here, we tested the effects of predator density (predatory mites) on prey populations (Collembola) in monoculture and mixed plant communities. We hypothesized that top-down control would increase with predator density but only in the mixed plant community. Our results revealed two contrasting patterns of top-down control: stronger top-down control of prey communities in the mixed plant community, but weaker top-down control in plant monocultures in high predator density treatments. As expected, higher microbial community biomass in the mixed plant community sustained sufficiently high prey populations to support high predator density. Our results highlight the roles of plant community composition and predator densities in regulating top-down control of prey in soil food webs.

  9. Composite Conserved Promoter–Terminator Motifs (PeSLs) that Mediate Modular Shuffling in the Diverse T4-Like Myoviruses

    PubMed Central

    Comeau, André M.; Arbiol, Christine; Krisch, Henry M.

    2014-01-01

    The diverse T4-like phages (Tquatrovirinae) infect a wide array of gram-negative bacterial hosts. The genome architecture of these phages is generally well conserved, most of the phylogenetically variable genes being grouped together in a series hyperplastic regions (HPRs) that are interspersed among large blocks of conserved core genes. Recent evidence from a pair of closely related T4-like phages has suggested that small, composite terminator/promoter sequences (promoterearly stem loop [PeSLs]) were implicated in mediating the high levels of genetic plasticity by indels occurring within the HPRs. Here, we present the genome sequence analysis of two T4-like phages, PST (168 kb, 272 open reading frames [ORFs]) and nt-1 (248 kb, 405 ORFs). These two phages were chosen for comparative sequence analysis because, although they are closely related to phages that have been previously sequenced (T4 and KVP40, respectively), they have different host ranges. In each case, one member of the pair infects a bacterial strain that is a human pathogen, whereas the other phage’s host is a nonpathogen. Despite belonging to phylogenetically distant branches of the T4-likes, these pairs of phage have diverged from each other in part by a mechanism apparently involving PeSL-mediated recombination. This analysis confirms a role of PeSL sequences in the generation of genomic diversity by serving as a point of genetic exchange between otherwise unrelated sequences within the HPRs. Finally, the palette of divergent genes swapped by PeSL-mediated homologous recombination is discussed in the context of the PeSLs’ potentially important role in facilitating phage adaption to new hosts and environments. PMID:24951563

  10. FastMotif: spectral sequence motif discovery.

    PubMed

    Colombo, Nicoló; Vlassis, Nikos

    2015-08-15

    Sequence discovery tools play a central role in several fields of computational biology. In the framework of Transcription Factor binding studies, most of the existing motif finding algorithms are computationally demanding, and they may not be able to support the increasingly large datasets produced by modern high-throughput sequencing technologies. We present FastMotif, a new motif discovery algorithm that is built on a recent machine learning technique referred to as Method of Moments. Based on spectral decompositions, our method is robust to model misspecifications and is not prone to locally optimal solutions. We obtain an algorithm that is extremely fast and designed for the analysis of big sequencing data. On HT-Selex data, FastMotif extracts motif profiles that match those computed by various state-of-the-art algorithms, but one order of magnitude faster. We provide a theoretical and numerical analysis of the algorithm's robustness and discuss its sensitivity with respect to the free parameters. The Matlab code of FastMotif is available from http://lcsb-portal.uni.lu/bioinformatics. vlassis@adobe.com Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  11. Genome editing with CompoZr custom zinc finger nucleases (ZFNs).

    PubMed

    Hansen, Keith; Coussens, Matthew J; Sago, Jack; Subramanian, Shilpi; Gjoka, Monika; Briner, Dave

    2012-06-14

    Genome editing is a powerful technique that can be used to elucidate gene function and the genetic basis of disease. Traditional gene editing methods such as chemical-based mutagenesis or random integration of DNA sequences confer indiscriminate genetic changes in an overall inefficient manner and require incorporation of undesirable synthetic sequences or use of aberrant culture conditions, potentially confusing biological study. By contrast, transient ZFN expression in a cell can facilitate precise, heritable gene editing in a highly efficient manner without the need for administration of chemicals or integration of synthetic transgenes. Zinc finger nucleases (ZFNs) are enzymes which bind and cut distinct sequences of double-stranded DNA (dsDNA). A functional CompoZr ZFN unit consists of two individual monomeric proteins that bind a DNA "half-site" of approximately 15-18 nucleotides (see Figure 1). When two ZFN monomers "home" to their adjacent target sites the DNA-cleavage domains dimerize and create a double-strand break (DSB) in the DNA. Introduction of ZFN-mediated DSBs in the genome lays a foundation for highly efficient genome editing. Imperfect repair of DSBs in a cell via the non-homologous end-joining (NHEJ) DNA repair pathway can result in small insertions and deletions (indels). Creation of indels within the gene coding sequence of a cell can result in frameshift and subsequent functional knockout of a gene locus at high efficiency. While this protocol describes the use of ZFNs to create a gene knockout, integration of transgenes may also be conducted via homology-directed repair at the ZFN cut site. The CompoZr Custom ZFN Service represents a systematic, comprehensive, and well-characterized approach to targeted gene editing for the scientific community with ZFN technology. Sigma scientists work closely with investigators to 1) perform due diligence analysis including analysis of relevant gene structure, biology, and model system pursuant to the

  12. The Motif of Meeting in Digital Education

    ERIC Educational Resources Information Center

    Sheail, Philippa

    2015-01-01

    This article draws on theoretical work which considers the composition of meetings, in order to think about the form of the meeting in digital environments for higher education. To explore the motif of meeting, I undertake a "compositional interpretation" (Rose, 2012) of the default interface offered by "Collaborate", an…

  13. The Motif of Meeting in Digital Education

    ERIC Educational Resources Information Center

    Sheail, Philippa

    2015-01-01

    This article draws on theoretical work which considers the composition of meetings, in order to think about the form of the meeting in digital environments for higher education. To explore the motif of meeting, I undertake a "compositional interpretation" (Rose, 2012) of the default interface offered by "Collaborate", an…

  14. Unitary circular code motifs in genomes of eukaryotes.

    PubMed

    El Soufi, Karim; Michel, Christian J

    ) are identified in the X motifs of low composition (cardinality less than 10) in the genomes of eukaryotes. Furthermore, identical trinucleotide pairs of the circular code X are preferentially used in the gene sequences of eukaryotes. These two results suggest that the unitary circular codes of trinucleotides may have been involved in the formation of the trinucleotide circular code X. Indeed, repeated trinucleotides in the X motifs in the genomes of eukaryotes may represent an intermediary evolution from repeated trinucleotides of cardinality 1 (T(+) motifs) in the genomes of eukaryotes up to the X motifs of cardinality 20 in the gene sequences of eukaryotes. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. Protospacer recognition motifs

    PubMed Central

    Shah, Shiraz A.; Erdmann, Susanne; Mojica, Francisco J.M.; Garrett, Roger A.

    2013-01-01

    Protospacer adjacent motifs (PAMs) were originally characterized for CRISPR-Cas systems that were classified on the basis of their CRISPR repeat sequences. A few short 2–5 bp sequences were identified adjacent to one end of the protospacers. Experimental and bioinformatical results linked the motif to the excision of protospacers and their insertion into CRISPR loci. Subsequently, evidence accumulated from different virus- and plasmid-targeting assays, suggesting that these motifs were also recognized during DNA interference, at least for the recently classified type I and type II CRISPR-based systems. The two processes, spacer acquisition and protospacer interference, employ different molecular mechanisms, and there is increasing evidence to suggest that the sequence motifs that are recognized, while overlapping, are unlikely to be identical. In this article, we consider the properties of PAM sequences and summarize the evidence for their dual functional roles. It is proposed to use the terms protospacer associated motif (PAM) for the conserved DNA sequence and to employ spacer acqusition motif (SAM) and target interference motif (TIM), respectively, for acquisition and interference recognition sites. PMID:23403393

  16. Motif enrichment tool.

    PubMed

    Blatti, Charles; Sinha, Saurabh

    2014-07-01

    The Motif Enrichment Tool (MET) provides an online interface that enables users to find major transcriptional regulators of their gene sets of interest. MET searches the appropriate regulatory region around each gene and identifies which transcription factor DNA-binding specificities (motifs) are statistically overrepresented. Motif enrichment analysis is currently available for many metazoan species including human, mouse, fruit fly, planaria and flowering plants. MET also leverages high-throughput experimental data such as ChIP-seq and DNase-seq from ENCODE and ModENCODE to identify the regulatory targets of a transcription factor with greater precision. The results from MET are produced in real time and are linked to a genome browser for easy follow-up analysis. Use of the web tool is free and open to all, and there is no login requirement. ADDRESS: http://veda.cs.uiuc.edu/MET/.

  17. Seeing the B-A-C-H motif

    NASA Astrophysics Data System (ADS)

    Catravas, Palmyra

    2005-09-01

    Musical compositions can be thought of as complex, multidimensional data sets. Compositions based on the B-A-C-H motif (a four-note motif of the pitches of the last name of Johann Sebastian Bach) span several centuries of evolving compositional styles and provide an intriguing set for analysis since they contain a common feature, the motif, buried in dissimilar contexts. We will present analyses which highlight the content of this unusual set of pieces, with emphasis on visual display of information.

  18. Aztec, Incan and Mayan Motifs...Lead to Distinctive Designs.

    ERIC Educational Resources Information Center

    Shields, Joanne

    2001-01-01

    Describes an art project for seventh-grade students in which they choose motifs based on Incan, Aztec, and Mayan Indian materials to incorporate into two-dimensional designs. Explains that the activity objective is to create a unified, balanced and pleasing composition using a minimum of three motifs. (CMK)

  19. [Personal motif in art].

    PubMed

    Gerevich, József

    2015-01-01

    One of the basic questions of the art psychology is whether a personal motif is to be found behind works of art and if so, how openly or indirectly it appears in the work itself. Analysis of examples and documents from the fine arts and literature allow us to conclude that the personal motif that can be identified by the viewer through symbols, at times easily at others with more difficulty, gives an emotional plus to the artistic product. The personal motif may be found in traumatic experiences, in communication to the model or with other emotionally important persons (mourning, disappointment, revenge, hatred, rivalry, revolt etc.), in self-searching, or self-analysis. The emotions are expressed in artistic activity either directly or indirectly. The intention nourished by the artist's identity (Kunstwollen) may stand in the way of spontaneous self-expression, channelling it into hidden paths. Under the influence of certain circumstances, the artist may arouse in the viewer, consciously or unconsciously, an illusionary, misleading image of himself. An examination of the personal motif is one of the important research areas of art therapy.

  20. Ballast: A Ball-based Algorithm for Structural Motifs

    PubMed Central

    He, Lu; Vandin, Fabio; Pandurangan, Gopal

    2013-01-01

    Abstract Structural motifs encapsulate local sequence-structure-function relationships characteristic of related proteins, enabling the prediction of functional characteristics of new proteins, providing molecular-level insights into how those functions are performed, and supporting the development of variants specifically maintaining or perturbing function in concert with other properties. Numerous computational methods have been developed to search through databases of structures for instances of specified motifs. However, it remains an open problem how best to leverage the local geometric and chemical constraints underlying structural motifs in order to develop motif-finding algorithms that are both theoretically and practically efficient. We present a simple, general, efficient approach, called Ballast (ball-based algorithm for structural motifs), to match given structural motifs to given structures. Ballast combines the best properties of previously developed methods, exploiting the composition and local geometry of a structural motif and its possible instances in order to effectively filter candidate matches. We show that on a wide range of motif-matching problems, Ballast efficiently and effectively finds good matches, and we provide theoretical insights into why it works well. By supporting generic measures of compositional and geometric similarity, Ballast provides a powerful substrate for the development of motif-matching algorithms. PMID:23383999

  1. Motifs from the deep

    PubMed Central

    Hwang, Tony W; Codrea, Vlad; Ellington, Andrew D

    2009-01-01

    Because of the increasing recognition of the importance of non-coding RNAs in gene regulation, there is considerable interest in identifying RNA motifs in genomic data. In a recent report in BMC Genomics, Breaker and colleagues describe a new algorithm for identifying functional noncoding RNAs in metagenomic sequences of marine organisms, a strategy that may be particularly effective for discovering new and unique riboswitches. PMID:19735583

  2. Germination et texture du composé supraconducteur Nb3Sn

    NASA Astrophysics Data System (ADS)

    Taillard, R.; Ustinov, A. I.

    2002-07-01

    The composite design and/or manufacturing process of the Nb3Sn multifilamentary strands are continuously changed so as to improve the superconducting behaviour. Such an enhancement depends on both the amount and microstructure of the superconducting phase. The study of the parameters and of the mechanisms of the phase transformations is therefore of the higher importance. The stages of nucleation and growth of the Nb3Sn grains are mainly investigated by thin foil transmission electron microscopy and by X-ray diffraction. The results obtained with the various techniques are shown to be in accordance and to complete each other. An example establishes their usefulness in order to explain the evolution of the critical current density. The effect of the grain misorientation on the critical current density is also considered. L'amélioration du comportement supraconducteur des composites multifilantentaires à base de Nb3Sn passe par la définition de nouveaux designs et/ou de nouveaux procédés de fabrication. Le comportement supraconducteur est régi par la quantité et la microstructure de la phase supraconductrice. Ce dernier paramètre impose d'identifier la nature des mécanismes des transformations de phases et leurs paramèt res. La démarche met en oeuvre la microscopie électronique en transmission et la détermination des orientations cristallographiques aux rayons X. Elle est appliquée aux stades de germination et de croissance des grains de la phase Nb3Sn dans le procédé de la source d'étain interne. Les apports complémentaires et concordants des différentes techniques sont dégagés. L'importance des résultats pour l'interprétation des variations de la densité de courant critique est démontrée par un exemple. L'influence de la désorientation entre les grains sur la densité de courant critique est finalement considérée.

  3. Cell-specific expression of the macrophage scavenger receptor gene is dependent on PU.1 and a composite AP-1/ets motif.

    PubMed Central

    Moulton, K S; Semple, K; Wu, H; Glass, C K

    1994-01-01

    The type I and II scavenger receptors (SRs) are highly restricted to cells of monocyte origin and become maximally expressed during the process of monocyte-to-macrophage differentiation. In this report, we present evidence that SR genomic sequences from -245 to +46 bp relative to the major transcriptional start site were sufficient to confer preferential expression of a reporter gene to cells of monocyte and macrophage origin. This profile of expression resulted from the combinatorial actions of multiple positive and negative regulatory elements. Positive transcriptional control was primarily determined by two elements, located 181 and 46 bp upstream of the major transcriptional start site. Transcriptional control via the -181 element was mediated by PU.1/Spi-1, a macrophage and B-cell-specific transcription factor that is a member of the ets domain gene family. Intriguingly, the -181 element represented a relatively low-affinity binding site for Spi-B, a closely related member of the ets domain family that has been shown to bind with relatively high affinity to other PU.1/Spi-1 binding sites. These observations support the idea that PU.1/Spi-1 and Spi-B regulate overlapping but nonidentical sets of genes. The -46 element represented a composite binding site for a distinct set of ets domain proteins that were preferentially expressed in monocyte and macrophage cell lines and that formed ternary complexes with members of the AP-1 gene family. In concert, these observations suggest a model for how interactions between cell-specific and more generally expressed transcription factors function to dictate the appropriate temporal and cell-specific patterns of SR expression during the process of macrophage differentiation. Images PMID:8007948

  4. Convergent evolution and mimicry of protein linear motifs in host-pathogen interactions.

    PubMed

    Chemes, Lucía Beatriz; de Prat-Gay, Gonzalo; Sánchez, Ignacio Enrique

    2015-06-01

    Pathogen linear motif mimics are highly evolvable elements that facilitate rewiring of host protein interaction networks. Host linear motifs and pathogen mimics differ in sequence, leading to thermodynamic and structural differences in the resulting protein-protein interactions. Moreover, the functional output of a mimic depends on the motif and domain repertoire of the pathogen protein. Regulatory evolution mediated by linear motifs can be understood by measuring evolutionary rates, quantifying positive and negative selection and performing phylogenetic reconstructions of linear motif natural history. Convergent evolution of linear motif mimics is widespread among unrelated proteins from viral, prokaryotic and eukaryotic pathogens and can also take place within individual protein phylogenies. Statistics, biochemistry and laboratory models of infection link pathogen linear motifs to phenotypic traits such as tropism, virulence and oncogenicity. In vitro evolution experiments and analysis of natural sequences suggest that changes in linear motif composition underlie pathogen adaptation to a changing environment.

  5. Apport des neutrons à l'analyse structurale des composés partiellement désordonnés

    NASA Astrophysics Data System (ADS)

    Cousson, A.

    2003-02-01

    La cristallographie est un outil extrêmement puissant qui pourrait être utilisé par de nombreux scientifiques dont les sujets de recherche sont en fait très éloignés. L'évolution des techniques ces dernières années a relégué par exemple la cristallographie des rayons X des petites molécules à un rôle mineur, un rôle de service. Certains ont même le sentiment semble-t-il que toutes les connaissances sont contenues dans de multiples logiciels capables par eux-mêmes de conduire une analyse structurale à un résultat correct unique. Il est souhaitable que chacun soit capable de réaliser l'étude structurale du composé qui l'intéresse et bien entendu nécessaire de comprendre ce que l'on fait, la qualité des résultats et leur analyse en dépend. L'objet de cette présentation est de montrer l'apport spécifique de la diffraction de neutrons sur monocristaux à l'étude du désordre, en particulier des atomes d'hydrogène, et ses conséquences sur la compréhension des propriétés physiques, à partir de développements et d'exemples récents.

  6. MISCORE: a new scoring function for characterizing DNA regulatory motifs in promoter sequences

    PubMed Central

    2012-01-01

    Background Computational approaches for finding DNA regulatory motifs in promoter sequences are useful to biologists in terms of reducing the experimental costs and speeding up the discovery process of de novo binding sites. It is important for rule-based or clustering-based motif searching schemes to effectively and efficiently evaluate the similarity between a k-mer (a k-length subsequence) and a motif model, without assuming the independence of nucleotides in motif models or without employing computationally expensive Markov chain models to estimate the background probabilities of k-mers. Also, it is interesting and beneficial to use a priori knowledge in developing advanced searching tools. Results This paper presents a new scoring function, termed as MISCORE, for functional motif characterization and evaluation. Our MISCORE is free from: (i) any assumption on model dependency; and (ii) the use of Markov chain model for background modeling. It integrates the compositional complexity of motif instances into the function. Performance evaluations with comparison to the well-known Maximum a Posteriori (MAP) score and Information Content (IC) have shown that MISCORE has promising capabilities to separate and recognize functional DNA motifs and its instances from non-functional ones. Conclusions MISCORE is a fast computational tool for candidate motif characterization, evaluation and selection. It enables to embed priori known motif models for computing motif-to-motif similarity, which is more advantageous than IC and MAP score. In addition to these merits mentioned above, MISCORE can automatically filter out some repetitive k-mers from a motif model due to the introduction of the compositional complexity in the function. Consequently, the merits of our proposed MISCORE in terms of both motif signal modeling power and computational efficiency will make it more applicable in the development of computational motif discovery tools. PMID:23282090

  7. Motif Yggdrasil: sampling sequence motifs from a tree mixture model.

    PubMed

    Andersson, Samuel A; Lagergren, Jens

    2007-06-01

    In phylogenetic foot-printing, putative regulatory elements are found in upstream regions of orthologous genes by searching for common motifs. Motifs in different upstream sequences are subject to mutations along the edges of the corresponding phylogenetic tree, consequently taking advantage of the tree in the motif search is an appealing idea. We describe the Motif Yggdrasil sampler; the first Gibbs sampler based on a general tree that uses unaligned sequences. Previous tree-based Gibbs samplers have assumed a star-shaped tree or partially aligned upstream regions. We give a probabilistic model (MY model) describing upstream sequences with regulatory elements and build a Gibbs sampler with respect to this model. The model allows toggling, i.e., the restriction of a position to a subset of nucleotides, but does not require aligned sequences nor edge lengths, which may be difficult to come by. We apply the collapsing technique to eliminate the need to sample nuisance parameters, and give a derivation of the predictive update formula. We show that the MY model improves the modeling of difficult motif instances and that the use of the tree achieves a substantial increase in nucleotide level correlation coefficient both for synthetic data and 37 bacterial lexA genes. We investigate the sensitivity to errors in the tree and show that using random trees MY sampler still has a performance similar to the original version.

  8. Redox active motifs in selenoproteins.

    PubMed

    Li, Fei; Lutz, Patricia B; Pepelyayeva, Yuliya; Arnér, Elias S J; Bayse, Craig A; Rozovsky, Sharon

    2014-05-13

    Selenoproteins use the rare amino acid selenocysteine (Sec) to act as the first line of defense against oxidants, which are linked to aging, cancer, and neurodegenerative diseases. Many selenoproteins are oxidoreductases in which the reactive Sec is connected to a neighboring Cys and able to form a ring. These Sec-containing redox motifs govern much of the reactivity of selenoproteins. To study their fundamental properties, we have used (77)Se NMR spectroscopy in concert with theoretical calculations to determine the conformational preferences and mobility of representative motifs. This use of (77)Se as a probe enables the direct recording of the properties of Sec as its environment is systematically changed. We find that all motifs have several ring conformations in their oxidized state. These ring structures are most likely stabilized by weak, nonbonding interactions between the selenium and the amide carbon. To examine how the presence of selenium and ring geometric strain governs the motifs' reactivity, we measured the redox potentials of Sec-containing motifs and their corresponding Cys-only variants. The comparisons reveal that for C-terminal motifs the redox potentials increased between 20-25 mV when the selenenylsulfide bond was changed to a disulfide bond. Changes of similar magnitude arose when we varied ring size or the motifs' flanking residues. This suggests that the presence of Sec is not tied to unusually low redox potentials. The unique roles of selenoproteins in human health and their chemical reactivities may therefore not necessarily be explained by lower redox potentials, as has often been claimed.

  9. [Prediction of Promoter Motifs in Virophages].

    PubMed

    Gong, Chaowen; Zhou, Xuewen; Pan, Yingjie; Wang, Yongjie

    2015-07-01

    Virophages have crucial roles in ecosystems and are the transport vectors of genetic materials. To shed light on regulation and control mechanisms in virophage--host systems as well as evolution between virophages and their hosts, the promoter motifs of virophages were predicted on the upstream regions of start codons using an analytical tool for prediction of promoter motifs: Multiple EM for Motif Elicitation. Seventeen potential promoter motifs were identified based on the E-value, location, number and length of promoters in genomes. Sputnik and zamilon motif 2 with AT-rich regions were distributed widely on genomes, suggesting that these motifs may be associated with regulation of the expression of various genes. Motifs containing the TCTA box were predicted to be late promoter motif in mavirus; motifs containing the ATCT box were the potential late promoter motif in the Ace Lake mavirus . AT-rich regions were identified on motif 2 in the Organic Lake virophage, motif 3 in Yellowstone Lake virophage (YSLV)1 and 2, motif 1 in YSLV3, and motif 1 and 2 in YSLV4, respectively. AT-rich regions were distributed widely on the genomes of virophages. All of these motifs may be promoter motifs of virophages. Our results provide insights into further exploration of temporal expression of genes in virophages as well as associations between virophages and giant viruses.

  10. Knowledge discovery of multilevel protein motifs

    SciTech Connect

    Conklin, D.; Glasgow, J.; Fortier, S.

    1994-12-31

    A new category of protein motif is introduced. This type of motif captures, in addition to global structure, the nested structure of its component parts. A dataset of four proteins is represented using this scheme. A structured machine discovery procedure is used to discover recurrent amino acid motifs and this knowledge is utilized for the expression of subsequent protein motif discoveries. Examples of discovered multilevel motifs are presented.

  11. Unravelling daily human mobility motifs

    PubMed Central

    Schneider, Christian M.; Belik, Vitaly; Couronné, Thomas; Smoreda, Zbigniew; González, Marta C.

    2013-01-01

    Human mobility is differentiated by time scales. While the mechanism for long time scales has been studied, the underlying mechanism on the daily scale is still unrevealed. Here, we uncover the mechanism responsible for the daily mobility patterns by analysing the temporal and spatial trajectories of thousands of persons as individual networks. Using the concept of motifs from network theory, we find only 17 unique networks are present in daily mobility and they follow simple rules. These networks, called here motifs, are sufficient to capture up to 90 per cent of the population in surveys and mobile phone datasets for different countries. Each individual exhibits a characteristic motif, which seems to be stable over several months. Consequently, daily human mobility can be reproduced by an analytically tractable framework for Markov chains by modelling periods of high-frequency trips followed by periods of lower activity as the key ingredient. PMID:23658117

  12. Sequential visibility-graph motifs

    NASA Astrophysics Data System (ADS)

    Iacovacci, Jacopo; Lacasa, Lucas

    2016-04-01

    Visibility algorithms transform time series into graphs and encode dynamical information in their topology, paving the way for graph-theoretical time series analysis as well as building a bridge between nonlinear dynamics and network science. In this work we introduce and study the concept of sequential visibility-graph motifs, smaller substructures of n consecutive nodes that appear with characteristic frequencies. We develop a theory to compute in an exact way the motif profiles associated with general classes of deterministic and stochastic dynamics. We find that this simple property is indeed a highly informative and computationally efficient feature capable of distinguishing among different dynamics and robust against noise contamination. We finally confirm that it can be used in practice to perform unsupervised learning, by extracting motif profiles from experimental heart-rate series and being able, accordingly, to disentangle meditative from other relaxation states. Applications of this general theory include the automatic classification and description of physical, biological, and financial time series.

  13. The Overall Response of Composite Materials Undergoing Large Elastic Deformations

    DTIC Science & Technology

    1990-10-30

    procedure in general to es.et the energy of the composite W (i). For the linear case, it has been shown (MILTON, 1985; AVELLANEDA , 1987) that the DSC...No. 89-0288. 26 P. . CASTAFEDA REFRENcEs AVELLANEDA . M. 1987 Commun. Pure appi. Math. 40, 527. BOUCHER. S. 1974 J. Compos. Mater. 8, 82. BUDIANSKY, B

  14. Efficient Generation of Hyperbolic Patterns from a Single Asymmetric Motif

    NASA Astrophysics Data System (ADS)

    Chen, Ning; Chung, K. W.

    2016-11-01

    We present an efficient method of constructing hyperbolic patterns based on an asymmetric motif designed in the central hyperbolic polygon. Since there is no rotational symmetry in each hyperbolic polygon, a subset of the hyperbolic group elements has to be selected carefully so that the central hyperbolic polygon is transformed to the other polygons once and only once. An efficient labeling procedure is proved by considering the group presentation and can be easily implemented using the computer. Illustrative hyperbolic patterns are constructed from given asymmetric motifs for the symmetry group [p, q]+ which consists of all compositions of an even number of reflections.

  15. Neural Circuits: Male Mating Motifs.

    PubMed

    Benton, Richard

    2015-09-02

    Characterizing microcircuit motifs in intact nervous systems is essential to relate neural computations to behavior. In this issue of Neuron, Clowney et al. (2015) identify recurring, parallel feedforward excitatory and inhibitory pathways in male Drosophila's courtship circuitry, which might explain decisive mate choice.

  16. Parametric bootstrapping for biological sequence motifs.

    PubMed

    O'Neill, Patrick K; Erill, Ivan

    2016-10-06

    Biological sequence motifs drive the specific interactions of proteins and nucleic acids. Accordingly, the effective computational discovery and analysis of such motifs is a central theme in bioinformatics. Many practical questions about the properties of motifs can be recast as random sampling problems. In this light, the task is to determine for a given motif whether a certain feature of interest is statistically unusual among relevantly similar alternatives. Despite the generality of this framework, its use has been frustrated by the difficulties of defining an appropriate reference class of motifs for comparison and of sampling from it effectively. We define two distributions over the space of all motifs of given dimension. The first is the maximum entropy distribution subject to mean information content, and the second is the truncated uniform distribution over all motifs having information content within a given interval. We derive exact sampling algorithms for each. As a proof of concept, we employ these sampling methods to analyze a broad collection of prokaryotic and eukaryotic transcription factor binding site motifs. In addition to positional information content, we consider the informational Gini coefficient of the motif, a measure of the degree to which information is evenly distributed throughout a motif's positions. We find that both prokaryotic and eukaryotic motifs tend to exhibit higher informational Gini coefficients (IGC) than would be expected by chance under either reference distribution. As a second application, we apply maximum entropy sampling to the motif p-value problem and use it to give elementary derivations of two new estimators. Despite the historical centrality of biological sequence motif analysis, this study constitutes to our knowledge the first use of principled null hypotheses for sequence motifs given information content. Through their use, we are able to characterize for the first time differerences in global motif statistics

  17. Observability of Neuronal Network Motifs

    PubMed Central

    Whalen, Andrew J.; Brennan, Sean N.; Sauer, Timothy D.; Schiff, Steven J.

    2014-01-01

    We quantify observability in small (3 node) neuronal networks as a function of 1) the connection topology and symmetry, 2) the measured nodes, and 3) the nodal dynamics (linear and nonlinear). We find that typical observability metrics for 3 neuron motifs range over several orders of magnitude, depending upon topology, and for motifs containing symmetry the network observability decreases when observing from particularly confounded nodes. Nonlinearities in the nodal equations generally decrease the average network observability and full network information becomes available only in limited regions of the system phase space. Our findings demonstrate that such networks are partially observable, and suggest their potential efficacy in reconstructing network dynamics from limited measurement data. How well such strategies can be used to reconstruct and control network dynamics in experimental settings is a subject for future experimental work. PMID:25909092

  18. The Thiamin Pyrophosphate-Motif

    NASA Technical Reports Server (NTRS)

    Dominiak, Paulina M.; Ciszak, Ewa M.

    2003-01-01

    Using databases the authors have identified a common thiamin pyrophosphate (TPP)-motif in the family of functionally diverse TPP-dependent enzymes. This common motif consists of multimeric organization of subunits, two catalytic centers, common amino acid sequence, and specific contacts to provide a flip-flop, or alternate site, mechanism of action. Each catalytic center [PP:PYR] is formed at the interface of the PP-domain binding the magnesium ion, pyrophosphate and aminopyrimidine ring of TPP, and the PYR-domain binding the aminopyrimidine ring of that cofactor. A pair of these catalytic centers constitutes the catalytic core [PP:PYR]* within these enzymes. Analysis of the structural elements of this catalytic core reveals novel definition of the common amino acid sequences, which are GX@&(G)@XXGQ, and GDGX25-30 within the PP- domain, and the E&(G)@XXG@ within the PYR-domain, where Q, corresponds to a hydrophobic amino acid. This TPP-motif provides a novel tool for annotation of TPP-dependent enzymes useful in advancing functional proteomics.

  19. The Thiamin Pyrophosphate-Motif

    NASA Technical Reports Server (NTRS)

    Dominiak, Paulina M.; Ciszak, Ewa M.

    2003-01-01

    Using databases the authors have identified a common thiamin pyrophosphate (TPP)-motif in the family of functionally diverse TPP-dependent enzymes. This common motif consists of multimeric organization of subunits, two catalytic centers, common amino acid sequence, and specific contacts to provide a flip-flop, or alternate site, mechanism of action. Each catalytic center [PP:PYR] is formed at the interface of the PP-domain binding the magnesium ion, pyrophosphate and aminopyrimidine ring of TPP, and the PYR-domain binding the aminopyrimidine ring of that cofactor. A pair of these catalytic centers constitutes the catalytic core [PP:PYR]* within these enzymes. Analysis of the structural elements of this catalytic core reveals novel definition of the common amino acid sequences, which are GX@&(G)@XXGQ, and GDGX25-30 within the PP- domain, and the E&(G)@XXG@ within the PYR-domain, where Q, corresponds to a hydrophobic amino acid. This TPP-motif provides a novel tool for annotation of TPP-dependent enzymes useful in advancing functional proteomics.

  20. The Thiamin Pyrophosphate-Motif

    NASA Technical Reports Server (NTRS)

    Dominiak, P.; Ciszak, E.

    2003-01-01

    Using databases the authors have identified a common thiamin pyrophosphate (TPP)-motif in the family of functionally diverse TPP-dependent enzymes. This common motif consists of multimeric organization of subunits and two catalytic centers. Each catalytic center (PP:PYR) is formed at the interface of the PP-domain binding the magnesium ion, pyrophosphate and amhopyrimidine ring of TPP, and the PYR-domain binding the aminopyrimidine ring of that cofactor. A pair of these catalytic centers constitutes the catalytic core (PP:PYR)(sub 2) within these enzymes. Analysis of the structural elements of this catalytic core reveals novel definition of the common amino acid sequences, which are GXPhiX(sub 4)(G)PhiXXGQ and GDGX(sub 25-30)NN in the PP-domain, and the EX(sub 4)(G)PhiXXGPhi in the PYR-domain, where Phi corresponds to a hydrophobic amino acid. This TPP-motif provides a novel tool for annotation of TPP-dependent enzymes useful in advancing functional proteomics.

  1. The Thiamin Pyrophosphate-Motif

    NASA Technical Reports Server (NTRS)

    Dominiak, P.; Ciszak, E.

    2003-01-01

    Using databases the authors have identified a common thiamin pyrophosphate (TPP)-motif in the family of functionally diverse TPP-dependent enzymes. This common motif consists of multimeric organization of subunits and two catalytic centers. Each catalytic center (PP:PYR) is formed at the interface of the PP-domain binding the magnesium ion, pyrophosphate and amhopyrimidine ring of TPP, and the PYR-domain binding the aminopyrimidine ring of that cofactor. A pair of these catalytic centers constitutes the catalytic core (PP:PYR)(sub 2) within these enzymes. Analysis of the structural elements of this catalytic core reveals novel definition of the common amino acid sequences, which are GXPhiX(sub 4)(G)PhiXXGQ and GDGX(sub 25-30)NN in the PP-domain, and the EX(sub 4)(G)PhiXXGPhi in the PYR-domain, where Phi corresponds to a hydrophobic amino acid. This TPP-motif provides a novel tool for annotation of TPP-dependent enzymes useful in advancing functional proteomics.

  2. Application de la topologie moléculaire à la prédiction de la viscosité liquide des composés organiques

    NASA Astrophysics Data System (ADS)

    García-Domenech, R.; Villanueva, A.; Gálvez, J.; Gozalbes, R.

    1999-07-01

    Molecular Topology has been applied to search for a mathematical model able to predict liquid viscosity values for an extensive group of organic compounds with C, H, O, N, S and halogenous atoms. The topological descriptors we have used are the connectivity indices from Kier et Hall -up to fourth order-, and the electrotopological indices. Quality of regression equation finally selected has been evaluated by a crossvalidation study. Viscosity of all compounds excepted 1,1,2-trichlorotrifluoro etane is correctly predicted by the model proposed. Nous avons appliqué la topologie moléculaire à la recherche d'un modèle mathématique capable de prédire la viscosité liquide d'un large groupe de composés organiques contenant C, H, O, N, S et des halogènes. Les descripteurs topologiques utilisés pour cette étude des relations quantitatives structure-propiété, R.Q.S.P., sont les indices de connectivité de Kier et Hall -jusqu'à l'ordre quatre- ainsi que les indices atomiques électrotopologiques. La qualité de l'équation de régression multilinéaire obtenue (aussi dite “fonction de connectivité") a été évaluée par une étude de validation croisée. Tous les composés excepté le 1,1,2-trichlorotrifluoréthane s'ajustent au modèle proposé.

  3. Comprehensive discovery of DNA motifs in 349 human cells and tissues reveals new features of motifs.

    PubMed

    Zheng, Yiyu; Li, Xiaoman; Hu, Haiyan

    2015-01-01

    Comprehensive motif discovery under experimental conditions is critical for the global understanding of gene regulation. To generate a nearly complete list of human DNA motifs under given conditions, we employed a novel approach to de novo discover significant co-occurring DNA motifs in 349 human DNase I hypersensitive site datasets. We predicted 845 to 1325 motifs in each dataset, for a total of 2684 non-redundant motifs. These 2684 motifs contained 54.02 to 75.95% of the known motifs in seven large collections including TRANSFAC. In each dataset, we also discovered 43 663 to 2 013 288 motif modules, groups of motifs with their binding sites co-occurring in a significant number of short DNA regions. Compared with known interacting transcription factors in eight resources, the predicted motif modules on average included 84.23% of known interacting motifs. We further showed new features of the predicted motifs, such as motifs enriched in proximal regions rarely overlapped with motifs enriched in distal regions, motifs enriched in 5' distal regions were often enriched in 3' distal regions, etc. Finally, we observed that the 2684 predicted motifs classified the cell or tissue types of the datasets with an accuracy of 81.29%. The resources generated in this study are available at http://server.cs.ucf.edu/predrem/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. Detecting correlations among functional-sequence motifs

    NASA Astrophysics Data System (ADS)

    Pirino, Davide; Rigosa, Jacopo; Ledda, Alice; Ferretti, Luca

    2012-06-01

    Sequence motifs are words of nucleotides in DNA with biological functions, e.g., gene regulation. Identification of such words proceeds through rejection of Markov models on the expected motif frequency along the genome. Additional biological information can be extracted from the correlation structure among patterns of motif occurrences. In this paper a log-linear multivariate intensity Poisson model is estimated via expectation maximization on a set of motifs along the genome of E. coli K12. The proposed approach allows for excitatory as well as inhibitory interactions among motifs and between motifs and other genomic features like gene occurrences. Our findings confirm previous stylized facts about such types of interactions and shed new light on genome-maintenance functions of some particular motifs. We expect these methods to be applicable to a wider set of genomic features.

  5. The Thiamine-Pyrophosphate-Motif

    NASA Technical Reports Server (NTRS)

    Ciszak, Ewa; Dominiak, Paulina

    2004-01-01

    Thiamin pyrophosphate (TPP), a derivative of vitamin B1, is a cofactor for enzymes performing catalysis in pathways of energy production including the well known decarboxylation of a-keto acid dehydrogenases followed by transketolation. TPP-dependent enzymes constitute a structurally and functionally diverse group exhibiting multimeric subunit organization, multiple domains and two chemically equivalent catalytic centers. Annotation of functional TPP-dependcnt enzymes, therefore, has not been trivial due to low sequence similarity related to this complex organization. Our approach to analysis of structures of known TPP-dependent enzymes reveals for the first time features common to this group, which we have termed the TPP-motif. The TPP-motif consists of specific spatial arrangements of structural elements and their specific contacts to provide for a flip-flop, or alternate site, enzymatic mechanism of action. Analysis of structural elements entrained in the flip-flop action displayed by TPP-dependent enzymes reveals a novel definition of the common amino acid sequences. These sequences allow for annotation of TPP-dependent enzymes, thus advancing functional proteomics. Further details of three-dimensional structures of TPP-dependent enzymes will be discussed.

  6. rMotifGen: random motif generator for DNA and protein sequences.

    PubMed

    Rouchka, Eric C; Hardin, C Timothy

    2007-08-07

    Detection of short, subtle conserved motif regions within a set of related DNA or amino acid sequences can lead to discoveries about important regulatory domains such as transcription factor and DNA binding sites as well as conserved protein domains. In order to help assess motif detection algorithms on motifs with varying properties and levels of conservation, we have developed a computational tool, rMotifGen, with the sole purpose of generating a number of random DNA or protein sequences containing short sequence motifs. Each motif consensus can be user-defined, randomly generated, or created from a position-specific scoring matrix (PSSM). Insertions and mutations within these motifs are created according to user-defined parameters and substitution matrices. The resulting sequences can be helpful in mutational simulations and in testing the limits of motif detection algorithms. Two implementations of rMotifGen have been created, one providing a graphical user interface (GUI) for random motif construction, and the other serving as a command line interface. The second implementation has the added advantages of platform independence and being able to be called in a batch mode. rMotifGen was used to construct sample sets of sequences containing DNA motifs and amino acid motifs that were then tested against the Gibbs sampler and MEME packages. rMotifGen provides an efficient and convenient method for creating random DNA or amino acid sequences with a variable number of motifs, where the instance of each motif can be incorporated using a position-specific scoring matrix (PSSM) or by creating an instance mutated from its corresponding consensus using an evolutionary model based on substitution matrices. rMotifGen is freely available at: http://bioinformatics.louisville.edu/brg/rMotifGen/.

  7. [Scanning electronmicroscopic study of 3 composite filling materials after 1 year's use].

    PubMed

    Triadan, H

    1976-05-01

    This is an in-vivo comparative test of two test materials, composites Compo-Cap and Cosmic against Adaptic on a monkey (Macaca speciosa) over one year. No significant differences could be found and the defects on margins and in the surface was similar. Undubitable secondary caries could--unlike in a previous test with Epoxylite--not be found with these fillings.

  8. Discovering novel sequence motifs with MEME.

    PubMed

    Bailey, Timothy L

    2002-11-01

    This unit illustrates how to use MEME to discover motifs in a group of related nucleotide or peptide sequences. A MEME motif is a sequence pattern that occurs repeatedly in one or more sequences in the input group. MEME can be used to discover novel patterns because it bases its discoveries only on the input sequences, not on any prior knowledge (such as databases of known motifs). The input to MEME is a set of unaligned sequences of the same type (peptide or nucleotide). For each motif it discovers, MEME reports the occurrences (sites), consensus sequence, and the level of conservation (information content) at each position in the pattern. MEME also produces block diagrams showing where all of the discovered motifs occur in the training set sequences. MEME's hypertext (HTML) output also contains buttons that allow for the convenient use of the motifs in other searches.

  9. rMotifGen: random motif generator for DNA and protein sequences

    PubMed Central

    Rouchka, Eric C; Hardin, C Timothy

    2007-01-01

    Background Detection of short, subtle conserved motif regions within a set of related DNA or amino acid sequences can lead to discoveries about important regulatory domains such as transcription factor and DNA binding sites as well as conserved protein domains. In order to help assess motif detection algorithms on motifs with varying properties and levels of conservation, we have developed a computational tool, rMotifGen, with the sole purpose of generating a number of random DNA or protein sequences containing short sequence motifs. Each motif consensus can be user-defined, randomly generated, or created from a position-specific scoring matrix (PSSM). Insertions and mutations within these motifs are created according to user-defined parameters and substitution matrices. The resulting sequences can be helpful in mutational simulations and in testing the limits of motif detection algorithms. Results Two implementations of rMotifGen have been created, one providing a graphical user interface (GUI) for random motif construction, and the other serving as a command line interface. The second implementation has the added advantages of platform independence and being able to be called in a batch mode. rMotifGen was used to construct sample sets of sequences containing DNA motifs and amino acid motifs that were then tested against the Gibbs sampler and MEME packages. Conclusion rMotifGen provides an efficient and convenient method for creating random DNA or amino acid sequences with a variable number of motifs, where the instance of each motif can be incorporated using a position-specific scoring matrix (PSSM) or by creating an instance mutated from its corresponding consensus using an evolutionary model based on substitution matrices. rMotifGen is freely available at: . PMID:17683637

  10. BayesMotif: de novo protein sorting motif discovery from impure datasets

    PubMed Central

    2010-01-01

    Background Protein sorting is the process that newly synthesized proteins are transported to their target locations within or outside of the cell. This process is precisely regulated by protein sorting signals in different forms. A major category of sorting signals are amino acid sub-sequences usually located at the N-terminals or C-terminals of protein sequences. Genome-wide experimental identification of protein sorting signals is extremely time-consuming and costly. Effective computational algorithms for de novo discovery of protein sorting signals is needed to improve the understanding of protein sorting mechanisms. Methods We formulated the protein sorting motif discovery problem as a classification problem and proposed a Bayesian classifier based algorithm (BayesMotif) for de novo identification of a common type of protein sorting motifs in which a highly conserved anchor is present along with a less conserved motif regions. A false positive removal procedure is developed to iteratively remove sequences that are unlikely to contain true motifs so that the algorithm can identify motifs from impure input sequences. Results Experiments on both implanted motif datasets and real-world datasets showed that the enhanced BayesMotif algorithm can identify anchored sorting motifs from pure or impure protein sequence dataset. It also shows that the false positive removal procedure can help to identify true motifs even when there is only 20% of the input sequences containing true motif instances. Conclusion We proposed BayesMotif, a novel Bayesian classification based algorithm for de novo discovery of a special category of anchored protein sorting motifs from impure datasets. Compared to conventional motif discovery algorithms such as MEME, our algorithm can find less-conserved motifs with short highly conserved anchors. Our algorithm also has the advantage of easy incorporation of additional meta-sequence features such as hydrophobicity or charge of the motifs which

  11. A Further Study on Mining DNA Motifs Using Fuzzy Self-Organizing Maps.

    PubMed

    Tapan, Sarwar; Wang, Dianhui

    2016-01-01

    Self-organizing map (SOM)-based motif mining, despite being a promising approach for problem solving, mostly fails to offer a consistent interpretation of clusters with respect to the mixed composition of signal and noise in the nodes. The main reason behind this shortcoming comes from the similarity metrics used in data assignment, specially designed with the biological interpretation for this domain, which are not meant to consider the inevitable noise mixture in the clusters. This limits the explicability of the majority of clusters that are supposedly noise dominated, degrading the overall system clarity in motif discovery. This paper aims to improve the explicability aspect of learning process by introducing a composite similarity function (CSF) that is specially designed for the k -mer-to-cluster similarity measure with respect to the degree of motif properties and embedded noise in the cluster. Our proposed motif finding algorithm in this paper is built on our previous work robust elicitation algorithms for discovering (READ) [1] and termed READ Deoxyribonucleic acid motifs using CSFs (READ(csf)), which performs slightly better than READ and shows some remarkable improvements over SOM-based SOMBRERO and SOMEA tools in terms of F-measure on the testing data sets. A real data set containing multiple motifs is used to explore the potential of the READ(csf) for more challenging biological data mining tasks. Visual comparisons with the verified logos extracted from JASPAR database demonstrate that our algorithm is promising to discover multiple motifs simultaneously.

  12. MSDmotif: exploring protein sites and motifs

    PubMed Central

    Golovin, Adel; Henrick, Kim

    2008-01-01

    Background Protein structures have conserved features – motifs, which have a sufficient influence on the protein function. These motifs can be found in sequence as well as in 3D space. Understanding of these fragments is essential for 3D structure prediction, modelling and drug-design. The Protein Data Bank (PDB) is the source of this information however present search tools have limited 3D options to integrate protein sequence with its 3D structure. Results We describe here a web application for querying the PDB for ligands, binding sites, small 3D structural and sequence motifs and the underlying database. Novel algorithms for chemical fragments, 3D motifs, ϕ/ψ sequences, super-secondary structure motifs and for small 3D structural motif associations searches are incorporated. The interface provides functionality for visualization, search criteria creation, sequence and 3D multiple alignment options. MSDmotif is an integrated system where a results page is also a search form. A set of motif statistics is available for analysis. This set includes molecule and motif binding statistics, distribution of motif sequences, occurrence of an amino-acid within a motif, correlation of amino-acids side-chain charges within a motif and Ramachandran plots for each residue. The binding statistics are presented in association with properties that include a ligand fragment library. Access is also provided through the distributed Annotation System (DAS) protocol. An additional entry point facilitates XML requests with XML responses. Conclusion MSDmotif is unique by combining chemical, sequence and 3D data in a single search engine with a range of search and visualisation options. It provides multiple views of data found in the PDB archive for exploring protein structures. PMID:18637174

  13. A survey of motif finding Web tools for detecting binding site motifs in ChIP-Seq data.

    PubMed

    Tran, Ngoc Tam L; Huang, Chun-Hsi

    2014-02-20

    ChIP-Seq (chromatin immunoprecipitation sequencing) has provided the advantage for finding motifs as ChIP-Seq experiments narrow down the motif finding to binding site locations. Recent motif finding tools facilitate the motif detection by providing user-friendly Web interface. In this work, we reviewed nine motif finding Web tools that are capable for detecting binding site motifs in ChIP-Seq data. We showed each motif finding Web tool has its own advantages for detecting motifs that other tools may not discover. We recommended the users to use multiple motif finding Web tools that implement different algorithms for obtaining significant motifs, overlapping resemble motifs, and non-overlapping motifs. Finally, we provided our suggestions for future development of motif finding Web tool that better assists researchers for finding motifs in ChIP-Seq data.

  14. Mining, compressing and classifying with extensible motifs

    PubMed Central

    Apostolico, Alberto; Comin, Matteo; Parida, Laxmi

    2006-01-01

    Background Motif patterns of maximal saturation emerged originally in contexts of pattern discovery in biomolecular sequences and have recently proven a valuable notion also in the design of data compression schemes. Informally, a motif is a string of intermittently solid and wild characters that recurs more or less frequently in an input sequence or family of sequences. Motif discovery techniques and tools tend to be computationally imposing, however, special classes of "rigid" motifs have been identified of which the discovery is affordable in low polynomial time. Results In the present work, "extensible" motifs are considered such that each sequence of gaps comes endowed with some elasticity, whereby the same pattern may be stretched to fit segments of the source that match all the solid characters but are otherwise of different lengths. A few applications of this notion are then described. In applications of data compression by textual substitution, extensible motifs are seen to bring savings on the size of the codebook, and hence to improve compression. In germane contexts, in which compressibility is used in its dual role as a basis for structural inference and classification, extensible motifs are seen to support unsupervised classification and phylogeny reconstruction. Conclusion Off-line compression based on extensible motifs can be used advantageously to compress and classify biological sequences. PMID:16722593

  15. Sampling Motif-Constrained Ensembles of Networks

    NASA Astrophysics Data System (ADS)

    Fischer, Rico; Leitão, Jorge C.; Peixoto, Tiago P.; Altmann, Eduardo G.

    2015-10-01

    The statistical significance of network properties is conditioned on null models which satisfy specified properties but that are otherwise random. Exponential random graph models are a principled theoretical framework to generate such constrained ensembles, but which often fail in practice, either due to model inconsistency or due to the impossibility to sample networks from them. These problems affect the important case of networks with prescribed clustering coefficient or number of small connected subgraphs (motifs). In this Letter we use the Wang-Landau method to obtain a multicanonical sampling that overcomes both these problems. We sample, in polynomial time, networks with arbitrary degree sequences from ensembles with imposed motifs counts. Applying this method to social networks, we investigate the relation between transitivity and homophily, and we quantify the correlation between different types of motifs, finding that single motifs can explain up to 60% of the variation of motif profiles.

  16. Temporal motifs in time-dependent networks

    NASA Astrophysics Data System (ADS)

    Kovanen, Lauri; Karsai, Márton; Kaski, Kimmo; Kertész, János; Saramäki, Jari

    2011-11-01

    Temporal networks are commonly used to represent systems where connections between elements are active only for restricted periods of time, such as telecommunication, neural signal processing, biochemical reaction and human social interaction networks. We introduce the framework of temporal motifs to study the mesoscale topological-temporal structure of temporal networks in which the events of nodes do not overlap in time. Temporal motifs are classes of similar event sequences, where the similarity refers not only to topology but also to the temporal order of the events. We provide a mapping from event sequences to coloured directed graphs that enables an efficient algorithm for identifying temporal motifs. We discuss some aspects of temporal motifs, including causality and null models, and present basic statistics of temporal motifs in a large mobile call network.

  17. A comparison between corn and grain sorghum fermentation rates, distillers dried grains with solubles composition, and lipid profiles

    USDA-ARS?s Scientific Manuscript database

    Interest in utilization of feedstocks other than corn for fuel ethanol production has been increasing due to political as well as environmental reasons. Grain sorghum is an identified alternative that has a number of potential benefits relative to corn in both composition and agronomic traits. Compo...

  18. MotifNet: a web-server for network motif analysis.

    PubMed

    Smoly, Ilan Y; Lerman, Eugene; Ziv-Ukelson, Michal; Yeger-Lotem, Esti

    2017-06-15

    Network motifs are small topological patterns that recur in a network significantly more often than expected by chance. Their identification emerged as a powerful approach for uncovering the design principles underlying complex networks. However, available tools for network motif analysis typically require download and execution of computationally intensive software on a local computer. We present MotifNet, the first open-access web-server for network motif analysis. MotifNet allows researchers to analyze integrated networks, where nodes and edges may be labeled, and to search for motifs of up to eight nodes. The output motifs are presented graphically and the user can interactively filter them by their significance, number of instances, node and edge labels, and node identities, and view their instances. MotifNet also allows the user to distinguish between motifs that are centered on specific nodes and motifs that recur in distinct parts of the network. MotifNet is freely available at http://netbio.bgu.ac.il/motifnet . The website was implemented using ReactJs and supports all major browsers. The server interface was implemented in Python with data stored on a MySQL database. estiyl@bgu.ac.il or michaluz@cs.bgu.ac.il. Supplementary data are available at Bioinformatics online.

  19. Efficient motif search in ranked lists and applications to variable gap motifs.

    PubMed

    Leibovich, Limor; Yakhini, Zohar

    2012-07-01

    Sequence elements, at all levels-DNA, RNA and protein, play a central role in mediating molecular recognition and thereby molecular regulation and signaling. Studies that focus on -measuring and investigating sequence-based recognition make use of statistical and computational tools, including approaches to searching sequence motifs. State-of-the-art motif searching tools are limited in their coverage and ability to address large motif spaces. We develop and present statistical and algorithmic approaches that take as input ranked lists of sequences and return significant motifs. The efficiency of our approach, based on suffix trees, allows searches over motif spaces that are not covered by existing tools. This includes searching variable gap motifs-two half sites with a flexible length gap in between-and searching long motifs over large alphabets. We used our approach to analyze several high-throughput measurement data sets and report some validation results as well as novel suggested motifs and motif refinements. We suggest a refinement of the known estrogen receptor 1 motif in humans, where we observe gaps other than three nucleotides that also serve as significant recognition sites, as well as a variable length motif related to potential tyrosine phosphorylation.

  20. CompleteMOTIFs: DNA motif discovery platform for transcription factor binding experiments.

    PubMed

    Kuttippurathu, Lakshmi; Hsing, Michael; Liu, Yongchao; Schmidt, Bertil; Maskell, Douglas L; Lee, Kyungjoon; He, Aibin; Pu, William T; Kong, Sek Won

    2011-03-01

    CompleteMOTIFs (cMOTIFs) is an integrated web tool developed to facilitate systematic discovery of overrepresented transcription factor binding motifs from high-throughput chromatin immunoprecipitation experiments. Comprehensive annotations and Boolean logic operations on multiple peak locations enable users to focus on genomic regions of interest for de novo motif discovery using tools such as MEME, Weeder and ChIPMunk. The pipeline incorporates a scanning tool for known motifs from TRANSFAC and JASPAR databases, and performs an enrichment test using local or precalculated background models that significantly improve the motif scanning result. Furthermore, using the cMOTIFs pipeline, we demonstrated that multiple transcription factors could cooperatively bind to the upstream of important stem cell differentiation regulators. http://cmotifs.tchlab.org.

  1. [Psychopathological study of lie motif in schizophrenia].

    PubMed

    Otsuka, Koichiro; Kato, Satoshi

    2006-01-01

    The theme of a statement is called "lie motif" by the authors when schizophrenic patients say "I have lied to anybody". We tried to analyse of the psychopathological characteristics and anthropological meanings of the lie motifs in schizophrenia, which has not been thematically examined until now, based on 4 cases, and contrasting with the lie motif (Lügenmotiv) in depression taken up by A. Kraus (1989). We classified the lie motifs in schizophrenia into the following two types: a) the past directive lie motif: the patients speak about their real lie regarding it as a 'petty fault' in their distant past with self-guilty feeling, b) the present directive lie motif: the patients say repeatedly 'I have lied' (about their present speech and behavior), retreating from their previous commitments. The observed false confessions of innocent fault by the patients seem to belong to the present directed lie motif. In comparison with the lie motif in depression, it is characteristic for the lie motif in schizophrenia that the patients feel themselves to already have been caught out by others before they confess the lie. The lie motif in schizophrenia seems to come into being through the attribution process of taking the others' blame on ones' own shoulders, which has been pointed out to be common in the guilt experience in schizophrenia. The others' blame on this occasion is due to "the others' gaze" in the experience of the initial self-centralization (i.e. non delusional self-referential experience) in the early stage of schizophrenia (S. Kato 1999). The others' gaze is supposed to bring about the feeling of amorphous self-revelation which could also be regarded as the guilt feeling without content, to the patients. When the guilt feeling is bound with a past concrete fault, the patients tell the past directive lie motif. On the other hand, when the patients cannot find a past fixed content, and feel their present actions as uncertain and experience them as lies, the

  2. Network motifs in integrated cellular networks of transcription-regulation and protein-protein interaction

    NASA Astrophysics Data System (ADS)

    Yeger-Lotem, Esti; Sattath, Shmuel; Kashtan, Nadav; Itzkovitz, Shalev; Milo, Ron; Pinter, Ron Y.; Alon, Uri; Margalit, Hanah

    2004-04-01

    Genes and proteins generate molecular circuitry that enables the cell to process information and respond to stimuli. A major challenge is to identify characteristic patterns in this network of interactions that may shed light on basic cellular mechanisms. Previous studies have analyzed aspects of this network, concentrating on either transcription-regulation or protein-protein interactions. Here we search for composite network motifs: characteristic network patterns consisting of both transcription-regulation and protein-protein interactions that recur significantly more often than in random networks. To this end we developed algorithms for detecting motifs in networks with two or more types of interactions and applied them to an integrated data set of protein-protein interactions and transcription regulation in Saccharomyces cerevisiae. We found a two-protein mixed-feedback loop motif, five types of three-protein motifs exhibiting coregulation and complex formation, and many motifs involving four proteins. Virtually all four-protein motifs consisted of combinations of smaller motifs. This study presents a basic framework for detecting the building blocks of networks with multiple types of interactions.

  3. Stochastic motif extraction using hidden Markov model

    SciTech Connect

    Fujiwara, Yukiko; Asogawa, Minoru; Konagaya, Akihiko

    1994-12-31

    In this paper, we study the application of an HMM (hidden Markov model) to the problem of representing protein sequences by a stochastic motif. A stochastic protein motif represents the small segments of protein sequences that have a certain function or structure. The stochastic motif, represented by an HMM, has conditional probabilities to deal with the stochastic nature of the motif. This HMM directive reflects the characteristics of the motif, such as a protein periodical structure or grouping. In order to obtain the optimal HMM, we developed the {open_quotes}iterative duplication method{close_quotes} for HMM topology learning. It starts from a small fully-connected network and iterates the network generation and parameter optimization until it achieves sufficient discrimination accuracy. Using this method, we obtained an HMM for a leucine zipper motif. Compared to the accuracy of a symbolic pattern representation with accuracy of 14.8 percent, an HMM achieved 79.3 percent in prediction. Additionally, the method can obtain an HMM for various types of zinc finger motifs, and it might separate the mixed data. We demonstrated that this approach is applicable to the validation of the protein databases; a constructed HMM b as indicated that one protein sequence annotated as {open_quotes}lencine-zipper like sequence{close_quotes} in the database is quite different from other leucine-zipper sequences in terms of likelihood, and we found this discrimination is plausible.

  4. VARUN: discovering extensible motifs under saturation constraints.

    PubMed

    Apostolico, Alberto; Comin, Matteo; Parida, Laxmi

    2010-01-01

    The discovery of motifs in biosequences is frequently torn between the rigidity of the model on one hand and the abundance of candidates on the other hand. In particular, motifs that include wild cards or "don't cares" escalate exponentially with their number, and this gets only worse if a don't care is allowed to stretch up to some prescribed maximum length. In this paper, a notion of extensible motif in a sequence is introduced and studied, which tightly combines the structure of the motif pattern, as described by its syntactic specification, with the statistical measure of its occurrence count. It is shown that a combination of appropriate saturation conditions and the monotonicity of probabilistic scores over regions of constant frequency afford us significant parsimony in the generation and testing of candidate overrepresented motifs. A suite of software programs called Varun is described, implementing the discovery of extensible motifs of the type considered. The merits of the method are then documented by results obtained in a variety of experiments primarily targeting protein sequence families. Of equal importance seems the fact that the sets of all surprising motifs returned in each experiment are extracted faster and come in much more manageable sizes than would be obtained in the absence of saturation constraints.

  5. Efficient motif search in ranked lists and applications to variable gap motifs

    PubMed Central

    Leibovich, Limor; Yakhini, Zohar

    2012-01-01

    Sequence elements, at all levels—DNA, RNA and protein, play a central role in mediating molecular recognition and thereby molecular regulation and signaling. Studies that focus on measuring and investigating sequence-based recognition make use of statistical and computational tools, including approaches to searching sequence motifs. State-of-the-art motif searching tools are limited in their coverage and ability to address large motif spaces. We develop and present statistical and algorithmic approaches that take as input ranked lists of sequences and return significant motifs. The efficiency of our approach, based on suffix trees, allows searches over motif spaces that are not covered by existing tools. This includes searching variable gap motifs—two half sites with a flexible length gap in between—and searching long motifs over large alphabets. We used our approach to analyze several high-throughput measurement data sets and report some validation results as well as novel suggested motifs and motif refinements. We suggest a refinement of the known estrogen receptor 1 motif in humans, where we observe gaps other than three nucleotides that also serve as significant recognition sites, as well as a variable length motif related to potential tyrosine phosphorylation. PMID:22416066

  6. RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets.

    PubMed

    Thomas-Chollier, Morgane; Herrmann, Carl; Defrance, Matthieu; Sand, Olivier; Thieffry, Denis; van Helden, Jacques

    2012-02-01

    ChIP-seq is increasingly used to characterize transcription factor binding and chromatin marks at a genomic scale. Various tools are now available to extract binding motifs from peak data sets. However, most approaches are only available as command-line programs, or via a website but with size restrictions. We present peak-motifs, a computational pipeline that discovers motifs in peak sequences, compares them with databases, exports putative binding sites for visualization in the UCSC genome browser and generates an extensive report suited for both naive and expert users. It relies on time- and memory-efficient algorithms enabling the treatment of several thousand peaks within minutes. Regarding time efficiency, peak-motifs outperforms all comparable tools by several orders of magnitude. We demonstrate its accuracy by analyzing data sets ranging from 4000 to 1,28,000 peaks for 12 embryonic stem cell-specific transcription factors. In all cases, the program finds the expected motifs and returns additional motifs potentially bound by cofactors. We further apply peak-motifs to discover tissue-specific motifs in peak collections for the p300 transcriptional co-activator. To our knowledge, peak-motifs is the only tool that performs a complete motif analysis and offers a user-friendly web interface without any restriction on sequence size or number of peaks.

  7. RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets

    PubMed Central

    Thomas-Chollier, Morgane; Herrmann, Carl; Defrance, Matthieu; Sand, Olivier; Thieffry, Denis; van Helden, Jacques

    2012-01-01

    ChIP-seq is increasingly used to characterize transcription factor binding and chromatin marks at a genomic scale. Various tools are now available to extract binding motifs from peak data sets. However, most approaches are only available as command-line programs, or via a website but with size restrictions. We present peak-motifs, a computational pipeline that discovers motifs in peak sequences, compares them with databases, exports putative binding sites for visualization in the UCSC genome browser and generates an extensive report suited for both naive and expert users. It relies on time- and memory-efficient algorithms enabling the treatment of several thousand peaks within minutes. Regarding time efficiency, peak-motifs outperforms all comparable tools by several orders of magnitude. We demonstrate its accuracy by analyzing data sets ranging from 4000 to 1 28 000 peaks for 12 embryonic stem cell-specific transcription factors. In all cases, the program finds the expected motifs and returns additional motifs potentially bound by cofactors. We further apply peak-motifs to discover tissue-specific motifs in peak collections for the p300 transcriptional co-activator. To our knowledge, peak-motifs is the only tool that performs a complete motif analysis and offers a user-friendly web interface without any restriction on sequence size or number of peaks. PMID:22156162

  8. Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas

    PubMed Central

    Petrov, Anton I.; Zirbel, Craig L.; Leontis, Neocles B.

    2013-01-01

    The analysis of atomic-resolution RNA three-dimensional (3D) structures reveals that many internal and hairpin loops are modular, recurrent, and structured by conserved non-Watson–Crick base pairs. Structurally similar loops define RNA 3D motifs that are conserved in homologous RNA molecules, but can also occur at nonhomologous sites in diverse RNAs, and which often vary in sequence. To further our understanding of RNA motif structure and sequence variability and to provide a useful resource for structure modeling and prediction, we present a new method for automated classification of internal and hairpin loop RNA 3D motifs and a new online database called the RNA 3D Motif Atlas. To classify the motif instances, a representative set of internal and hairpin loops is automatically extracted from a nonredundant list of RNA-containing PDB files. Their structures are compared geometrically, all-against-all, using the FR3D program suite. The loops are clustered into motif groups, taking into account geometric similarity and structural annotations and making allowance for a variable number of bulged bases. The automated procedure that we have implemented identifies all hairpin and internal loop motifs previously described in the literature. All motif instances and motif groups are assigned unique and stable identifiers and are made available in the RNA 3D Motif Atlas (http://rna.bgsu.edu/motifs), which is automatically updated every four weeks. The RNA 3D Motif Atlas provides an interactive user interface for exploring motif diversity and tools for programmatic data access. PMID:23970545

  9. Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas.

    PubMed

    Petrov, Anton I; Zirbel, Craig L; Leontis, Neocles B

    2013-10-01

    The analysis of atomic-resolution RNA three-dimensional (3D) structures reveals that many internal and hairpin loops are modular, recurrent, and structured by conserved non-Watson-Crick base pairs. Structurally similar loops define RNA 3D motifs that are conserved in homologous RNA molecules, but can also occur at nonhomologous sites in diverse RNAs, and which often vary in sequence. To further our understanding of RNA motif structure and sequence variability and to provide a useful resource for structure modeling and prediction, we present a new method for automated classification of internal and hairpin loop RNA 3D motifs and a new online database called the RNA 3D Motif Atlas. To classify the motif instances, a representative set of internal and hairpin loops is automatically extracted from a nonredundant list of RNA-containing PDB files. Their structures are compared geometrically, all-against-all, using the FR3D program suite. The loops are clustered into motif groups, taking into account geometric similarity and structural annotations and making allowance for a variable number of bulged bases. The automated procedure that we have implemented identifies all hairpin and internal loop motifs previously described in the literature. All motif instances and motif groups are assigned unique and stable identifiers and are made available in the RNA 3D Motif Atlas (http://rna.bgsu.edu/motifs), which is automatically updated every four weeks. The RNA 3D Motif Atlas provides an interactive user interface for exploring motif diversity and tools for programmatic data access.

  10. Understanding the Effect of Preparative Approaches in the Formation of “Flower-like” Li4Ti5O12 —Multiwalled Carbon Nanotube Composite Motifs with Performance as High-Rate Anode Materials for Li-Ion Battery Applications

    DOE PAGES

    Wang, Lei; Zhang, Yiman; McBean, Coray L.; ...

    2017-01-18

    Herein we highlight the significance of nanoscale attachment modality as an important determinant of observed electrochemical performance. Specifically, controlled loading ratios of multi-walled carbon nanotubes (MWNTs) have been successfully anchored onto the surfaces of a unique “flower-like” Li4Ti5O12 (LTO) micro-scale sphere motif, for the first time, using a number of different and distinctive preparative approaches, including (i) a sonication method, (ii) an in situ direct-deposition approach, (iii) a covalent attachment protocol, as well as (iv) a π-π interaction strategy. In terms of structural characterization, the composites generated by physical sonication as well as non-covalent π-π interactions retained the intrinsic hierarchicalmore » “flower-like” morphology and exhibited a similar crystallinity profile as compared with that of pure LTO. By comparison, the composite prepared by an in situ direct deposition approach yielded not only a fragmented LTO structure, likely due to the possible interfering presence of the MWNTs themselves during the relevant hydrothermal reaction, but also a larger crystallite size, owing to the higher annealing temperature associated with its preparation. Finally, the composite created via covalent attachment was covered with an amorphous insulating linker, which probably led to a decreased contact area between the LTO and the MWNTs and hence, a lower crystallinity in the resulting composite. In addition electrode tests suggested that the composite generated by π-π interactions out-performed the other three analogous heterostructures, due to a smaller charge transfer resistance as well as a faster Li-ion diffusion. In particular, the LTO-MWNT composite, produced by π-π interactions, exhibited a reproducibly high rate capability as well as a reliably solid cycling stability, delivering 132 mA h g-1 at 50 C, after 100 discharge/charge cycles, including 40 cycles at a high (>20 C) rate. To conclude, such data

  11. Short sequence motifs, overrepresented in mammalian conservednon-coding sequences

    SciTech Connect

    Minovitsky, Simon; Stegmaier, Philip; Kel, Alexander; Kondrashov,Alexey S.; Dubchak, Inna

    2007-02-21

    Background: A substantial fraction of non-coding DNAsequences of multicellular eukaryotes is under selective constraint. Inparticular, ~;5 percent of the human genome consists of conservednon-coding sequences (CNSs). CNSs differ from other genomic sequences intheir nucleotide composition and must play important functional roles,which mostly remain obscure.Results: We investigated relative abundancesof short sequence motifs in all human CNSs present in the human/mousewhole-genome alignments vs. three background sets of sequences: (i)weakly conserved or unconserved non-coding sequences (non-CNSs); (ii)near-promoter sequences (located between nucleotides -500 and -1500,relative to a start of transcription); and (iii) random sequences withthe same nucleotide composition as that of CNSs. When compared tonon-CNSs and near-promoter sequences, CNSs possess an excess of AT-richmotifs, often containing runs of identical nucleotides. In contrast, whencompared to random sequences, CNSs contain an excess of GC-rich motifswhich, however, lack CpG dinucleotides. Thus, abundance of short sequencemotifs in human CNSs, taken as a whole, is mostly determined by theiroverall compositional properties and not by overrepresentation of anyspecific short motifs. These properties are: (i) high AT-content of CNSs,(ii) a tendency, probably due to context-dependent mutation, of A's andT's to clump, (iii) presence of short GC-rich regions, and (iv) avoidanceof CpG contexts, due to their hypermutability. Only a small number ofshort motifs, overrepresented in all human CNSs are similar to bindingsites of transcription factors from the FOX family.Conclusion: Human CNSsas a whole appear to be too broad a class of sequences to possess strongfootprints of any short sequence-specific functions. Such footprintsshould be studied at the level of functional subclasses of CNSs, such asthose which flank genes with a particular pattern of expression. Overallproperties of CNSs are affected by patterns in

  12. Network motif identification in stochastic networks

    NASA Astrophysics Data System (ADS)

    Jiang, Rui; Tu, Zhidong; Chen, Ting; Sun, Fengzhu

    2006-06-01

    Network motifs have been identified in a wide range of networks across many scientific disciplines and are suggested to be the basic building blocks of most complex networks. Nonetheless, many networks come with intrinsic and/or experimental uncertainties and should be treated as stochastic networks. The building blocks in these networks thus may also have stochastic properties. In this article, we study stochastic network motifs derived from families of mutually similar but not necessarily identical patterns of interconnections. We establish a finite mixture model for stochastic networks and develop an expectation-maximization algorithm for identifying stochastic network motifs. We apply this approach to the transcriptional regulatory networks of Escherichia coli and Saccharomyces cerevisiae, as well as the protein-protein interaction networks of seven species, and identify several stochastic network motifs that are consistent with current biological knowledge. expectation-maximization algorithm | mixture model | transcriptional regulatory network | protein-protein interaction network

  13. DNA Motif Databases and Their Uses.

    PubMed

    Stormo, Gary D

    2015-09-03

    Transcription factors (TFs) recognize and bind to specific DNA sequences. The specificity of a TF is usually represented as a position weight matrix (PWM). Several databases of DNA motifs exist and are used in biological research to address important biological questions. This overview describes PWMs and some of the most commonly used motif databases, as well as a few of their common applications. Copyright © 2015 John Wiley & Sons, Inc.

  14. Chaotic Motifs in Gene Regulatory Networks

    PubMed Central

    Zhang, Zhaoyang; Ye, Weiming; Qian, Yu; Zheng, Zhigang; Huang, Xuhui; Hu, Gang

    2012-01-01

    Chaos should occur often in gene regulatory networks (GRNs) which have been widely described by nonlinear coupled ordinary differential equations, if their dimensions are no less than 3. It is therefore puzzling that chaos has never been reported in GRNs in nature and is also extremely rare in models of GRNs. On the other hand, the topic of motifs has attracted great attention in studying biological networks, and network motifs are suggested to be elementary building blocks that carry out some key functions in the network. In this paper, chaotic motifs (subnetworks with chaos) in GRNs are systematically investigated. The conclusion is that: (i) chaos can only appear through competitions between different oscillatory modes with rivaling intensities. Conditions required for chaotic GRNs are found to be very strict, which make chaotic GRNs extremely rare. (ii) Chaotic motifs are explored as the simplest few-node structures capable of producing chaos, and serve as the intrinsic source of chaos of random few-node GRNs. Several optimal motifs causing chaos with atypically high probability are figured out. (iii) Moreover, we discovered that a number of special oscillators can never produce chaos. These structures bring some advantages on rhythmic functions and may help us understand the robustness of diverse biological rhythms. (iv) The methods of dominant phase-advanced driving (DPAD) and DPAD time fraction are proposed to quantitatively identify chaotic motifs and to explain the origin of chaotic behaviors in GRNs. PMID:22792171

  15. Chaotic motifs in gene regulatory networks.

    PubMed

    Zhang, Zhaoyang; Ye, Weiming; Qian, Yu; Zheng, Zhigang; Huang, Xuhui; Hu, Gang

    2012-01-01

    Chaos should occur often in gene regulatory networks (GRNs) which have been widely described by nonlinear coupled ordinary differential equations, if their dimensions are no less than 3. It is therefore puzzling that chaos has never been reported in GRNs in nature and is also extremely rare in models of GRNs. On the other hand, the topic of motifs has attracted great attention in studying biological networks, and network motifs are suggested to be elementary building blocks that carry out some key functions in the network. In this paper, chaotic motifs (subnetworks with chaos) in GRNs are systematically investigated. The conclusion is that: (i) chaos can only appear through competitions between different oscillatory modes with rivaling intensities. Conditions required for chaotic GRNs are found to be very strict, which make chaotic GRNs extremely rare. (ii) Chaotic motifs are explored as the simplest few-node structures capable of producing chaos, and serve as the intrinsic source of chaos of random few-node GRNs. Several optimal motifs causing chaos with atypically high probability are figured out. (iii) Moreover, we discovered that a number of special oscillators can never produce chaos. These structures bring some advantages on rhythmic functions and may help us understand the robustness of diverse biological rhythms. (iv) The methods of dominant phase-advanced driving (DPAD) and DPAD time fraction are proposed to quantitatively identify chaotic motifs and to explain the origin of chaotic behaviors in GRNs.

  16. Helix-packing motifs in membrane proteins.

    PubMed

    Walters, R F S; DeGrado, W F

    2006-09-12

    The fold of a helical membrane protein is largely determined by interactions between membrane-imbedded helices. To elucidate recurring helix-helix interaction motifs, we dissected the crystallographic structures of membrane proteins into a library of interacting helical pairs. The pairs were clustered according to their three-dimensional similarity (rmsd motifs whose structural features can be understood in terms of simple principles of helix-helix packing. Thus, the universe of common transmembrane helix-pairing motifs is relatively simple. The largest cluster, which comprises 29% of the library members, consists of an antiparallel motif with left-handed packing angles, and it is frequently stabilized by packing of small side chains occurring every seven residues in the sequence. Right-handed parallel and antiparallel structures show a similar tendency to segregate small residues to the helix-helix interface but spaced at four-residue intervals. Position-specific sequence propensities were derived for the most populated motifs. These structural and sequential motifs should be quite useful for the design and structural prediction of membrane proteins.

  17. MotifHyades: Expectation Maximization for de novo DNA Motif Pair Discovery on Paired Sequences.

    PubMed

    Wong, Ka-Chun

    2017-06-13

    In higher eukaryotes, protein-DNA binding interactions are the central activities in gene regulation. In particular, DNA motifs such as transcription factor binding sites are the key components in gene transcription. Harnessing the recently available chromatin interaction data, computational methods are desired for identifying the coupling DNA motif pairs enriched on long-range chromatin-interacting sequence pairs (e.g. promoter-enhancer pairs) systematically. To fill the void, a novel probabilistic model (namely, MotifHyades) is proposed and developed for de novo DNA motif pair discovery on paired sequences. In particular, two expectation maximization algorithms are derived for efficient model training with linear computational complexity. Under diverse scenarios, MotifHyades is demonstrated faster and more accurate than the existing ad hoc computational pipeline. In addition, MotifHyades is applied to discover thousands of DNA motif pairs with higher gold standard motif matching ratio, higher DNase accessibility, and higher evolutionary conservation than the previous ones in the human K562 cell line. Lastly, it has been run on five other human cell lines (i.e. GM12878, HeLa-S3, HUVEC, IMR90, and NHEK), revealing another thousands of novel DNA motif pairs which are characterized across a broad spectrum of genomic features on long-range promoter-enhancer pairs. The matrix-algebra-optimized versions of MotifHyades and the discovered DNA motif pairs can be found in http://bioinfo.cs.cityu.edu.hk/MotifHyades . kc.w@cityu.edu.hk. Supplementary data are available at Bioinformatics online.

  18. iMotifs: an integrated sequence motif visualization and analysis environment

    PubMed Central

    Piipari, Matias; Down, Thomas A.; Saini, Harpreet; Enright, Anton; Hubbard, Tim J.P.

    2010-01-01

    Motivation: Short sequence motifs are an important class of models in molecular biology, used most commonly for describing transcription factor binding site specificity patterns. High-throughput methods have been recently developed for detecting regulatory factor binding sites in vivo and in vitro and consequently high-quality binding site motif data are becoming available for increasing number of organisms and regulatory factors. Development of intuitive tools for the study of sequence motifs is therefore important. iMotifs is a graphical motif analysis environment that allows visualization of annotated sequence motifs and scored motif hits in sequences. It also offers motif inference with the sensitive NestedMICA algorithm, as well as overrepresentation and pairwise motif matching capabilities. All of the analysis functionality is provided without the need to convert between file formats or learn different command line interfaces. The application includes a bundled and graphically integrated version of the NestedMICA motif inference suite that has no outside dependencies. Problems associated with local deployment of software are therefore avoided. Availability: iMotifs is licensed with the GNU Lesser General Public License v2.0 (LGPL 2.0). The software and its source is available at http://wiki.github.com/mz2/imotifs and can be run on Mac OS X Leopard (Intel/PowerPC). We also provide a cross-platform (Linux, OS X, Windows) LGPL 2.0 licensed library libxms for the Perl, Ruby, R and Objective-C programming languages for input and output of XMS formatted annotated sequence motif set files. Contact: matias.piipari@gmail.com; imotifs@googlegroups.com PMID:20106815

  19. Inhibition de la corrosion d'acier au carbone en milieu H3PO4 2M par des composés organiques de type ``triazine''

    NASA Astrophysics Data System (ADS)

    Bekkouch, K.; Aouniti, A.; Hammouti, B.; Kertit, S.

    1999-05-01

    The effect of addition of some triazine compounds on the corrosion behaviour of steel in 2M H3PO4 has been studied by weight loss and electrochemical polarisation methods. Both methods showed that the dissolution rate was dependent on the chemical properties and concentration of the product. From comparison of results, it was found that 6-azathymine (T6) is the best inhibitor and its inhibition efficiency reaches a maximum value of 86% at 10-3 M. Polarisation measurements indicated that T6 acts as cathodic inhibitor by merely blocking the reaction sites without changing the mechanism of the hydrogen evolution reaction. It was found that T6 was adsorbed on steel surface according to a Langmuir isotherm model. The effect of temperature indicated that inhibition efficiency of T6 is dependent on the temperature in the range 25-50 circC. L'effet de l'addition de certains composés organiques de type triazine sur la corrosion d'un acier en milieu H3PO4 2M a été étudié à l'aide des méthodes électrochimiques et gravimétriques. Les résultats obtenus ont montré que la vitesse de dissolution de l'acier dépend de la structure moléculaire et de la concentration du produit. La comparaison des efficacités inhibitrices montre que le 6-azathymine (T6) est le meilleur inhibiteur de la série des triazines testés. L'efficacité inhibitrice du T6 atteint une valeur maximale de 86 % à 10-3 M. L'allure des courbes de polarisation indique que le T6 agit essentiellement comme inhibiteur de type cathodique par adsorption à la surface de l'acier selon le modèle de l'isotherme de Langmuir. L'efficacité inhibitrice du T6 dépend de la température dans le domaine allant de 25 à 50 circC.

  20. Characteristic motifs for families of allergenic proteins

    PubMed Central

    Ivanciuc, Ovidiu; Garcia, Tzintzuni; Torres, Miguel; Schein, Catherine H.; Braun, Werner

    2008-01-01

    The identification of potential allergenic proteins is usually done by scanning a database of allergenic proteins and locating known allergens with a high sequence similarity. However, there is no universally accepted cut-off value for sequence similarity to indicate potential IgE cross-reactivity. Further, overall sequence similarity may be less important than discrete areas of similarity in proteins with homologous structure. To identify such areas, we first classified all allergens and their subdomains in the Structural Database of Allergenic Proteins (SDAP, http://fermi.utmb.edu/SDAP/) to their closest protein families as defined in Pfam, and identified conserved physicochemical property motifs characteristic of each group of sequences. Allergens populate only a small subset of all known Pfam families, as all allergenic proteins in SDAP could be grouped to only 130 (of 9318 total) Pfams, and 31 families contain more than four allergens. Conserved physicochemical property motifs for the aligned sequences of the most populated Pfam families were identified with the PCPMer program suite and catalogued in the webserver Motif-Mate (http://born.utmb.edu/motifmate/summary.php). We also determined specific motifs for allergenic members of a family that could distinguish them from non-allergenic ones. These allergen specific motifs should be most useful in database searches for potential allergens. We found that sequence motifs unique to the allergens in three families (seed storage proteins, Bet v 1, and tropomyosin) overlap with known IgE epitopes, thus providing evidence that our motif based approach can be used to assess the potential allergenicity of novel proteins. PMID:18951633

  1. Modeling gene regulatory network motifs using statecharts

    PubMed Central

    2012-01-01

    Background Gene regulatory networks are widely used by biologists to describe the interactions among genes, proteins and other components at the intra-cellular level. Recently, a great effort has been devoted to give gene regulatory networks a formal semantics based on existing computational frameworks. For this purpose, we consider Statecharts, which are a modular, hierarchical and executable formal model widely used to represent software systems. We use Statecharts for modeling small and recurring patterns of interactions in gene regulatory networks, called motifs. Results We present an improved method for modeling gene regulatory network motifs using Statecharts and we describe the successful modeling of several motifs, including those which could not be modeled or whose models could not be distinguished using the method of a previous proposal. We model motifs in an easy and intuitive way by taking advantage of the visual features of Statecharts. Our modeling approach is able to simulate some interesting temporal properties of gene regulatory network motifs: the delay in the activation and the deactivation of the "output" gene in the coherent type-1 feedforward loop, the pulse in the incoherent type-1 feedforward loop, the bistability nature of double positive and double negative feedback loops, the oscillatory behavior of the negative feedback loop, and the "lock-in" effect of positive autoregulation. Conclusions We present a Statecharts-based approach for the modeling of gene regulatory network motifs in biological systems. The basic motifs used to build more complex networks (that is, simple regulation, reciprocal regulation, feedback loop, feedforward loop, and autoregulation) can be faithfully described and their temporal dynamics can be analyzed. PMID:22536967

  2. Identifying combinatorial regulation of transcription factors and binding motifs

    PubMed Central

    Kato, Mamoru; Hata, Naoya; Banerjee, Nilanjana; Futcher, Bruce; Zhang, Michael Q

    2004-01-01

    Background Combinatorial interaction of transcription factors (TFs) is important for gene regulation. Although various genomic datasets are relevant to this issue, each dataset provides relatively weak evidence on its own. Developing methods that can integrate different sequence, expression and localization data have become important. Results Here we use a novel method that integrates chromatin immunoprecipitation (ChIP) data with microarray expression data and with combinatorial TF-motif analysis. We systematically identify combinations of transcription factors and of motifs. The various combinations of TFs involved multiple binding mechanisms. We reconstruct a new combinatorial regulatory map of the yeast cell cycle in which cell-cycle regulation can be drawn as a chain of extended TF modules. We find that the pairwise combination of a TF for an early cell-cycle phase and a TF for a later phase is often used to control gene expression at intermediate times. Thus the number of distinct times of gene expression is greater than the number of transcription factors. We also see that some TF modules control branch points (cell-cycle entry and exit), and in the presence of appropriate signals they can allow progress along alternative pathways. Conclusions Combining different data sources can increase statistical power as demonstrated by detecting TF interactions and composite TF-binding motifs. The original picture of a chain of simple cell-cycle regulators can be extended to a chain of composite regulatory modules: different modules may share a common TF component in the same pathway or a TF component cross-talking to other pathways. PMID:15287978

  3. The Verrucomicrobia LexA-Binding Motif: Insights into the Evolutionary Dynamics of the SOS Response

    PubMed Central

    Erill, Ivan; Campoy, Susana; Kılıç, Sefa; Barbé, Jordi

    2016-01-01

    The SOS response is the primary bacterial mechanism to address DNA damage, coordinating multiple cellular processes that include DNA repair, cell division, and translesion synthesis. In contrast to other regulatory systems, the composition of the SOS genetic network and the binding motif of its transcriptional repressor, LexA, have been shown to vary greatly across bacterial clades, making it an ideal system to study the co-evolution of transcription factors and their regulons. Leveraging comparative genomics approaches and prior knowledge on the core SOS regulon, here we define the binding motif of the Verrucomicrobia, a recently described phylum of emerging interest due to its association with eukaryotic hosts. Site directed mutagenesis of the Verrucomicrobium spinosum recA promoter confirms that LexA binds a 14 bp palindromic motif with consensus sequence TGTTC-N4-GAACA. Computational analyses suggest that recognition of this novel motif is determined primarily by changes in base-contacting residues of the third alpha helix of the LexA helix-turn-helix DNA binding motif. In conjunction with comparative genomics analysis of the LexA regulon in the Verrucomicrobia phylum, electrophoretic shift assays reveal that LexA binds to operators in the promoter region of DNA repair genes and a mutagenesis cassette in this organism, and identify previously unreported components of the SOS response. The identification of tandem LexA-binding sites generating instances of other LexA-binding motifs in the lexA gene promoter of Verrucomicrobia species leads us to postulate a novel mechanism for LexA-binding motif evolution. This model, based on gene duplication, successfully addresses outstanding questions in the intricate co-evolution of the LexA protein, its binding motif and the regulatory network it controls. PMID:27489856

  4. Unsupervised statistical discovery of spaced motifs in prokaryotic genomes.

    PubMed

    Tong, Hao; Schliekelman, Paul; Mrázek, Jan

    2017-01-05

    DNA sequences contain repetitive motifs which have various functions in the physiology of the organism. A number of methods have been developed for discovery of such sequence motifs with a primary focus on detection of regulatory motifs and particularly transcription factor binding sites. Most motif-finding methods apply probabilistic models to detect motifs characterized by unusually high number of copies of the motif in the analyzed sequences. We present a novel method for detection of pairs of motifs separated by spacers of variable nucleotide sequence but conserved length. Unlike existing methods for motif discovery, the motifs themselves are not required to occur at unusually high frequency but only to exhibit a significant preference to occur at a specific distance from each other. In the present implementation of the method, motifs are represented by pentamers and all pairs of pentamers are evaluated for statistically significant preference for a specific distance. An important step of the algorithm eliminates motif pairs where the spacers separating the two motifs exhibit a high degree of sequence similarity; such motif pairs likely arise from duplications of the whole segment including the motifs and the spacer rather than due to selective constraints indicative of a functional importance of the motif pair. The method was used to scan 569 complete prokaryotic genomes for novel sequence motifs. Some motifs detected were previously known but other motifs found in the search appear to be novel. Selected motif pairs were subjected to further investigation and in some cases their possible biological functions were proposed. We present a new motif-finding technique that is applicable to scanning complete genomes for sequence motifs. The results from analysis of 569 genomes suggest that the method detects previously known motifs that are expected to be found as well as new motifs that are unlikely to be discovered by traditional motif-finding methods. We conclude

  5. Sequential motif profile of natural visibility graphs.

    PubMed

    Iacovacci, Jacopo; Lacasa, Lucas

    2016-11-01

    The concept of sequential visibility graph motifs-subgraphs appearing with characteristic frequencies in the visibility graphs associated to time series-has been advanced recently along with a theoretical framework to compute analytically the motif profiles associated to horizontal visibility graphs (HVGs). Here we develop a theory to compute the profile of sequential visibility graph motifs in the context of natural visibility graphs (VGs). This theory gives exact results for deterministic aperiodic processes with a smooth invariant density or stochastic processes that fulfill the Markov property and have a continuous marginal distribution. The framework also allows for a linear time numerical estimation in the case of empirical time series. A comparison between the HVG and the VG case (including evaluation of their robustness for short series polluted with measurement noise) is also presented.

  6. The telomere repeat motif of basal Metazoa.

    PubMed

    Traut, Walther; Szczepanowski, Monika; Vítková, Magda; Opitz, Christian; Marec, Frantisek; Zrzavý, Jan

    2007-01-01

    In most eukaryotes the telomeres consist of short DNA tandem repeats and associated proteins. Telomeric repeats are added to the chromosome ends by telomerase, a specialized reverse transcriptase. We examined telomerase activity and telomere repeat sequences in representatives of basal metazoan groups. Our results show that the 'vertebrate' telomere motif (TTAGGG)( n ) is present in all basal metazoan groups, i.e. sponges, Cnidaria, Ctenophora, and Placozoa, and also in the unicellular metazoan sister group, the Choanozoa. Thus it can be considered the ancestral telomere repeat motif of Metazoa. It has been conserved from the metazoan radiation in most animal phylogenetic lineages, and replaced by other motifs-according to our present knowledge-only in two major lineages, Arthropoda and Nematoda.

  7. Motif-based embedding for graph clustering

    NASA Astrophysics Data System (ADS)

    Lim, Sungsu; Lee, Jae-Gil

    2016-12-01

    Community detection in complex networks is a fundamental problem that has been extensively studied owing to its wide range of applications. However, because community detection methods typically rely on the relations between vertices in networks, they may fail to discover higher-order graph substructures, called the network motifs. In this paper, we propose a novel embedding method for graph clustering that considers higher-order relationships involving multiple vertices. We show that our embedding method, which we call motif-based embedding, is more effective in detecting communities than existing graph embedding methods, spectral embedding and force-directed embedding, both theoretically and experimentally.

  8. MoTeX-II: structured MoTif eXtraction from large-scale datasets.

    PubMed

    Pissis, Solon P

    2014-07-08

    Identifying repeated factors that occur in a string of letters or common factors that occur in a set of strings represents an important task in computer science and biology. Such patterns are called motifs, and the process of identifying them is called motif extraction. In biology, motif extraction constitutes a fundamental step in understanding regulation of gene expression. State-of-the-art tools for motif extraction have their own constraints. Most of these tools are only designed for single motif extraction; structured motifs additionally allow for distance intervals between their single motif components. Moreover, motif extraction from large-scale datasets-for instance, large-scale ChIP-Seq datasets-cannot be performed by current tools. Other constraints include high time and/or space complexity for identifying long motifs with higher error thresholds. In this article, we introduce MoTeX-II, a word-based high-performance computing tool for structured MoTif eXtraction from large-scale datasets. Similar to its predecessor for single motif extraction, it uses state-of-the-art algorithms for solving the fixed-length approximate string matching problem. It produces similar and partially identical results to state-of-the-art tools for structured motif extraction with respect to accuracy as quantified by statistical significance measures. Moreover, we show that it matches or outperforms these tools in terms of runtime efficiency by merging single motif occurrences efficiently. MoTeX-II comes in three flavors: a standard CPU version; an OpenMP-based version; and an MPI-based version. For instance, the MPI-based version of MoTeX-II requires only a couple of hours to process all human genes for structured motif extraction on 1056 processors, while current sequential tools require more than a week for this task. Finally, we show that MoTeX-II is successful in extracting known composite transcription factor binding sites from real datasets. Use of MoTeX-II in biological

  9. Calendar motifs on Getashen hydria

    NASA Astrophysics Data System (ADS)

    Vrtanesyan, Garegin

    2015-07-01

    Getashen hydria was found in the tombs of the middle bronze age (the first third of the second Millennium B.C.) in Armenia (Lake Sevan). It shows a scene consisting of three friezes. On the lower frieze depicts six zoomorphic figures, on an average six frieze waterfowl, and on top, is the graphic signs. Calendar motives of this composition have a numeric expression, six zoomorphic figures on the lower and middle friezes. Division of the annual cycle into two parts is known in the calendars of the ancient Indo-Iranian ("great summer" and "the great winter"). Animals on the lower frieze of the second mark, "winter" road of the Sun, because in this period are the most important events, ensuring the reproduction of the economy of the society. This rut ungulates - wild (deer) and domestic (goats). Moreover, the gon goats end in December, almost coinciding with the onset of the winter solstice. A couple of dogs on the lower frieze marks the version of the myth, imprisoned in the rock hero - the Sun (Mihr - Artavazd), to which his dogs have to chew the chains, anticipating his exit at the winter solstice. This is indicated by the direction of their movement, the Sun moves from left to right for an observer, only when located on the South side of the sky (i.e., beginning with the autumnal equinox). The most important event of the period of "summer road" of the Sun is the vernal equinox, which coincide with the arrival of waterfowl (ducks, geese). Their direction on the second frieze (left to right) corresponds to the position of the observer, facing North.

  10. MEME SUITE: tools for motif discovery and searching.

    PubMed

    Bailey, Timothy L; Boden, Mikael; Buske, Fabian A; Frith, Martin; Grant, Charles E; Clementi, Luca; Ren, Jingyuan; Li, Wilfred W; Noble, William S

    2009-07-01

    The MEME Suite web server provides a unified portal for online discovery and analysis of sequence motifs representing features such as DNA binding sites and protein interaction domains. The popular MEME motif discovery algorithm is now complemented by the GLAM2 algorithm which allows discovery of motifs containing gaps. Three sequence scanning algorithms--MAST, FIMO and GLAM2SCAN--allow scanning numerous DNA and protein sequence databases for motifs discovered by MEME and GLAM2. Transcription factor motifs (including those discovered using MEME) can be compared with motifs in many popular motif databases using the motif database scanning algorithm TOMTOM. Transcription factor motifs can be further analyzed for putative function by association with Gene Ontology (GO) terms using the motif-GO term association tool GOMO. MEME output now contains sequence LOGOS for each discovered motif, as well as buttons to allow motifs to be conveniently submitted to the sequence and motif database scanning algorithms (MAST, FIMO and TOMTOM), or to GOMO, for further analysis. GLAM2 output similarly contains buttons for further analysis using GLAM2SCAN and for rerunning GLAM2 with different parameters. All of the motif-based tools are now implemented as web services via Opal. Source code, binaries and a web server are freely available for noncommercial use at http://meme.nbcr.net.

  11. MEME Suite: tools for motif discovery and searching

    PubMed Central

    Bailey, Timothy L.; Boden, Mikael; Buske, Fabian A.; Frith, Martin; Grant, Charles E.; Clementi, Luca; Ren, Jingyuan; Li, Wilfred W.; Noble, William S.

    2009-01-01

    The MEME Suite web server provides a unified portal for online discovery and analysis of sequence motifs representing features such as DNA binding sites and protein interaction domains. The popular MEME motif discovery algorithm is now complemented by the GLAM2 algorithm which allows discovery of motifs containing gaps. Three sequence scanning algorithms—MAST, FIMO and GLAM2SCAN—allow scanning numerous DNA and protein sequence databases for motifs discovered by MEME and GLAM2. Transcription factor motifs (including those discovered using MEME) can be compared with motifs in many popular motif databases using the motif database scanning algorithm Tomtom. Transcription factor motifs can be further analyzed for putative function by association with Gene Ontology (GO) terms using the motif-GO term association tool GOMO. MEME output now contains sequence LOGOS for each discovered motif, as well as buttons to allow motifs to be conveniently submitted to the sequence and motif database scanning algorithms (MAST, FIMO and Tomtom), or to GOMO, for further analysis. GLAM2 output similarly contains buttons for further analysis using GLAM2SCAN and for rerunning GLAM2 with different parameters. All of the motif-based tools are now implemented as web services via Opal. Source code, binaries and a web server are freely available for noncommercial use at http://meme.nbcr.net. PMID:19458158

  12. Genomic analysis of membrane protein families: abundance and conserved motifs

    PubMed Central

    Liu, Yang; Engelman, Donald M; Gerstein, Mark

    2002-01-01

    Background Polytopic membrane proteins can be related to each other on the basis of the number of transmembrane helices and sequence similarities. Building on the Pfam classification of protein domain families, and using transmembrane-helix prediction and sequence-similarity searching, we identified a total of 526 well-characterized membrane protein families in 26 recently sequenced genomes. To this we added a clustering of a number of predicted but unclassified membrane proteins, resulting in a total of 637 membrane protein families. Results Analysis of the occurrence and composition of these families revealed several interesting trends. The number of assigned membrane protein domains has an approximately linear relationship to the total number of open reading frames (ORFs) in 26 genomes studied. Caenorhabditis elegans is an apparent outlier, because of its high representation of seven-span transmembrane (7-TM) chemoreceptor families. In all genomes, including that of C. elegans, the number of distinct membrane protein families has a logarithmic relation to the number of ORFs. Glycine, proline, and tyrosine locations tend to be conserved in transmembrane regions within families, whereas isoleucine, valine, and methionine locations are relatively mutable. Analysis of motifs in putative transmembrane helices reveals that GxxxG and GxxxxxxG (which can be written GG4 and GG7, respectively; see Materials and methods) are among the most prevalent. This was noted in earlier studies; we now find these motifs are particularly well conserved in families, however, especially those corresponding to transporters, symporters, and channels. Conclusions We carried out a genome-wide analysis on patterns of the classified polytopic membrane protein families and analyzed the distribution of conserved amino acids and motifs in the transmembrane helix regions in these families. PMID:12372142

  13. Network motifs modulate druggability of cellular targets

    PubMed Central

    Wu, Fan; Ma, Cong; Tan, Cheemeng

    2016-01-01

    Druggability refers to the capacity of a cellular target to be modulated by a small-molecule drug. To date, druggability is mainly studied by focusing on direct binding interactions between a drug and its target. However, druggability is impacted by cellular networks connected to a drug target. Here, we use computational approaches to reveal basic principles of network motifs that modulate druggability. Through quantitative analysis, we find that inhibiting self-positive feedback loop is a more robust and effective treatment strategy than inhibiting other regulations, and adding direct regulations to a drug-target generally reduces its druggability. The findings are explained through analytical solution of the motifs. Furthermore, we find that a consensus topology of highly druggable motifs consists of a negative feedback loop without any positive feedback loops, and consensus motifs with low druggability have multiple positive direct regulations and positive feedback loops. Based on the discovered principles, we predict potential genetic targets in Escherichia coli that have either high or low druggability based on their network context. Our work establishes the foundation toward identifying and predicting druggable targets based on their network topology. PMID:27824147

  14. Motifs and structural blocks retrieval by GHT

    NASA Astrophysics Data System (ADS)

    Cantoni, Virginio; Ferone, Alessio; Petrosino, Alfredo; Polat, Ozlem

    2014-06-01

    The structure of a protein gives more insight on the protein function than its amino acid sequence. Protein structure analysis and comparison are important for understanding the evolutionary relationships among proteins, predicting protein functions, and predicting protein folding. Proteins are formed by two basic regular 3D structural patterns, called Secondary Structures (SSs): helices and sheets. A structural motif is a compact 3D protein block referring to a small specific combination of secondary structural elements, which appears in a variety of molecules. In this paper we compare a few approaches for motif retrieval based on the Generalized Hough Transform (GHT). A primary technique is to adopt the single SS as structural primitives; alternatives are to adopt a SSs pair as primitive structural element, or a SSs triplet, and so on up-to an entire motif. The richer the primitive, the higher the time for pre-analysis and search, and the simpler the inspection process on the parameter space for analyzing the peaks. Performance comparisons, in terms of precision and computation time, are here presented considering the retrieval of motifs composed by three to five SSs for more than 15 million searches. The approach can be easily applied to the retrieval of greater blocks, up to protein domains, or even entire proteins.

  15. DNA motif elucidation using belief propagation.

    PubMed

    Wong, Ka-Chun; Chan, Tak-Ming; Peng, Chengbin; Li, Yue; Zhang, Zhaolei

    2013-09-01

    Protein-binding microarray (PBM) is a high-throughout platform that can measure the DNA-binding preference of a protein in a comprehensive and unbiased manner. A typical PBM experiment can measure binding signal intensities of a protein to all the possible DNA k-mers (k=8∼10); such comprehensive binding affinity data usually need to be reduced and represented as motif models before they can be further analyzed and applied. Since proteins can often bind to DNA in multiple modes, one of the major challenges is to decompose the comprehensive affinity data into multimodal motif representations. Here, we describe a new algorithm that uses Hidden Markov Models (HMMs) and can derive precise and multimodal motifs using belief propagations. We describe an HMM-based approach using belief propagations (kmerHMM), which accepts and preprocesses PBM probe raw data into median-binding intensities of individual k-mers. The k-mers are ranked and aligned for training an HMM as the underlying motif representation. Multiple motifs are then extracted from the HMM using belief propagations. Comparisons of kmerHMM with other leading methods on several data sets demonstrated its effectiveness and uniqueness. Especially, it achieved the best performance on more than half of the data sets. In addition, the multiple binding modes derived by kmerHMM are biologically meaningful and will be useful in interpreting other genome-wide data such as those generated from ChIP-seq. The executables and source codes are available at the authors' websites: e.g. http://www.cs.toronto.edu/∼wkc/kmerHMM.

  16. A survey of motif finding Web tools for detecting binding site motifs in ChIP-Seq data

    PubMed Central

    2014-01-01

    Abstract ChIP-Seq (chromatin immunoprecipitation sequencing) has provided the advantage for finding motifs as ChIP-Seq experiments narrow down the motif finding to binding site locations. Recent motif finding tools facilitate the motif detection by providing user-friendly Web interface. In this work, we reviewed nine motif finding Web tools that are capable for detecting binding site motifs in ChIP-Seq data. We showed each motif finding Web tool has its own advantages for detecting motifs that other tools may not discover. We recommended the users to use multiple motif finding Web tools that implement different algorithms for obtaining significant motifs, overlapping resemble motifs, and non-overlapping motifs. Finally, we provided our suggestions for future development of motif finding Web tool that better assists researchers for finding motifs in ChIP-Seq data. Reviewers This article was reviewed by Prof. Sandor Pongor, Dr. Yuriy Gusev, and Dr. Shyam Prabhakar (nominated by Prof. Limsoon Wong). PMID:24555784

  17. CombiMotif: A new algorithm for network motifs discovery in protein-protein interaction networks

    NASA Astrophysics Data System (ADS)

    Luo, Jiawei; Li, Guanghui; Song, Dan; Liang, Cheng

    2014-12-01

    Discovering motifs in protein-protein interaction networks is becoming a current major challenge in computational biology, since the distribution of the number of network motifs can reveal significant systemic differences among species. However, this task can be computationally expensive because of the involvement of graph isomorphic detection. In this paper, we present a new algorithm (CombiMotif) that incorporates combinatorial techniques to count non-induced occurrences of subgraph topologies in the form of trees. The efficiency of our algorithm is demonstrated by comparing the obtained results with the current state-of-the art subgraph counting algorithms. We also show major differences between unicellular and multicellular organisms. The datasets and source code of CombiMotif are freely available upon request.

  18. Composites

    NASA Astrophysics Data System (ADS)

    Taylor, John G.

    The Composites market is arguably the most challenging and profitable market for phenolic resins aside from electronics. The variety of products and processes encountered creates the challenges, and the demand for high performance in critical operations brings value. Phenolic composite materials are rendered into a wide range of components to supply a diverse and fragmented commercial base that includes customers in aerospace (Space Shuttle), aircraft (interiors and brakes), mass transit (interiors), defense (blast protection), marine, mine ducting, off-shore (ducts and grating) and infrastructure (architectural) to name a few. For example, phenolic resin is a critical adhesive in the manufacture of honeycomb sandwich panels. Various solvent and water based resins are described along with resin characteristics and the role of metal ions for enhanced thermal stability of the resin used to coat the honeycomb. Featured new developments include pultrusion of phenolic grating, success in RTM/VARTM fabricated parts, new ballistic developments for military vehicles and high char yield carbon-carbon composites along with many others. Additionally, global regional market resin volumes and sales are presented and compared with other thermosetting resin systems.

  19. Strength of Bolted Joints in Laminated Composites

    DTIC Science & Technology

    1984-03-01

    considered (c13ao23=033a0). Under these conditions, in the absence of body forces, the condition of force equilibriui can be expressed as E18 ] (1) Bo2...Conference), ASTM STP 617, 1977, pp 229-242. 14. I.M. Daniel, R.E. Rowlands, and J.B. Whiteside, "Effects of Material and Stacking Sequence on the...Whitney, "Uniaxial Failure of Composite Laminates Containing Stress Concentrations,"Fracture Mechanics of Compo-ites, ASTM STP 593, 1975, pp. 117-142

  20. Pressure-dependent formation of i-motif and G-quadruplex DNA structures.

    PubMed

    Takahashi, S; Sugimoto, N

    2015-12-14

    Pressure is an important physical stimulus that can influence the fate of cells by causing structural changes in biomolecules such as DNA. We investigated the effect of high pressure on the folding of duplex, DNA i-motif, and G-quadruplex (G4) structures; the non-canonical structures may be modulators of expression of genes involved in cancer progression. The i-motif structure was stabilized by high pressure, whereas the G4 structure was destabilized. The melting temperature of an intramolecular i-motif formed by 5'-dCGG(CCT)10CGG-3' increased from 38.8 °C at atmospheric pressure to 61.5 °C at 400 MPa. This effect was also observed in the presence of 40 wt% ethylene glycol, a crowding agent. In the presence of 40 wt% ethylene glycol, the G4 structure was less destabilized than in the absence of the crowding agent. P-T stability diagrams of duplex DNA with a telomeric sequence indicated that the duplex is more stable than G4 and i-motif structures under low pressure, but the i-motif dominates the structural composition under high pressure. Under crowding conditions, the P-T diagrams indicated that the duplex does not form under high pressure, and i-motif and G4 structures dominate. Our findings imply that temperature regulates the formation of the duplex structure, whereas pressure triggers the formation of non-canonical DNA structures like i-motif and G4. These results suggest that pressure impacts the function of nucleic acids by stabilizing non-canonical structures; this may be relevant to deep sea organisms and during evolution under prebiotic conditions.

  1. Overlapping ETS and CRE Motifs ((G/C)CGGAAGTGACGTCA) preferentially bound by GABPα and CREB proteins.

    PubMed

    Chatterjee, Raghunath; Zhao, Jianfei; He, Ximiao; Shlyakhtenko, Andrey; Mann, Ishminder; Waterfall, Joshua J; Meltzer, Paul; Sathyanarayana, B K; FitzGerald, Peter C; Vinson, Charles

    2012-10-01

    Previously, we identified 8-bps long DNA sequences (8-mers) that localize in human proximal promoters and grouped them into known transcription factor binding sites (TFBS). We now examine split 8-mers consisting of two 4-mers separated by 1-bp to 30-bps (X(4)-N(1-30)-X(4)) to identify pairs of TFBS that localize in proximal promoters at a precise distance. These include two overlapping TFBS: the ETS⇔ETS motif ((C/G)CCGGAAGCGGAA) and the ETS⇔CRE motif ((C/G)CGGAAGTGACGTCAC). The nucleotides in bold are part of both TFBS. Molecular modeling shows that the ETS⇔CRE motif can be bound simultaneously by both the ETS and the B-ZIP domains without protein-protein clashes. The electrophoretic mobility shift assay (EMSA) shows that the ETS protein GABPα and the B-ZIP protein CREB preferentially bind to the ETS⇔CRE motif only when the two TFBS overlap precisely. In contrast, the ETS domain of ETV5 and CREB interfere with each other for binding the ETS⇔CRE. The 11-mer (CGGAAGTGACG), the conserved part of the ETS⇔CRE motif, occurs 226 times in the human genome and 83% are in known regulatory regions. In vivo GABPα and CREB ChIP-seq peaks identified the ETS⇔CRE as the most enriched motif occurring in promoters of genes involved in mRNA processing, cellular catabolic processes, and stress response, suggesting that a specific class of genes is regulated by this composite motif.

  2. Overlapping ETS and CRE Motifs (G/CCGGAAGTGACGTCA) Preferentially Bound by GABPα and CREB Proteins

    PubMed Central

    Chatterjee, Raghunath; Zhao, Jianfei; He, Ximiao; Shlyakhtenko, Andrey; Mann, Ishminder; Waterfall, Joshua J.; Meltzer, Paul; Sathyanarayana, B. K.; FitzGerald, Peter C.; Vinson, Charles

    2012-01-01

    Previously, we identified 8-bps long DNA sequences (8-mers) that localize in human proximal promoters and grouped them into known transcription factor binding sites (TFBS). We now examine split 8-mers consisting of two 4-mers separated by 1-bp to 30-bps (X4-N1-30-X4) to identify pairs of TFBS that localize in proximal promoters at a precise distance. These include two overlapping TFBS: the ETS⇔ETS motif (C/GCCGGAAGCGGAA) and the ETS⇔CRE motif (C/GCGGAAGTGACGTCAC). The nucleotides in bold are part of both TFBS. Molecular modeling shows that the ETS⇔CRE motif can be bound simultaneously by both the ETS and the B-ZIP domains without protein-protein clashes. The electrophoretic mobility shift assay (EMSA) shows that the ETS protein GABPα and the B-ZIP protein CREB preferentially bind to the ETS⇔CRE motif only when the two TFBS overlap precisely. In contrast, the ETS domain of ETV5 and CREB interfere with each other for binding the ETS⇔CRE. The 11-mer (CGGAAGTGACG), the conserved part of the ETS⇔CRE motif, occurs 226 times in the human genome and 83% are in known regulatory regions. In vivo GABPα and CREB ChIP-seq peaks identified the ETS⇔CRE as the most enriched motif occurring in promoters of genes involved in mRNA processing, cellular catabolic processes, and stress response, suggesting that a specific class of genes is regulated by this composite motif. PMID:23050235

  3. A Discriminative Approach for Unsupervised Clustering of DNA Sequence Motifs

    PubMed Central

    Stegmaier, Philip; Kel, Alexander; Wingender, Edgar; Borlak, Jürgen

    2013-01-01

    Algorithmic comparison of DNA sequence motifs is a problem in bioinformatics that has received increased attention during the last years. Its main applications concern characterization of potentially novel motifs and clustering of a motif collection in order to remove redundancy. Despite growing interest in motif clustering, the question which motif clusters to aim at has so far not been systematically addressed. Here we analyzed motif similarities in a comprehensive set of vertebrate transcription factor classes. For this we developed enhanced similarity scores by inclusion of the information coverage (IC) criterion, which evaluates the fraction of information an alignment covers in aligned motifs. A network-based method enabled us to identify motif clusters with high correspondence to DNA-binding domain phylogenies and prior experimental findings. Based on this analysis we derived a set of motif families representing distinct binding specificities. These motif families were used to train a classifier which was further integrated into a novel algorithm for unsupervised motif clustering. Application of the new algorithm demonstrated its superiority to previously published methods and its ability to reproduce entrained motif families. As a result, our work proposes a probabilistic approach to decide whether two motifs represent common or distinct binding specificities. PMID:23555204

  4. Regulatory motif finding by logic regression.

    PubMed

    Keles, Sündüz; van der Laan, Mark J; Vulpe, Chris

    2004-11-01

    Multiple transcription factors coordinately control transcriptional regulation of genes in eukaryotes. Although many computational methods consider the identification of individual transcription factor binding sites (TFBSs), very few focus on the interactions between these sites. We consider finding TFBSs and their context specific interactions using microarray gene expression data. We devise a hybrid approach called LogicMotif composed of a TFBS identification method combined with the new regression methodology logic regression. LogicMotif has two steps: First, potential binding sites are identified from transcription control regions of genes of interest. Various available methods can be used in this step when the genes of interest can be divided into groups such as up-and downregulated. For this step, we also develop a simple univariate regression and extension method MFURE to extract candidate TFBSs from a large number of genes in the availability of microarray gene expression data. MFURE provides an alternative method for this step when partitioning of the genes into disjoint groups is not preferred. This first step aims to identify individual sites within gene groups of interest or sites that are correlated with the gene expression outcome. In the second step, logic regression is used to build a predictive model of outcome of interest (either gene expression or up- and down-regulation) using these potential sites. This 2-fold approach creates a rich diverse set of potential binding sites in the first step and builds regression or classification models in the second step using logic regression that is particularly good at identifying complex interactions. LogicMotif is applied to two publicly available datasets. A genome-wide gene expression data set of Saccharomyces cerevisiae is used for validation. The regression models obtained are interpretable and the biological implications are in agreement with the known resuts. This analysis suggests that LogicMotif

  5. Discovery and validation of information theory-based transcription factor and cofactor binding site motifs.

    PubMed

    Lu, Ruipeng; Mucaki, Eliseos J; Rogan, Peter K

    2016-11-28

    Data from ChIP-seq experiments can derive the genome-wide binding specificities of transcription factors (TFs) and other regulatory proteins. We analyzed 765 ENCODE ChIP-seq peak datasets of 207 human TFs with a novel motif discovery pipeline based on recursive, thresholded entropy minimization. This approach, while obviating the need to compensate for skewed nucleotide composition, distinguishes true binding motifs from noise, quantifies the strengths of individual binding sites based on computed affinity and detects adjacent cofactor binding sites that coordinate with the targets of primary, immunoprecipitated TFs. We obtained contiguous and bipartite information theory-based position weight matrices (iPWMs) for 93 sequence-specific TFs, discovered 23 cofactor motifs for 127 TFs and revealed six high-confidence novel motifs. The reliability and accuracy of these iPWMs were determined via four independent validation methods, including the detection of experimentally proven binding sites, explanation of effects of characterized SNPs, comparison with previously published motifs and statistical analyses. We also predict previously unreported TF coregulatory interactions (e.g. TF complexes). These iPWMs constitute a powerful tool for predicting the effects of sequence variants in known binding sites, performing mutation analysis on regulatory SNPs and predicting previously unrecognized binding sites and target genes.

  6. A Review of Functional Motifs Utilized by Viruses

    PubMed Central

    Sobhy, Haitham

    2016-01-01

    Short linear motifs (SLiM) are short peptides that facilitate protein function and protein-protein interactions. Viruses utilize these motifs to enter into the host, interact with cellular proteins, or egress from host cells. Studying functional motifs may help to predict protein characteristics, interactions, or the putative cellular role of a protein. In virology, it may reveal aspects of the virus tropism and help find antiviral therapeutics. This review highlights the recent understanding of functional motifs utilized by viruses. Special attention was paid to the function of proteins harboring these motifs, and viruses encoding these proteins. The review highlights motifs involved in (i) immune response and post-translational modifications (e.g., ubiquitylation, SUMOylation or ISGylation); (ii) virus-host cell interactions, including virus attachment, entry, fusion, egress and nuclear trafficking; (iii) virulence and antiviral activities; (iv) virion structure; and (v) low-complexity regions (LCRs) or motifs enriched with residues (Xaa-rich motifs). PMID:28248213

  7. Composites

    NASA Astrophysics Data System (ADS)

    Chmielewski, M.; Nosewicz, S.; Pietrzak, K.; Rojek, J.; Strojny-Nędza, A.; Mackiewicz, S.; Dutkiewicz, J.

    2014-11-01

    It is commonly known that the properties of sintered materials are strongly related to technological conditions of the densification process. This paper shows the sintering behavior of a NiAl-Al2O3 composite, and its individual components sintered separately. Each kind of material was processed via the powder metallurgy route (hot pressing). The progress of sintering at different stages of the process was tested. Changes in the microstructure were examined using scanning and transmission electron microscopy. Metal-ceramics interface was clean and no additional phases were detected. Correlation between the microstructure, density, and mechanical properties of the sintered materials was analyzed. The values of elastic constants of NiAl/Al2O3 were close to intermetallic ones due to the volume content of the NiAl phase particularly at low densities, where small alumina particles had no impact on the composite's stiffness. The influence of the external pressure of 30 MPa seemed crucial for obtaining satisfactory stiffness for three kinds of the studied materials which were characterized by a high dense microstructure with a low number of isolated spherical pores.

  8. Sequential motif profile of natural visibility graphs

    NASA Astrophysics Data System (ADS)

    Iacovacci, Jacopo; Lacasa, Lucas

    2016-11-01

    The concept of sequential visibility graph motifs—subgraphs appearing with characteristic frequencies in the visibility graphs associated to time series—has been advanced recently along with a theoretical framework to compute analytically the motif profiles associated to horizontal visibility graphs (HVGs). Here we develop a theory to compute the profile of sequential visibility graph motifs in the context of natural visibility graphs (VGs). This theory gives exact results for deterministic aperiodic processes with a smooth invariant density or stochastic processes that fulfill the Markov property and have a continuous marginal distribution. The framework also allows for a linear time numerical estimation in the case of empirical time series. A comparison between the HVG and the VG case (including evaluation of their robustness for short series polluted with measurement noise) is also presented.

  9. Chiral Alkyl Halides: Underexplored Motifs in Medicine

    PubMed Central

    Gál, Bálint; Bucher, Cyril; Burns, Noah Z.

    2016-01-01

    While alkyl halides are valuable intermediates in synthetic organic chemistry, their use as bioactive motifs in drug discovery and medicinal chemistry is rare in comparison. This is likely attributable to the common misconception that these compounds are merely non-specific alkylators in biological systems. A number of chlorinated compounds in the pharmaceutical and food industries, as well as a growing number of halogenated marine natural products showing unique bioactivity, illustrate the role that chiral alkyl halides can play in drug discovery. Through a series of case studies, we demonstrate in this review that these motifs can indeed be stable under physiological conditions, and that halogenation can enhance bioactivity through both steric and electronic effects. Our hope is that, by placing such compounds in the minds of the chemical community, they may gain more traction in drug discovery and inspire more synthetic chemists to develop methods for selective halogenation. PMID:27827902

  10. On the Kernelization Complexity of Colorful Motifs

    NASA Astrophysics Data System (ADS)

    Ambalath, Abhimanyu M.; Balasundaram, Radheshyam; Rao H., Chintan; Koppula, Venkata; Misra, Neeldhara; Philip, Geevarghese; Ramanujan, M. S.

    The Colorful Motif problem asks if, given a vertex-colored graph G, there exists a subset S of vertices of G such that the graph induced by G on S is connected and contains every color in the graph exactly once. The problem is motivated by applications in computational biology and is also well-studied from the theoretical point of view. In particular, it is known to be NP-complete even on trees of maximum degree three [Fellows et al, ICALP 2007]. In their pioneering paper that introduced the color-coding technique, Alon et al. [STOC 1995] show, inter alia, that the problem is FPT on general graphs. More recently, Cygan et al. [WG 2010] showed that Colorful Motif is NP-complete on comb graphs, a special subclass of the set of trees of maximum degree three. They also showed that the problem is not likely to admit polynomial kernels on forests.

  11. Anticipated synchronization in neuronal network motifs

    NASA Astrophysics Data System (ADS)

    Matias, F. S.; Gollo, L. L.; Carelli, P. V.; Copelli, M.; Mirasso, C. R.

    2013-01-01

    Two identical dynamical systems coupled unidirectionally (in a so called master-slave configuration) exhibit anticipated synchronization (AS) if the one which receives the coupling (the slave) also receives a negative delayed self-feedback. In oscillatory neuronal systems AS is characterized by a phase-locking with negative time delay τ between the spikes of the master and of the slave (slave fires before the master), while in the usual delayed synchronization (DS) regime τ is positive (slave fires after the master). A 3-neuron motif in which the slave self-feedback is replaced by a feedback loop mediated by an interneuron can exhibits both AS and DS regimes. Here we show that AS is robust in the presence of noise in a 3 Hodgkin-Huxley type neuronal motif. We also show that AS is stable for large values of τ in a chain of connected slaves-interneurons.

  12. Functional Motifs in Biochemical Reaction Networks

    PubMed Central

    Tyson, John J.; Novák, Béla

    2013-01-01

    The signal-response characteristics of a living cell are determined by complex networks of interacting genes, proteins, and metabolites. Understanding how cells respond to specific challenges, how these responses are contravened in diseased cells, and how to intervene pharmacologically in the decision-making processes of cells requires an accurate theory of the information-processing capabilities of macromolecular regulatory networks. Adopting an engineer’s approach to control systems, we ask whether realistic cellular control networks can be decomposed into simple regulatory motifs that carry out specific functions in a cell. We show that such functional motifs exist and review the experimental evidence that they control cellular responses as expected. PMID:20055671

  13. A Basic Set of Homeostatic Controller Motifs

    PubMed Central

    Drengstig, T.; Jolma, I.W.; Ni, X.Y.; Thorsen, K.; Xu, X.M.; Ruoff, P.

    2012-01-01

    Adaptation and homeostasis are essential properties of all living systems. However, our knowledge about the reaction kinetic mechanisms leading to robust homeostatic behavior in the presence of environmental perturbations is still poor. Here, we describe, and provide physiological examples of, a set of two-component controller motifs that show robust homeostasis. This basic set of controller motifs, which can be considered as complete, divides into two operational work modes, termed as inflow and outflow control. We show how controller combinations within a cell can integrate uptake and metabolization of a homeostatic controlled species and how pathways can be activated and lead to the formation of alternative products, as observed, for example, in the change of fermentation products by microorganisms when the supply of the carbon source is altered. The antagonistic character of hormonal control systems can be understood by a combination of inflow and outflow controllers. PMID:23199928

  14. Analyzing network reliability using structural motifs.

    PubMed

    Khorramzadeh, Yasamin; Youssef, Mina; Eubank, Stephen; Mowlaei, Shahir

    2015-04-01

    This paper uses the reliability polynomial, introduced by Moore and Shannon in 1956, to analyze the effect of network structure on diffusive dynamics such as the spread of infectious disease. We exhibit a representation for the reliability polynomial in terms of what we call structural motifs that is well suited for reasoning about the effect of a network's structural properties on diffusion across the network. We illustrate by deriving several general results relating graph structure to dynamical phenomena.

  15. Motif mining based on network space compression.

    PubMed

    Zhang, Qiang; Xu, Yuan

    2015-01-01

    A network motif is a recurring subnetwork within a network, and it takes on certain functions in practical biological macromolecule applications. Previous algorithms have focused on the computational efficiency of network motif detection, but some problems in storage space and searching time manifested during earlier studies. The considerable computational and spacial complexity also presents a significant challenge. In this paper, we provide a new approach for motif mining based on compressing the searching space. According to the characteristic of the parity nodes, we cut down the searching space and storage space in real graphs and random graphs, thereby reducing the computational cost of verifying the isomorphism of sub-graphs. We obtain a new network with smaller size after removing parity nodes and the "repeated edges" connected with the parity nodes. Random graph structure and sub-graph searching are based on the Back Tracking Method; all sub-graphs can be searched for by adding edges progressively. Experimental results show that this algorithm has higher speed and better stability than its alternatives.

  16. Dynamic motifs in socio-economic networks

    NASA Astrophysics Data System (ADS)

    Zhang, Xin; Shao, Shuai; Stanley, H. Eugene; Havlin, Shlomo

    2014-12-01

    Socio-economic networks are of central importance in economic life. We develop a method of identifying and studying motifs in socio-economic networks by focusing on “dynamic motifs,” i.e., evolutionary connection patterns that, because of “node acquaintances” in the network, occur much more frequently than random patterns. We examine two evolving bi-partite networks: i) the world-wide commercial ship chartering market and ii) the ship build-to-order market. We find similar dynamic motifs in both bipartite networks, even though they describe different economic activities. We also find that “influence” and “persistence” are strong factors in the interaction behavior of organizations. When two companies are doing business with the same customer, it is highly probable that another customer who currently only has business relationship with one of these two companies, will become customer of the second in the future. This is the effect of influence. Persistence means that companies with close business ties to customers tend to maintain their relationships over a long period of time.

  17. HeliCis: a DNA motif discovery tool for colocalized motif pairs with periodic spacing.

    PubMed

    Larsson, Erik; Lindahl, Per; Mostad, Petter

    2007-10-28

    Correct temporal and spatial gene expression during metazoan development relies on combinatorial interactions between different transcription factors. As a consequence, cis-regulatory elements often colocalize in clusters termed cis-regulatory modules. These may have requirements on organizational features such as spacing, order and helical phasing (periodic spacing) between binding sites. Due to the turning of the DNA helix, a small modification of the distance between a pair of sites may sometimes drastically disrupt function, while insertion of a full helical turn of DNA (10-11 bp) between cis elements may cause functionality to be restored. Recently, de novo motif discovery methods which incorporate organizational properties such as colocalization and order preferences have been developed, but there are no tools which incorporate periodic spacing into the model. We have developed a web based motif discovery tool, HeliCis, which features a flexible model which allows de novo detection of motifs with periodic spacing. Depending on the parameter settings it may also be used for discovering colocalized motifs without periodicity or motifs separated by a fixed gap of known or unknown length. We show on simulated data that it can efficiently capture the synergistic effects of colocalization and periodic spacing to improve detection of weak DNA motifs. It provides a simple to use web interface which interactively visualizes the current settings and thereby makes it easy to understand the parameters and the model structure. HeliCis provides simple and efficient de novo discovery of colocalized DNA motif pairs, with or without periodic spacing. Our evaluations show that it can detect weak periodic patterns which are not easily discovered using a sequential approach, i.e. first finding the binding sites and second analyzing the properties of their pairwise distances.

  18. A Novel Method for Dynamic Short-Beam Shear Testing of 3D Woven Composites

    DTIC Science & Technology

    2011-08-11

    Yip MC, Lin JL (1998) Effects of low-energy impact on the fatigue behavior of carbon /epoxy composites. Composites Science and Technology 58(1):1–8 5...Compos- ite Materials 42(20):2111–2122 12. Davis DC, Whelan BD (2012) An experimental study of interlam- inar shear fracture toughness of a nanotube ...delamination toughness of stitched graphite/epoxy textile composites. Composites Science and Technology 57(7):729–737 15. Chen L, Ifju PG, Sankar BV (2001) A

  19. Occurrence probability of structured motifs in random sequences.

    PubMed

    Robin, S; Daudin, J-J; Richard, H; Sagot, M-F; Schbath, S

    2002-01-01

    The problem of extracting from a set of nucleic acid sequences motifs which may have biological function is more and more important. In this paper, we are interested in particular motifs that may be implicated in the transcription process. These motifs, called structured motifs, are composed of two ordered parts separated by a variable distance and allowing for substitutions. In order to assess their statistical significance, we propose approximations of the probability of occurrences of such a structured motif in a given sequence. An application of our method to evaluate candidate promoters in E. coli and B. subtilis is presented. Simulations show the goodness of the approximations.

  20. No tradeoff between versatility and robustness in gene circuit motifs

    NASA Astrophysics Data System (ADS)

    Payne, Joshua L.

    2016-05-01

    Circuit motifs are small directed subgraphs that appear in real-world networks significantly more often than in randomized networks. In the Boolean model of gene circuits, most motifs are realized by multiple circuit genotypes. Each of a motif's constituent circuit genotypes may have one or more functions, which are embodied in the expression patterns the circuit forms in response to specific initial conditions. Recent enumeration of a space of nearly 17 million three-gene circuit genotypes revealed that all circuit motifs have more than one function, with the number of functions per motif ranging from 12 to nearly 30,000. This indicates that some motifs are more functionally versatile than others. However, the individual circuit genotypes that constitute each motif are less robust to mutation if they have many functions, hinting that functionally versatile motifs may be less robust to mutation than motifs with few functions. Here, I explore the relationship between versatility and robustness in circuit motifs, demonstrating that functionally versatile motifs are robust to mutation despite the inherent tradeoff between versatility and robustness at the level of an individual circuit genotype.

  1. CLIMP: Clustering Motifs via Maximal Cliques with Parallel Computing Design

    PubMed Central

    Chen, Yong

    2016-01-01

    A set of conserved binding sites recognized by a transcription factor is called a motif, which can be found by many applications of comparative genomics for identifying over-represented segments. Moreover, when numerous putative motifs are predicted from a collection of genome-wide data, their similarity data can be represented as a large graph, where these motifs are connected to one another. However, an efficient clustering algorithm is desired for clustering the motifs that belong to the same groups and separating the motifs that belong to different groups, or even deleting an amount of spurious ones. In this work, a new motif clustering algorithm, CLIMP, is proposed by using maximal cliques and sped up by parallelizing its program. When a synthetic motif dataset from the database JASPAR, a set of putative motifs from a phylogenetic foot-printing dataset, and a set of putative motifs from a ChIP dataset are used to compare the performances of CLIMP and two other high-performance algorithms, the results demonstrate that CLIMP mostly outperforms the two algorithms on the three datasets for motif clustering, so that it can be a useful complement of the clustering procedures in some genome-wide motif prediction pipelines. CLIMP is available at http://sqzhang.cn/climp.html. PMID:27487245

  2. RNA structural motif recognition based on least-squares distance.

    PubMed

    Shen, Ying; Wong, Hau-San; Zhang, Shaohong; Zhang, Lin

    2013-09-01

    RNA structural motifs are recurrent structural elements occurring in RNA molecules. RNA structural motif recognition aims to find RNA substructures that are similar to a query motif, and it is important for RNA structure analysis and RNA function prediction. In view of this, we propose a new method known as RNA Structural Motif Recognition based on Least-Squares distance (LS-RSMR) to effectively recognize RNA structural motifs. A test set consisting of five types of RNA structural motifs occurring in Escherichia coli ribosomal RNA is compiled by us. Experiments are conducted for recognizing these five types of motifs. The experimental results fully reveal the superiority of the proposed LS-RSMR compared with four other state-of-the-art methods.

  3. MProfiler: A Profile-Based Method for DNA Motif Discovery

    NASA Astrophysics Data System (ADS)

    Altarawy, Doaa; Ismail, Mohamed A.; Ghanem, Sahar M.

    Motif Finding is one of the most important tasks in gene regulation which is essential in understanding biological cell functions. Based on recent studies, the performance of current motif finders is not satisfactory. A number of ensemble methods have been proposed to enhance the accuracy of the results. Existing ensemble methods overall performance is better than stand-alone motif finders. A recent ensemble method, MotifVoter, significantly outperforms all existing stand-alone and ensemble methods. In this paper, we propose a method, MProfiler, to increase the accuracy of MotifVoter without increasing the run time by introducing an idea called center profiling. Our experiments show improvement in the quality of generated clusters over MotifVoter in both accuracy and cluster compactness. Using 56 datasets, the accuracy of the final results using our method achieves 80% improvement in correlation coefficient nCC, and 93% improvement in performance coefficient nPC over MotifVoter.

  4. Chaotic motif sampler: detecting motifs from biological sequences by using chaotic neurodynamics

    NASA Astrophysics Data System (ADS)

    Matsuura, Takafumi; Ikeguchi, Tohru

    Identification of a region in biological sequences, motif extraction problem (MEP) is solved in bioinformatics. However, the MEP is an NP-hard problem. Therefore, it is almost impossible to obtain an optimal solution within a reasonable time frame. To find near optimal solutions for NP-hard combinatorial optimization problems such as traveling salesman problems, quadratic assignment problems, and vehicle routing problems, chaotic search, which is one of the deterministic approaches, has been proposed and exhibits better performance than stochastic approaches. In this paper, we propose a new alignment method that employs chaotic dynamics to solve the MEPs. It is called the Chaotic Motif Sampler. We show that the performance of the Chaotic Motif Sampler is considerably better than that of the conventional methods such as the Gibbs Site Sampler and the Neighborhood Optimization for Multiple Alignment Discovery.

  5. The RNA 3D Motif Atlas: Computational methods for extraction, organization and evaluation of RNA motifs.

    PubMed

    Parlea, Lorena G; Sweeney, Blake A; Hosseini-Asanjan, Maryam; Zirbel, Craig L; Leontis, Neocles B

    2016-07-01

    RNA 3D motifs occupy places in structured RNA molecules that correspond to the hairpin, internal and multi-helix junction "loops" of their secondary structure representations. As many as 40% of the nucleotides of an RNA molecule can belong to these structural elements, which are distinct from the regular double helical regions formed by contiguous AU, GC, and GU Watson-Crick basepairs. With the large number of atomic- or near atomic-resolution 3D structures appearing in a steady stream in the PDB/NDB structure databases, the automated identification, extraction, comparison, clustering and visualization of these structural elements presents an opportunity to enhance RNA science. Three broad applications are: (1) identification of modular, autonomous structural units for RNA nanotechnology, nanobiology and synthetic biology applications; (2) bioinformatic analysis to improve RNA 3D structure prediction from sequence; and (3) creation of searchable databases for exploring the binding specificities, structural flexibility, and dynamics of these RNA elements. In this contribution, we review methods developed for computational extraction of hairpin and internal loop motifs from a non-redundant set of high-quality RNA 3D structures. We provide a statistical summary of the extracted hairpin and internal loop motifs in the most recent version of the RNA 3D Motif Atlas. We also explore the reliability and accuracy of the extraction process by examining its performance in clustering recurrent motifs from homologous ribosomal RNA (rRNA) structures. We conclude with a summary of remaining challenges, especially with regard to extraction of multi-helix junction motifs. Copyright © 2016 Elsevier Inc. All rights reserved.

  6. Bases of motifs for generating repeated patterns with wild cards.

    PubMed

    Pisanti, Nadia; Crochemore, Maxime; Grossi, Roberto; Sagot, Marie-France

    2005-01-01

    Motif inference represents one of the most important areas of research in computational biology, and one of its oldest ones. Despite this, the problem remains very much open in the sense that no existing definition is fully satisfying, either in formal terms, or in relation to the biological questions that involve finding such motifs. Two main types of motifs have been considered in the literature: matrices (of letter frequency per position in the motif) and patterns. There is no conclusive evidence in favor of either, and recent work has attempted to integrate the two types into a single model. In this paper, we address the formal issue in relation to motifs as patterns. This is essential to get at a better understanding of motifs in general. In particular, we consider a promising idea that was recently proposed, which attempted to avoid the combinatorial explosion in the number of motifs by means of a generator set for the motifs. Instead of exhibiting a complete list of motifs satisfying some input constraints, what is produced is a basis of such motifs from which all the other ones can be generated. We study the computational cost of determining such a basis of repeated motifs with wild cards in a sequence. We give new upper and lower bounds on such a cost, introducing a notion of basis that is provably contained in (and, thus, smaller) than previously defined ones. Our basis can be computed in less time and space, and is still able to generate the same set of motifs. We also prove that the number of motifs in all bases defined so far grows exponentially with the quorum, that is, with the minimal number of times a motif must appear in a sequence, something unnoticed in previous work. We show that there is no hope to efficiently compute such bases unless the quorum is fixed.

  7. Multilayer motif analysis of brain networks

    NASA Astrophysics Data System (ADS)

    Battiston, Federico; Nicosia, Vincenzo; Chavez, Mario; Latora, Vito

    2017-04-01

    In the last decade, network science has shed new light both on the structural (anatomical) and on the functional (correlations in the activity) connectivity among the different areas of the human brain. The analysis of brain networks has made possible to detect the central areas of a neural system and to identify its building blocks by looking at overabundant small subgraphs, known as motifs. However, network analysis of the brain has so far mainly focused on anatomical and functional networks as separate entities. The recently developed mathematical framework of multi-layer networks allows us to perform an analysis of the human brain where the structural and functional layers are considered together. In this work, we describe how to classify the subgraphs of a multiplex network, and we extend the motif analysis to networks with an arbitrary number of layers. We then extract multi-layer motifs in brain networks of healthy subjects by considering networks with two layers, anatomical and functional, respectively, obtained from diffusion and functional magnetic resonance imaging. Results indicate that subgraphs in which the presence of a physical connection between brain areas (links at the structural layer) coexists with a non-trivial positive correlation in their activities are statistically overabundant. Finally, we investigate the existence of a reinforcement mechanism between the two layers by looking at how the probability to find a link in one layer depends on the intensity of the connection in the other one. Showing that functional connectivity is non-trivially constrained by the underlying anatomical network, our work contributes to a better understanding of the interplay between the structure and function in the human brain.

  8. MINER: software for phylogenetic motif identification.

    PubMed

    La, David; Livesay, Dennis R

    2005-07-01

    MINER is web-based software for phylogenetic motif (PM) identification. PMs are sequence regions (fragments) that conserve the overall familial phylogeny. PMs have been shown to correspond to a wide variety of catalytic regions, substrate-binding sites and protein interfaces, making them ideal functional site predictions. The MINER output provides an intuitive interface for interactive PM sequence analysis and structural visualization. The web implementation of MINER is freely available at http://www.pmap.csupomona.edu/MINER/. Source code is available to the academic community on request.

  9. A designed DNA binding motif that recognizes extended sites and spans two adjacent major grooves†

    PubMed Central

    Rodríguez, Jéssica; Mosquera, Jesús; García-Fandiño, Rebeca; Vázquez, M. Eugenio; Mascareñas, José L.

    2016-01-01

    We report the rational design of a DNA-binding peptide construct composed of the DNA-contacting regions of two transcription factors (GCN4 and GAGA) linked through an AT-hook DNA anchor. The resulting chimera, which represents a new, non-natural DNA binding motif, binds with high affinity and selectivity to a long composite sequence of 13 base pairs (TCAT-AATT-GAGAG). PMID:27252825

  10. Transcription factor motif quality assessment requires systematic comparative analysis

    PubMed Central

    Kibet, Caleb Kipkurui; Machanick, Philip

    2016-01-01

    Transcription factor (TF) binding site prediction remains a challenge in gene regulatory research due to degeneracy and potential variability in binding sites in the genome. Dozens of algorithms designed to learn binding models (motifs) have generated many motifs available in research papers with a subset making it to databases like JASPAR, UniPROBE and Transfac. The presence of many versions of motifs from the various databases for a single TF and the lack of a standardized assessment technique makes it difficult for biologists to make an appropriate choice of binding model and for algorithm developers to benchmark, test and improve on their models. In this study, we review and evaluate the approaches in use, highlight differences and demonstrate the difficulty of defining a standardized motif assessment approach. We review scoring functions, motif length, test data and the type of performance metrics used in prior studies as some of the factors that influence the outcome of a motif assessment. We show that the scoring functions and statistics used in motif assessment influence ranking of motifs in a TF-specific manner. We also show that TF binding specificity can vary by source of genomic binding data. We also demonstrate that information content of a motif is not in isolation a measure of motif quality but is influenced by TF binding behaviour. We conclude that there is a need for an easy-to-use tool that presents all available evidence for a comparative analysis. PMID:27092243

  11. Cross-disciplinary detection and analysis of network motifs.

    PubMed

    Tran, Ngoc Tam L; DeLuccia, Luke; McDonald, Aidan F; Huang, Chun-Hsi

    2015-01-01

    The detection of network motifs has recently become an important part of network analysis across all disciplines. In this work, we detected and analyzed network motifs from undirected and directed networks of several different disciplines, including biological network, social network, ecological network, as well as other networks such as airlines, power grid, and co-purchase of political books networks. Our analysis revealed that undirected networks are similar at the basic three and four nodes, while the analysis of directed networks revealed the distinction between networks of different disciplines. The study showed that larger motifs contained the three-node motif as a subgraph. Topological analysis revealed that similar networks have similar small motifs, but as the motif size increases, differences arise. Pearson correlation coefficient showed strong positive relationship between some undirected networks but inverse relationship between some directed networks. The study suggests that the three-node motif is a building block of larger motifs. It also suggests that undirected networks share similar low-level structures. Moreover, similar networks share similar small motifs, but larger motifs define the unique structure of individuals. Pearson correlation coefficient suggests that protein structure networks, dolphin social network, and co-authorships in network science belong to a superfamily. In addition, yeast protein-protein interaction network, primary school contact network, Zachary's karate club network, and co-purchase of political books network can be classified into a superfamily.

  12. Identification of polymorphic motifs using probabilistic search algorithms

    PubMed Central

    Basu, Analabha; Chaudhuri, Probal; Majumder, Partha P.

    2005-01-01

    The problem of identifying motifs comprising nucleotides at a set of polymorphic DNA sites, not necessarily contiguous, arises in many human genetic problems. However, when the sites are not contiguous, no efficient algorithm exists for polymorphic motif identification. A search based on complete enumeration is computationally inefficient. We have developed probabilistic search algorithms to discover motifs of known or unknown lengths. We have developed statistical tests of significance for assessing a motif discovery, and a statistical criterion for simultaneously estimating motif length and discovering it. We have tested these algorithms on various synthetic data sets and have shown that they are very efficient, in the sense that the “true” motifs can be detected in the vast majority of replications and in a small number of iterations. Additionally, we have applied them to some real data sets and have shown that they are able to identify known motifs. In certain applications, it is pertinent to find motifs that contain contrasting nucleotides at the sites included in the motif (e.g., motifs identified in case-control association studies). For this, we have suggested appropriate modifications. Using simulations, we have discovered that the success rate of identification of the correct motif is high in case-control studies except when relative risks are small. Our analyses of evolutionary data sets resulted in the identification of some motifs that appear to have important implications on human evolutionary inference. These algorithms can easily be implemented to discover motifs from multilocus genotype data by simple numerical recoding of genotypes. PMID:15632091

  13. Motif-directed redesign of enzyme specificity.

    PubMed

    Borgo, Benjamin; Havranek, James J

    2014-03-01

    Computational protein design relies on several approximations, including the use of fixed backbones and rotamers, to reduce protein design to a computationally tractable problem. However, allowing backbone and off-rotamer flexibility leads to more accurate designs and greater conformational diversity. Exhaustive sampling of this additional conformational space is challenging, and often impossible. Here, we report a computational method that utilizes a preselected library of native interactions to direct backbone flexibility to accommodate placement of these functional contacts. Using these native interaction modules, termed motifs, improves the likelihood that the interaction can be realized, provided that suitable backbone perturbations can be identified. Furthermore, it allows a directed search of the conformational space, reducing the sampling needed to find low energy conformations. We implemented the motif-based design algorithm in Rosetta, and tested the efficacy of this method by redesigning the substrate specificity of methionine aminopeptidase. In summary, native enzymes have evolved to catalyze a wide range of chemical reactions with extraordinary specificity. Computational enzyme design seeks to generate novel chemical activities by altering the target substrates of these existing enzymes. We have implemented a novel approach to redesign the specificity of an enzyme and demonstrated its effectiveness on a model system.

  14. Promoter Motifs in NCLDVs: An Evolutionary Perspective

    PubMed Central

    Oliveira, Graziele Pereira; Andrade, Ana Cláudia dos Santos Pereira; Rodrigues, Rodrigo Araújo Lima; Arantes, Thalita Souza; Boratto, Paulo Victor Miranda; Silva, Ludmila Karen dos Santos; Dornas, Fábio Pio; Trindade, Giliane de Souza; Drumond, Betânia Paiva; La Scola, Bernard; Kroon, Erna Geessien; Abrahão, Jônatas Santos

    2017-01-01

    For many years, gene expression in the three cellular domains has been studied in an attempt to discover sequences associated with the regulation of the transcription process. Some specific transcriptional features were described in viruses, although few studies have been devoted to understanding the evolutionary aspects related to the spread of promoter motifs through related viral families. The discovery of giant viruses and the proposition of the new viral order Megavirales that comprise a monophyletic group, named nucleo-cytoplasmic large DNA viruses (NCLDV), raised new questions in the field. Some putative promoter sequences have already been described for some NCLDV members, bringing new insights into the evolutionary history of these complex microorganisms. In this review, we summarize the main aspects of the transcription regulation process in the three domains of life, followed by a systematic description of what is currently known about promoter regions in several NCLDVs. We also discuss how the analysis of the promoter sequences could bring new ideas about the giant viruses’ evolution. Finally, considering a possible common ancestor for the NCLDV group, we discussed possible promoters’ evolutionary scenarios and propose the term “MEGA-box” to designate an ancestor promoter motif (‘TATATAAAATTGA’) that could be evolved gradually by nucleotides’ gain and loss and point mutations. PMID:28117683

  15. Promoter Motifs in NCLDVs: An Evolutionary Perspective.

    PubMed

    Oliveira, Graziele Pereira; Andrade, Ana Cláudia Dos Santos Pereira; Rodrigues, Rodrigo Araújo Lima; Arantes, Thalita Souza; Boratto, Paulo Victor Miranda; Silva, Ludmila Karen Dos Santos; Dornas, Fábio Pio; Trindade, Giliane de Souza; Drumond, Betânia Paiva; La Scola, Bernard; Kroon, Erna Geessien; Abrahão, Jônatas Santos

    2017-01-20

    For many years, gene expression in the three cellular domains has been studied in an attempt to discover sequences associated with the regulation of the transcription process. Some specific transcriptional features were described in viruses, although few studies have been devoted to understanding the evolutionary aspects related to the spread of promoter motifs through related viral families. The discovery of giant viruses and the proposition of the new viral order Megavirales that comprise a monophyletic group, named nucleo-cytoplasmic large DNA viruses (NCLDV), raised new questions in the field. Some putative promoter sequences have already been described for some NCLDV members, bringing new insights into the evolutionary history of these complex microorganisms. In this review, we summarize the main aspects of the transcription regulation process in the three domains of life, followed by a systematic description of what is currently known about promoter regions in several NCLDVs. We also discuss how the analysis of the promoter sequences could bring new ideas about the giant viruses' evolution. Finally, considering a possible common ancestor for the NCLDV group, we discussed possible promoters' evolutionary scenarios and propose the term "MEGA-box" to designate an ancestor promoter motif ('TATATAAAATTGA') that could be evolved gradually by nucleotides' gain and loss and point mutations.

  16. An Affinity Propagation-Based DNA Motif Discovery Algorithm.

    PubMed

    Sun, Chunxiao; Huo, Hongwei; Yu, Qiang; Guo, Haitao; Sun, Zhigang

    2015-01-01

    The planted (l, d) motif search (PMS) is one of the fundamental problems in bioinformatics, which plays an important role in locating transcription factor binding sites (TFBSs) in DNA sequences. Nowadays, identifying weak motifs and reducing the effect of local optimum are still important but challenging tasks for motif discovery. To solve the tasks, we propose a new algorithm, APMotif, which first applies the Affinity Propagation (AP) clustering in DNA sequences to produce informative and good candidate motifs and then employs Expectation Maximization (EM) refinement to obtain the optimal motifs from the candidate motifs. Experimental results both on simulated data sets and real biological data sets show that APMotif usually outperforms four other widely used algorithms in terms of high prediction accuracy.

  17. Probabilistic models for semisupervised discriminative motif discovery in DNA sequences.

    PubMed

    Kim, Jong Kyoung; Choi, Seungjin

    2011-01-01

    Methods for discriminative motif discovery in DNA sequences identify transcription factor binding sites (TFBSs), searching only for patterns that differentiate two sets (positive and negative sets) of sequences. On one hand, discriminative methods increase the sensitivity and specificity of motif discovery, compared to generative models. On the other hand, generative models can easily exploit unlabeled sequences to better detect functional motifs when labeled training samples are limited. In this paper, we develop a hybrid generative/discriminative model which enables us to make use of unlabeled sequences in the framework of discriminative motif discovery, leading to semisupervised discriminative motif discovery. Numerical experiments on yeast ChIP-chip data for discovering DNA motifs demonstrate that the best performance is obtained between the purely-generative and the purely-discriminative and the semisupervised learning improves the performance when labeled sequences are limited.

  18. An Affinity Propagation-Based DNA Motif Discovery Algorithm

    PubMed Central

    Sun, Chunxiao; Huo, Hongwei; Yu, Qiang; Guo, Haitao; Sun, Zhigang

    2015-01-01

    The planted (l, d) motif search (PMS) is one of the fundamental problems in bioinformatics, which plays an important role in locating transcription factor binding sites (TFBSs) in DNA sequences. Nowadays, identifying weak motifs and reducing the effect of local optimum are still important but challenging tasks for motif discovery. To solve the tasks, we propose a new algorithm, APMotif, which first applies the Affinity Propagation (AP) clustering in DNA sequences to produce informative and good candidate motifs and then employs Expectation Maximization (EM) refinement to obtain the optimal motifs from the candidate motifs. Experimental results both on simulated data sets and real biological data sets show that APMotif usually outperforms four other widely used algorithms in terms of high prediction accuracy. PMID:26347887

  19. Network Motifs: Simple Building Blocks of Complex Networks

    NASA Astrophysics Data System (ADS)

    Milo, R.; Shen-Orr, S.; Itzkovitz, S.; Kashtan, N.; Chklovskii, D.; Alon, U.

    2002-10-01

    Complex networks are studied across many fields of science. To uncover their structural design principles, we defined ``network motifs,'' patterns of interconnections occurring in complex networks at numbers that are significantly higher than those in randomized networks. We found such motifs in networks from biochemistry, neurobiology, ecology, and engineering. The motifs shared by ecological food webs were distinct from the motifs shared by the genetic networks of Escherichia coli and Saccharomyces cerevisiae or from those found in the World Wide Web. Similar motifs were found in networks that perform information processing, even though they describe elements as different as biomolecules within a cell and synaptic connections between neurons in Caenorhabditis elegans. Motifs may thus define universal classes of networks. This approach may uncover the basic building blocks of most networks.

  20. Detecting DNA regulatory motifs by incorporating positional trendsin information content

    SciTech Connect

    Kechris, Katherina J.; van Zwet, Erik; Bickel, Peter J.; Eisen,Michael B.

    2004-05-04

    On the basis of the observation that conserved positions in transcription factor binding sites are often clustered together, we propose a simple extension to the model-based motif discovery methods. We assign position-specific prior distributions to the frequency parameters of the model, penalizing deviations from a specified conservation profile. Examples with both simulated and real data show that this extension helps discover motifs as the data become noisier or when there is a competing false motif.

  1. Discriminative motif analysis of high-throughput dataset

    PubMed Central

    Yao, Zizhen; MacQuarrie, Kyle L.; Fong, Abraham P.; Tapscott, Stephen J.; Ruzzo, Walter L.; Gentleman, Robert C.

    2014-01-01

    Motivation: High-throughput ChIP-seq studies typically identify thousands of peaks for a single transcription factor (TF). It is common for traditional motif discovery tools to predict motifs that are statistically significant against a naïve background distribution but are of questionable biological relevance. Results: We describe a simple yet effective algorithm for discovering differential motifs between two sequence datasets that is effective in eliminating systematic biases and scalable to large datasets. Tested on 207 ENCODE ChIP-seq datasets, our method identifies correct motifs in 78% of the datasets with known motifs, demonstrating improvement in both accuracy and efficiency compared with DREME, another state-of-art discriminative motif discovery tool. More interestingly, on the remaining more challenging datasets, we identify common technical or biological factors that compromise the motif search results and use advanced features of our tool to control for these factors. We also present case studies demonstrating the ability of our method to detect single base pair differences in DNA specificity of two similar TFs. Lastly, we demonstrate discovery of key TF motifs involved in tissue specification by examination of high-throughput DNase accessibility data. Availability: The motifRG package is publically available via the bioconductor repository. Contact: yzizhen@fhcrc.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24162561

  2. Gibbs motif sampling: detection of bacterial outer membrane protein repeats.

    PubMed Central

    Neuwald, A. F.; Liu, J. S.; Lawrence, C. E.

    1995-01-01

    The detection and alignment of locally conserved regions (motifs) in multiple sequences can provide insight into protein structure, function, and evolution. A new Gibbs sampling algorithm is described that detects motif-encoding regions in sequences and optimally partitions them into distinct motif models; this is illustrated using a set of immunoglobulin fold proteins. When applied to sequences sharing a single motif, the sampler can be used to classify motif regions into related submodels, as is illustrated using helix-turn-helix DNA-binding proteins. Other statistically based procedures are described for searching a database for sequences matching motifs found by the sampler. When applied to a set of 32 very distantly related bacterial integral outer membrane proteins, the sampler revealed that they share a subtle, repetitive motif. Although BLAST (Altschul SF et al., 1990, J Mol Biol 215:403-410) fails to detect significant pairwise similarity between any of the sequences, the repeats present in these outer membrane proteins, taken as a whole, are highly significant (based on a generally applicable statistical test for motifs described here). Analysis of bacterial porins with known trimeric beta-barrel structure and related proteins reveals a similar repetitive motif corresponding to alternating membrane-spanning beta-strands. These beta-strands occur on the membrane interface (as opposed to the trimeric interface) of the beta-barrel. The broad conservation and structural location of these repeats suggests that they play important functional roles. PMID:8520488

  3. Using random forest algorithm to predict β-hairpin motifs.

    PubMed

    Jia, Shao-Chun; Hu, Xiu-Zhen

    2011-06-01

    A novel method is presented for predicting β-hairpin motifs in protein sequences. That is Random Forest algorithm on the basis of the multi-characteristic parameters, which include amino acids component of position, hydropathy component of position, predicted secondary structure information and value of auto-correlation function. Firstly, the method is trained and tested on a set of 8,291 β-hairpin motifs and 6,865 non-β-hairpin motifs. The overall accuracy and Matthew's correlation coefficient achieve 82.2% and 0.64 using 5-fold cross-validation, while they achieve 81.7% and 0.63 using the independent test. Secondly, the method is also tested on a set of 4,884 β-hairpin motifs and 4,310 non-β-hairpin motifs which is used in previous studies. The overall accuracy and Matthew's correlation coefficient achieve 80.9% and 0.61 for 5-fold cross-validation, while they achieve 80.6% and 0.60 for the independent test. Compared with the previous, the present result is better. Thirdly, 4,884 β-hairpin motifs and 4,310 non-β-hairpin motifs selected as the training set, and 8,291 β-hairpin motifs and 6,865 non-β-hairpin motifs selected as the independent testing set, the overall accuracy and Matthew's correlation coefficient achieve 81.5% and 0.63 with the independent test.

  4. An RNA motif that binds ATP

    NASA Technical Reports Server (NTRS)

    Sassanfar, M.; Szostak, J. W.

    1993-01-01

    RNAs that contain specific high-affinity binding sites for small molecule ligands immobilized on a solid support are present at a frequency of roughly one in 10(10)-10(11) in pools of random sequence RNA molecules. Here we describe a new in vitro selection procedure designed to ensure the isolation of RNAs that bind the ligand of interest in solution as well as on a solid support. We have used this method to isolate a remarkably small RNA motif that binds ATP, a substrate in numerous biological reactions and the universal biological high-energy intermediate. The selected ATP-binding RNAs contain a consensus sequence, embedded in a common secondary structure. The binding properties of ATP analogues and modified RNAs show that the binding interaction is characterized by a large number of close contacts between the ATP and RNA, and by a change in the conformation of the RNA.

  5. Complex lasso: new entangled motifs in proteins

    NASA Astrophysics Data System (ADS)

    Niemyska, Wanda; Dabrowski-Tumanski, Pawel; Kadlof, Michal; Haglund, Ellinor; Sułkowski, Piotr; Sulkowska, Joanna I.

    2016-11-01

    We identify new entangled motifs in proteins that we call complex lassos. Lassos arise in proteins with disulfide bridges (or in proteins with amide linkages), when termini of a protein backbone pierce through an auxiliary surface of minimal area, spanned on a covalent loop. We find that as much as 18% of all proteins with disulfide bridges in a non-redundant subset of PDB form complex lassos, and classify them into six distinct geometric classes, one of which resembles supercoiling known from DNA. Based on biological classification of proteins we find that lassos are much more common in viruses, plants and fungi than in other kingdoms of life. We also discuss how changes in the oxidation/reduction potential may affect the function of proteins with lassos. Lassos and associated surfaces of minimal area provide new, interesting and possessing many potential applications geometric characteristics not only of proteins, but also of other biomolecules.

  6. An RNA motif that binds ATP

    NASA Technical Reports Server (NTRS)

    Sassanfar, M.; Szostak, J. W.

    1993-01-01

    RNAs that contain specific high-affinity binding sites for small molecule ligands immobilized on a solid support are present at a frequency of roughly one in 10(10)-10(11) in pools of random sequence RNA molecules. Here we describe a new in vitro selection procedure designed to ensure the isolation of RNAs that bind the ligand of interest in solution as well as on a solid support. We have used this method to isolate a remarkably small RNA motif that binds ATP, a substrate in numerous biological reactions and the universal biological high-energy intermediate. The selected ATP-binding RNAs contain a consensus sequence, embedded in a common secondary structure. The binding properties of ATP analogues and modified RNAs show that the binding interaction is characterized by a large number of close contacts between the ATP and RNA, and by a change in the conformation of the RNA.

  7. Complex lasso: new entangled motifs in proteins

    PubMed Central

    Niemyska, Wanda; Dabrowski-Tumanski, Pawel; Kadlof, Michal; Haglund, Ellinor; Sułkowski, Piotr; Sulkowska, Joanna I.

    2016-01-01

    We identify new entangled motifs in proteins that we call complex lassos. Lassos arise in proteins with disulfide bridges (or in proteins with amide linkages), when termini of a protein backbone pierce through an auxiliary surface of minimal area, spanned on a covalent loop. We find that as much as 18% of all proteins with disulfide bridges in a non-redundant subset of PDB form complex lassos, and classify them into six distinct geometric classes, one of which resembles supercoiling known from DNA. Based on biological classification of proteins we find that lassos are much more common in viruses, plants and fungi than in other kingdoms of life. We also discuss how changes in the oxidation/reduction potential may affect the function of proteins with lassos. Lassos and associated surfaces of minimal area provide new, interesting and possessing many potential applications geometric characteristics not only of proteins, but also of other biomolecules. PMID:27874096

  8. Protein structural motifs in prediction and design.

    PubMed

    Mackenzie, Craig O; Grigoryan, Gevorg

    2017-06-01

    The Protein Data Bank (PDB) has been an integral resource for shaping our fundamental understanding of protein structure and for the advancement of such applications as protein design and structure prediction. Over the years, information from the PDB has been used to generate models ranging from specific structural mechanisms to general statistical potentials. With accumulating structural data, it has become possible to mine for more complete and complex structural observations, deducing more accurate generalizations. Motif libraries, which capture recurring structural features along with their sequence preferences, have exposed modularity in the structural universe and found successful application in various problems of structural biology. Here we summarize recent achievements in this arena, focusing on subdomain level structural patterns and their applications to protein design and structure prediction, and suggest promising future directions as the structural database continues to grow. Copyright © 2017 Elsevier Ltd. All rights reserved.

  9. Motif-role-fingerprints: the building-blocks of motifs, clustering-coefficients and transitivities in directed networks.

    PubMed

    McDonnell, Mark D; Yaveroğlu, Ömer Nebil; Schmerl, Brett A; Iannella, Nicolangelo; Ward, Lawrence M

    2014-01-01

    Complex networks are frequently characterized by metrics for which particular subgraphs are counted. One statistic from this category, which we refer to as motif-role fingerprints, differs from global subgraph counts in that the number of subgraphs in which each node participates is counted. As with global subgraph counts, it can be important to distinguish between motif-role fingerprints that are 'structural' (induced subgraphs) and 'functional' (partial subgraphs). Here we show mathematically that a vector of all functional motif-role fingerprints can readily be obtained from an arbitrary directed adjacency matrix, and then converted to structural motif-role fingerprints by multiplying that vector by a specific invertible conversion matrix. This result demonstrates that a unique structural motif-role fingerprint exists for any given functional motif-role fingerprint. We demonstrate a similar result for the cases of functional and structural motif-fingerprints without node roles, and global subgraph counts that form the basis of standard motif analysis. We also explicitly highlight that motif-role fingerprints are elemental to several popular metrics for quantifying the subgraph structure of directed complex networks, including motif distributions, directed clustering coefficient, and transitivity. The relationships between each of these metrics and motif-role fingerprints also suggest new subtypes of directed clustering coefficients and transitivities. Our results have potential utility in analyzing directed synaptic networks constructed from neuronal connectome data, such as in terms of centrality. Other potential applications include anomaly detection in networks, identification of similar networks and identification of similar nodes within networks. Matlab code for calculating all stated metrics following calculation of functional motif-role fingerprints is provided as S1 Matlab File.

  10. Motif-Role-Fingerprints: The Building-Blocks of Motifs, Clustering-Coefficients and Transitivities in Directed Networks

    PubMed Central

    McDonnell, Mark D.; Yaveroğlu, Ömer Nebil; Schmerl, Brett A.; Iannella, Nicolangelo; Ward, Lawrence M.

    2014-01-01

    Complex networks are frequently characterized by metrics for which particular subgraphs are counted. One statistic from this category, which we refer to as motif-role fingerprints, differs from global subgraph counts in that the number of subgraphs in which each node participates is counted. As with global subgraph counts, it can be important to distinguish between motif-role fingerprints that are ‘structural’ (induced subgraphs) and ‘functional’ (partial subgraphs). Here we show mathematically that a vector of all functional motif-role fingerprints can readily be obtained from an arbitrary directed adjacency matrix, and then converted to structural motif-role fingerprints by multiplying that vector by a specific invertible conversion matrix. This result demonstrates that a unique structural motif-role fingerprint exists for any given functional motif-role fingerprint. We demonstrate a similar result for the cases of functional and structural motif-fingerprints without node roles, and global subgraph counts that form the basis of standard motif analysis. We also explicitly highlight that motif-role fingerprints are elemental to several popular metrics for quantifying the subgraph structure of directed complex networks, including motif distributions, directed clustering coefficient, and transitivity. The relationships between each of these metrics and motif-role fingerprints also suggest new subtypes of directed clustering coefficients and transitivities. Our results have potential utility in analyzing directed synaptic networks constructed from neuronal connectome data, such as in terms of centrality. Other potential applications include anomaly detection in networks, identification of similar networks and identification of similar nodes within networks. Matlab code for calculating all stated metrics following calculation of functional motif-role fingerprints is provided as S1 Matlab File. PMID:25486535

  11. The limits of de novo DNA motif discovery.

    PubMed

    Simcha, David; Price, Nathan D; Geman, Donald

    2012-01-01

    A major challenge in molecular biology is reverse-engineering the cis-regulatory logic that plays a major role in the control of gene expression. This program includes searching through DNA sequences to identify "motifs" that serve as the binding sites for transcription factors or, more generally, are predictive of gene expression across cellular conditions. Several approaches have been proposed for de novo motif discovery-searching sequences without prior knowledge of binding sites or nucleotide patterns. However, unbiased validation is not straightforward. We consider two approaches to unbiased validation of discovered motifs: testing the statistical significance of a motif using a DNA "background" sequence model to represent the null hypothesis and measuring performance in predicting membership in gene clusters. We demonstrate that the background models typically used are "too null," resulting in overly optimistic assessments of significance, and argue that performance in predicting TF binding or expression patterns from DNA motifs should be assessed by held-out data, as in predictive learning. Applying this criterion to common motif discovery methods resulted in universally poor performance, although there is a marked improvement when motifs are statistically significant against real background sequences. Moreover, on synthetic data where "ground truth" is known, discriminative performance of all algorithms is far below the theoretical upper bound, with pronounced "over-fitting" in training. A key conclusion from this work is that the failure of de novo discovery approaches to accurately identify motifs is basically due to statistical intractability resulting from the fixed size of co-regulated gene clusters, and thus such failures do not necessarily provide evidence that unfound motifs are not active biologically. Consequently, the use of prior knowledge to enhance motif discovery is not just advantageous but necessary. An implementation of the LR and ALR

  12. EXTREME: an online EM algorithm for motif discovery

    PubMed Central

    Quang, Daniel; Xie, Xiaohui

    2014-01-01

    Motivation: Identifying regulatory elements is a fundamental problem in the field of gene transcription. Motif discovery—the task of identifying the sequence preference of transcription factor proteins, which bind to these elements—is an important step in this challenge. MEME is a popular motif discovery algorithm. Unfortunately, MEME’s running time scales poorly with the size of the dataset. Experiments such as ChIP-Seq and DNase-Seq are providing a rich amount of information on the binding preference of transcription factors. MEME cannot discover motifs in data from these experiments in a practical amount of time without a compromising strategy such as discarding a majority of the sequences. Results: We present EXTREME, a motif discovery algorithm designed to find DNA-binding motifs in ChIP-Seq and DNase-Seq data. Unlike MEME, which uses the expectation-maximization algorithm for motif discovery, EXTREME uses the online expectation-maximization algorithm to discover motifs. EXTREME can discover motifs in large datasets in a practical amount of time without discarding any sequences. Using EXTREME on ChIP-Seq and DNase-Seq data, we discover many motifs, including some novel and infrequent motifs that can only be discovered by using the entire dataset. Conservation analysis of one of these novel infrequent motifs confirms that it is evolutionarily conserved and possibly functional. Availability and implementation: All source code is available at the Github repository http://github.com/uci-cbcl/EXTREME. Contact: xhx@ics.uci.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24532725

  13. Mapping the distribution of packing topologies within protein interiors shows predominant preference for specific packing motifs

    PubMed Central

    2011-01-01

    Background Mapping protein primary sequences to their three dimensional folds referred to as the 'second genetic code' remains an unsolved scientific problem. A crucial part of the problem concerns the geometrical specificity in side chain association leading to densely packed protein cores, a hallmark of correctly folded native structures. Thus, any model of packing within proteins should constitute an indispensable component of protein folding and design. Results In this study an attempt has been made to find, characterize and classify recurring patterns in the packing of side chain atoms within a protein which sustains its native fold. The interaction of side chain atoms within the protein core has been represented as a contact network based on the surface complementarity and overlap between associating side chain surfaces. Some network topologies definitely appear to be preferred and they have been termed 'packing motifs', analogous to super secondary structures in proteins. Study of the distribution of these motifs reveals the ubiquitous presence of typical smaller graphs, which appear to get linked or coalesce to give larger graphs, reminiscent of the nucleation-condensation model in protein folding. One such frequently occurring motif, also envisaged as the unit of clustering, the three residue clique was invariably found in regions of dense packing. Finally, topological measures based on surface contact networks appeared to be effective in discriminating sequences native to a specific fold amongst a set of decoys. Conclusions Out of innumerable topological possibilities, only a finite number of specific packing motifs are actually realized in proteins. This small number of motifs could serve as a basis set in the construction of larger networks. Of these, the triplet clique exhibits distinct preference both in terms of composition and geometry. PMID:21605466

  14. Mapping the distribution of packing topologies within protein interiors shows predominant preference for specific packing motifs.

    PubMed

    Basu, Sankar; Bhattacharyya, Dhananjay; Banerjee, Rahul

    2011-05-24

    Mapping protein primary sequences to their three dimensional folds referred to as the 'second genetic code' remains an unsolved scientific problem. A crucial part of the problem concerns the geometrical specificity in side chain association leading to densely packed protein cores, a hallmark of correctly folded native structures. Thus, any model of packing within proteins should constitute an indispensable component of protein folding and design. In this study an attempt has been made to find, characterize and classify recurring patterns in the packing of side chain atoms within a protein which sustains its native fold. The interaction of side chain atoms within the protein core has been represented as a contact network based on the surface complementarity and overlap between associating side chain surfaces. Some network topologies definitely appear to be preferred and they have been termed 'packing motifs', analogous to super secondary structures in proteins. Study of the distribution of these motifs reveals the ubiquitous presence of typical smaller graphs, which appear to get linked or coalesce to give larger graphs, reminiscent of the nucleation-condensation model in protein folding. One such frequently occurring motif, also envisaged as the unit of clustering, the three residue clique was invariably found in regions of dense packing. Finally, topological measures based on surface contact networks appeared to be effective in discriminating sequences native to a specific fold amongst a set of decoys. Out of innumerable topological possibilities, only a finite number of specific packing motifs are actually realized in proteins. This small number of motifs could serve as a basis set in the construction of larger networks. Of these, the triplet clique exhibits distinct preference both in terms of composition and geometry.

  15. Mitochondrial and Y chromosome haplotype motifs as diagnostic markers of Jewish ancestry: a reconsideration.

    PubMed

    Tofanelli, Sergio; Taglioli, Luca; Bertoncini, Stefania; Francalacci, Paolo; Klyosov, Anatole; Pagani, Luca

    2014-01-01

    Several authors have proposed haplotype motifs based on site variants at the mitochondrial genome (mtDNA) and the non-recombining portion of the Y chromosome (NRY) to trace the genealogies of Jewish people. Here, we analyzed their main approaches and test the feasibility of adopting motifs as ancestry markers through construction of a large database of mtDNA and NRY haplotypes from public genetic genealogical repositories. We verified the reliability of Jewish ancestry prediction based on the Cohen and Levite Modal Haplotypes in their "classical" 6 STR marker format or in the "extended" 12 STR format, as well as four founder mtDNA lineages (HVS-I segments) accounting for about 40% of the current population of Ashkenazi Jews. For this purpose we compared haplotype composition in individuals of self-reported Jewish ancestry with the rest of European, African or Middle Eastern samples, to test for non-random association of ethno-geographic groups and haplotypes. Overall, NRY and mtDNA based motifs, previously reported to differentiate between groups, were found to be more represented in Jewish compared to non-Jewish groups. However, this seems to stem from common ancestors of Jewish lineages being rather recent respect to ancestors of non-Jewish lineages with the same "haplotype signatures." Moreover, the polyphyly of haplotypes which contain the proposed motifs and the misuse of constant mutation rates heavily affected previous attempts to correctly dating the origin of common ancestries. Accordingly, our results stress the limitations of using the above haplotype motifs as reliable Jewish ancestry predictors and show its inadequacy for forensic or genealogical purposes.

  16. The Effect of Orthology and Coregulation on Detecting Regulatory Motifs

    PubMed Central

    Storms, Valerie; Claeys, Marleen; Sanchez, Aminael; De Moor, Bart; Verstuyf, Annemieke; Marchal, Kathleen

    2010-01-01

    Background Computational de novo discovery of transcription factor binding sites is still a challenging problem. The growing number of sequenced genomes allows integrating orthology evidence with coregulation information when searching for motifs. Moreover, the more advanced motif detection algorithms explicitly model the phylogenetic relatedness between the orthologous input sequences and thus should be well adapted towards using orthologous information. In this study, we evaluated the conditions under which complementing coregulation with orthologous information improves motif detection for the class of probabilistic motif detection algorithms with an explicit evolutionary model. Methodology We designed datasets (real and synthetic) covering different degrees of coregulation and orthologous information to test how well Phylogibbs and Phylogenetic sampler, as representatives of the motif detection algorithms with evolutionary model performed as compared to MEME, a more classical motif detection algorithm that treats orthologs independently. Results and Conclusions Under certain conditions detecting motifs in the combined coregulation-orthology space is indeed more efficient than using each space separately, but this is not always the case. Moreover, the difference in success rate between the advanced algorithms and MEME is still marginal. The success rate of motif detection depends on the complex interplay between the added information and the specificities of the applied algorithms. Insights in this relation provide information useful to both developers and users. All benchmark datasets are available at http://homes.esat.kuleuven.be/~kmarchal/Supplementary_Storms_Valerie_PlosONE. PMID:20140085

  17. The effect of orthology and coregulation on detecting regulatory motifs.

    PubMed

    Storms, Valerie; Claeys, Marleen; Sanchez, Aminael; De Moor, Bart; Verstuyf, Annemieke; Marchal, Kathleen

    2010-02-03

    Computational de novo discovery of transcription factor binding sites is still a challenging problem. The growing number of sequenced genomes allows integrating orthology evidence with coregulation information when searching for motifs. Moreover, the more advanced motif detection algorithms explicitly model the phylogenetic relatedness between the orthologous input sequences and thus should be well adapted towards using orthologous information. In this study, we evaluated the conditions under which complementing coregulation with orthologous information improves motif detection for the class of probabilistic motif detection algorithms with an explicit evolutionary model. We designed datasets (real and synthetic) covering different degrees of coregulation and orthologous information to test how well Phylogibbs and Phylogenetic sampler, as representatives of the motif detection algorithms with evolutionary model performed as compared to MEME, a more classical motif detection algorithm that treats orthologs independently. Under certain conditions detecting motifs in the combined coregulation-orthology space is indeed more efficient than using each space separately, but this is not always the case. Moreover, the difference in success rate between the advanced algorithms and MEME is still marginal. The success rate of motif detection depends on the complex interplay between the added information and the specificities of the applied algorithms. Insights in this relation provide information useful to both developers and users. All benchmark datasets are available at http://homes.esat.kuleuven.be/~kmarchal/Supplementary_Storms_Valerie_PlosONE.

  18. Discriminative Motif Finding for Predicting Protein Subcellular Localization

    PubMed Central

    Lin, Tien-ho; Murphy, Robert F.; Bar-Joseph, Ziv

    2010-01-01

    Many methods have been described to predict the subcellular location of proteins from sequence information. However, most of these methods either rely on global sequence properties or use a set of known protein targeting motifs to predict protein localization. Here we develop and test a novel method that identifies potential targeting motifs using a discriminative approach based on hidden Markov models (discriminative HMMs). These models search for motifs that are present in a compartment but absent in other, nearby, compartments by utilizing an hierarchical structure that mimics the protein sorting mechanism. We show that both discriminative motif finding and the hierarchical structure improves localization prediction on a benchmark dataset of yeast proteins. The motifs identified can be mapped to known targeting motifs and they are more conserved than the average protein sequence. Using our motif-based predictions we can identify potential annotation errors in public databases for the location of some of the proteins. A software implementation and the dataset described in this paper are available from http://murphylab.web.cmu.edu/software/2009_TCBB_motif/ PMID:21233524

  19. DETAIL VIEW, MAIN ENTRANCE GATES, SHOWING A WINGED HOURGLASS MOTIF, ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    DETAIL VIEW, MAIN ENTRANCE GATES, SHOWING A WINGED HOURGLASS MOTIF, WHICH REFERS TO THE QUICK PASSAGE OF TIME AND THE SHORTNESS OF HUMAN LIFE. USE OF THIS MOTIF WAS A CARRYOVER FROM THE MCARTHUR GATES. - Woodlands Cemetery, 4000 Woodlands Avenue, Philadelphia, Philadelphia County, PA

  20. Identifying novel sequence variants of RNA 3D motifs

    PubMed Central

    Zirbel, Craig L.; Roll, James; Sweeney, Blake A.; Petrov, Anton I.; Pirrung, Meg; Leontis, Neocles B.

    2015-01-01

    Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameterize each model, we use all instances of each motif found in the RNA 3D Motif Atlas and annotations of pairwise nucleotide interactions generated by the FR3D software. Isostericity relations between non-Watson–Crick basepairs are used in scoring sequence variants. SCFG techniques model nested pairs and insertions, while MRF ideas handle crossing interactions and base triples. We use test sets of randomly-generated sequences to set acceptance and rejection thresholds for each motif group and thus control the false positive rate. Validation was carried out by comparing results for four motif groups to RMDetect. The software developed for sequence scoring (JAR3D) is structured to automatically incorporate new motifs as they accumulate in the RNA 3D Motif Atlas when new structures are solved and is available free for download. PMID:26130723

  1. Exclusion of RNA strands from a purine motif triple helix.

    PubMed Central

    Semerad, C L; Maher, L J

    1994-01-01

    Research concerning oligonucleotide-directed triple helix formation has mainly focused on the binding of DNA oligonucleotides to duplex DNA. The participation of RNA strands in triple helices is also of interest. For the pyrimidine motif (pyrimidine.purine.pyrimidine triplets), systematic substitution of RNA for DNA in one, two, or all three triplex strands has previously been reported. For the purine motif (purine.purine.pyrimidine triplets), studies have shown only that RNA cannot bind to duplex DNA. To extend this result, we created a DNA triple helix in the purine motif and systematically replaced one, two, or all three strands with RNA. In dramatic contrast to the general accommodation of RNA strands in the pyrimidine triple helix motif, a stable triplex forms in the purine motif only when all three of the substituent strands are DNA. The lack of triplex formation among any of the other seven possible strand combinations involving RNA suggests that: (i) duplex structures containing RNA cannot be targeted by DNA oligonucleotides in the purine motif; (ii) RNA strands cannot be employed to recognize duplex DNA in the purine motif; and (iii) RNA tertiary structures are likely to contain only isolated base triplets in the purine motif. Images PMID:7529405

  2. MADMX: a strategy for maximal dense motif extraction.

    PubMed

    Grossi, Roberto; Pietracaprina, Andrea; Pisanti, Nadia; Pucci, Geppino; Upfal, Eli; Vandin, Fabio

    2011-04-01

    We develop, analyze, and experiment with a new tool, called MADMX, which extracts frequent motifs from biological sequences. We introduce the notion of density to single out the "significant" motifs. The density is a simple and flexible measure for bounding the number of don't cares in a motif, defined as the fraction of solid (i.e., different from don't care) characters in the motif. A maximal dense motif has density above a certain threshold, and any further specialization of a don't care symbol in it or any extension of its boundaries decreases its number of occurrences in the input sequence. By extracting only maximal dense motifs, MADMX reduces the output size and improves performance, while enhancing the quality of the discoveries. The efficiency of our approach relies on a newly defined combining operation, dubbed fusion, which allows for the construction of maximal dense motifs in a bottom-up fashion, while avoiding the generation of nonmaximal ones. We provide experimental evidence of the efficiency and the quality of the motifs returned by MADMX.

  3. Crossover among structural motifs in Pd-Au nanoalloys.

    PubMed

    Zhu, Beien; Guesmi, Hazar; Creuze, Jérôme; Legrand, Bernard; Mottet, Christine

    2015-11-14

    The crossovers among the most abundant structural motifs (icosahedra, decahedra and truncated octahedra) of Pd-Au nanoalloys have been determined theoretically in a size range between 2 and 7 nm and for three compositions equivalent to Pd3Au, PdAu and PdAu3. The chemical ordering and segregation optimisation are performed via Monte Carlo simulations using semi-empirical tight-binding potentials fitted to ab initio calculations. The chemical configurations are then quenched via molecular dynamic simulations in order to compare their energy and characterize the equilibrium structures as a function of the cluster size. For the smaller sizes (of around 300 atoms and fewer) the structures are also optimized at the electronic level within ab initio calculations in order to validate the semi-empirical potential. The predictions of the crossover sizes for the nanoalloys cannot be simply extrapolated from the crossover of the pure nanoparticles but imply stress release phenomena related to the size misfit between the two metals. Indeed, alloying extends the range of stability of the icosahedron beyond that of the pure systems and the energy differences between decahedra and truncated octahedra become asymptotic, around the sizes of 5-6 nm. Nevertheless, such equilibrium results should be modulated regarding kinetic considerations or possible gas adsorption under experimental conditions.

  4. Automated discovery of active motifs in multiple RNA secondary structures

    SciTech Connect

    Wang, J.T.L.; Chang, Chia-Yo; Shapiro, B.A.

    1996-12-31

    In this paper we present a method for discovering approximately common motifs (also known as active motifs) in multiple RNA secondary structures. The secondary structures can be represented as ordered trees (i.e., the order among siblings matters). Motifs in these trees are connected subgraphs that can differ in both substitutions and deletions/insertions. The proposed method consists of two steps: (1) find candidate motifs in a small sample of the secondary structures; (2) search all of the secondary structures to determine how frequently these motifs occur (within the allowed approximation) in the secondary structures. To reduce the running time, we develop two optimization heuristics based on sampling and pattern matching techniques. Experimental results obtained by running these algorithms on both generated data and RNA secondary structures show the good performance of the algorithms. To demonstrate the utility of our algorithms, we discuss their applications to conducting the phylogenetic study of RNA sequences obtained from GenBank.

  5. Dynamic motifs of strategies in prisoner's dilemma games

    NASA Astrophysics Data System (ADS)

    Kim, Young Jin; Roh, Myungkyoon; Jeong, Seon-Young; Son, Seung-Woo

    2014-12-01

    We investigate the win-lose relations between strategies of iterated prisoner's dilemma games by using a directed network concept to display the replicator dynamics results. In the giant strongly-connected component of the win/lose network, we find win-lose circulations similar to rock-paper-scissors and analyze the fixed point and its stability. Applying the network motif concept, we introduce dynamic motifs, which describe the population dynamics relations among the three strategies. Through exact enumeration, we find 22 dynamic motifs and display their phase portraits. Visualization using directed networks and motif analysis is a useful method to make complex dynamic behavior simple in order to understand it more intuitively. Dynamic motifs can be building blocks for dynamic behavior among strategies when they are applied to other types of games.

  6. Structural motifs, mixing, and segregation effects in 38-atom binary clusters

    NASA Astrophysics Data System (ADS)

    Paz-Borbón, Lauro Oliver; Johnston, Roy L.; Barcaro, Giovanni; Fortunelli, Alessandro

    2008-04-01

    Thirty eight-atom binary clusters composed of elements from groups 10 and 11 of the Periodic Table mixing a second-row with a third-row transition metal (TM) (i.e., clusters composed of the four pairs: Pd-Pt, Ag-Au, Pd-Au, and Ag-Pt) are studied through a combined empirical-potential (EP)/density functional (DF) method. A "system comparison" approach is adopted in order to analyze a wide diversity of structural motifs, and the energy competition among different structural motifs is studied at the DF level for these systems, mainly focusing on the composition 24-14 (the first number refers to the second-row TM atom) but also considering selected motifs with compositions 19-19 (of interest for investigating surface segregation effects) and 32-6 (also 14-24 and 6-32 for the Pd-Au pair). The results confirm the EP predictions about the stability of crystalline structures at this size for the Au-Pd pair but with decahedral or mixed fivefold-symmetric/closed-packed structures in close competition with fcc motifs for the Ag-Au or Ag-Pt and Pd-Pt pairs, respectively. Overall, the EP description is found to be reasonably accurate for the Pd-Pt and Au-Pd pairs, whereas it is less reliable for the Ag-Au and Ag-Pt pairs due to electronic structure (charge transfer or directionality) effects. The driving force to core-shell chemical ordering is put on a quantitative basis, and surface segregation of the most cohesive element into the core is confirmed, with the exception of the Ag-Au pair for which charge transfer effects favor the segregation of Au to the surface of the clusters.

  7. Classification and assessment tools for structural motif discovery algorithms.

    PubMed

    Badr, Ghada; Al-Turaiki, Isra; Mathkour, Hassan

    2013-01-01

    Motif discovery is the problem of finding recurring patterns in biological data. Patterns can be sequential, mainly when discovered in DNA sequences. They can also be structural (e.g. when discovering RNA motifs). Finding common structural patterns helps to gain a better understanding of the mechanism of action (e.g. post-transcriptional regulation). Unlike DNA motifs, which are sequentially conserved, RNA motifs exhibit conservation in structure, which may be common even if the sequences are different. Over the past few years, hundreds of algorithms have been developed to solve the sequential motif discovery problem, while less work has been done for the structural case. In this paper, we survey, classify, and compare different algorithms that solve the structural motif discovery problem, where the underlying sequences may be different. We highlight their strengths and weaknesses. We start by proposing a benchmark dataset and a measurement tool that can be used to evaluate different motif discovery approaches. Then, we proceed by proposing our experimental setup. Finally, results are obtained using the proposed benchmark to compare available tools. To the best of our knowledge, this is the first attempt to compare tools solely designed for structural motif discovery. Results show that the accuracy of discovered motifs is relatively low. The results also suggest a complementary behavior among tools where some tools perform well on simple structures, while other tools are better for complex structures. We have classified and evaluated the performance of available structural motif discovery tools. In addition, we have proposed a benchmark dataset with tools that can be used to evaluate newly developed tools.

  8. Automatic annotation of protein motif function with Gene Ontology terms

    PubMed Central

    Lu, Xinghua; Zhai, Chengxiang; Gopalakrishnan, Vanathi; Buchanan, Bruce G

    2004-01-01

    Background Conserved protein sequence motifs are short stretches of amino acid sequence patterns that potentially encode the function of proteins. Several sequence pattern searching algorithms and programs exist foridentifying candidate protein motifs at the whole genome level. However, amuch needed and importanttask is to determine the functions of the newly identified protein motifs. The Gene Ontology (GO) project is an endeavor to annotate the function of genes or protein sequences with terms from a dynamic, controlled vocabulary and these annotations serve well as a knowledge base. Results This paperpresents methods to mine the GO knowledge base and use the association between the GO terms assigned to a sequence and the motifs matched by the same sequence as evidence for predicting the functions of novel protein motifs automatically. The task of assigning GO terms to protein motifsis viewed as both a binary classification and information retrieval problem, where PROSITE motifs are used as samples for mode training and functional prediction. The mutual information of a motif and aGO term association isfound to be a very useful feature. We take advantageof the known motifs to train a logistic regression classifier, which allows us to combine mutual information with other frequency-based features and obtain a probability of correctassociation. The trained logistic regression model has intuitively meaningful and logically plausible parameter values, and performs very well empirically according to our evaluation criteria. Conclusions In this research, different methods for automatic annotation of protein motifs have been investigated. Empirical result demonstrated that the methods have a great potential for detecting and augmenting information about thefunctions of newly discovered candidate protein motifs. PMID:15345032

  9. De Novo Regulatory Motif Discovery Identifies Significant Motifs in Promoters of Five Classes of Plant Dehydrin Genes

    PubMed Central

    Zolotarov, Yevgen; Strömvik, Martina

    2015-01-01

    Plants accumulate dehydrins in response to osmotic stresses. Dehydrins are divided into five different classes, which are thought to be regulated in different manners. To better understand differences in transcriptional regulation of the five dehydrin classes, de novo motif discovery was performed on 350 dehydrin promoter sequences from a total of 51 plant genomes. Overrepresented motifs were identified in the promoters of five dehydrin classes. The Kn dehydrin promoters contain motifs linked with meristem specific expression, as well as motifs linked with cold/dehydration and abscisic acid response. KS dehydrin promoters contain a motif with a GATA core. SKn and YnSKn dehydrin promoters contain motifs that match elements connected with cold/dehydration, abscisic acid and light response. YnKn dehydrin promoters contain motifs that match abscisic acid and light response elements, but not cold/dehydration response elements. Conserved promoter motifs are present in the dehydrin classes and across different plant lineages, indicating that dehydrin gene regulation is likely also conserved. PMID:26114291

  10. Single promoters as regulatory network motifs

    NASA Astrophysics Data System (ADS)

    Zopf, Christopher; Maheshri, Narendra

    2012-02-01

    At eukaryotic promoters, chromatin can influence the relationship between a gene's expression and transcription factor (TF) activity. This additional complexity might allow single promoters to exhibit dynamical behavior commonly attributed to regulatory motifs involving multiple genes. We investigate the role of promoter chromatin architecture in the kinetics of gene activation using a previously described set of promoter variants based on the phosphate-regulated PHO5 promoter in S. cerevisiae. Accurate quantitative measurement of transcription activation kinetics is facilitated by a controllable and observable TF input to a promoter of interest leading to an observable expression output in single cells. We find the particular architecture of these promoters can result in a significant delay in activation, filtering of noisy TF signals, and a memory of previous activation -- dynamical behaviors reminiscent of a feed-forward loop but only requiring a single promoter. We suggest this is a consequence of chromatin transactions at the promoter, likely passing through a long-lived ``primed'' state between its inactive and competent states. Finally, we show our experimental setup can be generalized as a ``gene oscilloscope'' to probe the kinetics of heterologous promoter architectures.

  11. Tripartite motif 32 prevents pathological cardiac hypertrophy.

    PubMed

    Chen, Lijuan; Huang, Jia; Ji, Yanxiao; Zhang, Xiaojing; Wang, Pixiao; Deng, Keqiong; Jiang, Xi; Ma, Genshan; Li, Hongliang

    2016-05-01

    TRIM32 (tripartite motif 32) is widely accepted to be an E3 ligase that interacts with and eventually ubiquitylates multiple substrates. TRIM32 mutants have been associated with LGMD-2H (limb girdle muscular dystrophy 2H). However, whether TRIM32 is involved in cardiac hypertrophy induced by biomechanical stresses and neurohumoral mediators remains unclear. We generated mice and isolated NRCMs (neonatal rat cardiomyocytes) that overexpressed or were deficient in TRIM32 to investigate the effect of TRIM32 on AB (aortic banding) or AngII (angiotensin II)-mediated cardiac hypertrophy. Echocardiography and both pathological and molecular analyses were used to determine the extent of cardiac hypertrophy and subsequent fibrosis. Our results showed that overexpression of TRIM32 in the heart significantly alleviated the hypertrophic response induced by pressure overload, whereas TRIM32 deficiency dramatically aggravated pathological cardiac remodelling. Similar results were also found in cultured NRCMs incubated with AngII. Mechanistically, the present study suggests that TRIM32 exerts cardioprotective action by interruption of Akt- but not MAPK (mitogen-dependent protein kinase)-dependent signalling pathways. Additionally, inactivation of Akt by LY294002 offset the exacerbated hypertrophic response induced by AB in TRIM32-deficient mice. In conclusion, the present study indicates that TRIM32 plays a protective role in AB-induced pathological cardiac remodelling by blocking Akt-dependent signalling. Therefore TRIM32 could be a novel therapeutic target for the prevention of cardiac hypertrophy and heart failure. © 2016 The Author(s).

  12. Construction of validated, non-redundant composite protein sequence databases.

    PubMed

    Bleasby, A J; Wootton, J C

    1990-01-01

    A strategy has been developed for the construction of a validated, comprehensive composite protein sequence database. Entries are amalgamated from primary source data bases by a largely automated set of processes in which redundant and trivially different entries are eliminated. A modular approach has been adopted to allow scientific judgement to be used at each stage of database processing and amalgamation. Source databases are assigned a priority depending on the quality of sequence validation and commenting. Rejection of entries from the lower priority database, in each pairwise comparison of databases, is carried out according to optionally defined redundancy criteria based on sequence segment mismatches. Efficient algorithms for this methodology are embodied in the COMPO software system. COMPO has been applied for over 2 years in construction and regular updating of the OWL composite protein sequence database from the source databases NBRF-PIR, SWISS-PROT, a GenBank translation retrieved from the feature tables, NBRF-NEW, NEWAT86, PSD-KYOTO and the sequences contained in the Brookhaven protein structure databank. OWL is part of the ISIS integrated data resource of protein sequence and structure [Akrigg et al. (1988) Nature, 335, 745-746]. The modular nature of the integration process greatly facilitates the frequent updating of OWL following releases of the source databases. The extent of redundancy in these sources is revealed by the comparison process. The advantages of a robust composite database for sequence similarity searching and information retrieval are discussed.

  13. The value of prior knowledge in discovering motifs with MEME

    SciTech Connect

    Bailey, T.L.; Elkan, C.

    1995-12-31

    MEME is a tool for discovering motifs in sets of protein or DNA sequences. This paper describes several extensions to MEME which increase its ability to find motifs in a totally unsupervised fashion, but which also allow it to benefit when prior knowledge is available. When no background knowledge is asserted, MEME obtains increased robustness from a method for determining motif widths automatically, and from probabilistic models that allow motifs to be absent in some input sequences. On the other hand, MEME can exploit prior knowledge about a motif being present in all input sequences, about the length of a motif and whether it is a palindrome, and (using Dirichlet mixtures) about expected patterns in individual motif positions. Extensive experiments are reported which support the claim that MEME benefits from, but does not require, background knowledge. The experiments use seven previously studied DNA and protein sequence families and 75 of the protein families documented in the Prosite database of sites and patterns, Release 11.1.

  14. Triadic motifs in the dependence networks of virtual societies.

    PubMed

    Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

    2014-06-10

    In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs.

  15. Triadic motifs in the dependence networks of virtual societies

    PubMed Central

    Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

    2014-01-01

    In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs. PMID:24912755

  16. MEME: discovering and analyzing DNA and protein sequence motifs.

    PubMed

    Bailey, Timothy L; Williams, Nadya; Misleh, Chris; Li, Wilfred W

    2006-07-01

    MEME (Multiple EM for Motif Elicitation) is one of the most widely used tools for searching for novel 'signals' in sets of biological sequences. Applications include the discovery of new transcription factor binding sites and protein domains. MEME works by searching for repeated, ungapped sequence patterns that occur in the DNA or protein sequences provided by the user. Users can perform MEME searches via the web server hosted by the National Biomedical Computation Resource (http://meme.nbcr.net) and several mirror sites. Through the same web server, users can also access the Motif Alignment and Search Tool to search sequence databases for matches to motifs encoded in several popular formats. By clicking on buttons in the MEME output, users can compare the motifs discovered in their input sequences with databases of known motifs, search sequence databases for matches to the motifs and display the motifs in various formats. This article describes the freely accessible web server and its architecture, and discusses ways to use MEME effectively to find new sequence patterns in biological sequences and analyze their significance.

  17. MEME: discovering and analyzing DNA and protein sequence motifs

    PubMed Central

    Bailey, Timothy L.; Williams, Nadya; Misleh, Chris; Li, Wilfred W.

    2006-01-01

    MEME (Multiple EM for Motif Elicitation) is one of the most widely used tools for searching for novel ‘signals’ in sets of biological sequences. Applications include the discovery of new transcription factor binding sites and protein domains. MEME works by searching for repeated, ungapped sequence patterns that occur in the DNA or protein sequences provided by the user. Users can perform MEME searches via the web server hosted by the National Biomedical Computation Resource () and several mirror sites. Through the same web server, users can also access the Motif Alignment and Search Tool to search sequence databases for matches to motifs encoded in several popular formats. By clicking on buttons in the MEME output, users can compare the motifs discovered in their input sequences with databases of known motifs, search sequence databases for matches to the motifs and display the motifs in various formats. This article describes the freely accessible web server and its architecture, and discusses ways to use MEME effectively to find new sequence patterns in biological sequences and analyze their significance. PMID:16845028

  18. Profile-based short linear protein motif discovery

    PubMed Central

    2012-01-01

    Background Short linear protein motifs are attracting increasing attention as functionally independent sites, typically 3–10 amino acids in length that are enriched in disordered regions of proteins. Multiple methods have recently been proposed to discover over-represented motifs within a set of proteins based on simple regular expressions. Here, we extend these approaches to profile-based methods, which provide a richer motif representation. Results The profile motif discovery method MEME performed relatively poorly for motifs in disordered regions of proteins. However, when we applied evolutionary weighting to account for redundancy amongst homologous proteins, and masked out poorly conserved regions of disordered proteins, the performance of MEME is equivalent to that of regular expression methods. However, the two approaches returned different subsets within both a benchmark dataset, and a more realistic discovery dataset. Conclusions Profile-based motif discovery methods complement regular expression based methods. Whilst profile-based methods are computationally more intensive, they are likely to discover motifs currently overlooked by regular expression methods. PMID:22607209

  19. RNA motif search with data-driven element ordering.

    PubMed

    Rampášek, Ladislav; Jimenez, Randi M; Lupták, Andrej; Vinař, Tomáš; Brejová, Broňa

    2016-05-18

    In this paper, we study the problem of RNA motif search in long genomic sequences. This approach uses a combination of sequence and structure constraints to uncover new distant homologs of known functional RNAs. The problem is NP-hard and is traditionally solved by backtracking algorithms. We have designed a new algorithm for RNA motif search and implemented a new motif search tool RNArobo. The tool enhances the RNAbob descriptor language, allowing insertions in helices, which enables better characterization of ribozymes and aptamers. A typical RNA motif consists of multiple elements and the running time of the algorithm is highly dependent on their ordering. By approaching the element ordering problem in a principled way, we demonstrate more than 100-fold speedup of the search for complex motifs compared to previously published tools. We have developed a new method for RNA motif search that allows for a significant speedup of the search of complex motifs that include pseudoknots. Such speed improvements are crucial at a time when the rate of DNA sequencing outpaces growth in computing. RNArobo is available at http://compbio.fmph.uniba.sk/rnarobo .

  20. The TRTGn motif stabilizes the transcription initiation open complex.

    PubMed

    Voskuil, Martin I; Chambliss, Glenn H

    2002-09-20

    The effect on transcription initiation by the extended -10 motif (5'-TRTG(n)-3'), positioned upstream of the -10 region, was investigated using a series of base substitution mutations in the alpha-amylase promoter (amyP). The extended -10 motif, previously referred to as the -16 region, is found frequently in Gram-positive bacterial promoters and several extended -10 promoters from Escherichia coli. The inhibitory effects of the non-productive promoter site (amyP2), which overlaps the upstream region of amyP, were eliminated by mutagenesis of the -35 region and the TRTG motif of amyP2. Removal by mutagenesis of the competitive effects of amyP2 resulted in a reduced dependence of amyP on the TRTG motif. In the absence of the second promoter, mutations in the TRTG motif of amyP destabilized the open complex and prevented the maintenance of open complexes at low temperatures. The open complex half-life was up to 26-fold shorter in the mutant TRTG motif promoters than in the wild-type promoter. We demonstrate that the amyP TRTG motif dramatically stabilizes the open complex intermediate during transcription initiation. Even though the open complex is less stable in the mutant promoters, the region of melted DNA is the same in the wild-type and mutant promoters. However, upon addition of the first three nucleotides, which trap RNAP (RNA polymerase) in a stable initiating complex, the melted DNA region contracts at the 5'-end in a TRTG motif promoter mutant but not at the wild-type promoter, indicating that the motif contributes to maintaining DNA-strand separation.

  1. MEME-ChIP: motif analysis of large DNA datasets.

    PubMed

    Machanick, Philip; Bailey, Timothy L

    2011-06-15

    Advances in high-throughput sequencing have resulted in rapid growth in large, high-quality datasets including those arising from transcription factor (TF) ChIP-seq experiments. While there are many existing tools for discovering TF binding site motifs in such datasets, most web-based tools cannot directly process such large datasets. The MEME-ChIP web service is designed to analyze ChIP-seq 'peak regions'--short genomic regions surrounding declared ChIP-seq 'peaks'. Given a set of genomic regions, it performs (i) ab initio motif discovery, (ii) motif enrichment analysis, (iii) motif visualization, (iv) binding affinity analysis and (v) motif identification. It runs two complementary motif discovery algorithms on the input data--MEME and DREME--and uses the motifs they discover in subsequent visualization, binding affinity and identification steps. MEME-ChIP also performs motif enrichment analysis using the AME algorithm, which can detect very low levels of enrichment of binding sites for TFs with known DNA-binding motifs. Importantly, unlike with the MEME web service, there is no restriction on the size or number of uploaded sequences, allowing very large ChIP-seq datasets to be analyzed. The analyses performed by MEME-ChIP provide the user with a varied view of the binding and regulatory activity of the ChIP-ed TF, as well as the possible involvement of other DNA-binding TFs. MEME-ChIP is available as part of the MEME Suite at http://meme.nbcr.net.

  2. MEME-ChIP: motif analysis of large DNA datasets

    PubMed Central

    Machanick, Philip; Bailey, Timothy L.

    2011-01-01

    Motivation: Advances in high-throughput sequencing have resulted in rapid growth in large, high-quality datasets including those arising from transcription factor (TF) ChIP-seq experiments. While there are many existing tools for discovering TF binding site motifs in such datasets, most web-based tools cannot directly process such large datasets. Results: The MEME-ChIP web service is designed to analyze ChIP-seq ‘peak regions’—short genomic regions surrounding declared ChIP-seq ‘peaks’. Given a set of genomic regions, it performs (i) ab initio motif discovery, (ii) motif enrichment analysis, (iii) motif visualization, (iv) binding affinity analysis and (v) motif identification. It runs two complementary motif discovery algorithms on the input data—MEME and DREME—and uses the motifs they discover in subsequent visualization, binding affinity and identification steps. MEME-ChIP also performs motif enrichment analysis using the AME algorithm, which can detect very low levels of enrichment of binding sites for TFs with known DNA-binding motifs. Importantly, unlike with the MEME web service, there is no restriction on the size or number of uploaded sequences, allowing very large ChIP-seq datasets to be analyzed. The analyses performed by MEME-ChIP provide the user with a varied view of the binding and regulatory activity of the ChIP-ed TF, as well as the possible involvement of other DNA-binding TFs. Availability: MEME-ChIP is available as part of the MEME Suite at http://meme.nbcr.net. Contact: t.bailey@uq.edu.au Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21486936

  3. Automated motif extraction and classification in RNA tertiary structures

    PubMed Central

    Djelloul, Mahassine; Denise, Alain

    2008-01-01

    We used a novel graph-based approach to extract RNA tertiary motifs. We cataloged them all and clustered them using an innovative graph similarity measure. We applied our method to three widely studied structures: Haloarcula marismortui 50S (H.m 50S), Escherichia coli 50S (E. coli 50S), and Thermus thermophilus 16S (T.th 16S) RNAs. We identified 10 known motifs without any prior knowledge of their shapes or positions. We additionally identified four putative new motifs. PMID:18957493

  4. Coherent feedforward transcriptional regulatory motifs enhance drug resistance

    NASA Astrophysics Data System (ADS)

    Charlebois, Daniel A.; Balázsi, Gábor; Kærn, Mads

    2014-05-01

    Fluctuations in gene expression give identical cells access to a spectrum of phenotypes that can serve as a transient, nongenetic basis for natural selection by temporarily increasing drug resistance. In this study, we demonstrate using mathematical modeling and simulation that certain gene regulatory network motifs, specifically coherent feedforward loop motifs, can facilitate the development of nongenetic resistance by increasing cell-to-cell variability and the time scale at which beneficial phenotypic states can be maintained. Our results highlight how regulatory network motifs enabling transient, nongenetic inheritance play an important role in defining reproductive fitness in adverse environments and provide a selective advantage subject to evolutionary pressure.

  5. Targeting functional motifs of a protein family.

    PubMed

    Bhadola, Pradeep; Deo, Nivedita

    2016-10-01

    The structural organization of a protein family is investigated by devising a method based on the random matrix theory (RMT), which uses the physiochemical properties of the amino acid with multiple sequence alignment. A graphical method to represent protein sequences using physiochemical properties is devised that gives a fast, easy, and informative way of comparing the evolutionary distances between protein sequences. A correlation matrix associated with each property is calculated, where the noise reduction and information filtering is done using RMT involving an ensemble of Wishart matrices. The analysis of the eigenvalue statistics of the correlation matrix for the β-lactamase family shows the universal features as observed in the Gaussian orthogonal ensemble (GOE). The property-based approach captures the short- as well as the long-range correlation (approximately following GOE) between the eigenvalues, whereas the previous approach (treating amino acids as characters) gives the usual short-range correlations, while the long-range correlations are the same as that of an uncorrelated series. The distribution of the eigenvector components for the eigenvalues outside the bulk (RMT bound) deviates significantly from RMT observations and contains important information about the system. The information content of each eigenvector of the correlation matrix is quantified by introducing an entropic estimate, which shows that for the β-lactamase family the smallest eigenvectors (low eigenmodes) are highly localized as well as informative. These small eigenvectors when processed gives clusters involving positions that have well-defined biological and structural importance matching with experiments. The approach is crucial for the recognition of structural motifs as shown in β-lactamase (and other families) and selectively identifies the important positions for targets to deactivate (activate) the enzymatic actions.

  6. Targeting functional motifs of a protein family

    NASA Astrophysics Data System (ADS)

    Bhadola, Pradeep; Deo, Nivedita

    2016-10-01

    The structural organization of a protein family is investigated by devising a method based on the random matrix theory (RMT), which uses the physiochemical properties of the amino acid with multiple sequence alignment. A graphical method to represent protein sequences using physiochemical properties is devised that gives a fast, easy, and informative way of comparing the evolutionary distances between protein sequences. A correlation matrix associated with each property is calculated, where the noise reduction and information filtering is done using RMT involving an ensemble of Wishart matrices. The analysis of the eigenvalue statistics of the correlation matrix for the β -lactamase family shows the universal features as observed in the Gaussian orthogonal ensemble (GOE). The property-based approach captures the short- as well as the long-range correlation (approximately following GOE) between the eigenvalues, whereas the previous approach (treating amino acids as characters) gives the usual short-range correlations, while the long-range correlations are the same as that of an uncorrelated series. The distribution of the eigenvector components for the eigenvalues outside the bulk (RMT bound) deviates significantly from RMT observations and contains important information about the system. The information content of each eigenvector of the correlation matrix is quantified by introducing an entropic estimate, which shows that for the β -lactamase family the smallest eigenvectors (low eigenmodes) are highly localized as well as informative. These small eigenvectors when processed gives clusters involving positions that have well-defined biological and structural importance matching with experiments. The approach is crucial for the recognition of structural motifs as shown in β -lactamase (and other families) and selectively identifies the important positions for targets to deactivate (activate) the enzymatic actions.

  7. Thermodynamic Features of Structural Motifs Formed by β-L-RNA

    PubMed Central

    Szabat, Marta; Gudanis, Dorota; Kotkowiak, Weronika; Gdaniec, Zofia; Kierzek, Ryszard; Pasternak, Anna

    2016-01-01

    This is the first report to provide comprehensive thermodynamic and structural data concerning duplex, hairpin, quadruplex and i-motif structures in β-L-RNA series. Herein we confirm that, within the limits of experimental error, the thermodynamic stability of enantiomeric structural motifs is the same as that of naturally occurring D-RNA counterparts. In addition, formation of D-RNA/L-RNA heterochiral duplexes is also observed; however, their thermodynamic stability is significantly reduced in reference to homochiral D-RNA duplexes. The presence of three locked nucleic acid (LNA) residues within the D-RNA strand diminishes the negative effect of the enantiomeric, complementary L-RNA strand in the formation of heterochiral RNA duplexes. Similar behavior is also observed for heterochiral LNA-2′-O-methyl-D-RNA/L-RNA duplexes. The formation of heterochiral duplexes was confirmed by 1H NMR spectroscopy. The CD curves of homochiral L-RNA structural motifs are always reversed, whereas CD curves of heterochiral duplexes present individual features dependent on the composition of chiral strands. PMID:26908023

  8. Fast-Find: a novel computational approach to analyzing combinatorial motifs.

    PubMed

    Hamady, Micah; Peden, Erin; Knight, Rob; Singh, Ravinder

    2006-01-04

    Many vital biological processes, including transcription and splicing, require a combination of short, degenerate sequence patterns, or motifs, adjacent to defined sequence features. Although these motifs occur frequently by chance, they only have biological meaning within a specific context. Identifying transcripts that contain meaningful combinations of patterns is thus an important problem, which existing tools address poorly. Here we present a new approach, Fast-FIND (Fast-Fully Indexed Nucleotide Database), that uses a relational database to support rapid indexed searches for arbitrary combinations of patterns defined either by sequence or composition. Fast-FIND is easy to implement, takes less than a second to search the entire Drosophila genome sequence for arbitrary patterns adjacent to sites of alternative polyadenylation, and is sufficiently fast to allow sensitivity analysis on the patterns. We have applied this approach to identify transcripts that contain combinations of sequence motifs for RNA-binding proteins that may regulate alternative polyadenylation. Fast-FIND provides an efficient way to identify transcripts that are potentially regulated via alternative polyadenylation. We have used it to generate hypotheses about interactions between specific polyadenylation factors, which we will test experimentally.

  9. Identify Beta-Hairpin Motifs with Quadratic Discriminant Algorithm Based on the Chemical Shifts.

    PubMed

    YongE, Feng; GaoShan, Kou

    2015-01-01

    Successful prediction of the beta-hairpin motif will be helpful for understanding the of the fold recognition. Some algorithms have been proposed for the prediction of beta-hairpin motifs. However, the parameters used by these methods were primarily based on the amino acid sequences. Here, we proposed a novel model for predicting beta-hairpin structure based on the chemical shift. Firstly, we analyzed the statistical distribution of chemical shifts of six nuclei in not beta-hairpin and beta-hairpin motifs. Secondly, we used these chemical shifts as features combined with three algorithms to predict beta-hairpin structure. Finally, we achieved the best prediction, namely sensitivity of 92%, the specificity of 94% with 0.85 of Mathew's correlation coefficient using quadratic discriminant analysis algorithm, which is clearly superior to the same method for the prediction of beta-hairpin structure from 20 amino acid compositions in the three-fold cross-validation. Our finding showed that the chemical shift is an effective parameter for beta-hairpin prediction, suggesting the quadratic discriminant analysis is a powerful algorithm for the prediction of beta-hairpin.

  10. A million peptide motifs for the molecular biologist.

    PubMed

    Tompa, Peter; Davey, Norman E; Gibson, Toby J; Babu, M Madan

    2014-07-17

    A molecular description of functional modules in the cell is the focus of many high-throughput studies in the postgenomic era. A large portion of biomolecular interactions in virtually all cellular processes is mediated by compact interaction modules, referred to as peptide motifs. Such motifs are typically less than ten residues in length, occur within intrinsically disordered regions, and are recognized and/or posttranslationally modified by structured domains of the interacting partner. In this review, we suggest that there might be over a million instances of peptide motifs in the human proteome. While this staggering number suggests that peptide motifs are numerous and the most understudied functional module in the cell, it also holds great opportunities for new discoveries. Copyright © 2014 Elsevier Inc. All rights reserved.

  11. SOUTH AND EAST SIDES; HENSCHIEN APPLIED HIS ORNAMENTAL MOTIF EVEN ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    SOUTH AND EAST SIDES; HENSCHIEN APPLIED HIS ORNAMENTAL MOTIF EVEN TO THIS TINY STRUCTURE; NOTE PRAIRIE-STYLE CAPITALS ON PILASTERS - Rath Packing Company, Salt Bunker, Sycamore Street between Elm & Eighteenth Streets, Waterloo, Black Hawk County, IA

  12. Ca2+-binding Motif of βγ-Crystallins*

    PubMed Central

    Srivastava, Shanti Swaroop; Mishra, Amita; Krishnan, Bal; Sharma, Yogendra

    2014-01-01

    βγ-Crystallin-type double clamp (N/D)(N/D)XX(S/T)S motif is an established but sparsely investigated motif for Ca2+ binding. A βγ-crystallin domain is formed of two Greek key motifs, accommodating two Ca2+-binding sites. βγ-Crystallins make a separate class of Ca2+-binding proteins (CaBP), apparently a major group of CaBP in bacteria. Paralleling the diversity in βγ-crystallin domains, these motifs also show great diversity, both in structure and in function. Although the expression of some of them has been associated with stress, virulence, and adhesion, the functional implications of Ca2+ binding to βγ-crystallins in mediating biological processes are yet to be elucidated. PMID:24567326

  13. 5. Interior of showroom and offices. Note ship motifs in ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    5. Interior of showroom and offices. Note ship motifs in balcony and pilot house. Restored boats include a 1955 Standard (forward) and 1953 Clipper (background). - Barbour Boat Works, Tryon Palace Drive, New Bern, Craven County, NC

  14. DETAIL OF CORNICE MOULDING WITH RAM'S HEAD MOTIF. EIGHT SHADES ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    DETAIL OF CORNICE MOULDING WITH RAM'S HEAD MOTIF. EIGHT SHADES OF GOLD LEAF AND BURNISHED GOLD LEAF WERE USED FOR THE INTERIOR FINISHES. - Anaconda Historic District, Washoe Theater, 305 Main Street, Anaconda, Deer Lodge County, MT

  15. 10. DETAIL OF CORNICE MOULDING WITH RAM'S HEAD MOTIF. EIGHT ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    10. DETAIL OF CORNICE MOULDING WITH RAM'S HEAD MOTIF. EIGHT SHADES OF GOLD LEAF AND BURNISHED GOLD LEAF WERE USED FOR THE INTERIOR FINISHES - Anaconda Historic District, Washoe Theater, 305 Main Street, Anaconda, Deer Lodge County, MT

  16. Detecting Statistically Significant Communities of Triangle Motifs in Undirected Networks

    DTIC Science & Technology

    2015-03-16

    Perry et al. [6] by developing a statistical framework that supports the detection of triangle motif-based clusters in complex networks. The specific...triangle motif-based clustering . 2. Developed an algorithm for clustering undirected networks, where the triangle configuration was used as the basis...for forming clusters . 3. Developed a C++ implementation of the proposed clustering framework. 15. SUBJECT TERMS EOARD, Operations Research, Networks

  17. Functional characterization of motif sequences under purifying selection.

    PubMed

    Chen, De-Hua; Chang, Andrew Ying-Fei; Liao, Ben-Yang; Yeang, Chen-Hsiang

    2013-02-01

    Diverse life forms are driven by the evolution of gene regulatory programs including changes in regulator proteins and cis-regulatory elements. Alterations of cis-regulatory elements are likely to dominate the evolution of the gene regulatory networks, as they are subjected to smaller selective constraints compared with proteins and hence may evolve quickly to adapt the environment. Prior studies on cis-regulatory element evolution focus primarily on sequence substitutions of known transcription factor-binding motifs. However, evolutionary models for the dynamics of motif occurrence are relatively rare, and comprehensive characterization of the evolution of all possible motif sequences has not been pursued. In the present study, we propose an algorithm to estimate the strength of purifying selection of a motif sequence based on an evolutionary model capturing the birth and death of motif occurrences on promoters. We term this measure as the 'evolutionary retention coefficient', as it is related yet distinct from the canonical definition of selection coefficient in population genetics. Using this algorithm, we estimate and report the evolutionary retention coefficients of all possible 10-nucleotide sequences from the aligned promoter sequences of 27 748. orthologous gene families in 34 mammalian species. Intriguingly, the evolutionary retention coefficients of motifs are intimately associated with their functional relevance. Top-ranking motifs (sorted by evolutionary retention coefficients) are significantly enriched with transcription factor-binding sequences according to the curated knowledge from the TRANSFAC database and the ChIP-seq data generated from the ENCODE Consortium. Moreover, genes harbouring high-scoring motifs on their promoters retain significantly coherent expression profiles, and those genes are over-represented in the functional classes involved in gene regulation. The validation results reveal the dependencies between natural selection and

  18. Three-Dimensional DNA Nanostructures Assembled from DNA Star Motifs.

    PubMed

    Tian, Cheng; Zhang, Chuan

    2017-01-01

    Tile-based DNA self-assembly is a promising method in DNA nanotechnology and has produced a wide range of nanostructures by using a small set of unique DNA strands. DNA star motif, as one of DNA tiles, has been employed to assemble varieties of symmetric one-, two-, three-dimensional (1, 2, 3D) DNA nanostructures. Herein, we describe the design principles, assembly methods, and characterization methods of 3D DNA nanostructures assembled from the DNA star motifs.

  19. Specific regulatory motifs predict glucocorticoid responsiveness of hippocampal gene expression.

    PubMed

    Datson, N A; Polman, J A E; de Jonge, R T; van Boheemen, P T M; van Maanen, E M T; Welten, J; McEwen, B S; Meiland, H C; Meijer, O C

    2011-10-01

    The glucocorticoid receptor (GR) is an ubiquitously expressed ligand-activated transcription factor that mediates effects of cortisol in relation to adaptation to stress. In the brain, GR affects the hippocampus to modulate memory processes through direct binding to glucocorticoid response elements (GREs) in the DNA. However, its effects are to a high degree cell specific, and its target genes in different cell types as well as the mechanisms conferring this specificity are largely unknown. To gain insight in hippocampal GR signaling, we characterized to which GRE GR binds in the rat hippocampus. Using a position-specific scoring matrix, we identified evolutionary-conserved putative GREs from a microarray based set of hippocampal target genes. Using chromatin immunoprecipitation, we were able to confirm GR binding to 15 out of a selection of 32 predicted sites (47%). The majority of these 15 GREs are previously undescribed and thus represent novel GREs that bind GR and therefore may be functional in the rat hippocampus. GRE nucleotide composition was not predictive for binding of GR to a GRE. A search for conserved flanking sequences that may predict GR-GRE interaction resulted in the identification of GC-box associated motifs, such as Myc-associated zinc finger protein 1, within 2 kb of GREs with GR binding in the hippocampus. This enrichment was not present around nonbinding GRE sequences nor around proven GR-binding sites from a mesenchymal stem-like cell dataset that we analyzed. GC-binding transcription factors therefore may be unique partners for DNA-bound GR and may in part explain cell-specific transcriptional regulation by glucocorticoids in the context of the hippocampus.

  20. Discovering Multidimensional Motifs in Physiological Signals for Personalized Healthcare.

    PubMed

    Balasubramanian, Arvind; Wang, Jun; Prabhakaran, Balakrishnan

    2016-08-01

    Personalized diagnosis and therapy requires monitoring patient activity using various body sensors. Sensor data generated during personalized exercises or tasks may be too specific or inadequate to be evaluated using supervised methods such as classification. We propose multidimensional motif (MDM) discovery as a means for patient activity monitoring, since such motifs can capture repeating patterns across multiple dimensions of the data, and can serve as conformance indicators. Previous studies pertaining to mining MDMs have proposed approaches that lack the capability of concurrently processing multiple dimensions, thus limiting their utility in online scenarios. In this paper, we propose an efficient real-time approach to MDM discovery in body sensor generated time series data for monitoring performance of patients during therapy. We present two alternative models for MDMs based on motif co-occurrences and temporal ordering among motifs across multiple dimensions, with detailed formulation of the concepts proposed. The proposed method uses an efficient hashing based record to enable speedy update and retrieval of motif sets, and identification of MDMs. Performance evaluation using synthetic and real body sensor data in unsupervised motif discovery tasks shows that the approach is effective for (a) concurrent processing of multidimensional time series information suitable for real-time applications, (b) finding unknown naturally occurring patterns with minimal delay, and

  1. Finding specific RNA motifs: Function in a zeptomole world?

    PubMed Central

    KNIGHT, ROB; YARUS, MICHAEL

    2003-01-01

    We have developed a new method for estimating the abundance of any modular (piecewise) RNA motif within a longer random region. We have used this method to estimate the size of the active motifs available to modern SELEX experiments (picomoles of unique sequences) and to a plausible RNA World (zeptomoles of unique sequences: 1 zmole = 602 sequences). Unexpectedly, activities such as specific isoleucine binding are almost certainly present in zeptomoles of molecules, and even ribozymes such as self-cleavage motifs may appear (depending on assumptions about the minimal structures). The number of specified nucleotides is not the only important determinant of a motif’s rarity: The number of modules into which it is divided, and the details of this division, are also crucial. We propose three maxims for easily isolated motifs: the Maxim of Minimization, the Maxim of Multiplicity, and the Maxim of the Median. These maxims together state that selected motifs should be small and composed of as many separate, equally sized modules as possible. For evenly divided motifs with four modules, the largest accessible activity in picomole scale (1–1000 pmole) pools of length 100 is about 34 nucleotides; while for zeptomole scale (1–1000 zmole) pools it is about 20 specific nucleotides (50% probability of occurrence). This latter figure includes some ribozymes and aptamers. Consequently, an RNA metabolism apparently could have begun with only zeptomoles of RNA molecules. PMID:12554865

  2. PISMA: A Visual Representation of Motif Distribution in DNA Sequences

    PubMed Central

    Alcántara-Silva, Rogelio; Alvarado-Hermida, Moisés; Díaz-Contreras, Gibrán; Sánchez-Barrios, Martha; Carrera, Samantha; Galván, Silvia Carolina

    2017-01-01

    Background: Because the graphical presentation and analysis of motif distribution can provide insights for experimental hypothesis, PISMA aims at identifying motifs on DNA sequences, counting and showing them graphically. The motif length ranges from 2 to 10 bases, and the DNA sequences range up to 10 kb. The motif distribution is shown as a bar-code–like, as a gene-map–like, and as a transcript scheme. Results: We obtained graphical schemes of the CpG site distribution from 91 human papillomavirus genomes. Also, we present 2 analyses: one of DNA motifs associated with either methylation-resistant or methylation-sensitive CpG islands and another analysis of motifs associated with exosome RNA secretion. Availability and Implementation: PISMA is developed in Java; it is executable in any type of hardware and in diverse operating systems. PISMA is freely available to noncommercial users. The English version and the User Manual are provided in Supplementary Files 1 and 2, and a Spanish version is available at www.biomedicas.unam.mx/wp-content/software/pisma.zip and www.biomedicas.unam.mx/wp-content/pdf/manual/pisma.pdf. PMID:28469418

  3. The distribution of RNA motifs in natural sequences.

    PubMed

    Bourdeau, V; Ferbeyre, G; Pageau, M; Paquin, B; Cedergren, R

    1999-11-15

    Functional analysis of genome sequences has largely ignored RNA genes and their structures. We introduce here the notion of 'ribonomics' to describe the search for the distribution of and eventually the determination of the physiological roles of these RNA structures found in the sequence databases. The utility of this approach is illustrated here by the identification in the GenBank database of RNA motifs having known binding or chemical activity. The frequency of these motifs indicates that most have originated from evolutionary drift and are selectively neutral. On the other hand, their distribution among species and their location within genes suggest that the destiny of these motifs may be more elaborate. For example, the hammerhead motif has a skewed organismal presence, is phylogenetically stable and recent work on a schistosome version confirms its in vivo biological activity. The under-representation of the valine-binding motif and the Rev-binding element in GenBank hints at a detrimental effect on cell growth or viability. Data on the presence and the location of these motifs may provide critical guidance in the design of experiments directed towards the understanding and the manipulation of RNA complexes and activities in vivo.

  4. Selection of peptide entry motifs by bacterial surface display.

    PubMed Central

    Taschner, Sabine; Meinke, Andreas; von Gabain, Alexander; Boyd, Aoife P

    2002-01-01

    Surface display technologies have been established previously to select peptides and polypeptides that interact with purified immobilized ligands. In the present study, we designed and implemented a surface display-based technique to identify novel peptide motifs that mediate entry into eukaryotic cells. An Escherichia coli library expressing surface-displayed peptides was combined with eukaryotic cells and the gentamicin protection assay was performed to select recombinant E. coli, which were internalized into eukaryotic cells by virtue of the displayed peptides. To establish the proof of principle of this approach, the fibronectin-binding motifs of the fibronectin-binding protein A of Staphylococcus aureus were inserted into the E. coli FhuA protein. Surface expression of the fusion proteins was demonstrated by functional assays and by FACS analysis. The fibronectin-binding motifs were shown to mediate entry of the bacteria into non-phagocytic eukaryotic cells and brought about the preferential selection of these bacteria over E. coli expressing parental FhuA, with an enrichment of 100000-fold. Four entry sequences were selected and identified using an S. aureus library of peptides displayed in the FhuA protein on the surface of E. coli. These sequences included novel entry motifs as well as integrin-binding Arg-Gly-Asp (RGD) motifs and promoted a high degree of bacterial entry. Bacterial surface display is thus a powerful tool to effectively select and identify entry peptide motifs. PMID:12144529

  5. cWINNOWER Algorithm for Finding Fuzzy DNA Motifs

    NASA Technical Reports Server (NTRS)

    Liang, Shoudan

    2003-01-01

    The cWINNOWER algorithm detects fuzzy motifs in DNA sequences rich in protein-binding signals. A signal is defined as any short nucleotide pattern having up to d mutations differing from a motif of length l. The algorithm finds such motifs if multiple mutated copies of the motif (i.e., the signals) are present in the DNA sequence in sufficient abundance. The cWINNOWER algorithm substantially improves the sensitivity of the winnower method of Pevzner and Sze by imposing a consensus constraint, enabling it to detect much weaker signals. We studied the minimum number of detectable motifs qc as a function of sequence length N for random sequences. We found that qc increases linearly with N for a fast version of the algorithm based on counting three-member sub-cliques. Imposing consensus constraints reduces qc, by a factor of three in this case, which makes the algorithm dramatically more sensitive. Our most sensitive algorithm, which counts four-member sub-cliques, needs a minimum of only 13 signals to detect motifs in a sequence of length N = 12000 for (l,d) = (15,4).

  6. Transcriptional Network Growing Models Using Motif-Based Preferential Attachment

    PubMed Central

    Abdelzaher, Ahmed F.; Al-Musawi, Ahmad F.; Ghosh, Preetam; Mayo, Michael L.; Perkins, Edward J.

    2015-01-01

    Understanding relationships between architectural properties of gene-regulatory networks (GRNs) has been one of the major goals in systems biology and bioinformatics, as it can provide insights into, e.g., disease dynamics and drug development. Such GRNs are characterized by their scale-free degree distributions and existence of network motifs – i.e., small-node subgraphs that occur more abundantly in GRNs than expected from chance alone. Because these transcriptional modules represent “building blocks” of complex networks and exhibit a wide range of functional and dynamical properties, they may contribute to the remarkable robustness and dynamical stability associated with the whole of GRNs. Here, we developed network-construction models to better understand this relationship, which produce randomized GRNs by using transcriptional motifs as the fundamental growth unit in contrast to other methods that construct similar networks on a node-by-node basis. Because this model produces networks with a prescribed lower bound on the number of choice transcriptional motifs (e.g., downlinks, feed-forward loops), its fidelity to the motif distributions observed in model organisms represents an improvement over existing methods, which we validated by contrasting their resultant motif and degree distributions against existing network-growth models and data from the model organism of the bacterium Escherichia coli. These models may therefore serve as novel testbeds for further elucidating relationships between the topology of transcriptional motifs and network-wide dynamical properties. PMID:26528473

  7. PISMA: A Visual Representation of Motif Distribution in DNA Sequences.

    PubMed

    Alcántara-Silva, Rogelio; Alvarado-Hermida, Moisés; Díaz-Contreras, Gibrán; Sánchez-Barrios, Martha; Carrera, Samantha; Galván, Silvia Carolina

    2017-01-01

    Because the graphical presentation and analysis of motif distribution can provide insights for experimental hypothesis, PISMA aims at identifying motifs on DNA sequences, counting and showing them graphically. The motif length ranges from 2 to 10 bases, and the DNA sequences range up to 10 kb. The motif distribution is shown as a bar-code-like, as a gene-map-like, and as a transcript scheme. We obtained graphical schemes of the CpG site distribution from 91 human papillomavirus genomes. Also, we present 2 analyses: one of DNA motifs associated with either methylation-resistant or methylation-sensitive CpG islands and another analysis of motifs associated with exosome RNA secretion. PISMA is developed in Java; it is executable in any type of hardware and in diverse operating systems. PISMA is freely available to noncommercial users. The English version and the User Manual are provided in Supplementary Files 1 and 2, and a Spanish version is available at www.biomedicas.unam.mx/wp-content/software/pisma.zip and www.biomedicas.unam.mx/wp-content/pdf/manual/pisma.pdf.

  8. Transcriptional Network Growing Models Using Motif-Based Preferential Attachment.

    PubMed

    Abdelzaher, Ahmed F; Al-Musawi, Ahmad F; Ghosh, Preetam; Mayo, Michael L; Perkins, Edward J

    2015-01-01

    Understanding relationships between architectural properties of gene-regulatory networks (GRNs) has been one of the major goals in systems biology and bioinformatics, as it can provide insights into, e.g., disease dynamics and drug development. Such GRNs are characterized by their scale-free degree distributions and existence of network motifs - i.e., small-node subgraphs that occur more abundantly in GRNs than expected from chance alone. Because these transcriptional modules represent "building blocks" of complex networks and exhibit a wide range of functional and dynamical properties, they may contribute to the remarkable robustness and dynamical stability associated with the whole of GRNs. Here, we developed network-construction models to better understand this relationship, which produce randomized GRNs by using transcriptional motifs as the fundamental growth unit in contrast to other methods that construct similar networks on a node-by-node basis. Because this model produces networks with a prescribed lower bound on the number of choice transcriptional motifs (e.g., downlinks, feed-forward loops), its fidelity to the motif distributions observed in model organisms represents an improvement over existing methods, which we validated by contrasting their resultant motif and degree distributions against existing network-growth models and data from the model organism of the bacterium Escherichia coli. These models may therefore serve as novel testbeds for further elucidating relationships between the topology of transcriptional motifs and network-wide dynamical properties.

  9. Motif difficulty (MD): a predictive measure of problem difficulty for evolutionary algorithms using network motifs.

    PubMed

    Liu, Jing; Abbass, Hussein A; Green, David G; Zhong, Weicai

    2012-01-01

    One of the major challenges in the field of evolutionary algorithms (EAs) is to characterise which kinds of problems are easy and which are not. Researchers have been attracted to predict the behaviour of EAs in different domains. We introduce fitness landscape networks (FLNs) that are formed using operators satisfying specific conditions and define a new predictive measure that we call motif difficulty (MD) for comparison-based EAs. Because it is impractical to exhaustively search the whole network, we propose a sampling technique for calculating an approximate MD measure. Extensive experiments on binary search spaces are conducted to show both the advantages and limitations of MD. Multidimensional knapsack problems (MKPs) are also used to validate the performance of approximate MD on FLNs with different topologies. The effect of two representations, namely binary and permutation, on the difficulty of MKPs is analysed.

  10. A comprehensive analysis of the La-motif protein superfamily

    PubMed Central

    Bousquet-Antonelli, Cécile; Deragon, Jean-Marc

    2009-01-01

    The extremely well-conserved La motif (LAM), in synergy with the immediately following RNA recognition motif (RRM), allows direct binding of the (genuine) La autoantigen to RNA polymerase III primary transcripts. This motif is not only found on La homologs, but also on La-related proteins (LARPs) of unrelated function. LARPs are widely found amongst eukaryotes and, although poorly characterized, appear to be RNA-binding proteins fulfilling crucial cellular functions. We searched the fully sequenced genomes of 83 eukaryotic species scattered along the tree of life for the presence of LAM-containing proteins. We observed that these proteins are absent from archaea and present in all eukaryotes (except protists from the Plasmodium genus), strongly suggesting that the LAM is an ancestral motif that emerged early after the archaea-eukarya radiation. A complete evolutionary and structural analysis of these proteins resulted in their classification into five families: the genuine La homologs and four LARP families. Unexpectedly, in each family a conserved domain representing either a classical RRM or an RRM-like motif immediately follows the LAM of most proteins. An evolutionary analysis of the LAM-RRM/RRM-L regions shows that these motifs co-evolved and should be used as a single entity to define the functional region of interaction of LARPs with their substrates. We also found two extremely well conserved motifs, named LSA and DM15, shared by LARP6 and LARP1 family members, respectively. We suggest that members of the same family are functional homologs and/or share a common molecular mode of action on different RNA baits. PMID:19299548

  11. A comprehensive analysis of the La-motif protein superfamily.

    PubMed

    Bousquet-Antonelli, Cécile; Deragon, Jean-Marc

    2009-05-01

    The extremely well-conserved La motif (LAM), in synergy with the immediately following RNA recognition motif (RRM), allows direct binding of the (genuine) La autoantigen to RNA polymerase III primary transcripts. This motif is not only found on La homologs, but also on La-related proteins (LARPs) of unrelated function. LARPs are widely found amongst eukaryotes and, although poorly characterized, appear to be RNA-binding proteins fulfilling crucial cellular functions. We searched the fully sequenced genomes of 83 eukaryotic species scattered along the tree of life for the presence of LAM-containing proteins. We observed that these proteins are absent from archaea and present in all eukaryotes (except protists from the Plasmodium genus), strongly suggesting that the LAM is an ancestral motif that emerged early after the archaea-eukarya radiation. A complete evolutionary and structural analysis of these proteins resulted in their classification into five families: the genuine La homologs and four LARP families. Unexpectedly, in each family a conserved domain representing either a classical RRM or an RRM-like motif immediately follows the LAM of most proteins. An evolutionary analysis of the LAM-RRM/RRM-L regions shows that these motifs co-evolved and should be used as a single entity to define the functional region of interaction of LARPs with their substrates. We also found two extremely well conserved motifs, named LSA and DM15, shared by LARP6 and LARP1 family members, respectively. We suggest that members of the same family are functional homologs and/or share a common molecular mode of action on different RNA baits.

  12. Mechanisms of Zero-Lag Synchronization in Cortical Motifs

    PubMed Central

    Gollo, Leonardo L.; Mirasso, Claudio; Sporns, Olaf; Breakspear, Michael

    2014-01-01

    Zero-lag synchronization between distant cortical areas has been observed in a diversity of experimental data sets and between many different regions of the brain. Several computational mechanisms have been proposed to account for such isochronous synchronization in the presence of long conduction delays: Of these, the phenomenon of “dynamical relaying” – a mechanism that relies on a specific network motif – has proven to be the most robust with respect to parameter mismatch and system noise. Surprisingly, despite a contrary belief in the community, the common driving motif is an unreliable means of establishing zero-lag synchrony. Although dynamical relaying has been validated in empirical and computational studies, the deeper dynamical mechanisms and comparison to dynamics on other motifs is lacking. By systematically comparing synchronization on a variety of small motifs, we establish that the presence of a single reciprocally connected pair – a “resonance pair” – plays a crucial role in disambiguating those motifs that foster zero-lag synchrony in the presence of conduction delays (such as dynamical relaying) from those that do not (such as the common driving triad). Remarkably, minor structural changes to the common driving motif that incorporate a reciprocal pair recover robust zero-lag synchrony. The findings are observed in computational models of spiking neurons, populations of spiking neurons and neural mass models, and arise whether the oscillatory systems are periodic, chaotic, noise-free or driven by stochastic inputs. The influence of the resonance pair is also robust to parameter mismatch and asymmetrical time delays amongst the elements of the motif. We call this manner of facilitating zero-lag synchrony resonance-induced synchronization, outline the conditions for its occurrence, and propose that it may be a general mechanism to promote zero-lag synchrony in the brain. PMID:24763382

  13. NetMODE: Network Motif Detection without Nauty

    PubMed Central

    Wang, Haidong; Deng, Hualiang; Liu, Xiaoguang; Wang, Gang

    2012-01-01

    A motif in a network is a connected graph that occurs significantly more frequently as an induced subgraph than would be expected in a similar randomized network. By virtue of being atypical, it is thought that motifs might play a more important role than arbitrary subgraphs. Recently, a flurry of advances in the study of network motifs has created demand for faster computational means for identifying motifs in increasingly larger networks. Motif detection is typically performed by enumerating subgraphs in an input network and in an ensemble of comparison networks; this poses a significant computational problem. Classifying the subgraphs encountered, for instance, is typically performed using a graph canonical labeling package, such as Nauty, and will typically be called billions of times. In this article, we describe an implementation of a network motif detection package, which we call NetMODE. NetMODE can only perform motif detection for -node subgraphs when , but does so without the use of Nauty. To avoid using Nauty, NetMODE has an initial pretreatment phase, where -node graph data is stored in memory (). For we take a novel approach, which relates to the Reconstruction Conjecture for directed graphs. We find that NetMODE can perform up to around times faster than its predecessors when and up to around times faster when (the exact improvement varies considerably). NetMODE also (a) includes a method for generating comparison graphs uniformly at random, (b) can interface with external packages (e.g. R), and (c) can utilize multi-core architectures. NetMODE is available from netmode.sf.net. PMID:23272055

  14. Phyloproteomic Analysis of 11780 Six-Residue-Long Motifs Occurrences

    PubMed Central

    Lobanov, M. Yu.

    2015-01-01

    How is it possible to find good traits for phylogenetic reconstructions? Here, we present a new phyloproteomic criterion that is an occurrence of simple motifs which can be imprints of evolution history. We studied the occurrences of 11780 six-residue-long motifs consisting of two randomly located amino acids in 97 eukaryotic and 25 bacterial proteomes. For all eukaryotic proteomes, with the exception of the Amoebozoa, Stramenopiles, and Diplomonadida kingdoms, the number of proteins containing the motifs from the first group (one of the two amino acids occurs once at the terminal position) made about 20%; in the case of motifs from the second (one of two amino acids occurs one time within the pattern) and third (the two amino acids occur randomly) groups, 30% and 50%, respectively. For bacterial proteomes, this relationship was 10%, 27%, and 63%, respectively. The matrices of correlation coefficients between numbers of proteins where a motif from the set of 11780 motifs appears at least once in 9 kingdoms and 5 phyla of bacteria were calculated. Among the correlation coefficients for eukaryotic proteomes, the correlation between the animal and fungi kingdoms (0.62) is higher than between fungi and plants (0.54). Our study provides support that animals and fungi are sibling kingdoms. Comparison of the frequencies of six-residue-long motifs in different proteomes allows obtaining phylogenetic relationships based on similarities between these frequencies: the Diplomonadida kingdoms are more close to Bacteria than to Eukaryota; Stramenopiles and Amoebozoa are more close to each other than to other kingdoms of Eukaryota. PMID:26114101

  15. An Integrated Procedure for the Structural Design of a Composite Rotor-Hydrofoil of a Water Current Turbine (WCT)

    NASA Astrophysics Data System (ADS)

    Oller Aramayo, S. A.; Nallim, L. G.; Oller, S.

    2013-12-01

    This paper shows an integrated structural design optimization of a composite rotor-hydrofoil of a water current turbine by means the finite elements method (FEM), using a Serial/Parallel mixing theory (Rastellini et al. Comput. Struct. 86:879-896, 2008, Martinez et al., 2007, Martinez and Oller Arch. Comput. Methods. 16(4):357-397, 2009, Martinez et al. Compos. Part B Eng. 42(2011):134-144, 2010) coupled with a fluid-dynamic formulation and multi-objective optimization algorithm (Gen and Cheng 1997, Lee et al. Compos. Struct. 99:181-192, 2013, Lee et al. Compos. Struct. 94(3):1087-1096, 2012). The composite hydrofoil of the turbine rotor has been design using a reinforced laminate composites, taking into account the optimization of the carbon fiber orientation to obtain the maximum strength and lower rotational-inertia. Also, these results have been compared with a steel hydrofoil remarking the different performance on both structures. The mechanical and geometrical parameters involved in the design of this fiber-reinforced composite material are the fiber orientation, number of layers, stacking sequence and laminate thickness. Water pressure in the rotor of the turbine is obtained from a coupled fluid-dynamic simulation (CFD), whose detail can be found in the reference Oller et al. (2012). The main purpose of this paper is to achieve a very low inertia rotor minimizing the start-stop effect, because it is applied in axial water flow turbine currently in design by the authors, in which is important to take the maximum advantage of the kinetic energy. The FEM simulation codes are engineered by CIMNE (International Center for Numerical Method in Engineering, Barcelona, Spain), COMPack for the solids problem application, KRATOS for fluid dynamic application and RMOP for the structural optimization. To validate the procedure here presented, many turbine rotors made of composite materials are analyzed and three of them are compared with the steel one.

  16. Finding sequence motifs in prokaryotic genomes--a brief practical guide for a microbiologist.

    PubMed

    Mrázek, Jan

    2009-09-01

    Finding significant nucleotide sequence motifs in prokaryotic genomes can be divided into three types of tasks: (1) supervised motif finding, where a sample of motif sequences is used to find other similar sequences in genomes; (2) unsupervised motif finding, which typically relates to the task of finding regulatory motifs and protein binding sites and (3) exploratory motif finding, which aims to identify potential functionally significant sequence motifs as those that are unusual in some statistical sense. This article provides a conceptual overview for each type of task, a brief description of basic algorithms used in their solution, and a review of selected relevant software available online.

  17. Interconnected Network Motifs Control Podocyte Morphology and Kidney Function

    PubMed Central

    Azeloglu, Evren U.; Hardy, Simon V.; Eungdamrong, Narat John; Chen, Yibang; Jayaraman, Gomathi; Chuang, Peter Y.; Fang, Wei; Xiong, Huabao; Neves, Susana R.; Jain, Mohit R.; Li, Hong; Ma’ayan, Avi; Gordon, Ronald E.; He, John Cijiang; Iyengar, Ravi

    2014-01-01

    Podocytes are kidney cells with specialized morphology that is required for glomerular filtration. Diseases, such as diabetes, or drug exposure that causes disruption of the podocyte foot process morphology results in kidney pathophysiology. Proteomic analysis of glomeruli isolated from rats with puromycin-induced kidney disease and control rats indicated that protein kinase A (PKA), which is activated by adenosine 3′,5′-monophosphate (cAMP), is a key regulator of podocyte morphology and function. In podocytes, cAMP signaling activates cAMP response element–binding protein (CREB) to enhance expression of the gene encoding a differentiation marker, synaptopodin, a protein that associates with actin and promotes its bundling. We constructed and experimentally verified a β-adrenergic receptor–driven network with multiple feedback and feedforward motifs that controls CREB activity. To determine how the motifs interacted to regulate gene expression, we mapped multicompartment dynamical models, including information about protein subcellular localization, onto the network topology using Petri net formalisms. These computational analyses indicated that the juxtaposition of multiple feedback and feedforward motifs enabled the prolonged CREB activation necessary for synaptopodin expression and actin bundling. Drug-induced modulation of these motifs in diseased rats led to recovery of normal morphology and physiological function in vivo. Thus, analysis of regulatory motifs using network dynamics can provide insights into pathophysiology that enable predictions for drug intervention strategies to treat kidney disease. PMID:24497609

  18. cWINNOWER algorithm for finding fuzzy dna motifs

    NASA Technical Reports Server (NTRS)

    Liang, S.; Samanta, M. P.; Biegel, B. A.

    2004-01-01

    The cWINNOWER algorithm detects fuzzy motifs in DNA sequences rich in protein-binding signals. A signal is defined as any short nucleotide pattern having up to d mutations differing from a motif of length l. The algorithm finds such motifs if a clique consisting of a sufficiently large number of mutated copies of the motif (i.e., the signals) is present in the DNA sequence. The cWINNOWER algorithm substantially improves the sensitivity of the winnower method of Pevzner and Sze by imposing a consensus constraint, enabling it to detect much weaker signals. We studied the minimum detectable clique size qc as a function of sequence length N for random sequences. We found that qc increases linearly with N for a fast version of the algorithm based on counting three-member sub-cliques. Imposing consensus constraints reduces qc by a factor of three in this case, which makes the algorithm dramatically more sensitive. Our most sensitive algorithm, which counts four-member sub-cliques, needs a minimum of only 13 signals to detect motifs in a sequence of length N = 12,000 for (l, d) = (15, 4). Copyright Imperial College Press.

  19. Binding properties of SUMO-interacting motifs (SIMs) in yeast.

    PubMed

    Jardin, Christophe; Horn, Anselm H C; Sticht, Heinrich

    2015-03-01

    Small ubiquitin-like modifier (SUMO) conjugation and interaction play an essential role in many cellular processes. A large number of yeast proteins is known to interact non-covalently with SUMO via short SUMO-interacting motifs (SIMs), but the structural details of this interaction are yet poorly characterized. In the present work, sequence analysis of a large dataset of 148 yeast SIMs revealed the existence of a hydrophobic core binding motif and a preference for acidic residues either within or adjacent to the core motif. Thus the sequence properties of yeast SIMs are highly similar to those described for human. Molecular dynamics simulations were performed to investigate the binding preferences for four representative SIM peptides differing in the number and distribution of acidic residues. Furthermore, the relative stability of two previously observed alternative binding orientations (parallel, antiparallel) was assessed. For all SIMs investigated, the antiparallel binding mode remained stable in the simulations and the SIMs were tightly bound via their hydrophobic core residues supplemented by polar interactions of the acidic residues. In contrary, the stability of the parallel binding mode is more dependent on the sequence features of the SIM motif like the number and position of acidic residues or the presence of additional adjacent interaction motifs. This information should be helpful to enhance the prediction of SIMs and their binding properties in different organisms to facilitate the reconstruction of the SUMO interactome.

  20. IQ-motif peptides as novel anti-microbial agents.

    PubMed

    McLean, Denise T F; Lundy, Fionnuala T; Timson, David J

    2013-04-01

    The IQ-motif is an amphipathic, often positively charged, α-helical, calmodulin binding sequence found in a number of eukaryote signalling, transport and cytoskeletal proteins. They share common biophysical characteristics with established, cationic α-helical antimicrobial peptides, such as the human cathelicidin LL-37. Therefore, we tested eight peptides encoding the sequences of IQ-motifs derived from the human cytoskeletal scaffolding proteins IQGAP2 and IQGAP3. Some of these peptides were able to inhibit the growth of Escherichia coli and Staphylococcus aureus with minimal inhibitory concentrations (MIC) comparable to LL-37. In addition some IQ-motifs had activity against the fungus Candida albicans. This antimicrobial activity is combined with low haemolytic activity (comparable to, or lower than, that of LL-37). Those IQ-motifs with anti-microbial activity tended to be able to bind to lipopolysaccharide. Some of these were also able to permeabilise the cell membranes of both Gram positive and Gram negative bacteria. These results demonstrate that IQ-motifs are viable lead sequences for the identification and optimisation of novel anti-microbial peptides. Thus, further investigation of the anti-microbial properties of this diverse group of sequences is merited.

  1. cWINNOWER algorithm for finding fuzzy dna motifs

    NASA Technical Reports Server (NTRS)

    Liang, S.; Samanta, M. P.; Biegel, B. A.

    2004-01-01

    The cWINNOWER algorithm detects fuzzy motifs in DNA sequences rich in protein-binding signals. A signal is defined as any short nucleotide pattern having up to d mutations differing from a motif of length l. The algorithm finds such motifs if a clique consisting of a sufficiently large number of mutated copies of the motif (i.e., the signals) is present in the DNA sequence. The cWINNOWER algorithm substantially improves the sensitivity of the winnower method of Pevzner and Sze by imposing a consensus constraint, enabling it to detect much weaker signals. We studied the minimum detectable clique size qc as a function of sequence length N for random sequences. We found that qc increases linearly with N for a fast version of the algorithm based on counting three-member sub-cliques. Imposing consensus constraints reduces qc by a factor of three in this case, which makes the algorithm dramatically more sensitive. Our most sensitive algorithm, which counts four-member sub-cliques, needs a minimum of only 13 signals to detect motifs in a sequence of length N = 12,000 for (l, d) = (15, 4). Copyright Imperial College Press.

  2. Fitting a mixture model by expectation maximization to discover motifs in biopolymers

    SciTech Connect

    Bailey, T.L.; Elkan, C.

    1994-12-31

    The algorithm described in this paper discovers one or more motifs in a collection of DNA or protein sequences by using the technique of expectation maximization to fit a two-component finite mixture model to the set of sequences. Multiple motifs are found by fitting a mixture model to the data, probabilistically erasing the occurrences of the motif thus found, and repeating the process to find successive motifs. The algorithm requires only a set of unaligned sequences and a number specifying the width of the motifs as input. It returns a model of each motif and a threshold which together can be used as a Bayes-optimal classifier for searching for occurrences of the motif in other databases. The algorithm estimates how many times each motif occurs in each sequence in the dataset and outputs an alignment of the occurrences of the motif. The algorithm is capable of discovering several different motifs with differing numbers of occurrences in a single dataset.

  3. Novel Structural and Functional Motifs in cellulose synthase (CesA) Genes of Bread Wheat (Triticum aestivum, L.)

    PubMed Central

    Kaur, Simerjeet; Dhugga, Kanwarpal S.; Gill, Kulvinder; Singh, Jaswinder

    2016-01-01

    Cellulose is the primary determinant of mechanical strength in plant tissues. Late-season lodging is inversely related to the amount of cellulose in a unit length of the stem. Wheat is the most widely grown of all the crops globally, yet information on its CesA gene family is limited. We have identified 22 CesA genes from bread wheat, which include homoeologs from each of the three genomes, and named them as TaCesAXA, TaCesAXB or TaCesAXD, where X denotes the gene number and the last suffix stands for the respective genome. Sequence analyses of the CESA proteins from wheat and their orthologs from barley, maize, rice, and several dicot species (Arabidopsis, beet, cotton, poplar, potato, rose gum and soybean) revealed motifs unique to monocots (Poales) or dicots. Novel structural motifs CQIC and SVICEXWFA were identified, which distinguished the CESAs involved in the formation of primary and secondary cell wall (PCW and SCW) in all the species. We also identified several new motifs specific to monocots or dicots. The conserved motifs identified in this study possibly play functional roles specific to PCW or SCW formation. The new insights from this study advance our knowledge about the structure, function and evolution of the CesA family in plants in general and wheat in particular. This information will be useful in improving culm strength to reduce lodging or alter wall composition to improve biofuel production. PMID:26771740

  4. Novel Structural and Functional Motifs in cellulose synthase (CesA) Genes of Bread Wheat (Triticum aestivum, L.).

    PubMed

    Kaur, Simerjeet; Dhugga, Kanwarpal S; Gill, Kulvinder; Singh, Jaswinder

    2016-01-01

    Cellulose is the primary determinant of mechanical strength in plant tissues. Late-season lodging is inversely related to the amount of cellulose in a unit length of the stem. Wheat is the most widely grown of all the crops globally, yet information on its CesA gene family is limited. We have identified 22 CesA genes from bread wheat, which include homoeologs from each of the three genomes, and named them as TaCesAXA, TaCesAXB or TaCesAXD, where X denotes the gene number and the last suffix stands for the respective genome. Sequence analyses of the CESA proteins from wheat and their orthologs from barley, maize, rice, and several dicot species (Arabidopsis, beet, cotton, poplar, potato, rose gum and soybean) revealed motifs unique to monocots (Poales) or dicots. Novel structural motifs CQIC and SVICEXWFA were identified, which distinguished the CESAs involved in the formation of primary and secondary cell wall (PCW and SCW) in all the species. We also identified several new motifs specific to monocots or dicots. The conserved motifs identified in this study possibly play functional roles specific to PCW or SCW formation. The new insights from this study advance our knowledge about the structure, function and evolution of the CesA family in plants in general and wheat in particular. This information will be useful in improving culm strength to reduce lodging or alter wall composition to improve biofuel production.

  5. Metal-Free Motifs for Solar Fuel Applications

    NASA Astrophysics Data System (ADS)

    Ilic, Stefan; Zoric, Marija R.; Kadel, Usha Pandey; Huang, Yunjing; Glusac, Ksenija D.

    2017-05-01

    Metal-free motifs, such as graphitic carbon nitride, conjugated polymers, and doped nanostructures, are emerging as a new class of Earth-abundant materials for solar fuel devices. Although these metal-free structures show great potential, detailed mechanistic understanding of their performance remains limited. Here, we review important experimental and theoretical findings relevant to the role of metal-free motifs as either photoelectrodes or electrocatalysts. First, the light-harvesting characteristics of metal-free photoelectrodes (band energetics, exciton binding energies, charge carrier mobilities and lifetimes) are discussed and contrasted with those in traditional inorganic semiconductors (such as Si). Second, the mechanistic insights into the electrocatalytic oxygen reduction and evolution reactions, hydrogen evolution reaction, and carbon dioxide reduction reaction by metal-free motifs are summarized, including experimental surface-sensitive spectroscopy findings, studies on small molecular models, and computational modeling of these chemical transformations.

  6. [Specific motifs in the genomes of the family Chlamydiaceae].

    PubMed

    Demkin, V V; Kirillova, N V

    2012-01-01

    Specific motifs in the genomes of the family Chlamydiaceae were discussed. The search for genetic markers ofbacteria identification and typing is an urgent problem. The progress in sequencing technology resulted in compilation of the database of genomic nucleotide sequences of bacteria. This raised the problem of the search and selection of genetic targets for identification and typing in bacterial genes based on comparative analysis of complete genomic sequences. The goal of this work was to implement comparative genetic analysis of different species of the family Chlamydiaceae. This analysis was focused to detection of specific motifs capable of serving as genetic marker of this family. The consensus domains were detected using the Visual Basic for Application software for MS Excel. Complete coincidence of segments 25 nucleotide long was used as the test for consensus domain selection. One complete genomic sequence for each of 8 bacterial species was taken for the experiment. The experimental sample did not contain complete sequence of C. suis, because at the moment of this research this species was absence in the database GenBank. Comparative assay of the sequences of the C. trachomatis and other representatives of the family Chlamydiaceae revealed 41 common motifs for 8 Chlamydiaceae species tested in this work. The maximal number of consensus motifs was observed in genes of ribosomal RNA and t-RNA. In addition to genes of r-RNA and t-RNA consensus motifs were observed in 5 genes and 6 intergene segments. The gene CTL0299, CTLO800, dagA, and hctA consensus motifs detected in this work can be regarded as identification domains of the family Chlamydiaceae.

  7. Specific RNA self-assembly with minimal paranemic motifs.

    PubMed

    Afonin, Kirill A; Cieply, Dennis J; Leontis, Neocles B

    2008-01-09

    The paranemic crossover (PX) is a motif for assembling two nucleic acid molecules using Watson-Crick (WC) basepairing without unfolding preformed secondary structure in the individual molecules. Once formed, the paranemic assembly motif comprises adjacent parallel double helices that crossover at every possible point over the length of the motif. The interaction is reversible as it does not require denaturation of basepairs internal to each interacting molecular unit. Paranemic assembly has been demonstrated for DNA but not for RNA and only for motifs with four or more crossover points and lengths of five or more helical half-turns. Here we report the design of RNA molecules that paranemically assemble with the minimum number of two crossovers spanning the major groove to form paranemic motifs with a length of three half turns (3HT). Dissociation constants (Kd's) were measured for a series of molecules in which the number of basepairs between the crossover points was varied from five to eight basepairs. The paranemic 3HT complex with six basepairs (3HT_6M) was found to be the most stable with Kd = 1 x 10-8 M. The half-time for kinetic exchange of the 3HT_6M complex was determined to be approximately 100 min, from which we calculated association and dissociation rate constants ka = 5.11 x 103 M-1s-1 and kd = 5.11 x 10-5 s-1. RNA paranemic assembly of 3HT and 5HT complexes is blocked by single-base substitutions that disrupt individual intermolecular Watson-Crick basepairs and is restored by compensatory substitutions that restore those basepairs. The 3HT motif appears suitable for specific, programmable, and reversible tecto-RNA self-assembly for constructing artificial RNA molecular machines.

  8. Selection against spurious promoter motifs correlates withtranslational efficiency across bacteria

    SciTech Connect

    Froula, Jeffrey L.; Francino, M. Pilar

    2007-05-01

    Because binding of RNAP to misplaced sites could compromise the efficiency of transcription, natural selection for the optimization of gene expression should regulate the distribution of DNA motifs capable of RNAP-binding across the genome. Here we analyze the distribution of the -10 promoter motifs that bind the {sigma}{sup 70} subunit of RNAP in 42 bacterial genomes. We show that selection on these motifs operates across the genome, maintaining an over-representation of -10 motifs in regulatory sequences while eliminating them from the nonfunctional and, in most cases, from the protein coding regions. In some genomes, however, -10 sites are over-represented in the coding sequences; these sites could induce pauses effecting regulatory roles throughout the length of a transcriptional unit. For nonfunctional sequences, the extent of motif under-representation varies across genomes in a manner that broadly correlates with the number of tRNA genes, a good indicator of translational speed and growth rate. This suggests that minimizing the time invested in gene transcription is an important selective pressure against spurious binding. However, selection against spurious binding is detectable in the reduced genomes of host-restricted bacteria that grow at slow rates, indicating that components of efficiency other than speed may also be important. Minimizing the number of RNAP molecules per cell required for transcription, and the corresponding energetic expense, may be most relevant in slow growers. These results indicate that genome-level properties affecting the efficiency of transcription and translation can respond in an integrated manner to optimize gene expression. The detection of selection against promoter motifs in nonfunctional regions also implies that no sequence may evolve free of selective constraints, at least in the relatively small and unstructured genomes of bacteria.

  9. A Command Editor Tool for X and Motif

    DTIC Science & Technology

    1993-07-01

    1of 16 h.. . . .. .. . . . . . .I .... . . . .. . . . . . . .- I m arble X/Motlf Design Document for Contract # DAAH01-93-C-R013 minimal implementation...Motif 2 of 18 m arble X/Motif Design Document for Contract # DAAH01-93-C-R013 ing of modified system widgets, proides to the developer the full source...oa’rutmz ol"croidctv fteseilmd h A iandEio olfrX n oi f1 i~lol’lot m arble Xfflotlf De*ign Documnent for Contract # DAAHOI-93-C-R013 user has just

  10. Using the Gibbs Motif Sampler for Phylogenetic Footprinting

    SciTech Connect

    Thompson, William; Conlan, Sean; McCue, Lee Ann; Lawrence, Charles

    2007-07-01

    The Gibbs Motif Sampler (Gibbs) (1) is a software package used to predict conserved elements in biopolymer sequences. While the software can be used to locate conserved motifs in protein sequences, its most common use is the prediction of transcription factor binding sites (TFBSs) in promoters upstream of gene sequences. We will describe approaches that use Gibbs to locate TFBSs in a collection of orthologous nucleotide sequences, i.e. phylogenetic footprinting. To illustrate this technique, we present examples that use Gibbs to detect binding sites for the transcription factor LexA in orthologous sequence data from representative species belonging to two different proteobacterial divisions.

  11. Characterizing regulatory path motifs in integrated networks using perturbational data

    PubMed Central

    2010-01-01

    We introduce Pathicular http://bioinformatics.psb.ugent.be/software/details/Pathicular, a Cytoscape plugin for studying the cellular response to perturbations of transcription factors by integrating perturbational expression data with transcriptional, protein-protein and phosphorylation networks. Pathicular searches for 'regulatory path motifs', short paths in the integrated physical networks which occur significantly more often than expected between transcription factors and their targets in the perturbational data. A case study in Saccharomyces cerevisiae identifies eight regulatory path motifs and demonstrates their biological significance. PMID:20230615

  12. Determination of Sectional Constancy of Organic Coal-Water Fuel Compositions

    NASA Astrophysics Data System (ADS)

    Dmitrienko, Margarita A.; Nyashina, Galina S.; Strizhak, Pavel A.

    2016-02-01

    To use widespreadly the waste of coals and oils processing in the great and the small-scale power generation, the key parameter, which is sectional constancy of promising organic coal-water fuels (OCWF), was studied. The compo-sitions of OCWF from brown and bituminous coals, filter cakes, used motor, turbine and dielectrical oils, water-oil emul-sion and special wetting agent (plasticizer) were investigated. Two modes of preparation were considered. They are with homogenizer and cavitator. It was established that the constancy did not exceed 5-7 days for the compositions of OCWF with brown coals, and 12-15 days for that compositions with bituminous coals and filter cakes. The injection of used oils in a composition of OCWF led to increase in viscosity of fuel compositions and their sectional constancy.

  13. Discriminative motif discovery in DNA and protein sequences using the DEME algorithm.

    PubMed

    Redhead, Emma; Bailey, Timothy L

    2007-10-15

    Motif discovery aims to detect short, highly conserved patterns in a collection of unaligned DNA or protein sequences. Discriminative motif finding algorithms aim to increase the sensitivity and selectivity of motif discovery by utilizing a second set of sequences, and searching only for patterns that can differentiate the two sets of sequences. Potential applications of discriminative motif discovery include discovering transcription factor binding site motifs in ChIP-chip data and finding protein motifs involved in thermal stability using sets of orthologous proteins from thermophilic and mesophilic organisms. We describe DEME, a discriminative motif discovery algorithm for use with protein and DNA sequences. Input to DEME is two sets of sequences; a "positive" set and a "negative" set. DEME represents motifs using a probabilistic model, and uses a novel combination of global and local search to find the motif that optimally discriminates between the two sets of sequences. DEME is unique among discriminative motif finders in that it uses an informative Bayesian prior on protein motif columns, allowing it to incorporate prior knowledge of residue characteristics. We also introduce four, synthetic, discriminative motif discovery problems that are designed for evaluating discriminative motif finders in various biologically motivated contexts. We test DEME using these synthetic problems and on two biological problems: finding yeast transcription factor binding motifs in ChIP-chip data, and finding motifs that discriminate between groups of thermophilic and mesophilic orthologous proteins. Using artificial data, we show that DEME is more effective than a non-discriminative approach when there are "decoy" motifs or when a variant of the motif is present in the "negative" sequences. With real data, we show that DEME is as good, but not better than non-discriminative algorithms at discovering yeast transcription factor binding motifs. We also show that DEME can find

  14. A two-helix motif positions the active site of lysophosphatidic acid acyltransferase for catalysis within the membrane bilayer

    PubMed Central

    Robertson, Rosanna M.; Yao, Jiangwei; Gajewski, Stefan; Kumar, Gyanendra; Martin, Erik W.; Rock, Charles O.; White, Stephen W.

    2017-01-01

    Phosphatidic acid is the central intermediate in membrane phospholipid synthesis and is generated by two acyltransferases in a pathway conserved in all life forms. The second step in this pathway is catalyzed by 1-acyl-sn-glycero-3-phosphate acyltransferase, called PlsC in bacteria. The crystal structure of PlsC from Thermotoga maritima reveals an unusual hydrophobic/aromatic N-terminal two-helix motif linked to an acyltransferase αβ domain that contains the catalytic HX4D motif. PlsC dictates the acyl chain composition of the 2-position of phospholipids, and the acyl chain selectivity ‘ruler’ is an appropriately placed and closed hydrophobic tunnel. This was confirmed by site-directed mutagenesis and membrane composition analysis of Escherichia coli cells expressing the mutated proteins. MD simulations reveal that the two-helix motif represents a novel substructure that firmly anchors the protein to one leaflet of the membrane. This binding mode allows the PlsC active site to acylate lysophospholipids within the membrane bilayer using soluble acyl donors. PMID:28714993

  15. A two-helix motif positions the lysophosphatidic acid acyltransferase active site for catalysis within the membrane bilayer.

    PubMed

    Robertson, Rosanna M; Yao, Jiangwei; Gajewski, Stefan; Kumar, Gyanendra; Martin, Erik W; Rock, Charles O; White, Stephen W

    2017-08-01

    Phosphatidic acid (PA), the central intermediate in membrane phospholipid synthesis, is generated by two acyltransferases in a pathway conserved in all life forms. The second step in this pathway is catalyzed by 1-acyl-sn-glycerol-3-phosphate acyltransferase, called PlsC in bacteria. Here we present the crystal structure of PlsC from Thermotoga maritima, revealing an unusual hydrophobic/aromatic N-terminal two-helix motif linked to an acyltransferase αβ-domain that contains the catalytic HX4D motif. PlsC dictates the acyl chain composition of the 2-position of phospholipids, and the acyl chain selectivity 'ruler' is an appropriately placed and closed hydrophobic tunnel. We confirmed this by site-directed mutagenesis and membrane composition analysis of Escherichia coli cells that expressed mutant PlsC. Molecular dynamics (MD) simulations showed that the two-helix motif represents a novel substructure that firmly anchors the protein to one leaflet of the membrane. This binding mode allows the PlsC active site to acylate lysophospholipids within the membrane bilayer by using soluble acyl donors.

  16. Nephila clavipes Flagelliform silk-like GGX motifs contribute to extensibility and spacer motifs contribute to strength in synthetic spider silk fibers.

    PubMed

    Adrianos, Sherry L; Teulé, Florence; Hinman, Michael B; Jones, Justin A; Weber, Warner S; Yarger, Jeffery L; Lewis, Randolph V

    2013-06-10

    Flagelliform spider silk is the most extensible silk fiber produced by orb weaver spiders, though not as strong as the dragline silk of the spider. The motifs found in the core of the Nephila clavipes flagelliform Flag protein are GGX, spacer, and GPGGX. Flag does not contain the polyalanine motif known to provide the strength of dragline silk. To investigate the source of flagelliform fiber strength, four recombinant proteins were produced containing variations of the three core motifs of the Nephila clavipes flagelliform Flag protein that produces this type of fiber. The as-spun fibers were processed in 80% aqueous isopropanol using a standardized process for all four fiber types, which produced improved mechanical properties. Mechanical testing of the recombinant proteins determined that the GGX motif contributes extensibility and the spacer motif contributes strength to the recombinant fibers. Recombinant protein fibers containing the spacer motif were stronger than the proteins constructed without the spacer that contained only the GGX motif or the combination of the GGX and GPGGX motifs. The mechanical and structural X-ray diffraction analysis of the recombinant fibers provide data that suggests a functional role of the spacer motif that produces tensile strength, though the spacer motif is not clearly defined structurally. These results indicate that the spacer is likely a primary contributor of strength, with the GGX motif supplying mobility to the protein network of native N. clavipes flagelliform silk fibers.

  17. Nephila clavipes Flagelliform Silk-like GGX Motifs Contribute to Extensibility and Spacer Motifs Contribute to Strength in Synthetic Spider Silk Fibers

    PubMed Central

    Adrianos, Sherry L.; Teulé, Florence; Hinman, Michael B.; Jones, Justin A.; Weber, Warner S.; Yarger, Jeffery L.; Lewis, Randolph V.

    2013-01-01

    Flagelliform spider silk is the most extensible silk fiber produced by orb weaver spiders, though not as strong as the dragline silk of the spider. The motifs found in the core of the Nephila clavipes flagelliform Flag protein are: GGX, spacer, and GPGGX. Flag does not contain the polyalanine motif known to provide the strength of dragline silk. To investigate the source of flagelliform fiber strength, four recombinant proteins were produced containing variations of the three core motifs of the Nephila clavipes flagelliform Flag protein that produces this type of fiber. The as-spun fibers were processed in 80% aqueous isopropanol using a standardized process for all four fiber types, which produced improved mechanical properties. Mechanical testing of the recombinant proteins determined that the GGX motif contributes extensibility and the spacer motif contributes strength to the recombinant fibers. Recombinant protein fibers containing the spacer motif were stronger than the proteins constructed without the spacer that contained only the GGX motif or the combination of the GGX and GPGGX motifs. The mechanical and structural X-ray diffraction analysis of the recombinant fibers provide data that suggests a functional role of the spacer motif that produces tensile strength though the spacer motif is not clearly defined structurally. These results indicate that the spacer is likely a primary contributor of strength with the GGX motif supplying mobility to the protein network of native N. clavipes flagelliform silk fibers. PMID:23646825

  18. Replacement of the Hepatitis E Virus ORF3 Protein PxxP Motif with Heterologous Late Domain Motifs Affects Virus Release Via Interaction with TSG101

    PubMed Central

    Kenney, Scott P.; Wentworth, Jacquelyn; Heffron, Connie L.; Meng, Xiang-Jin

    2015-01-01

    The ORF3 protein of hepatitis E virus (HEV) contains a “PSAP” amino acid late domain motif, which allows for interaction with the endosomal sorting complexes required for transport (ESCRT) pathway aiding virion release. Late domain motifs are interchangeable with other viral late domain motifs in several enveloped viruses, however, it remains unknown whether HEV shares this functional interchangeability and what implications this might have on viral replication. In this study, by substituting heterologous late domain motifs (PPPY, YPDL, and PSAA) for the HEV ORF3 late domain (PSAP), we demonstrated that deviation from the PSAP motif reduces virus release as measured by viral RNA in culture media. Virus release could not be restored by insertion of a heterologous late domain motif or by supplying wild-type ORF3 in trans, suggesting that the HEV PSAP motif is required for viral exit which cannot be bypassed by the use of alternative heterologous late domains. PMID:26457367

  19. Core signalling motif displaying multistability through multi-state enzymes

    PubMed Central

    Feng, Song; Sáez, Meritxell; Wiuf, Carsten; Feliu, Elisenda

    2016-01-01

    Bistability, and more generally multistability, is a key system dynamics feature enabling decision-making and memory in cells. Deciphering the molecular determinants of multistability is thus crucial for a better understanding of cellular pathways and their (re)engineering in synthetic biology. Here, we show that a key motif found predominantly in eukaryotic signalling systems, namely a futile signalling cycle, can display bistability when featuring a two-state kinase. We provide necessary and sufficient mathematical conditions on the kinetic parameters of this motif that guarantee the existence of multiple steady states. These conditions foster the intuition that bistability arises as a consequence of competition between the two states of the kinase. Extending from this result, we find that increasing the number of kinase states linearly translates into an increase in the number of steady states in the system. These findings reveal, to our knowledge, a new mechanism for the generation of bistability and multistability in cellular signalling systems. Further the futile cycle featuring a two-state kinase is among the smallest bistable signalling motifs. We show that multi-state kinases and the described competition-based motif are part of several natural signalling systems and thereby could enable them to implement complex information processing through multistability. These results indicate that multi-state kinases in signalling systems are readily exploited by natural evolution and could equally be used by synthetic approaches for the generation of multistable information processing systems at the cellular level. PMID:27733693

  20. Conditional graphical models for protein structural motif recognition.

    PubMed

    Liu, Yan; Carbonell, Jaime; Gopalakrishnan, Vanathi; Weigele, Peter

    2009-05-01

    Determining protein structures is crucial to understanding the mechanisms of infection and designing drugs. However, the elucidation of protein folds by crystallographic experiments can be a bottleneck in the development process. In this article, we present a probabilistic graphical model framework, conditional graphical models, for predicting protein structural motifs. It represents the structure characteristics of a structural motif using a graph, where the nodes denote the secondary structure elements, and the edges indicate the side-chain interactions between the components either within one protein chain or between chains. Then the model defines the optimal segmentation of a protein sequence against the graph by maximizing its "conditional" probability so that it can take advantages of the discriminative training approach. Efficient approximate inference algorithms using reversible jump Markov Chain Monte Carlo (MCMC) algorithm are developed to handle the resulting complex graphical models. We test our algorithm on four important structural motifs, and our method outperforms other state-of-art algorithms for motif recognition. We also hypothesize potential membership proteins of target folds from Swiss-Prot, which further supports the evolutionary hypothesis about viral folds.

  1. Learning Cellular Sorting Pathways Using Protein Interactions and Sequence Motifs

    PubMed Central

    Lin, Tien-Ho; Bar-Joseph, Ziv

    2011-01-01

    Abstract Proper subcellular localization is critical for proteins to perform their roles in cellular functions. Proteins are transported by different cellular sorting pathways, some of which take a protein through several intermediate locations until reaching its final destination. The pathway a protein is transported through is determined by carrier proteins that bind to specific sequence motifs. In this article, we present a new method that integrates protein interaction and sequence motif data to model how proteins are sorted through these sorting pathways. We use a hidden Markov model (HMM) to represent protein sorting pathways. The model is able to determine intermediate sorting states and to assign carrier proteins and motifs to the sorting pathways. In simulation studies, we show that the method can accurately recover an underlying sorting model. Using data for yeast, we show that our model leads to accurate prediction of subcellular localization. We also show that the pathways learned by our model recover many known sorting pathways and correctly assign proteins to the path they utilize. The learned model identified new pathways and their putative carriers and motifs and these may represent novel protein sorting mechanisms. Supplementary results and software implementation are available from http://murphylab.web.cmu.edu/software/2010_RECOMB_pathways/. PMID:21999284

  2. Chain motifs: the tails and handles of complex networks.

    PubMed

    Boas, Paulino R Villas; Rodrigues, Francisco A; Travieso, Gonzalo; Costa, Luciano da Fontoura

    2008-02-01

    A great part of the interest in complex networks has been motivated by the presence of structured, frequently nonuniform, connectivity. Because diverse connectivity patterns tend to result in distinct network dynamics, and also because they provide the means to identify and classify several types of complex network, it becomes important to obtain meaningful measurements of the local network topology. In addition to traditional features such as the node degree, clustering coefficient, and shortest path, motifs have been introduced in the literature in order to provide complementary descriptions of the network connectivity. The current work proposes a different type of motif, namely, chains of nodes, that is, sequences of connected nodes with degree 2. These chains have been subdivided into cords, tails, rings, and handles, depending on the type of their extremities (e.g., open or connected). A theoretical analysis of the density of such motifs in random and scale-free networks is described, and an algorithm for identifying these motifs in general networks is presented. The potential of considering chains for network characterization has been illustrated with respect to five categories of real-world networks including 16 cases. Several interesting findings were obtained, including the fact that several chains were observed in real-world networks, especially the world wide web, books, and the power grid. The possibility of chains resulting from incompletely sampled networks is also investigated.

  3. Insights into the motif preference of APOBEC3 enzymes.

    PubMed

    Ebrahimi, Diako; Alinejad-Rokny, Hamid; Davenport, Miles P

    2014-01-01

    We used a multivariate data analysis approach to identify motifs associated with HIV hypermutation by different APOBEC3 enzymes. The analysis showed that APOBEC3G targets G mainly within GG, TG, TGG, GGG, TGGG and also GGGT. The G nucleotides flanked by a C at the 3' end (in +1 and +2 positions) were indicated as disfavoured targets by APOBEC3G. The G nucleotides within GGGG were found to be targeted at a frequency much less than what is expected. We found that the infrequent G-to-A mutation within GGGG is not limited to the inaccessibility, to APOBEC3, of poly Gs in the central and 3'polypurine tracts (PPTs) which remain double stranded during the HIV reverse transcription. GGGG motifs outside the PPTs were also disfavoured. The motifs GGAG and GAGG were also found to be disfavoured targets for APOBEC3. The motif-dependent mutation of G within the HIV genome by members of the APOBEC3 family other than APOBEC3G was limited to GA→AA changes. The results did not show evidence of other types of context dependent G-to-A changes in the HIV genome.

  4. Insights into the Motif Preference of APOBEC3 Enzymes

    PubMed Central

    Ebrahimi, Diako; Alinejad-Rokny, Hamid; Davenport, Miles P.

    2014-01-01

    We used a multivariate data analysis approach to identify motifs associated with HIV hypermutation by different APOBEC3 enzymes. The analysis showed that APOBEC3G targets G mainly within GG, TG, TGG, GGG, TGGG and also GGGT. The G nucleotides flanked by a C at the 3′ end (in +1 and +2 positions) were indicated as disfavoured targets by APOBEC3G. The G nucleotides within GGGG were found to be targeted at a frequency much less than what is expected. We found that the infrequent G-to-A mutation within GGGG is not limited to the inaccessibility, to APOBEC3, of poly Gs in the central and 3′polypurine tracts (PPTs) which remain double stranded during the HIV reverse transcription. GGGG motifs outside the PPTs were also disfavoured. The motifs GGAG and GAGG were also found to be disfavoured targets for APOBEC3. The motif-dependent mutation of G within the HIV genome by members of the APOBEC3 family other than APOBEC3G was limited to GA→AA changes. The results did not show evidence of other types of context dependent G-to-A changes in the HIV genome. PMID:24498164

  5. 5. DETAIL VIEW OF THE EGYPTIAN MOTIF DECORATIVE ELEMENTS OF ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    5. DETAIL VIEW OF THE EGYPTIAN MOTIF DECORATIVE ELEMENTS OF BUILDING 1'S MAIN ENTRY TOWER (INCLUDING THE ENGAGED COLUMN CAPITALS, PILASTERS & CAPITALS, CORNICES, AND TERRA COTTA EAGLES); LOOKING SW FROM THE E WING ROOF. (Ryan) - Veterans Administration Medical Center, Building No. 1, Old State Route 13 West, Marion, Williamson County, IL

  6. Structural and functional analysis of the GABARAP interaction motif (GIM)

    DOE PAGES

    Rogov, Vladimir V.; Stolz, Alexandra; Ravichandran, Arvind C.; ...

    2017-06-27

    Through the canonical LC3 interaction motif (LIR), [W/F/Y]–X1–X2[I/L/V], protein complexes are recruited to autophagosomes to perform their functions as either autophagy adaptors or receptors. How these adaptors/receptors selectively interact with either LC3 or GABARAP families remains unclear. Herein, we determine the range of selectivity of 30 known core LIR motifs towards individual LC3s and GABARAPs. From these, we define a GABARAP Interaction Motif (GIM) sequence ([W/F]–[V/I]–X2–V) that the adaptor protein PLEKHM1 tightly conforms to. Using biophysical and structural approaches, we show that the PLEKHM1–LIR is indeed 11–fold more specific for GABARAP than LC3B. Selective mutation of the X1 and X2more » positions either completely abolished the interaction with all LC3 and GABARAPs or increased PLEKHM1–GIM selectivity 20–fold towards LC3B. Finally, we show that conversion of p62/SQSTM1, FUNDC1 and FIP200 LIRs into our newly defined GIM, by introducing two valine residues, enhances their interaction with endogenous GABARAP over LC3B. In conclusion, the identification of a GABARAP–specific interaction motif will aid the identification and characterization of the expanding array of autophagy receptor and adaptor proteins and their in vivo functions.« less

  7. Motifs in triadic random graphs based on Steiner triple systems

    NASA Astrophysics Data System (ADS)

    Winkler, Marco; Reichardt, Jörg

    2013-08-01

    Conventionally, pairwise relationships between nodes are considered to be the fundamental building blocks of complex networks. However, over the last decade, the overabundance of certain subnetwork patterns, i.e., the so-called motifs, has attracted much attention. It has been hypothesized that these motifs, instead of links, serve as the building blocks of network structures. Although the relation between a network's topology and the general properties of the system, such as its function, its robustness against perturbations, or its efficiency in spreading information, is the central theme of network science, there is still a lack of sound generative models needed for testing the functional role of subgraph motifs. Our work aims to overcome this limitation. We employ the framework of exponential random graph models (ERGMs) to define models based on triadic substructures. The fact that only a small portion of triads can actually be set independently poses a challenge for the formulation of such models. To overcome this obstacle, we use Steiner triple systems (STSs). These are partitions of sets of nodes into pair-disjoint triads, which thus can be specified independently. Combining the concepts of ERGMs and STSs, we suggest generative models capable of generating ensembles of networks with nontrivial triadic Z-score profiles. Further, we discover inevitable correlations between the abundance of triad patterns, which occur solely for statistical reasons and need to be taken into account when discussing the functional implications of motif statistics. Moreover, we calculate the degree distributions of our triadic random graphs analytically.

  8. Forward and Back: Motifs of Inhibition in Olfactory Processing

    PubMed Central

    Bazhenov, Maxim; Stopfer, Mark

    2016-01-01

    The remarkable performance of the olfactory system in classifying and categorizing the complex olfactory environment is built upon several basic neural circuit motifs. These include forms of inhibition that may play comparable roles in widely divergent species. In this issue of Neuron, a new study by Stokes and Isaacson sheds light on how elementary types of inhibition dynamically interact. PMID:20696373

  9. DNA containing CpG motifs induces angiogenesis

    NASA Astrophysics Data System (ADS)

    Zheng, Mei; Klinman, Dennis M.; Gierynska, Malgorzata; Rouse, Barry T.

    2002-06-01

    New blood vessel formation in the cornea is an essential step in the pathogenesis of a blinding immunoinflammatory reaction caused by ocular infection with herpes simplex virus (HSV). By using a murine corneal micropocket assay, we found that HSV DNA (which contains a significant excess of potentially bioactive "CpG" motifs when compared with mammalian DNA) induces angiogenesis. Moreover, synthetic oligodeoxynucleotides containing CpG motifs attract inflammatory cells and stimulate the release of vascular endothelial growth factor (VEGF), which in turn triggers new blood vessel formation. In vitro, CpG DNA induces the J774A.1 murine macrophage cell line to produce VEGF. In vivo CpG-induced angiogenesis was blocked by the administration of anti-mVEGF Ab or the inclusion of "neutralizing" oligodeoxynucleotides that specifically oppose the stimulatory activity of CpG DNA. These findings establish that DNA containing bioactive CpG motifs induces angiogenesis, and suggest that CpG motifs in HSV DNA may contribute to the blinding lesions of stromal keratitis.

  10. Variable structure motifs for transcription factor binding sites.

    PubMed

    Reid, John E; Evans, Kenneth J; Dyer, Nigel; Wernisch, Lorenz; Ott, Sascha

    2010-01-14

    Classically, models of DNA-transcription factor binding sites (TFBSs) have been based on relatively few known instances and have treated them as sites of fixed length using position weight matrices (PWMs). Various extensions to this model have been proposed, most of which take account of dependencies between the bases in the binding sites. However, some transcription factors are known to exhibit some flexibility and bind to DNA in more than one possible physical configuration. In some cases this variation is known to affect the function of binding sites. With the increasing volume of ChIP-seq data available it is now possible to investigate models that incorporate this flexibility. Previous work on variable length models has been constrained by: a focus on specific zinc finger proteins in yeast using restrictive models; a reliance on hand-crafted models for just one transcription factor at a time; and a lack of evaluation on realistically sized data sets. We re-analysed binding sites from the TRANSFAC database and found motivating examples where our new variable length model provides a better fit. We analysed several ChIP-seq data sets with a novel motif search algorithm and compared the results to one of the best standard PWM finders and a recently developed alternative method for finding motifs of variable structure. All the methods performed comparably in held-out cross validation tests. Known motifs of variable structure were recovered for p53, Stat5a and Stat5b. In addition our method recovered a novel generalised version of an existing PWM for Sp1 that allows for variable length binding. This motif improved classification performance. We have presented a new gapped PWM model for variable length DNA binding sites that is not too restrictive nor over-parameterised. Our comparison with existing tools shows that on average it does not have better predictive accuracy than existing methods. However, it does provide more interpretable models of motifs of variable

  11. Motif-Based Classification of Time Series with Bayesian Networks and SVMs

    NASA Astrophysics Data System (ADS)

    Buza, Krisztian; Schmidt-Thieme, Lars

    Classification of time series is an important task with many challenging applications like brain wave (EEG) analysis, signature verification or speech recognition. In this paper we show how characteristic local patterns (motifs) can improve the classification accuracy. We introduce a new motif class, generalized semi-continuous motifs. To allow flexibility and noise robustness, these motifs may include gaps of various lengths, generic and more specific wildcards. We propose an efficient algorithm for mining generalized sequential motifs. In experiments on real medical data, we show how generalized semi-continuous motifs improve the accuracy of SVMs and Bayesian Networks for time series classification.

  12. Co-motif discovery identifies an Esrrb-Sox2-DNA ternary complex as a mediator of transcriptional differences between mouse embryonic and epiblast stem cells.

    PubMed

    Hutchins, Andrew Paul; Choo, Siew Hua; Mistri, Tapan Kumar; Rahmani, Mehran; Woon, Chow Thai; Ng, Calista Keow Leng; Jauch, Ralf; Robson, Paul

    2013-02-01

    Transcription factors (TF) often bind in heterodimeric complexes with each TF recognizing a specific neighboring cis element in the regulatory region of the genome. Comprehension of this DNA motif grammar is opaque, yet recent developments have allowed the interrogation of genome-wide TF binding sites. We reasoned that within this data novel motif grammars could be identified that controlled distinct biological programs. For this purpose, we developed a novel motif-discovery tool termed fexcom that systematically interrogates ChIP-seq data to discover spatially constrained TF-TF composite motifs occurring over short DNA distances. We applied this to the extensive ChIP-seq data available from mouse embryonic stem cells (ESCs). In addition to the well-known and most prevalent sox-oct motif, we also discovered a novel constrained spacer motif for Esrrb and Sox2 with a gap of between 2 and 8 bps that Essrb and Sox2 cobind in a selective fashion. Through the use of knockdown experiments, we argue that the Esrrb-Sox2 complex is an arbiter of gene expression differences between ESCs and epiblast stem cells (EpiSC). A number of genes downregulated upon dual Esrrb/Sox2 knockdown (e.g., Klf4, Klf5, Jam2, Pecam1) are similarly downregulated in the ESC to EpiSC transition and contain the esrrb-sox motif. The prototypical Esrrb-Sox2 target gene, containing an esrrb-sox element conserved throughout eutherian and metatherian mammals, is Nr0b1. Through positive regulation of this transcriptional repressor, we argue the Esrrb-Sox2 complex promotes the ESC state through inhibition of the EpiSC transcriptional program and the same trio may also function to maintain trophoblast stem cells. Copyright © 2012 AlphaMed Press.

  13. Motivated Proteins: A web application for studying small three-dimensional protein motifs

    PubMed Central

    Leader, David P; Milner-White, E James

    2009-01-01

    Background Small loop-shaped motifs are common constituents of the three-dimensional structure of proteins. Typically they comprise between three and seven amino acid residues, and are defined by a combination of dihedral angles and hydrogen bonding partners. The most abundant of these are αβ-motifs, asx-motifs, asx-turns, β-bulges, β-bulge loops, β-turns, nests, niches, Schellmann loops, ST-motifs, ST-staples and ST-turns. We have constructed a database of such motifs from a range of high-quality protein structures and built a web application as a visual interface to this. Description The web application, Motivated Proteins, provides access to these 12 motifs (with 48 sub-categories) in a database of over 400 representative proteins. Queries can be made for specific categories or sub-categories of motif, motifs in the vicinity of ligands, motifs which include part of an enzyme active site, overlapping motifs, or motifs which include a particular amino acid sequence. Individual proteins can be specified, or, where appropriate, motifs for all proteins listed. The results of queries are presented in textual form as an (X)HTML table, and may be saved as parsable plain text or XML. Motifs can be viewed and manipulated either individually or in the context of the protein in the Jmol applet structural viewer. Cartoons of the motifs imposed on a linear representation of protein secondary structure are also provided. Summary information for the motifs is available, as are histograms of amino acid distribution, and graphs of dihedral angles at individual positions in the motifs. Conclusion Motivated Proteins is a publicly and freely accessible web application that enables protein scientists to study small three-dimensional motifs without requiring knowledge of either Structured Query Language or the underlying database schema. PMID:19210785

  14. A Bioinformatics Approach for Detecting Repetitive Nested Motifs using Pattern Matching

    PubMed Central

    Romero, José R.; Carballido, Jessica A.; Garbus, Ingrid; Echenique, Viviana C.; Ponzoni, Ignacio

    2016-01-01

    The identification of nested motifs in genomic sequences is a complex computational problem. The detection of these patterns is important to allow the discovery of transposable element (TE) insertions, incomplete reverse transcripts, deletions, and/or mutations. In this study, a de novo strategy for detecting patterns that represent nested motifs was designed based on exhaustive searches for pairs of motifs and combinatorial pattern analysis. These patterns can be grouped into three categories, motifs within other motifs, motifs flanked by other motifs, and motifs of large size. The methodology used in this study, applied to genomic sequences from the plant species Aegilops tauschii and Oryza sativa, revealed that it is possible to identify putative nested TEs by detecting these three types of patterns. The results were validated through BLAST alignments, which revealed the efficacy and usefulness of the new method, which is called Mamushka. PMID:27812277

  15. FPGA implementation of motifs-based neuronal network and synchronization analysis

    NASA Astrophysics Data System (ADS)

    Deng, Bin; Zhu, Zechen; Yang, Shuangming; Wei, Xile; Wang, Jiang; Yu, Haitao

    2016-06-01

    Motifs in complex networks play a crucial role in determining the brain functions. In this paper, 13 kinds of motifs are implemented with Field Programmable Gate Array (FPGA) to investigate the relationships between the networks properties and motifs properties. We use discretization method and pipelined architecture to construct various motifs with Hindmarsh-Rose (HR) neuron as the node model. We also build a small-world network based on these motifs and conduct the synchronization analysis of motifs as well as the constructed network. We find that the synchronization properties of motif determine that of motif-based small-world network, which demonstrates effectiveness of our proposed hardware simulation platform. By imitation of some vital nuclei in the brain to generate normal discharges, our proposed FPGA-based artificial neuronal networks have the potential to replace the injured nuclei to complete the brain function in the treatment of Parkinson's disease and epilepsy.

  16. A Bioinformatics Approach for Detecting Repetitive Nested Motifs using Pattern Matching.

    PubMed

    Romero, José R; Carballido, Jessica A; Garbus, Ingrid; Echenique, Viviana C; Ponzoni, Ignacio

    2016-01-01

    The identification of nested motifs in genomic sequences is a complex computational problem. The detection of these patterns is important to allow the discovery of transposable element (TE) insertions, incomplete reverse transcripts, deletions, and/or mutations. In this study, a de novo strategy for detecting patterns that represent nested motifs was designed based on exhaustive searches for pairs of motifs and combinatorial pattern analysis. These patterns can be grouped into three categories, motifs within other motifs, motifs flanked by other motifs, and motifs of large size. The methodology used in this study, applied to genomic sequences from the plant species Aegilops tauschii and Oryza sativa, revealed that it is possible to identify putative nested TEs by detecting these three types of patterns. The results were validated through BLAST alignments, which revealed the efficacy and usefulness of the new method, which is called Mamushka.

  17. Software tools for motif and pattern scanning: program descriptions including a universal sequence reading algorithm.

    PubMed

    Cockwell, K Y; Giles, I G

    1989-07-01

    Two programs, MOTIF and PATTERN, that scan sequences for matches to user-defined motifs and patterns of motifs based on identity and set membership are described. The programs use a simple and logical notation to define motifs, and may be used either interactively or by using command line parameters (suitable for batch processing). The two programs described also incorporate a simple, yet reliable, algorithm that automatically detects in which of six possible formats the sequence entry is written.

  18. Different gene regulation strategies revealed by analysis of binding motifs.

    PubMed

    Wunderlich, Zeba; Mirny, Leonid A

    2009-10-01

    Coordinated regulation of gene expression relies on transcription factors (TFs) binding to specific DNA sites. Our large-scale information-theoretical analysis of > 950 TF-binding motifs demonstrates that prokaryotes and eukaryotes use strikingly different strategies to target TFs to specific genome locations. Although bacterial TFs can recognize a specific DNA site in the genomic background, eukaryotic TFs exhibit widespread, nonfunctional binding and require clustering of sites to achieve specificity. We find support for this mechanism in a range of experimental studies and in our evolutionary analysis of DNA-binding domains. Our systematic characterization of binding motifs provides a quantitative assessment of the differences in transcription regulation in prokaryotes and eukaryotes.

  19. WildSpan: mining structured motifs from protein sequences

    PubMed Central

    2011-01-01

    Background Automatic extraction of motifs from biological sequences is an important research problem in study of molecular biology. For proteins, it is desired to discover sequence motifs containing a large number of wildcard symbols, as the residues associated with functional sites are usually largely separated in sequences. Discovering such patterns is time-consuming because abundant combinations exist when long gaps (a gap consists of one or more successive wildcards) are considered. Mining algorithms often employ constraints to narrow down the search space in order to increase efficiency. However, improper constraint models might degrade the sensitivity and specificity of the motifs discovered by computational methods. We previously proposed a new constraint model to handle large wildcard regions for discovering functional motifs of proteins. The patterns that satisfy the proposed constraint model are called W-patterns. A W-pattern is a structured motif that groups motif symbols into pattern blocks interleaved with large irregular gaps. Considering large gaps reflects the fact that functional residues are not always from a single region of protein sequences, and restricting motif symbols into clusters corresponds to the observation that short motifs are frequently present within protein families. To efficiently discover W-patterns for large-scale sequence annotation and function prediction, this paper first formally introduces the problem to solve and proposes an algorithm named WildSpan (sequential pattern mining across large wildcard regions) that incorporates several pruning strategies to largely reduce the mining cost. Results WildSpan is shown to efficiently find W-patterns containing conserved residues that are far separated in sequences. We conducted experiments with two mining strategies, protein-based and family-based mining, to evaluate the usefulness of W-patterns and performance of WildSpan. The protein-based mining mode of WildSpan is developed for

  20. WildSpan: mining structured motifs from protein sequences.

    PubMed

    Hsu, Chen-Ming; Chen, Chien-Yu; Liu, Baw-Jhiune

    2011-03-31

    Automatic extraction of motifs from biological sequences is an important research problem in study of molecular biology. For proteins, it is desired to discover sequence motifs containing a large number of wildcard symbols, as the residues associated with functional sites are usually largely separated in sequences. Discovering such patterns is time-consuming because abundant combinations exist when long gaps (a gap consists of one or more successive wildcards) are considered. Mining algorithms often employ constraints to narrow down the search space in order to increase efficiency. However, improper constraint models might degrade the sensitivity and specificity of the motifs discovered by computational methods. We previously proposed a new constraint model to handle large wildcard regions for discovering functional motifs of proteins. The patterns that satisfy the proposed constraint model are called W-patterns. A W-pattern is a structured motif that groups motif symbols into pattern blocks interleaved with large irregular gaps. Considering large gaps reflects the fact that functional residues are not always from a single region of protein sequences, and restricting motif symbols into clusters corresponds to the observation that short motifs are frequently present within protein families. To efficiently discover W-patterns for large-scale sequence annotation and function prediction, this paper first formally introduces the problem to solve and proposes an algorithm named WildSpan (sequential pattern mining across large wildcard regions) that incorporates several pruning strategies to largely reduce the mining cost. WildSpan is shown to efficiently find W-patterns containing conserved residues that are far separated in sequences. We conducted experiments with two mining strategies, protein-based and family-based mining, to evaluate the usefulness of W-patterns and performance of WildSpan. The protein-based mining mode of WildSpan is developed for discovering

  1. Finding sequence motifs in groups of functionally related proteins.

    PubMed

    Smith, H O; Annau, T M; Chandrasegaran, S

    1990-01-01

    We have developed a method for rapidly finding patterns of conserved amino acid residues (motifs) in groups of functionally related proteins. All 3-amino acid patterns in a group of proteins of the type aa1 d1 aa2 d2 aa3, where d1 and d2 are distances that can be varied in a range up to 24 residues, are accumulated into an array. Segments of the proteins containing those patterns that occur most frequently are aligned on each other by a scoring method that obtains an average relatedness value for all the amino acids in each column of the aligned sequence block based on the Dayhoff relatedness odds matrix. The automated method successfully finds and displays nearly all of the sequence motifs that have been previously reported to occur in 33 reverse transcriptases, 18 DNA integrases, and 30 DNA methyltransferases.

  2. MEME-LaB: motif analysis in clusters.

    PubMed

    Brown, Paul; Baxter, Laura; Hickman, Richard; Beynon, Jim; Moore, Jonathan D; Ott, Sascha

    2013-07-01

    Genome-wide expression analysis can result in large numbers of clusters of co-expressed genes. Although there are tools for ab initio discovery of transcription factor-binding sites, most do not provide a quick and easy way to study large numbers of clusters. To address this, we introduce a web tool called MEME-LaB. The tool wraps MEME (an ab initio motif finder), providing an interface for users to input multiple gene clusters, retrieve promoter sequences, run motif finding and then easily browse and condense the results, facilitating better interpretation of the results from large-scale datasets. MEME-LaB is freely accessible at: http://wsbc.warwick.ac.uk/wsbcToolsWebpage/. Supplementary data are available at Bioinformatics online.

  3. MEME-LaB: motif analysis in clusters

    PubMed Central

    Brown, Paul; Baxter, Laura; Hickman, Richard; Beynon, Jim; Moore, Jonathan D.; Ott, Sascha

    2013-01-01

    Summary: Genome-wide expression analysis can result in large numbers of clusters of co-expressed genes. Although there are tools for ab initio discovery of transcription factor-binding sites, most do not provide a quick and easy way to study large numbers of clusters. To address this, we introduce a web tool called MEME-LaB. The tool wraps MEME (an ab initio motif finder), providing an interface for users to input multiple gene clusters, retrieve promoter sequences, run motif finding and then easily browse and condense the results, facilitating better interpretation of the results from large-scale datasets. Availability: MEME-LaB is freely accessible at: http://wsbc.warwick.ac.uk/wsbcToolsWebpage/. Contact: p.e.brown@warwick.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23681125

  4. Association of branched oligonucleotides into the i-motif.

    PubMed

    Robidoux, S; Klinck, R; Gehring, K; Damha, M J

    1997-12-01

    The unique architecture of branched oligonucleotides mimicking lariat RNA introns [Wallace and Edmons, Proc. Natl. Acad. Sci. USA 80, 950-954 (1983)] was exploited to study compounds that associate as two parallel duplexes with intercalating C/C+ base pairs (i-motif DNA) [Gehring et al. Nature 363, 561-565 (1993)]. The formation of a branched cytosine tetrad was induced by joining the 5'-ends of pair of pentadeoxycytidine strands with a branching riboadenosine (rA) linker. This arrangement causes the orientation of the dC strands to be parallel, and forces the formation of a C/C+ duplex that self-associates into i-DNA. Presence of the i-motif in this structure is supported by thermal denaturation, native gel electrophoresis, CD, and NMR spectroscopy.

  5. A new motif for inhibitors of geranylgeranyl diphosphate synthase.

    PubMed

    Foust, Benjamin J; Allen, Cheryl; Holstein, Sarah A; Wiemer, David F

    2016-08-15

    The enzyme geranylgeranyl diphosphate synthase (GGDPS) is believed to receive the substrate farnesyl diphosphate through one lipophilic channel and release the product geranylgeranyl diphosphate through another. Bisphosphonates with two isoprenoid chains positioned on the α-carbon have proven to be effective inhibitors of this enzyme. Now a new motif has been prepared with one isoprenoid chain on the α-carbon, a second included as a phosphonate ester, and the potential for a third at the α-carbon. The pivaloyloxymethyl prodrugs of several compounds based on this motif have been prepared and the resulting compounds have been tested for their ability to disrupt protein geranylgeranylation and induce cytotoxicity in myeloma cells. The initial biological studies reveal activity consistent with GGDPS inhibition, and demonstrate a structure-function relationship which is dependent on the nature of the alkyl group at the α-carbon. Copyright © 2016 Elsevier Ltd. All rights reserved.

  6. A Combinatorial Code for Splicing Silencing: UAGG and GGGG Motifs

    PubMed Central

    An, Ping; Burge, Christopher B

    2005-01-01

    Alternative pre-mRNA splicing is widely used to regulate gene expression by tuning the levels of tissue-specific mRNA isoforms. Few regulatory mechanisms are understood at the level of combinatorial control despite numerous sequences, distinct from splice sites, that have been shown to play roles in splicing enhancement or silencing. Here we use molecular approaches to identify a ternary combination of exonic UAGG and 5′-splice-site-proximal GGGG motifs that functions cooperatively to silence the brain-region-specific CI cassette exon (exon 19) of the glutamate NMDA R1 receptor (GRIN1) transcript. Disruption of three components of the motif pattern converted the CI cassette into a constitutive exon, while predominant skipping was conferred when the same components were introduced, de novo, into a heterologous constitutive exon. Predominant exon silencing was directed by the motif pattern in the presence of six competing exonic splicing enhancers, and this effect was retained after systematically repositioning the two exonic UAGGs within the CI cassette. In this system, hnRNP A1 was shown to mediate silencing while hnRNP H antagonized silencing. Genome-wide computational analysis combined with RT-PCR testing showed that a class of skipped human and mouse exons can be identified by searches that preserve the sequence and spatial configuration of the UAGG and GGGG motifs. This analysis suggests that the multi-component silencing code may play an important role in the tissue-specific regulation of the CI cassette exon, and that it may serve more generally as a molecular language to allow for intricate adjustments and the coordination of splicing patterns from different genes. PMID:15828859

  7. Graph animals, subgraph sampling, and motif search in large networks.

    PubMed

    Baskerville, Kim; Grassberger, Peter; Paczuski, Maya

    2007-09-01

    We generalize a sampling algorithm for lattice animals (connected clusters on a regular lattice) to a Monte Carlo algorithm for "graph animals," i.e., connected subgraphs in arbitrary networks. As with the algorithm in [N. Kashtan et al., Bioinformatics 20, 1746 (2004)], it provides a weighted sample, but the computation of the weights is much faster (linear in the size of subgraphs, instead of superexponential). This allows subgraphs with up to ten or more nodes to be sampled with very high statistics, from arbitrarily large networks. Using this together with a heuristic algorithm for rapidly classifying isomorphic graphs, we present results for two protein interaction networks obtained using the tandem affinity purification (TAP) method: one of Escherichia coli with 230 nodes and 695 links, and one for yeast (Saccharomyces cerevisiae) with roughly ten times more nodes and links. We find in both cases that most connected subgraphs are strong motifs (Z scores >10) or antimotifs (Z scores <-10) when the null model is the ensemble of networks with fixed degree sequence. Strong differences appear between the two networks, with dominant motifs in E. coli being (nearly) bipartite graphs and having many pairs of nodes that connect to the same neighbors, while dominant motifs in yeast tend towards completeness or contain large cliques. We also explore a number of methods that do not rely on measurements of Z scores or comparisons with null models. For instance, we discuss the influence of specific complexes like the 26S proteasome in yeast, where a small number of complexes dominate the k cores with large k and have a decisive effect on the strongest motifs with 6-8 nodes. We also present Zipf plots of counts versus rank. They show broad distributions that are not power laws, in contrast to the case when disconnected subgraphs are included.

  8. Biosynthesis of caffeine underlying the diversity of motif B' methyltransferase.

    PubMed

    Nakayama, Fumiyo; Mizuno, Kouichi; Kato, Misako

    2015-05-01

    Caffeine (1,3,7-trimethylxanthine) and theobromine (3,7-dimethylxanthine) are well-known purine alkaloids in Camellia, Coffea, Cola, Paullinia, Ilex, and Theobroma spp. The caffeine biosynthetic pathway depends on the substrate specificity of N-methyltransferases, which are members of the motif B' methyl-transferase family. The caffeine biosynthetic pathways in purine alkaloid-containing plants might have evolved in parallel with one another, consistent with different catalytic properties of the enzymes involved in these pathways.

  9. Structural assessment of glycyl mutations in invariantly conserved motifs.

    PubMed

    Prakash, Tulika; Sandhu, Kuljeet Singh; Singh, Nitin Kumar; Bhasin, Yasha; Ramakrishnan, C; Brahmachari, Samir K

    2007-11-15

    Motifs that are evolutionarily conserved in proteins are crucial to their structure and function. In one of our earlier studies, we demonstrated that the conserved motifs occurring invariantly across several organisms could act as structural determinants of the proteins. We observed the abundance of glycyl residues in these invariantly conserved motifs. The role of glycyl residues in highly conserved motifs has not been studied extensively. Thus, it would be interesting to examine the structural perturbations induced by mutation in these conserved glycyl sites. In this work, we selected a representative set of invariant signature (IS) peptides for which both the PDB structure and mutation information was available. We thoroughly analyzed the conformational features of the glycyl sites and their local interactions with the surrounding residues. Using Ramachandran angles, we showed that the glycyl residues occurring in these IS peptides, which have undergone mutation, occurred more often in the L-disallowed as compared with the L-allowed region of the Ramachandran plot. Short range contacts around the mutation site were analyzed to study the steric effects. With the results obtained from our analysis, we hypothesize that any change of activity arising because of such mutations must be attributed to the long-range interaction(s) of the new residue if the glycyl residue in the IS peptide occurred in the L-allowed region of the Ramachandran plot. However, the mutation of those conserved glycyl residues that occurred in the L-disallowed region of the Ramachandran plot might lead to an altered activity of the protein as a result of an altered conformation of the backbone in the immediate vicinity of the glycyl residue, in addition to long range effects arising from the long side chains of the new residue. Thus, the loss of activity because of mutation in the conserved glycyl site might either relate to long range interactions or to local perturbations around the site

  10. Motif, the basics: an overview of the widget set

    SciTech Connect

    McClurg, F.R.

    1992-10-01

    The Motif library provides programmers with a rich set of tools for building a graphical user interface with a three-dimensional appearance and a consistent method of interaction for controlling an Unix application. This Xt-based, high-level library presents an object-oriented'' approach to program design for programmers and allows end-users the flexibility to modify attributes of the interface.

  11. Motif, the basics: an overview of the widget set

    SciTech Connect

    McClurg, F.R.

    1992-10-01

    The Motif library provides programmers with a rich set of tools for building a graphical user interface with a three-dimensional appearance and a consistent method of interaction for controlling an Unix application. This Xt-based, high-level library presents an ``object-oriented`` approach to program design for programmers and allows end-users the flexibility to modify attributes of the interface.

  12. A combinatorial code for splicing silencing: UAGG and GGGG motifs.

    PubMed

    Han, Kyoungha; Yeo, Gene; An, Ping; Burge, Christopher B; Grabowski, Paula J

    2005-05-01

    Alternative pre-mRNA splicing is widely used to regulate gene expression by tuning the levels of tissue-specific mRNA isoforms. Few regulatory mechanisms are understood at the level of combinatorial control despite numerous sequences, distinct from splice sites, that have been shown to play roles in splicing enhancement or silencing. Here we use molecular approaches to identify a ternary combination of exonic UAGG and 5'-splice-site-proximal GGGG motifs that functions cooperatively to silence the brain-region-specific CI cassette exon (exon 19) of the glutamate NMDA R1 receptor (GRIN1) transcript. Disruption of three components of the motif pattern converted the CI cassette into a constitutive exon, while predominant skipping was conferred when the same components were introduced, de novo, into a heterologous constitutive exon. Predominant exon silencing was directed by the motif pattern in the presence of six competing exonic splicing enhancers, and this effect was retained after systematically repositioning the two exonic UAGGs within the CI cassette. In this system, hnRNP A1 was shown to mediate silencing while hnRNP H antagonized silencing. Genome-wide computational analysis combined with RT-PCR testing showed that a class of skipped human and mouse exons can be identified by searches that preserve the sequence and spatial configuration of the UAGG and GGGG motifs. This analysis suggests that the multi-component silencing code may play an important role in the tissue-specific regulation of the CI cassette exon, and that it may serve more generally as a molecular language to allow for intricate adjustments and the coordination of splicing patterns from different genes.

  13. Exon Silencing by UAGG Motifs in Response to Neuronal Excitation

    PubMed Central

    An, Ping; Grabowski, Paula J

    2007-01-01

    Alternative pre-mRNA splicing plays fundamental roles in neurons by generating functional diversity in proteins associated with the communication and connectivity of the synapse. The CI cassette of the NMDA R1 receptor is one of a variety of exons that show an increase in exon skipping in response to cell excitation, but the molecular nature of this splicing responsiveness is not yet understood. Here we investigate the molecular basis for the induced changes in splicing of the CI cassette exon in primary rat cortical cultures in response to KCl-induced depolarization using an expression assay with a tight neuron-specific readout. In this system, exon silencing in response to neuronal excitation was mediated by multiple UAGG-type silencing motifs, and transfer of the motifs to a constitutive exon conferred a similar responsiveness by gain of function. Biochemical analysis of protein binding to UAGG motifs in extracts prepared from treated and mock-treated cortical cultures showed an increase in nuclear hnRNP A1-RNA binding activity in parallel with excitation. Evidence for the role of the NMDA receptor and calcium signaling in the induced splicing response was shown by the use of specific antagonists, as well as cell-permeable inhibitors of signaling pathways. Finally, a wider role for exon-skipping responsiveness is shown to involve additional exons with UAGG-related silencing motifs, and transcripts involved in synaptic functions. These results suggest that, at the post-transcriptional level, excitable exons such as the CI cassette may be involved in strategies by which neurons mount adaptive responses to hyperstimulation. PMID:17298175

  14. Graph animals, subgraph sampling, and motif search in large networks

    NASA Astrophysics Data System (ADS)

    Baskerville, Kim; Grassberger, Peter; Paczuski, Maya

    2007-09-01

    We generalize a sampling algorithm for lattice animals (connected clusters on a regular lattice) to a Monte Carlo algorithm for “graph animals,” i.e., connected subgraphs in arbitrary networks. As with the algorithm in [N. Kashtan , Bioinformatics 20, 1746 (2004)], it provides a weighted sample, but the computation of the weights is much faster (linear in the size of subgraphs, instead of superexponential). This allows subgraphs with up to ten or more nodes to be sampled with very high statistics, from arbitrarily large networks. Using this together with a heuristic algorithm for rapidly classifying isomorphic graphs, we present results for two protein interaction networks obtained using the tandem affinity purification (TAP) method: one of Escherichia coli with 230 nodes and 695 links, and one for yeast (Saccharomyces cerevisiae) with roughly ten times more nodes and links. We find in both cases that most connected subgraphs are strong motifs ( Z scores >10 ) or antimotifs ( Z scores <-10 ) when the null model is the ensemble of networks with fixed degree sequence. Strong differences appear between the two networks, with dominant motifs in E. coli being (nearly) bipartite graphs and having many pairs of nodes that connect to the same neighbors, while dominant motifs in yeast tend towards completeness or contain large cliques. We also explore a number of methods that do not rely on measurements of Z scores or comparisons with null models. For instance, we discuss the influence of specific complexes like the 26S proteasome in yeast, where a small number of complexes dominate the k cores with large k and have a decisive effect on the strongest motifs with 6-8 nodes. We also present Zipf plots of counts versus rank. They show broad distributions that are not power laws, in contrast to the case when disconnected subgraphs are included.

  15. Dissecting protein loops with a statistical scalpel suggests a functional implication of some structural motifs

    PubMed Central

    2011-01-01

    Background One of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function. Results Here, we present a systematic extraction of structural motifs of seven residues from protein loops and we explore their correspondence with functional sites. Our approach is based on the structural alphabet HMM-SA (Hidden Markov Model - Structural Alphabet), which allows simplification of protein structures into uni-dimensional sequences, and advanced pattern statistics adapted to short sequences. Structural motifs of interest are selected by looking for structural motifs significantly over-represented in SCOP superfamilies in protein loops. We discovered two types of structural motifs significantly over-represented in SCOP superfamilies: (i) ubiquitous motifs, shared by several superfamilies and (ii) superfamily-specific motifs, over-represented in few superfamilies. A comparison of ubiquitous words with known small structural motifs shows that they contain well-described motifs as turn, niche or nest motifs. A comparison between superfamily-specific motifs and biological annotations of Swiss-Prot reveals that some of them actually correspond to functional sites involved in the binding sites of small ligands, such as ATP/GTP, NAD(P) and SAH/SAM. Conclusions Our findings show that statistical over-representation in SCOP superfamilies is linked to functional features. The detection of over-represented motifs within structures simplified by HMM-SA is therefore a promising approach for prediction of functional sites and annotation of uncharacterized proteins. PMID:21689388

  16. State-Dependent Modulation of Slow Wave Motifs towards Awakening

    PubMed Central

    Shimaoka, Daisuke; Song, Chenchen; Knöpfel, Thomas

    2017-01-01

    Slow cortical waves that propagate across the cerebral cortex forming large-scale spatiotemporal propagation patterns are a hallmark of non-REM sleep and anesthesia, but also occur during resting wakefulness. To investigate how the spatial temporal properties of slow waves change with the depth of anesthetic, we optically imaged population voltage transients generated by mouse layer 2/3 pyramidal neurons across one or two cortical hemispheres dorsally with a genetically encoded voltage indicator (GEVI). From deep barbiturate anesthesia to light barbiturate sedation, depolarizing wave events recruiting at least 50% of the imaged cortical area consistently appeared as a conserved repertoire of distinct wave motifs. Toward awakening, the incidence of individual motifs changed systematically (the motif propagating from visual to motor areas increased while that from somatosensory to visual areas decreased) and both local and global cortical dynamics accelerated. These findings highlight that functional endogenous interactions between distant cortical areas are not only constrained by anatomical connectivity, but can also be modulated by the brain state. PMID:28484371

  17. State-Dependent Modulation of Slow Wave Motifs towards Awakening.

    PubMed

    Shimaoka, Daisuke; Song, Chenchen; Knöpfel, Thomas

    2017-01-01

    Slow cortical waves that propagate across the cerebral cortex forming large-scale spatiotemporal propagation patterns are a hallmark of non-REM sleep and anesthesia, but also occur during resting wakefulness. To investigate how the spatial temporal properties of slow waves change with the depth of anesthetic, we optically imaged population voltage transients generated by mouse layer 2/3 pyramidal neurons across one or two cortical hemispheres dorsally with a genetically encoded voltage indicator (GEVI). From deep barbiturate anesthesia to light barbiturate sedation, depolarizing wave events recruiting at least 50% of the imaged cortical area consistently appeared as a conserved repertoire of distinct wave motifs. Toward awakening, the incidence of individual motifs changed systematically (the motif propagating from visual to motor areas increased while that from somatosensory to visual areas decreased) and both local and global cortical dynamics accelerated. These findings highlight that functional endogenous interactions between distant cortical areas are not only constrained by anatomical connectivity, but can also be modulated by the brain state.

  18. A +1 ribosomal frameshifting motif prevalent among plant amalgaviruses.

    PubMed

    Nibert, Max L; Pyle, Jesse D; Firth, Andrew E

    2016-11-01

    Sequence accessions attributable to novel plant amalgaviruses have been found in the Transcriptome Shotgun Assembly database. Sixteen accessions, derived from 12 different plant species, appear to encompass the complete protein-coding regions of the proposed amalgaviruses, which would substantially expand the size of genus Amalgavirus from 4 current species. Other findings include evidence for UUU_CGN as a +1 ribosomal frameshifting motif prevalent among plant amalgaviruses; for a variant version of this motif found thus far in only two amalgaviruses from solanaceous plants; for a region of α-helical coiled coil propensity conserved in a central region of the ORF1 translation product of plant amalgaviruses; and for conserved sequences in a C-terminal region of the ORF2 translation product (RNA-dependent RNA polymerase) of plant amalgaviruses, seemingly beyond the region of conserved polymerase motifs. These results additionally illustrate the value of mining the TSA database and others for novel viral sequences for comparative analyses. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  19. QuateXelero: An Accelerated Exact Network Motif Detection Algorithm

    PubMed Central

    Khakabimamaghani, Sahand; Sharafuddin, Iman; Dichter, Norbert; Koch, Ina; Masoudi-Nejad, Ali

    2013-01-01

    Finding motifs in biological, social, technological, and other types of networks has become a widespread method to gain more knowledge about these networks’ structure and function. However, this task is very computationally demanding, because it is highly associated with the graph isomorphism which is an NP problem (not known to belong to P or NP-complete subsets yet). Accordingly, this research is endeavoring to decrease the need to call NAUTY isomorphism detection method, which is the most time-consuming step in many existing algorithms. The work provides an extremely fast motif detection algorithm called QuateXelero, which has a Quaternary Tree data structure in the heart. The proposed algorithm is based on the well-known ESU (FANMOD) motif detection algorithm. The results of experiments on some standard model networks approve the overal superiority of the proposed algorithm, namely QuateXelero, compared with two of the fastest existing algorithms, G-Tries and Kavosh. QuateXelero is especially fastest in constructing the central data structure of the algorithm from scratch based on the input network. PMID:23874498

  20. Network motifs come in sets: correlations in the randomization process.

    PubMed

    Ginoza, Reid; Mugler, Andrew

    2010-07-01

    The identification of motifs--subgraphs that appear significantly more often in a particular network than in an ensemble of randomized networks--has become a ubiquitous method for uncovering potentially important subunits within networks drawn from a wide variety of fields. We find that the most common algorithms used to generate the ensemble from the real network change subgraph counts in a highly correlated manner, such that one subgraph's status as a motif may not be independent from the statuses of the other subgraphs. We demonstrate this effect for the problem of three- and four-node motif identification in the transcriptional regulatory networks of E. coli and S. cerevisiae in which randomized networks are generated via an edge-swapping algorithm. We find strong correlations among subgraph counts; for three-node subgraphs these correlations are easily interpreted, and we present an information-theoretic tool that may be used to identify correlations among subgraphs of any size. Our results suggest that single-feature statistics such as Z scores that implicitly assume independence among subgraph counts constitute an insufficient summary of the network.

  1. The bioactive acidic serine- and aspartate-rich motif peptide.

    PubMed

    Minamizaki, Tomoko; Yoshiko, Yuji

    2015-01-01

    The organic component of the bone matrix comprises 40% dry weight of bone. The organic component is mostly composed of type I collagen and small amounts of non-collagenous proteins (NCPs) (10-15% of the total bone protein content). The small integrin-binding ligand N-linked glycoprotein (SIBLING) family, a NCP, is considered to play a key role in bone mineralization. SIBLING family of proteins share common structural features and includes the arginine-glycine-aspartic acid (RGD) motif and acidic serine- and aspartic acid-rich motif (ASARM). Clinical manifestations of gene mutations and/or genetically modified mice indicate that SIBLINGs play diverse roles in bone and extraskeletal tissues. ASARM peptides might not be primary responsible for the functional diversity of SIBLINGs, but this motif is suggested to be a key domain of SIBLINGs. However, the exact function of ASARM peptides is poorly understood. In this article, we discuss the considerable progress made in understanding the role of ASARM as a bioactive peptide.

  2. MAR characteristic motifs mediate episomal vector in CHO cells.

    PubMed

    Lin, Yan; Li, Zhaoxi; Wang, Tianyun; Wang, Xiaoyin; Wang, Li; Dong, Weihua; Jing, Changqin; Yang, Xianjun

    2015-04-01

    An ideal gene therapy vector should enable persistent transgene expression without limitations in safety and reproducibility. Recent researches' insight into the ability of chromosomal matrix attachment regions (MARs) to mediate episomal maintenance of genetic elements allowed the development of a circular episomal vector. Although a MAR-mediated engineered vector has been developed, little is known on which motifs of MAR confer this function during interaction with the host genome. Here, we report an artificially synthesized DNA fragment containing only characteristic motif sequences that served as an alternative to human beta-interferon matrix attachment region sequence. The potential of the vector to mediate gene transfer in CHO cells was investigated. The short synthetic MAR motifs were found to mediate episomal vector at a low copy number for many generations without integration into the host genome. Higher transgene expression was maintained for at least 4 months. In addition, MAR was maintained episomally and conferred sustained EGFP expression even in nonselective CHO cells. All the results demonstrated that MAR characteristic sequence-based vector can function as stable episomes in CHO cells, supporting long-term and effective transgene expression. Copyright © 2015 Elsevier B.V. All rights reserved.

  3. A novel swarm intelligence algorithm for finding DNA motifs.

    PubMed

    Lei, Chengwei; Ruan, Jianhua

    2009-01-01

    Discovering DNA motifs from co-expressed or co-regulated genes is an important step towards deciphering complex gene regulatory networks and understanding gene functions. Despite significant improvement in the last decade, it still remains one of the most challenging problems in computational molecular biology. In this work, we propose a novel motif finding algorithm that finds consensus patterns using a population-based stochastic optimisation technique called Particle Swarm Optimisation (PSO), which has been shown to be effective in optimising difficult multidimensional problems in continuous domains. We propose to use a word dissimilarity graph to remap the neighborhood structure of the solution space of DNA motifs, and propose a modification of the naive PSO algorithm to accommodate discrete variables. In order to improve efficiency, we also propose several strategies for escaping from local optima and for automatically determining the termination criteria. Experimental results on simulated challenge problems show that our method is both more efficient and more accurate than several existing algorithms. Applications to several sets of real promoter sequences also show that our approach is able to detect known transcription factor binding sites, and outperforms two of the most popular existing algorithms.

  4. TOPDOM: database of conservatively located domains and motifs in proteins.

    PubMed

    Varga, Julia; Dobson, László; Tusnády, Gábor E

    2016-09-01

    The TOPDOM database-originally created as a collection of domains and motifs located consistently on the same side of the membranes in α-helical transmembrane proteins-has been updated and extended by taking into consideration consistently localized domains and motifs in globular proteins, too. By taking advantage of the recently developed CCTOP algorithm to determine the type of a protein and predict topology in case of transmembrane proteins, and by applying a thorough search for domains and motifs as well as utilizing the most up-to-date version of all source databases, we managed to reach a 6-fold increase in the size of the whole database and a 2-fold increase in the number of transmembrane proteins. TOPDOM database is available at http://topdom.enzim.hu The webpage utilizes the common Apache, PHP5 and MySQL software to provide the user interface for accessing and searching the database. The database itself is generated on a high performance computer. tusnady.gabor@ttk.mta.hu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  5. STEME: efficient EM to find motifs in large data sets.

    PubMed

    Reid, John E; Wernisch, Lorenz

    2011-10-01

    MEME and many other popular motif finders use the expectation-maximization (EM) algorithm to optimize their parameters. Unfortunately, the running time of EM is linear in the length of the input sequences. This can prohibit its application to data sets of the size commonly generated by high-throughput biological techniques. A suffix tree is a data structure that can efficiently index a set of sequences. We describe an algorithm, Suffix Tree EM for Motif Elicitation (STEME), that approximates EM using suffix trees. To the best of our knowledge, this is the first application of suffix trees to EM. We provide an analysis of the expected running time of the algorithm and demonstrate that STEME runs an order of magnitude more quickly than the implementation of EM used by MEME. We give theoretical bounds for the quality of the approximation and show that, in practice, the approximation has a negligible effect on the outcome. We provide an open source implementation of the algorithm that we hope will be used to speed up existing and future motif search algorithms.

  6. Caveats in modeling a common motif in genetic circuits

    NASA Astrophysics Data System (ADS)

    Labavić, Darka; Nagel, Hannes; Janke, Wolfhard; Meyer-Ortmanns, Hildegard

    2013-06-01

    From a coarse-grained perspective, the motif of a self-activating species, activating a second species that acts as its own repressor, is widely found in biological systems, in particular in genetic systems with inherent oscillatory behavior. Here we consider a specific realization of this motif as a genetic circuit, termed the bistable frustrated unit, in which genes are described as directly producing proteins. Upon an improved resolution in time, we focus on the effect that inherent time scales on the underlying scale can have on the bifurcation patterns on a coarser scale. Time scales are set by the binding and unbinding rates of the transcription factors to the promoter regions of the genes. Depending on the ratio of these rates to the decay times of both proteins, the appropriate averaging procedure for obtaining a coarse-grained description changes and leads to sets of deterministic equations, which considerably differ in their bifurcation structure. In particular, the desired intermediate range of regular limit cycles fades away when the binding rates of genes are not fast as compared to the decay time of the proteins. Our analysis illustrates that the common topology of the widely found motif alone does not imply universal features in the dynamics.

  7. Maximum likelihood density modification by pattern recognition of structural motifs

    DOEpatents

    Terwilliger, Thomas C.

    2004-04-13

    An electron density for a crystallographic structure having protein regions and solvent regions is improved by maximizing the log likelihood of a set of structures factors {F.sub.h } using a local log-likelihood function: (x)+p(.rho.(x).vertline.SOLV)p.sub.SOLV (x)+p(.rho.(x).vertline.H)p.sub.H (x)], where p.sub.PROT (x) is the probability that x is in the protein region, p(.rho.(x).vertline.PROT) is the conditional probability for .rho.(x) given that x is in the protein region, and p.sub.SOLV (x) and p(.rho.(x).vertline.SOLV) are the corresponding quantities for the solvent region, p.sub.H (x) refers to the probability that there is a structural motif at a known location, with a known orientation, in the vicinity of the point x; and p(.rho.(x).vertline.H) is the probability distribution for electron density at this point given that the structural motif actually is present. One appropriate structural motif is a helical structure within the crystallographic structure.

  8. Event Networks and the Identification of Crime Pattern Motifs

    PubMed Central

    2015-01-01

    In this paper we demonstrate the use of network analysis to characterise patterns of clustering in spatio-temporal events. Such clustering is of both theoretical and practical importance in the study of crime, and forms the basis for a number of preventative strategies. However, existing analytical methods show only that clustering is present in data, while offering little insight into the nature of the patterns present. Here, we show how the classification of pairs of events as close in space and time can be used to define a network, thereby generalising previous approaches. The application of graph-theoretic techniques to these networks can then offer significantly deeper insight into the structure of the data than previously possible. In particular, we focus on the identification of network motifs, which have clear interpretation in terms of spatio-temporal behaviour. Statistical analysis is complicated by the nature of the underlying data, and we provide a method by which appropriate randomised graphs can be generated. Two datasets are used as case studies: maritime piracy at the global scale, and residential burglary in an urban area. In both cases, the same significant 3-vertex motif is found; this result suggests that incidents tend to occur not just in pairs, but in fact in larger groups within a restricted spatio-temporal domain. In the 4-vertex case, different motifs are found to be significant in each case, suggesting that this technique is capable of discriminating between clustering patterns at a finer granularity than previously possible. PMID:26605544

  9. STEME: efficient EM to find motifs in large data sets

    PubMed Central

    Reid, John E.; Wernisch, Lorenz

    2011-01-01

    MEME and many other popular motif finders use the expectation–maximization (EM) algorithm to optimize their parameters. Unfortunately, the running time of EM is linear in the length of the input sequences. This can prohibit its application to data sets of the size commonly generated by high-throughput biological techniques. A suffix tree is a data structure that can efficiently index a set of sequences. We describe an algorithm, Suffix Tree EM for Motif Elicitation (STEME), that approximates EM using suffix trees. To the best of our knowledge, this is the first application of suffix trees to EM. We provide an analysis of the expected running time of the algorithm and demonstrate that STEME runs an order of magnitude more quickly than the implementation of EM used by MEME. We give theoretical bounds for the quality of the approximation and show that, in practice, the approximation has a negligible effect on the outcome. We provide an open source implementation of the algorithm that we hope will be used to speed up existing and future motif search algorithms. PMID:21785132

  10. Computing expectation values for RNA motifs using discrete convolutions

    PubMed Central

    Lambert, André; Legendre, Matthieu; Fontaine, Jean-Fred; Gautheret, Daniel

    2005-01-01

    Background Computational biologists use Expectation values (E-values) to estimate the number of solutions that can be expected by chance during a database scan. Here we focus on computing Expectation values for RNA motifs defined by single-strand and helix lod-score profiles with variable helix spans. Such E-values cannot be computed assuming a normal score distribution and their estimation previously required lengthy simulations. Results We introduce discrete convolutions as an accurate and fast mean to estimate score distributions of lod-score profiles. This method provides excellent score estimations for all single-strand or helical elements tested and also applies to the combination of elements into larger, complex, motifs. Further, the estimated distributions remain accurate even when pseudocounts are introduced into the lod-score profiles. Estimated score distributions are then easily converted into E-values. Conclusion A good agreement was observed between computed E-values and simulations for a number of complete RNA motifs. This method is now implemented into the ERPIN software, but it can be applied as well to any search procedure based on ungapped profiles with statistically independent columns. PMID:15892887

  11. Computing expectation values for RNA motifs using discrete convolutions.

    PubMed

    Lambert, André; Legendre, Matthieu; Fontaine, Jean-Fred; Gautheret, Daniel

    2005-05-13

    Computational biologists use Expectation values (E-values) to estimate the number of solutions that can be expected by chance during a database scan. Here we focus on computing Expectation values for RNA motifs defined by single-strand and helix lod-score profiles with variable helix spans. Such E-values cannot be computed assuming a normal score distribution and their estimation previously required lengthy simulations. We introduce discrete convolutions as an accurate and fast mean to estimate score distributions of lod-score profiles. This method provides excellent score estimations for all single-strand or helical elements tested and also applies to the combination of elements into larger, complex, motifs. Further, the estimated distributions remain accurate even when pseudocounts are introduced into the lod-score profiles. Estimated score distributions are then easily converted into E-values. A good agreement was observed between computed E-values and simulations for a number of complete RNA motifs. This method is now implemented into the ERPIN software, but it can be applied as well to any search procedure based on ungapped profiles with statistically independent columns.

  12. G.U base pairing motifs in ribosomal RNA.

    PubMed

    Gautheret, D; Konings, D; Gutell, R R

    1995-10-01

    An increasing number of recognition mechanisms in RNA are found to involve G.U base pairs. In order to detect new functional sites of this type, we exhaustively analyzed the sequence alignments and secondary structures of eubacterial and chloroplast 16S and 23S rRNA, seeking positions with high levels of G.U pairs. Approximately 120 such sites were identified and classified according to their secondary structure and sequence environment. Overall biases in the distribution of G.U pairs are consistent with previously proposed structural rules: the side of the wobble pair that is subject to a loss of stacking is preferentially exposed to a secondary structure loop, where stacking is not as essential as in helical regions. However, multiple sites violate these rules and display highly conserved G.U pairs in orientations that could cause severe stacking problems. In addition, three motifs displaying a conserved G.U pair in a specific sequence/structure environment occur at an unusually high frequency. These motifs, of which two had not been reported before, involve sequences 5'UG3' 3'GA5' and 5'UG3' 3'GU5', as well as G.U pairs flanked by a bulge loop 3' of U. The possible structures and functions of these recurrent motifs are discussed.

  13. G.U base pairing motifs in ribosomal RNA.

    PubMed Central

    Gautheret, D; Konings, D; Gutell, R R

    1995-01-01

    An increasing number of recognition mechanisms in RNA are found to involve G.U base pairs. In order to detect new functional sites of this type, we exhaustively analyzed the sequence alignments and secondary structures of eubacterial and chloroplast 16S and 23S rRNA, seeking positions with high levels of G.U pairs. Approximately 120 such sites were identified and classified according to their secondary structure and sequence environment. Overall biases in the distribution of G.U pairs are consistent with previously proposed structural rules: the side of the wobble pair that is subject to a loss of stacking is preferentially exposed to a secondary structure loop, where stacking is not as essential as in helical regions. However, multiple sites violate these rules and display highly conserved G.U pairs in orientations that could cause severe stacking problems. In addition, three motifs displaying a conserved G.U pair in a specific sequence/structure environment occur at an unusually high frequency. These motifs, of which two had not been reported before, involve sequences 5'UG3' 3'GA5' and 5'UG3' 3'GU5', as well as G.U pairs flanked by a bulge loop 3' of U. The possible structures and functions of these recurrent motifs are discussed. PMID:7493326

  14. An update on cell surface proteins containing extensin-motifs.

    PubMed

    Borassi, Cecilia; Sede, Ana R; Mecchia, Martin A; Salgado Salter, Juan D; Marzol, Eliana; Muschietti, Jorge P; Estevez, Jose M

    2016-01-01

    In recent years it has become clear that there are several molecular links that interconnect the plant cell surface continuum, which is highly important in many biological processes such as plant growth, development, and interaction with the environment. The plant cell surface continuum can be defined as the space that contains and interlinks the cell wall, plasma membrane and cytoskeleton compartments. In this review, we provide an updated view of cell surface proteins that include modular domains with an extensin (EXT)-motif followed by a cytoplasmic kinase-like domain, known as PERKs (for proline-rich extensin-like receptor kinases); with an EXT-motif and an actin binding domain, known as formins; and with extracellular hybrid-EXTs. We focus our attention on the EXT-motifs with the short sequence Ser-Pro(3-5), which is found in several different protein contexts within the same extracellular space, highlighting a putative conserved structural and functional role. A closer understanding of the dynamic regulation of plant cell surface continuum and its relationship with the downstream signalling cascade is a crucial forthcoming challenge.

  15. Motif structure and cooperation in real-world complex networks

    NASA Astrophysics Data System (ADS)

    Salehi, Mostafa; Rabiee, Hamid R.; Jalili, Mahdi

    2010-12-01

    Networks of dynamical nodes serve as generic models for real-world systems in many branches of science ranging from mathematics to physics, technology, sociology and biology. Collective behavior of agents interacting over complex networks is important in many applications. The cooperation between selfish individuals is one of the most interesting collective phenomena. In this paper we address the interplay between the motifs’ cooperation properties and their abundance in a number of real-world networks including yeast protein-protein interaction, human brain, protein structure, email communication, dolphins’ social interaction, Zachary karate club and Net-science coauthorship networks. First, the amount of cooperativity for all possible undirected subgraphs with three to six nodes is calculated. To this end, the evolutionary dynamics of the Prisoner’s Dilemma game is considered and the cooperativity of each subgraph is calculated as the percentage of cooperating agents at the end of the simulation time. Then, the three- to six-node motifs are extracted for each network. The significance of the abundance of a motif, represented by a Z-value, is obtained by comparing them with some properly randomized versions of the original network. We found that there is always a group of motifs showing a significant inverse correlation between their cooperativity amount and Z-value, i.e. the more the Z-value the less the amount of cooperativity. This suggests that networks composed of well-structured units do not have good cooperativity properties.

  16. Retroviruses integrate into a shared, non-palindromic DNA motif.

    PubMed

    Kirk, Paul D W; Huvet, Maxime; Melamed, Anat; Maertens, Goedele N; Bangham, Charles R M

    2016-11-14

    Many DNA-binding factors, such as transcription factors, form oligomeric complexes with structural symmetry that bind to palindromic DNA sequences(1). Palindromic consensus nucleotide sequences are also found at the genomic integration sites of retroviruses(2-6) and other transposable elements(7-9), and it has been suggested that this palindromic consensus arises as a consequence of the structural symmetry in the integrase complex(2,3). However, we show here that the palindromic consensus sequence is not present in individual integration sites of human T-cell lymphotropic virus type 1 (HTLV-1) and human immunodeficiency virus type 1 (HIV-1), but arises in the population average as a consequence of the existence of a non-palindromic nucleotide motif that occurs in approximately equal proportions on the plus strand and the minus strand of the host genome. We develop a generally applicable algorithm to sort the individual integration site sequences into plus-strand and minus-strand subpopulations, and use this to identify the integration site nucleotide motifs of five retroviruses of different genera: HTLV-1, HIV-1, murine leukaemia virus (MLV), avian sarcoma leucosis virus (ASLV) and prototype foamy virus (PFV). The results reveal a non-palindromic motif that is shared between these retroviruses.

  17. Automatic Network Fingerprinting through Single-Node Motifs

    PubMed Central

    Echtermeyer, Christoph; da Fontoura Costa, Luciano; Rodrigues, Francisco A.; Kaiser, Marcus

    2011-01-01

    Complex networks have been characterised by their specific connectivity patterns (network motifs), but their building blocks can also be identified and described by node-motifs—a combination of local network features. One technique to identify single node-motifs has been presented by Costa et al. (L. D. F. Costa, F. A. Rodrigues, C. C. Hilgetag, and M. Kaiser, Europhys. Lett., 87, 1, 2009). Here, we first suggest improvements to the method including how its parameters can be determined automatically. Such automatic routines make high-throughput studies of many networks feasible. Second, the new routines are validated in different network-series. Third, we provide an example of how the method can be used to analyse network time-series. In conclusion, we provide a robust method for systematically discovering and classifying characteristic nodes of a network. In contrast to classical motif analysis, our approach can identify individual components (here: nodes) that are specific to a network. Such special nodes, as hubs before, might be found to play critical roles in real-world networks. PMID:21297963

  18. Event Networks and the Identification of Crime Pattern Motifs.

    PubMed

    Davies, Toby; Marchione, Elio

    2015-01-01

    In this paper we demonstrate the use of network analysis to characterise patterns of clustering in spatio-temporal events. Such clustering is of both theoretical and practical importance in the study of crime, and forms the basis for a number of preventative strategies. However, existing analytical methods show only that clustering is present in data, while offering little insight into the nature of the patterns present. Here, we show how the classification of pairs of events as close in space and time can be used to define a network, thereby generalising previous approaches. The application of graph-theoretic techniques to these networks can then offer significantly deeper insight into the structure of the data than previously possible. In particular, we focus on the identification of network motifs, which have clear interpretation in terms of spatio-temporal behaviour. Statistical analysis is complicated by the nature of the underlying data, and we provide a method by which appropriate randomised graphs can be generated. Two datasets are used as case studies: maritime piracy at the global scale, and residential burglary in an urban area. In both cases, the same significant 3-vertex motif is found; this result suggests that incidents tend to occur not just in pairs, but in fact in larger groups within a restricted spatio-temporal domain. In the 4-vertex case, different motifs are found to be significant in each case, suggesting that this technique is capable of discriminating between clustering patterns at a finer granularity than previously possible.

  19. Structure and ubiquitin binding of the ubiquitin-interacting motif

    SciTech Connect

    Fisher,R.; Wang, B.; Alam, S.; Higginson, D.; Robinson, H.; Sundquist, C.; Hill, C.

    2003-01-01

    Ubiquitylation is used to target proteins into a large number of different biological processes including proteasomal degradation, endocytosis, virus budding, and vacuolar protein sorting (Vps). Ubiquitylated proteins are typically recognized using one of several different conserved ubiquitin binding modules. Here, we report the crystal structure and ubiquitin binding properties of one such module, the ubiquitin-interacting motif (UIM). We found that UIM peptides from several proteins involved in endocytosis and vacuolar protein sorting including Hrs, Vps27p, Stam1, and Eps15 bound specifically, but with modest affinity (K{sub d} = 0.1-1 mM), to free ubiquitin. Full affinity ubiquitin binding required the presence of conserved acidic patches at the N and C terminus of the UIM, as well as highly conserved central alanine and serine residues. NMR chemical shift perturbation mapping experiments demonstrated that all of these UIM peptides bind to the I44 surface of ubiquitin. The 1.45 {angstrom} resolution crystal structure of the second yeast Vps27p UIM (Vps27p-2) revealed that the ubiquitin-interacting motif forms an amphipathic helix. Although Vps27p-2 is monomeric in solution, the motif unexpectedly crystallized as an antiparallel four-helix bundle, and the potential biological implications of UIM oligomerization are therefore discussed.

  20. A combinatorial approach to the repertoire of RNA kissing motifs; towards multiplex detection by switching hairpin aptamers

    PubMed Central

    Durand, Guillaume; Dausse, Eric; Goux, Emma; Fiore, Emmanuelle; Peyrin, Eric; Ravelet, Corinne; Toulmé, Jean-Jacques

    2016-01-01

    Loop–loop (also known as kissing) interactions between RNA hairpins are involved in several mechanisms in both prokaryotes and eukaryotes such as the regulation of the plasmid copy number or the dimerization of retroviral genomes. The stability of kissing complexes relies on loop parameters (base composition, sequence and size) and base combination at the loop–loop helix - stem junctions. In order to identify kissing partners that could be used as regulatory elements or building blocks of RNA scaffolds, we analysed a pool of 5.2 × 106 RNA hairpins with randomized loops. We identified more than 50 pairs of kissing RNA hairpins. Two kissing motifs, 5′CCNY and 5′RYRY, generate highly stable complexes with KDs in the low nanomolar range. Such motifs were introduced in the apical loop of hairpin aptamers that switch between unfolded and folded state upon binding to their cognate target molecule, hence their name aptaswitch. The aptaswitch–ligand complex is specifically recognized by a second RNA hairpin named aptakiss through loop–loop interaction. Taking advantage of our kissing motif repertoire we engineered aptaswitch–aptakiss modules for purine derivatives, namely adenosine, GTP and theophylline and demonstrated that these molecules can be specifically and simultaneously detected by surface plasmon resonance or by fluorescence anisotropy. PMID:27067541

  1. A combinatorial approach to the repertoire of RNA kissing motifs; towards multiplex detection by switching hairpin aptamers.

    PubMed

    Durand, Guillaume; Dausse, Eric; Goux, Emma; Fiore, Emmanuelle; Peyrin, Eric; Ravelet, Corinne; Toulmé, Jean-Jacques

    2016-05-19

    Loop-loop (also known as kissing) interactions between RNA hairpins are involved in several mechanisms in both prokaryotes and eukaryotes such as the regulation of the plasmid copy number or the dimerization of retroviral genomes. The stability of kissing complexes relies on loop parameters (base composition, sequence and size) and base combination at the loop-loop helix - stem junctions. In order to identify kissing partners that could be used as regulatory elements or building blocks of RNA scaffolds, we analysed a pool of 5.2 × 10(6) RNA hairpins with randomized loops. We identified more than 50 pairs of kissing RNA hairpins. Two kissing motifs, 5'CCNY and 5'RYRY, generate highly stable complexes with KDs in the low nanomolar range. Such motifs were introduced in the apical loop of hairpin aptamers that switch between unfolded and folded state upon binding to their cognate target molecule, hence their name aptaswitch. The aptaswitch-ligand complex is specifically recognized by a second RNA hairpin named aptakiss through loop-loop interaction. Taking advantage of our kissing motif repertoire we engineered aptaswitch-aptakiss modules for purine derivatives, namely adenosine, GTP and theophylline and demonstrated that these molecules can be specifically and simultaneously detected by surface plasmon resonance or by fluorescence anisotropy. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. Genome-Wide Motif Statistics are Shaped by DNA Binding Proteins over Evolutionary Time Scales

    NASA Astrophysics Data System (ADS)

    Qian, Long; Kussell, Edo

    2016-10-01

    The composition of a genome with respect to all possible short DNA motifs impacts the ability of DNA binding proteins to locate and bind their target sites. Since nonfunctional DNA binding can be detrimental to cellular functions and ultimately to organismal fitness, organisms could benefit from reducing the number of nonfunctional DNA binding sites genome wide. Using in vitro measurements of binding affinities for a large collection of DNA binding proteins, in multiple species, we detect a significant global avoidance of weak binding sites in genomes. We demonstrate that the underlying evolutionary process leaves a distinct genomic hallmark in that similar words have correlated frequencies, a signal that we detect in all species across domains of life. We consider the possibility that natural selection against weak binding sites contributes to this process, and using an evolutionary model we show that the strength of selection needed to maintain global word compositions is on the order of point mutation rates. Likewise, we show that evolutionary mechanisms based on interference of protein-DNA binding with replication and mutational repair processes could yield similar results and operate with similar rates. On the basis of these modeling and bioinformatic results, we conclude that genome-wide word compositions have been molded by DNA binding proteins acting through tiny evolutionary steps over time scales spanning millions of generations.

  3. Tuning structural motifs and alloying of bulk immiscible Mo-Cu bimetallic nanoparticles by gas-phase synthesis

    NASA Astrophysics Data System (ADS)

    Krishnan, Gopi; Verheijen, Marcel A.; Ten Brink, Gert H.; Palasantzas, George; Kooi, Bart J.

    2013-05-01

    Nowadays bimetallic nanoparticles (NPs) have emerged as key materials for important modern applications in nanoplasmonics, catalysis, biodiagnostics, and nanomagnetics. Consequently the control of bimetallic structural motifs with specific shapes provides increasing functionality and selectivity for related applications. However, producing bimetallic NPs with well controlled structural motifs still remains a formidable challenge. Hence, we present here a general methodology for gas phase synthesis of bimetallic NPs with distinctively different structural motifs ranging at a single particle level from a fully mixed alloy to core-shell, to onion (multi-shell), and finally to a Janus/dumbbell, with the same overall particle composition. These concepts are illustrated for Mo-Cu NPs, where the precise control of the bimetallic NPs with various degrees of chemical ordering, including different shapes from spherical to cube, is achieved by tailoring the energy and thermal environment that the NPs experience during their production. The initial state of NP growth, either in the liquid or in the solid state phase, has important implications for the different structural motifs and shapes of synthesized NPs. Finally we demonstrate that we are able to tune the alloying regime, for the otherwise bulk immiscible Mo-Cu, by achieving an increase of the critical size, below which alloying occurs, closely up to an order of magnitude. It is discovered that the critical size of the NP alloy is not only affected by controlled tuning of the alloying temperature but also by the particle shape.Nowadays bimetallic nanoparticles (NPs) have emerged as key materials for important modern applications in nanoplasmonics, catalysis, biodiagnostics, and nanomagnetics. Consequently the control of bimetallic structural motifs with specific shapes provides increasing functionality and selectivity for related applications. However, producing bimetallic NPs with well controlled structural motifs still

  4. Decreased RNA-binding motif 5 expression is associated with tumor progression in gastric cancer.

    PubMed

    Kobayashi, Takahiko; Ishida, Junich; Shimizu, Yuichi; Kawakami, Hiroshi; Suda, Goki; Muranaka, Tetsuhito; Komatsu, Yoshito; Asaka, Masahiro; Sakamoto, Naoya

    2017-03-01

    RNA-binding motif 5 is a putative tumor suppressor gene that modulates cell cycle arrest and apoptosis. We recently demonstrated that RNA-binding motif 5 inhibits cell growth through the p53 pathway. This study evaluated the clinical significance of RNA-binding motif 5 expression in gastric cancer and the effects of altered RNA-binding motif 5 expression on cancer biology in gastric cancer cells. RNA-binding motif 5 protein expression was evaluated by immunohistochemistry using the surgical specimens of 106 patients with gastric cancer. We analyzed the relationships of RNA-binding motif 5 expression with clinicopathological parameters and patient prognosis. We further explored the effects of RNA-binding motif 5 downregulation with short hairpin RNA on cell growth and p53 signaling in MKN45 gastric cancer cells. Immunohistochemistry revealed that RNA-binding motif 5 expression was decreased in 29 of 106 (27.4%) gastric cancer specimens. Decreased RNA-binding motif 5 expression was correlated with histological differentiation, depth of tumor infiltration, nodal metastasis, tumor-node-metastasis stage, and prognosis. RNA-binding motif 5 silencing enhanced gastric cancer cell proliferation and decreased p53 transcriptional activity in reporter gene assays. Conversely, restoration of RNA-binding motif 5 expression suppressed cell growth and recovered p53 transactivation in RNA-binding motif 5-silenced cells. Furthermore, RNA-binding motif 5 silencing reduced the messenger RNA and protein expression of the p53 target gene p21. Our results suggest that RNA-binding motif 5 downregulation is involved in gastric cancer progression and that RNA-binding motif 5 behaves as a tumor suppressor gene in gastric cancer.

  5. A Novel Bayesian DNA Motif Comparison Method for Clustering and Retrieval

    PubMed Central

    Margalit, Hanah; Friedman, Nir

    2008-01-01

    Characterizing the DNA-binding specificities of transcription factors is a key problem in computational biology that has been addressed by multiple algorithms. These usually take as input sequences that are putatively bound by the same factor and output one or more DNA motifs. A common practice is to apply several such algorithms simultaneously to improve coverage at the price of redundancy. In interpreting such results, two tasks are crucial: clustering of redundant motifs, and attributing the motifs to transcription factors by retrieval of similar motifs from previously characterized motif libraries. Both tasks inherently involve motif comparison. Here we present a novel method for comparing and merging motifs, based on Bayesian probabilistic principles. This method takes into account both the similarity in positional nucleotide distributions of the two motifs and their dissimilarity to the background distribution. We demonstrate the use of the new comparison method as a basis for motif clustering and retrieval procedures, and compare it to several commonly used alternatives. Our results show that the new method outperforms other available methods in accuracy and sensitivity. We incorporated the resulting motif clustering and retrieval procedures in a large-scale automated pipeline for analyzing DNA motifs. This pipeline integrates the results of various DNA motif discovery algorithms and automatically merges redundant motifs from multiple training sets into a coherent annotated library of motifs. Application of this pipeline to recent genome-wide transcription factor location data in S. cerevisiae successfully identified DNA motifs in a manner that is as good as semi-automated analysis reported in the literature. Moreover, we show how this analysis elucidates the mechanisms of condition-specific preferences of transcription factors. PMID:18463706

  6. Identification of cancer-related genes and motifs in the human gene regulatory network.

    PubMed

    Carson, Matthew B; Gu, Jianlei; Yu, Guangjun; Lu, Hui

    2015-08-01

    The authors investigated the regulatory network motifs and corresponding motif positions of cancer-related genes. First, they mapped disease-related genes to a transcription factor regulatory network. Next, they calculated statistically significant motifs and subsequently identified positions within these motifs that were enriched in cancer-related genes. Potential mechanisms of these motifs and positions are discussed. These results could be used to identify other disease- and cancer-related genes and could also suggest mechanisms for how these genes relate to co-occurring diseases.

  7. Identification of sequence motifs in oligonucleotides whose presence is correlated with antisense activity.

    PubMed

    Matveeva, O V; Tsodikov, A D; Giddings, M; Freier, S M; Wyatt, J R; Spiridonov, A N; Shabalina, S A; Gesteland, R F; Atkins, J F

    2000-08-01

    Design of antisense oligonucleotides targeting any mRNA can be much more efficient when several activity-enhancing motifs are included and activity-decreasing motifs are avoided. This conclusion was made after statistical analysis of data collected from >1000 experiments with phosphorothioate-modified oligonucleotides. Highly significant positive correlation between the presence of motifs CCAC, TCCC, ACTC, GCCA and CTCT in the oligonucleotide and its antisense efficiency was demonstrated. In addition, negative correlation was revealed for the motifs GGGG, ACTG, AAA and TAA. It was found that the likelihood of activity of an oligonucleotide against a desired mRNA target is sequence motif content dependent.

  8. Identification of sequence motifs in oligonucleotides whose presence is correlated with antisense activity

    PubMed Central

    Matveeva, O. V.; Tsodikov, A. D.; Giddings, M.; Freier, S. M.; Wyatt, J. R.; Spiridonov, A. N.; Shabalina, S. A.; Gesteland, R. F.; Atkins, J. F.

    2000-01-01

    Design of antisense oligonucleotides targeting any mRNA can be much more efficient when several activity-enhancing motifs are included and activity-decreasing motifs are avoided. This conclusion was made after statistical analysis of data collected from >1000 experiments with phosphorothioate-modified oligonucleotides. Highly significant positive correlation between the presence of motifs CCAC, TCCC, ACTC, GCCA and CTCT in the oligonucleotide and its antisense efficiency was demonstrated. In addition, negative correlation was revealed for the motifs GGGG, ACTG, AAA and TAA. It was found that the likelihood of activity of an oligonucleotide against a desired mRNA target is sequence motif content dependent. PMID:10908347

  9. Motif-based analysis of large nucleotide data sets using MEME-ChIP.

    PubMed

    Ma, Wenxiu; Noble, William S; Bailey, Timothy L

    2014-01-01

    MEME-ChIP is a web-based tool for analyzing motifs in large DNA or RNA data sets. It can analyze peak regions identified by ChIP-seq, cross-linking sites identified by CLIP-seq and related assays, as well as sets of genomic regions selected using other criteria. MEME-ChIP performs de novo motif discovery, motif enrichment analysis, motif location analysis and motif clustering, providing a comprehensive picture of the DNA or RNA motifs that are enriched in the input sequences. MEME-ChIP performs two complementary types of de novo motif discovery: weight matrix-based discovery for high accuracy; and word-based discovery for high sensitivity. Motif enrichment analysis using DNA or RNA motifs from human, mouse, worm, fly and other model organisms provides even greater sensitivity. MEME-ChIP's interactive HTML output groups and aligns significant motifs to ease interpretation. This protocol takes less than 3 h, and it provides motif discovery approaches that are distinct and complementary to other online methods.

  10. Motif-based analysis of large nucleotide data sets using MEME-ChIP

    PubMed Central

    Ma, Wenxiu; Noble, William S; Bailey, Timothy L

    2014-01-01

    MEME-ChIP is a web-based tool for analyzing motifs in large DNA or RNA data sets. It can analyze peak regions identified by ChIP-seq, cross-linking sites identified by cLIP-seq and related assays, as well as sets of genomic regions selected using other criteria. MEME-ChIP performs de novo motif discovery, motif enrichment analysis, motif location analysis and motif clustering, providing a comprehensive picture of the DNA or RNA motifs that are enriched in the input sequences. MEME-ChIP performs two complementary types of de novo motif discovery: weight matrix–based discovery for high accuracy; and word-based discovery for high sensitivity. Motif enrichment analysis using DNA or RNA motifs from human, mouse, worm, fly and other model organisms provides even greater sensitivity. MEME-ChIP’s interactive HTML output groups and aligns significant motifs to ease interpretation. this protocol takes less than 3 h, and it provides motif discovery approaches that are distinct and complementary to other online methods. PMID:24853928

  11. Evolution of an insect-specific GROUCHO-interaction motif in the ENGRAILED selector protein

    PubMed Central

    Hittinger, Chris Todd; Carroll, Sean B.

    2008-01-01

    Animal morphology evolves through alterations in the genetic regulatory networks that control development. Regulatory connections are commonly added, subtracted, or modified via mutations in cis-regulatory elements, but several cases are also known where transcription factors have gained or lost activity-modulating peptide motifs. In order to better assess the role of novel transcription factor peptide motifs in evolution, we searched for synapomorphic motifs in the homeotic selectors of Drosophila melanogaster and related insects. Here, we describe an evolutionarily novel GROUCHO (GRO)-interaction motif in the ENGRAILED (EN) selector protein. This “ehIFRPF” motif is not homologous to the previously characterized “engrailed homology 1” (eh1) GRO-interaction motif of EN. This second motif is an insect-specific “WRPW”-type motif that has been maintained by purifying selection in at least the dipteran/lepidopteran lineage. We demonstrate that this motif contributes to in vivo repression of the wingless (wg) target gene and to interaction with GRO in vitro. The acquisition and conservation of this auxiliary peptide motif shows how the number and activity of short peptide motifs can evolve in transcription factors while existing regulatory functions are maintained. PMID:18803772

  12. Agonist and antagonist switch DNA motifs recognized by human androgen receptor in prostate cancer

    PubMed Central

    Chen, Zhong; Lan, Xun; Thomas-Ahner, Jennifer M; Wu, Dayong; Liu, Xiangtao; Ye, Zhenqing; Wang, Liguo; Sunkel, Benjamin; Grenade, Cassandra; Chen, Junsheng; Zynger, Debra L; Yan, Pearlly S; Huang, Jiaoti; Nephew, Kenneth P; Huang, Tim H-M; Lin, Shili; Clinton, Steven K; Li, Wei; Jin, Victor X; Wang, Qianben

    2015-01-01

    Human transcription factors recognize specific DNA sequence motifs to regulate transcription. It is unknown whether a single transcription factor is able to bind to distinctly different motifs on chromatin, and if so, what determines the usage of specific motifs. By using a motif-resolution chromatin immunoprecipitation-exonuclease (ChIP-exo) approach, we find that agonist-liganded human androgen receptor (AR) and antagonist-liganded AR bind to two distinctly different motifs, leading to distinct transcriptional outcomes in prostate cancer cells. Further analysis on clinical prostate tissues reveals that the binding of AR to these two distinct motifs is involved in prostate carcinogenesis. Together, these results suggest that unique ligands may switch DNA motifs recognized by ligand-dependent transcription factors in vivo. Our findings also provide a broad mechanistic foundation for understanding ligand-specific induction of gene expression profiles. PMID:25535248

  13. DNA nanotechnology based on i-motif structures.

    PubMed

    Dong, Yuanchen; Yang, Zhongqiang; Liu, Dongsheng

    2014-06-17

    CONSPECTUS: Most biological processes happen at the nanometer scale, and understanding the energy transformations and material transportation mechanisms within living organisms has proved challenging. To better understand the secrets of life, researchers have investigated artificial molecular motors and devices over the past decade because such systems can mimic certain biological processes. DNA nanotechnology based on i-motif structures is one system that has played an important role in these investigations. In this Account, we summarize recent advances in functional DNA nanotechnology based on i-motif structures. The i-motif is a DNA quadruplex that occurs as four stretches of cytosine repeat sequences form C·CH(+) base pairs, and their stabilization requires slightly acidic conditions. This unique property has produced the first DNA molecular motor driven by pH changes. The motor is reliable, and studies show that it is capable of millisecond running speeds, comparable to the speed of natural protein motors. With careful design, the output of these types of motors was combined to drive micrometer-sized cantilevers bend. Using established DNA nanostructure assembly and functionalization methods, researchers can easily integrate the motor within other DNA assembled structures and functional units, producing DNA molecular devices with new functions such as suprahydrophobic/suprahydrophilic smart surfaces that switch, intelligent nanopores triggered by pH changes, molecular logic gates, and DNA nanosprings. Recently, researchers have produced motors driven by light and electricity, which have allowed DNA motors to be integrated within silicon-based nanodevices. Moreover, some devices based on i-motif structures have proven useful for investigating processes within living cells. The pH-responsiveness of the i-motif structure also provides a way to control the stepwise assembly of DNA nanostructures. In addition, because of the stability of the i-motif, this

  14. Motif discovery with data mining in 3D protein structure databases: discovery, validation and prediction of the U-shape zinc binding ("Huf-Zinc") motif.

    PubMed

    Maurer-Stroh, Sebastian; Gao, He; Han, Hao; Baeten, Lies; Schymkowitz, Joost; Rousseau, Frederic; Zhang, Louxin; Eisenhaber, Frank

    2013-02-01

    Data mining in protein databases, derivatives from more fundamental protein 3D structure and sequence databases, has considerable unearthed potential for the discovery of sequence motif--structural motif--function relationships as the finding of the U-shape (Huf-Zinc) motif, originally a small student's project, exemplifies. The metal ion zinc is critically involved in universal biological processes, ranging from protein-DNA complexes and transcription regulation to enzymatic catalysis and metabolic pathways. Proteins have evolved a series of motifs to specifically recognize and bind zinc ions. Many of these, so called zinc fingers, are structurally independent globular domains with discontinuous binding motifs made up of residues mostly far apart in sequence. Through a systematic approach starting from the BRIX structure fragment database, we discovered that there exists another predictable subset of zinc-binding motifs that not only have a conserved continuous sequence pattern but also share a characteristic local conformation, despite being included in totally different overall folds. While this does not allow general prediction of all Zn binding motifs, a HMM-based web server, Huf-Zinc, is available for prediction of these novel, as well as conventional, zinc finger motifs in protein sequences. The Huf-Zinc webserver can be freely accessed through this URL (http://mendel.bii.a-star.edu.sg/METHODS/hufzinc/).

  15. The Assembly Motif of a Bacterial Small Multidrug Resistance Protein*

    PubMed Central

    Poulsen, Bradley E.; Rath, Arianna; Deber, Charles M.

    2009-01-01

    Multidrug transporters such as the small multidrug resistance (SMR) family of bacterial integral membrane proteins are capable of conferring clinically significant resistance to a variety of common therapeutics. As antiporter proteins of ∼100 amino acids, SMRs must self-assemble into homo-oligomeric structures for efflux of drug molecules. Oligomerization centered at transmembrane helix four (TM4) has been implicated in SMR assembly, but the full complement of residues required to mediate its self-interaction remains to be characterized. Here, we use Hsmr, the 110-residue SMR family member of the archaebacterium Halobacterium salinarum, to determine the TM4 residue motif required to mediate drug resistance and SMR self-association. Twelve single point mutants that scan the central portion of the TM4 helix (residues 85–104) were constructed and were tested for their ability to confer resistance to the cytotoxic compound ethidium bromide. Six residues were found to be individually essential for drug resistance activity (Gly90, Leu91, Leu93, Ile94, Gly97, and Val98), defining a minimum activity motif of 90GLXLIXXGV98 within TM4. When the propensity of these mutants to dimerize on SDS-PAGE was examined, replacements of all but Ile resulted in ∼2-fold reduction of dimerization versus the wild-type antiporter. Our work defines a minimum activity motif of 90GLXLIXXGV98 within TM4 and suggests that this sequence mediates TM4-based SMR dimerization along a single helix surface, stabilized by a small residue heptad repeat sequence. These TM4-TM4 interactions likely constitute the highest affinity locus for disruption of SMR function by directly targeting its self-assembly mechanism. PMID:19224913

  16. CENTDIST: discovery of co-associated factors by motif distribution

    PubMed Central

    Zhang, Zhizhuo; Chang, Cheng Wei; Goh, Wan Ling; Sung, Wing-Kin; Cheung, Edwin

    2011-01-01

    Transcription factors (TFs) do not function alone but work together with other TFs (called co-TFs) in a combinatorial fashion to precisely control the transcription of target genes. Mining co-TFs is thus important to understand the mechanism of transcriptional regulation. Although existing methods can identify co-TFs, their accuracy depends heavily on the chosen background model and other parameters such as the enrichment window size and the PWM score cut-off. In this study, we have developed a novel web-based co-motif scanning program called CENTDIST (http://compbio.ddns.comp.nus.edu.sg/~chipseq/centdist/). In comparison to current co-motif scanning programs, CENTDIST does not require the input of any user-specific parameters and background information. Instead, CENTDIST automatically determines the best set of parameters and ranks co-TF motifs based on their distribution around ChIP-seq peaks. We tested CENTDIST on 14 ChIP-seq data sets and found CENTDIST is more accurate than existing methods. In particular, we applied CENTDIST on an Androgen Receptor (AR) ChIP-seq data set from a prostate cancer cell line and correctly predicted all known co-TFs (eight TFs) of AR in the top 20 hits as well as discovering AP4 as a novel co-TF of AR (which was missed by existing methods). Taken together, CENTDIST, which exploits the imbalanced nature of co-TF binding, is a user-friendly, parameter-less and powerful predictive web-based program for understanding the mechanism of transcriptional co-regulation. PMID:21602269

  17. Identification of imine reductase-specific sequence motifs.

    PubMed

    Fademrecht, Silvia; Scheller, Philipp N; Nestl, Bettina M; Hauer, Bernhard; Pleiss, Jürgen

    2016-05-01

    Chiral amines are valuable building blocks for the production of a variety of pharmaceuticals, agrochemicals and other specialty chemicals. Only recently, imine reductases (IREDs) were discovered which catalyze the stereoselective reduction of imines to chiral amines. Although several IREDs were biochemically characterized in the last few years, knowledge of the reaction mechanism and the molecular basis of substrate specificity and stereoselectivity is limited. To gain further insights into the sequence-function relationships, the Imine Reductase Engineering Database (www.IRED.BioCatNet.de) was established and a systematic analysis of 530 putative IREDs was performed. A standard numbering scheme based on R-IRED-Sk was introduced to facilitate the identification and communication of structurally equivalent positions in different proteins. A conservation analysis revealed a highly conserved cofactor binding region and a predominantly hydrophobic substrate binding cleft. Two IRED-specific motifs were identified, the cofactor binding motif GLGxMGx(5 )[ATS]x(4) Gx(4) [VIL]WNR[TS]x(2) [KR] and the active site motif Gx[DE]x[GDA]x[APS]x(3){K}x[ASL]x[LMVIAG]. Our results indicate a preference toward NADPH for all IREDs and explain why, despite their sequence similarity to β-hydroxyacid dehydrogenases (β-HADs), no conversion of β-hydroxyacids has been observed. Superfamily-specific conservations were investigated to explore the molecular basis of their stereopreference. Based on our analysis and previous experimental results on IRED mutants, an exclusive role of standard position 187 for stereoselectivity is excluded. Alternatively, two standard positions 139 and 194 were identified which are superfamily-specifically conserved and differ in R- and S-selective enzymes.

  18. Nucleic Acid i-Motif Structures in Analytical Chemistry.

    PubMed

    Alba, Joan Josep; Sadurní, Anna; Gargallo, Raimundo

    2016-09-02

    Under the appropriate experimental conditions of pH and temperature, cytosine-rich segments in DNA or RNA sequences may produce a characteristic folded structure known as an i-motif. Besides its potential role in vivo, which is still under investigation, this structure has attracted increasing interest in other fields due to its sharp, fast and reversible pH-driven conformational changes. This "on/off" switch at molecular level is being used in nanotechnology and analytical chemistry to develop nanomachines and sensors, respectively. This paper presents a review of the latest applications of this structure in the field of chemical analysis.

  19. Evolving DNA motifs to predict GeneChip probe performance

    PubMed Central

    Langdon, WB; Harrison, AP

    2009-01-01

    Background Affymetrix High Density Oligonuclotide Arrays (HDONA) simultaneously measure expression of thousands of genes using millions of probes. We use correlations between measurements for the same gene across 6685 human tissue samples from NCBI's GEO database to indicated the quality of individual HG-U133A probes. Low correlation indicates a poor probe. Results Regular expressions can be automatically created from a Backus-Naur form (BNF) context-free grammar using strongly typed genetic programming. Conclusion The automatically produced motif is better at predicting poor DNA sequences than an existing human generated RE, suggesting runs of Cytosine and Guanine and mixtures should all be avoided. PMID:19298675

  20. Detection of common motifs in RNA secondary structures.

    PubMed Central

    Margalit, H; Shapiro, B A; Oppenheim, A B; Maizel, J V

    1989-01-01

    We describe a novel computerized system for comparison of RNA secondary structures and demonstrate its use for experimental studies. The system is able to screen a very large number of structures, to cluster similar structures and to detect specific structural motifs. In particular, the system is useful for detecting mutations with specific structural effects among all possible point mutations, and for predicting compensatory mutations that will restore the wild type structure. The algorithms are independent of the folding rules that are used to generate the secondary structures. PMID:2473442

  1. Recurring sequence-structure motifs in (βα)8-barrel proteins and experimental optimization of a chimeric protein designed based on such motifs.

    PubMed

    Wang, Jichao; Zhang, Tongchuan; Liu, Ruicun; Song, Meilin; Wang, Juncheng; Hong, Jiong; Chen, Quan; Liu, Haiyan

    2017-02-01

    An interesting way of generating novel artificial proteins is to combine sequence motifs from natural proteins, mimicking the evolutionary path suggested by natural proteins comprising recurring motifs. We analyzed the βα and αβ modules of TIM barrel proteins by structure alignment-based sequence clustering. A number of preferred motifs were identified. A chimeric TIM was designed by using recurring elements as mutually compatible interfaces. The foldability of the designed TIM protein was then significantly improved by six rounds of directed evolution. The melting temperature has been improved by more than 20°C. A variety of characteristics suggested that the resulting protein is well-folded. Our analysis provided a library of peptide motifs that is potentially useful for different protein engineering studies. The protein engineering strategy of using recurring motifs as interfaces to connect partial natural proteins may be applied to other protein folds. Copyright © 2016 Elsevier B.V. All rights reserved.

  2. CTGC motifs within the HIV core promoter specify Tat-responsive pre-initiation complexes

    PubMed Central

    2012-01-01

    Background HIV latency is an obstacle for the eradication of HIV from infected individuals. Stable post-integration latency is controlled principally at the level of transcription. The HIV trans-activating protein, Tat, plays a key function in enhancing HIV transcriptional elongation. The HIV core promoter is specifically required for Tat-mediated trans-activation of HIV transcription. In addition, the HIV core promoter has been shown to be a potential anti-HIV drug target. Despite the pivotal role of the HIV core promoter in the control of HIV gene expression, the molecular mechanisms that couple Tat function specifically to the HIV core promoter remain unknown. Results Using electrophoretic mobility shift assays (EMSAs), the TATA box and adjacent sequences of HIV essential for Tat trans-activation were shown to form specific complexes with nuclear extracts from peripheral blood mononuclear cells, as well as from HeLa cells. These complexes, termed pre-initiation complexes of HIV (PICH), were distinct in composition and DNA binding specificity from those of prototypical eukaryotic TATA box regions such as Adenovirus major late promoter (AdMLP) or the hsp70 promoter. PICH contained basal transcription factors including TATA-binding protein and TFIIA. A mutational analysis revealed that CTGC motifs flanking the HIV TATA box are required for Tat trans-activation in living cells and correct PICH formation in vitro. The binding of known core promoter binding proteins AP-4 and USF-1 was found to be dispensable for Tat function. TAR RNA prevented stable binding of PICH-2, a complex that contains the general transcription factor TFIIA, to the HIV core promoter. The impact of TAR on PICH-2 specifically required its bulge sequence that is also known to interact with Tat. Conclusion Our data reveal that CTGC DNA motifs flanking the HIV TATA box are required for correct formation of specific pre-initiation complexes in vitro and that these motifs are also required for Tat

  3. Sequence Bundles: a novel method for visualising, discovering and exploring sequence motifs

    PubMed Central

    2014-01-01

    Background We introduce Sequence Bundles--a novel data visualisation method for representing multiple sequence alignments (MSAs). We identify and address key limitations of the existing bioinformatics data visualisation methods (i.e. the Sequence Logo) by enabling Sequence Bundles to give salient visual expression to sequence motifs and other data features, which would otherwise remain hidden. Methods For the development of Sequence Bundles we employed research-led information design methodologies. Sequences are encoded as uninterrupted, semi-opaque lines plotted on a 2-dimensional reconfigurable grid. Each line represents a single sequence. The thickness and opacity of the stack at each residue in each position indicates the level of conservation and the lines' curved paths expose patterns in correlation and functionality. Several MSAs can be visualised in a composite image. The Sequence Bundles method is designed to favour a tangible, continuous and intuitive display of information. Results We have developed a software demonstration application for generating a Sequence Bundles visualisation of MSAs provided for the BioVis 2013 redesign contest. A subsequent exploration of the visualised line patterns allowed for the discovery of a number of interesting features in the dataset. Reported features include the extreme conservation of sequences displaying a specific residue and bifurcations of the consensus sequence. Conclusions Sequence Bundles is a novel method for visualisation of MSAs and the discovery of sequence motifs. It can aid in generating new insight and hypothesis making. Sequence Bundles is well disposed for future implementation as an interactive visual analytics software, which can complement existing visualisation tools. PMID:25237395

  4. Intragenic motifs regulate the transcriptional complexity of Pkhd1/PKHD1

    PubMed Central

    Boddu, Ravindra; Yang, Chaozhe; O’Connor, Amber K.; Hendrickson, Robert Curtis; Boone, Braden; Cui, Xiangqin; Garcia-Gonzalez, Miguel; Igarashi, Peter; Onuchic, Luiz F.; Germino, Gregory G.

    2014-01-01

    Autosomal recessive polycystic kidney disease (ARPKD) results from mutations in the human PKHD1 gene. Both this gene, and its mouse ortholog, Pkhd1, are primarily expressed in renal and biliary ductal structures. The mouse protein product, fibrocystin/polyductin complex (FPC), is a 445-kDa protein encoded by a 67-exon transcript that spans >500 kb of genomic DNA. In the current study, we observed multiple alternatively spliced Pkhd1 transcripts that varied in size and exon composition in embryonic mouse kidney, liver, and placenta samples, as well as among adult mouse pancreas, brain, heart, lung, testes, liver, and kidney. Using reverse transcription PCR and RNASeq, we identified 22 novel Pkhd1 kidney transcripts with unique exon junctions. Various mechanisms of alternative splicing were observed, including exon skipping, use of alternate acceptor/donor splice sites, and inclusion of novel exons. Bioinformatic analyses identified, and exon-trapping minigene experiments validated, consensus binding sites for serine/arginine-rich proteins that modulate alternative splicing. Using site-directed mutagenesis, we examined the functional importance of selected splice enhancers. In addition, we demonstrated that many of the novel transcripts were polysome bound, thus likely translated. Finally, we determined that the human PKHD1 R760H missense variant alters a splice enhancer motif that disrupts exon splicing in vitro and is predicted to truncate the protein. Taken together, these data provide evidence of the complex transcriptional regulation of Pkhd1/PKHD1 and identified motifs that regulate its splicing. Our studies indicate that Pkhd1/PKHD1 transcription is modulated, in part by intragenic factors, suggesting that aberrant PKHD1 splicing represents an unappreciated pathogenic mechanism in ARPKD. PMID:24984783

  5. Fast and Accurate Discovery of Degenerate Linear Motifs in Protein Sequences

    PubMed Central

    Levy, Emmanuel D.; Michnick, Stephen W.

    2014-01-01

    Linear motifs mediate a wide variety of cellular functions, which makes their characterization in protein sequences crucial to understanding cellular systems. However, the short length and degenerate nature of linear motifs make their discovery a difficult problem. Here, we introduce MotifHound, an algorithm particularly suited for the discovery of small and degenerate linear motifs. MotifHound performs an exact and exhaustive enumeration of all motifs present in proteins of interest, including all of their degenerate forms, and scores the overrepresentation of each motif based on its occurrence in proteins of interest relative to a background (e.g., proteome) using the hypergeometric distribution. To assess MotifHound, we benchmarked it together with state-of-the-art algorithms. The benchmark consists of 11,880 sets of proteins from S. cerevisiae; in each set, we artificially spiked-in one motif varying in terms of three key parameters, (i) number of occurrences, (ii) length and (iii) the number of degenerate or “wildcard” positions. The benchmark enabled the evaluation of the impact of these three properties on the performance of the different algorithms. The results showed that MotifHound and SLiMFinder were the most accurate in detecting degenerate linear motifs. Interestingly, MotifHound was 15 to 20 times faster at comparable accuracy and performed best in the discovery of highly degenerate motifs. We complemented the benchmark by an analysis of proteins experimentally shown to bind the FUS1 SH3 domain from S. cerevisiae. Using the full-length protein partners as sole information, MotifHound recapitulated most experimentally determined motifs binding to the FUS1 SH3 domain. Moreover, these motifs exhibited properties typical of SH3 binding peptides, e.g., high intrinsic disorder and evolutionary conservation, despite the fact that none of these properties were used as prior information. MotifHound is available (http://michnick.bcm.umontreal.ca or http

  6. Automated protein motif generation in the structure-based protein function prediction tool ProMOL.

    PubMed

    Osipovitch, Mikhail; Lambrecht, Mitchell; Baker, Cameron; Madha, Shariq; Mills, Jeffrey L; Craig, Paul A; Bernstein, Herbert J

    2015-12-01

    ProMOL, a plugin for the PyMOL molecular graphics system, is a structure-based protein function prediction tool. ProMOL includes a set of routines for building motif templates that are used for screening query structures for enzyme active sites. Previously, each motif template was generated manually and required supervision in the optimization of parameters for sensitivity and selectivity. We developed an algorithm and workflow for the automation of motif building and testing routines in ProMOL. The algorithm uses a set of empirically derived parameters for optimization and requires little user intervention. The automated motif generation algorithm was first tested in a performance comparison with a set of manually generated motifs based on identical active sites from the same 112 PDB entries. The two sets of motifs were equally effective in identifying alignments with homologs and in rejecting alignments with unrelated structures. A second set of 296 active site motifs were generated automatically, based on Catalytic Site Atlas entries with literature citations, as an expansion of the library of existing manually generated motif templates. The new motif templates exhibited comparable performance to the existing ones in terms of hit rates against native structures, homologs with the same EC and Pfam designations, and randomly selected unrelated structures with a different EC designation at the first EC digit, as well as in terms of RMSD values obtained from local structural alignments of motifs and query structures. This research is supported by NIH grant GM078077.

  7. HIGEDA: a hierarchical gene-set genetics based algorithm for finding subtle motifs in biological sequences.

    PubMed

    Le, Thanh; Altman, Tom; Gardiner, Katheleen

    2010-02-01

    Identification of motifs in biological sequences is a challenging problem because such motifs are often short, degenerate, and may contain gaps. Most algorithms that have been developed for motif-finding use the expectation-maximization (EM) algorithm iteratively. Although EM algorithms can converge quickly, they depend strongly on initialization parameters and can converge to local sub-optimal solutions. In addition, they cannot generate gapped motifs. The effectiveness of EM algorithms in motif finding can be improved by incorporating methods that choose different sets of initial parameters to enable escape from local optima, and that allow gapped alignments within motif models. We have developed HIGEDA, an algorithm that uses the hierarchical gene-set genetic algorithm (HGA) with EM to initiate and search for the best parameters for the motif model. In addition, HIGEDA can identify gapped motifs using a position weight matrix and dynamic programming to generate an optimal gapped alignment of the motif model with sequences from the dataset. We show that HIGEDA outperforms MEME and other motif-finding algorithms on both DNA and protein sequences. Source code and test datasets are available for download at http://ouray.cudenver.edu/~tnle/, implemented in C++ and supported on Linux and MS Windows.

  8. SLiMDisc: short, linear motif discovery, correcting for common evolutionary descent

    PubMed Central

    Davey, Norman E.; Shields, Denis C.; Edwards, Richard J.

    2006-01-01

    Many important interactions of proteins are facilitated by short, linear motifs (SLiMs) within a protein's primary sequence. Our aim was to establish robust methods for discovering putative functional motifs. The strongest evidence for such motifs is obtained when the same motifs occur in unrelated proteins, evolving by convergence. In practise, searches for such motifs are often swamped by motifs shared in related proteins that are identical by descent. Prediction of motifs among sets of biologically related proteins, including those both with and without detectable similarity, were made using the TEIRESIAS algorithm. The number of motif occurrences arising through common evolutionary descent were normalized based on treatment of BLAST local alignments. Motifs were ranked according to a score derived from the product of the normalized number of occurrences and the information content. The method was shown to significantly outperform methods that do not discount evolutionary relatedness, when applied to known SLiMs from a subset of the eukaryotic linear motif (ELM) database. An implementation of Multiple Spanning Tree weighting outperformed two other weighting schemes, in a variety of settings. PMID:16855291

  9. Systematic discovery and characterization of regulatory motifs in ENCODE TF binding experiments

    PubMed Central

    Kheradpour, Pouya; Kellis, Manolis

    2014-01-01

    Recent advances in technology have led to a dramatic increase in the number of available transcription factor ChIP-seq and ChIP-chip data sets. Understanding the motif content of these data sets is an important step in understanding the underlying mechanisms of regulation. Here we provide a systematic motif analysis for 427 human ChIP-seq data sets using motifs curated from the literature and also discovered de novo using five established motif discovery tools. We use a systematic pipeline for calculating motif enrichment in each data set, providing a principled way for choosing between motif variants found in the literature and for flagging potentially problematic data sets. Our analysis confirms the known specificity of 41 of the 56 analyzed factor groups and reveals motifs of potential cofactors. We also use cell type-specific binding to find factors active in specific conditions. The resource we provide is accessible both for browsing a small number of factors and for performing large-scale systematic analyses. We provide motif matrices, instances and enrichments in each of the ENCODE data sets. The motifs discovered here have been used in parallel studies to validate the specificity of antibodies, understand cooperativity between data sets and measure the variation of motif binding across individuals and species. PMID:24335146

  10. Small yet effective: the ethylene responsive element binding factor-associated amphiphilic repression (EAR) motif.

    PubMed

    Kagale, Sateesh; Rozwadowski, Kevin

    2010-06-01

    The Ethylene-responsive element binding factor-associated Amphiphilic Repression (EAR) motif is a small yet distinct regulatory motif that is conserved in many plant transcriptional regulator (TR) proteins associated with diverse biological functions. We have previously established a list of high-confidence Arabidopsis EAR repressors, the EAR repressome, comprising 219 TRs belonging to 21 different TR families. This class of proteins and the sequence context of the EAR motif exhibited a high degree of conservation across evolutionarily diverse plant species. Our comprehensive genome-wide analysis enabled refining EAR motifs as comprising either LxLxL or DLNxxP. Comparing the representation of these sequence signatures in TRs to that of other repressor motifs we show that the EAR motif is the one most frequently represented, detected in 10 to 25% of the TRs from diverse plant species. The mechanisms involved in regulation of EAR motif function and the cellular fates of EAR repressors are currently not well understood. Our earlier analysis had implicated amino acid residues flanking the EAR motifs in regulation of their functionality. Here, we present additional evidence supporting possible regulation of EAR motif function by phosphorylation of integral or adjacent Ser and/or Thr residues. Additionally, we discuss potential novel roles of EAR motifs in plant-pathogen interaction and processes other than transcriptional repression.

  11. Recursive Alterations of the Relationship between Simple Membrane Geometry and Insertion of Amphiphilic Motifs

    PubMed Central

    Madsen, Kenneth Lindegaard; Herlo, Rasmus

    2017-01-01

    The shape and composition of a membrane directly regulate the localization, activity, and signaling properties of membrane associated proteins. Proteins that both sense and generate membrane curvature, e.g., through amphiphilic insertion motifs, potentially engage in recursive binding dynamics, where the recruitment of the protein itself changes the properties of the membrane substrate. Simple geometric models of membrane curvature interactions already provide prediction tools for experimental observations, however these models are treating curvature sensing and generation as separated phenomena. Here, we outline a model that applies both geometric and basic thermodynamic considerations. This model allows us to predict the consequences of recursive properties in such interaction schemes and thereby integrate the membrane as a dynamic substrate. We use this combined model to hypothesize the origin and properties of tubular carrier systems observed in cells. Furthermore, we pinpoint the coupling to a membrane reservoir as a factor that influences the membrane curvature sensing and generation properties of local curvatures in the cell in line with classic determinants such as lipid composition and membrane geometry. PMID:28208740

  12. Computational definition of sequence motifs governing constitutive exon splicing.

    PubMed

    Zhang, Xiang H-F; Chasin, Lawrence A

    2004-06-01

    We have searched for sequence motifs that contribute to the recognition of human pre-mRNA splice sites by comparing the frequency of 8-mers in internal noncoding exons versus unspliced pseudo exons and 5' untranslated regions (5' untranslated regions [UTRs]) of transcripts of intronless genes. This type of comparison avoids the isolation of sequences that are distinguished by their protein-coding information. We classified sequence families comprising 2069 putative exonic enhancers and 974 putative exonic silencers. Representatives of each class functioned as enhancers or silencers when inserted into a test exon and assayed in transfected mammalian cells. As a class, the enhancer sequencers were more prevalent and the silencer elements less prevalent in all exons compared with introns. A survey of 58 reported exonic splicing mutations showed good agreement between the splicing phenotype and the effect of the mutation on the motifs defined here. The large number of effective sequences implied by these results suggests that sequences that influence splicing may be very abundant in pre-mRNA.

  13. Synchronization patterns: from network motifs to hierarchical networks

    NASA Astrophysics Data System (ADS)

    Krishnagopal, Sanjukta; Lehnert, Judith; Poel, Winnie; Zakharova, Anna; Schöll, Eckehard

    2017-03-01

    We investigate complex synchronization patterns such as cluster synchronization and partial amplitude death in networks of coupled Stuart-Landau oscillators with fractal connectivities. The study of fractal or self-similar topology is motivated by the network of neurons in the brain. This fractal property is well represented in hierarchical networks, for which we present three different models. In addition, we introduce an analytical eigensolution method and provide a comprehensive picture of the interplay of network topology and the corresponding network dynamics, thus allowing us to predict the dynamics of arbitrarily large hierarchical networks simply by analysing small network motifs. We also show that oscillation death can be induced in these networks, even if the coupling is symmetric, contrary to previous understanding of oscillation death. Our results show that there is a direct correlation between topology and dynamics: hierarchical networks exhibit the corresponding hierarchical dynamics. This helps bridge the gap between mesoscale motifs and macroscopic networks. This article is part of the themed issue 'Horizons of cybernetical physics'.

  14. The helix bundle: A reversible lipid binding motif

    PubMed Central

    Narayanaswami, Vasanthy; Kiss, Robert S.; Weers, Paul M.M.

    2009-01-01

    Apolipoproteins are the protein components of lipoproteins that have the innate ability to inter convert between a lipid-free and a lipid-bound form in a facile manner, a remarkable property conferred by the helix bundle motif. Composed of a series of four or five amphipathic α-helices that fold to form a helix bundle, this motif allows the en face orientation of the hydrophobic faces of the α-helices in the protein interior in the lipid-free state. A conformational switch then permits helix-helix interactions to be substituted by helix-lipid interactions upon lipid binding interaction. This review compares the apolipoprotein high resolution structures and the factors that trigger this switch in insect apolipophorin III and the mammalian apolipoproteins, apolipoprotein E and apolipoprotein A-I, pointing out the commonalities and key differences in the mode of lipid interaction. Further insights into the lipid bound conformation of apolipoproteins are required to fully understand their functional role under physiological conditions. PMID:19770066

  15. Automatic classification of ceramic sherds with relief motifs

    NASA Astrophysics Data System (ADS)

    Debroutelle, Teddy; Treuillet, Sylvie; Chetouani, Aladine; Exbrayat, Matthieu; Martin, Lionel; Jesset, Sebastien

    2017-03-01

    A large corpus of ceramic sherds dating from the High Middle Ages has been extracted in Saran (France). The sherds have an engraved frieze made by the potter with a carved wooden wheel. These relief patterns can be used to date the sherds in order to study the diffusion of ceramic production. The aim of the ARCADIA project was to develop an automatic classification of this archaeological heritage. The sherds were scanned using a three-dimensional (3-D) laser scanner. After projecting the 3-D point cloud onto a depth map, the local variance highlighted the shallow relief patterns. The saliency region focused on the motif was extracted by a density-based spatial clustering of FAST points. An adaptive thresholding was then applied to the depth to obtain a binary pattern close to manual sampling. The five most representative types of motif were classified by training an SVM model with a pyramid histogram of visual words descriptor. Compared with other state-of-the-art methods, the proposed approach succeeded in classifying up to 84% of the binary patterns on a dataset of 377 scanned sherds. The automatic method is extremely time-saving compared to manual stamping.

  16. Identifying DNA Binding Motifs by Combining Data from Different Sources

    SciTech Connect

    Mao, Linyong; Resat, Haluk; Nagib Callaos; Katsuhisa Horimoto; Jake Chen; Amy Sze Chan

    2004-07-19

    A transcription factor regulates the expression of its target genes by binding to their operator regions. It functions by affecting the interactions between RNA polymerases and the gene's promoter. Many transcription factors bind to their targets by recognizing a specific DNA sequence pattern, which is referred to as a consensus sequence or a motif. Since it would remove the possible biases, combining biological data from different sources can be expected to improve the quality of the information extracted from the biological data. We analyzed the microarray gene expression data and the organism's genome sequence jointly to determine the transcription factor recognition sequences with more accuracy. Utilizing such a data integration approach, we have investigated the regulation of the photosynthesis genes of the purple non-sulphur photosynthetic bacterium Rhodobacter sphaeroides. The photosynthesis genes in this organism are tightly regulated as a function of environmental growth conditions by three major regulatory systems, PrrB/PrrA, AppA/PpsR and FnrL. In this study, we have detected a previously undefined PrrA consensus sequence, improved the previously known DNA-binding motif of PpsR, and confirmed the consensus sequence of the global regulator FnrL.

  17. Applying Side-chain Flexibility in Motifs for Protein Docking

    PubMed Central

    Liu, Hui; Lin, Feng; Yang, Jian-Li; Wang, Hong-Rui; Liu, Xiu-Ling

    2015-01-01

    Conventional rigid docking algorithms have been unsatisfactory in their computational results, largely due to the fact that protein structures are flexible in live environments. In response, we propose to introduce the side-chain flexibility in protein motif into the docking. First, the Morse theory is applied to curvature labeling and surface region growing, for segmentation of the protein surface into smaller patches. Then, the protein is described by an ensemble of conformations that incorporate the flexibility of interface side chains and are sampled using rotamers. Next, a 3D rotation invariant shape descriptor is proposed to deal with the flexible motifs and surface patches; thus, pairwise complementarity matching is needed only between the convex patches of ligand and the concave patches of receptor. The iterative closest point (ICP) algorithm is implemented for geometric alignment of the two 3D protein surface patches. Compared with the fast Fourier transform-based global geometric matching algorithm and other methods, our FlexDock system generates much less false-positive docking results, which benefits identification of the complementary candidates. Our computational experiments show the advantages of the proposed flexible docking algorithm over its counterparts. PMID:26508871

  18. Prevalent RNA recognition motif duplication in the human genome.

    PubMed

    Tsai, Yihsuan S; Gomez, Shawn M; Wang, Zefeng

    2014-05-01

    The sequence-specific recognition of RNA by proteins is mediated through various RNA binding domains, with the RNA recognition motif (RRM) being the most frequent and present in >50% of RNA-binding proteins (RBPs). Many RBPs contain multiple RRMs, and it is unclear how each RRM contributes to the binding specificity of the entire protein. We found that RRMs within the same RBP (i.e., sibling RRMs) tend to have significantly higher similarity than expected by chance. Sibling RRM pairs from RBPs shared by multiple species tend to have lower similarity than those found only in a single species, suggesting that multiple RRMs within the same protein might arise from domain duplication followed by divergence through random mutations. This finding is exemplified by a recent RRM domain duplication in DAZ proteins and an ancient duplication in PABP proteins. Additionally, we found that different similarities between sibling RRMs are associated with distinct functions of an RBP and that the RBPs tend to contain repetitive sequences with low complexity. Taken together, this study suggests that the number of RBPs with multiple RRMs has expanded in mammals and that the multiple sibling RRMs may recognize similar target motifs in a cooperative manner.

  19. Phosphotyrosine Substrate Sequence Motifs for Dual Specificity Phosphatases

    PubMed Central

    Zhao, Bryan M.; Keasey, Sarah L.; Tropea, Joseph E.; Lountos, George T.; Dyas, Beverly K.; Cherry, Scott; Raran-Kurussi, Sreejith; Waugh, David S.; Ulrich, Robert G.

    2015-01-01

    Protein tyrosine phosphatases dephosphorylate tyrosine residues of proteins, whereas, dual specificity phosphatases (DUSPs) are a subgroup of protein tyrosine phosphatases that dephosphorylate not only Tyr(P) residue, but also the Ser(P) and Thr(P) residues of proteins. The DUSPs are linked to the regulation of many cellular functions and signaling pathways. Though many cellular targets of DUSPs are known, the relationship between catalytic activity and substrate specificity is poorly defined. We investigated the interactions of peptide substrates with select DUSPs of four types: MAP kinases (DUSP1 and DUSP7), atypical (DUSP3, DUSP14, DUSP22 and DUSP27), viral (variola VH1), and Cdc25 (A-C). Phosphatase recognition sites were experimentally determined by measuring dephosphorylation of 6,218 microarrayed Tyr(P) peptides representing confirmed and theoretical phosphorylation motifs from the cellular proteome. A broad continuum of dephosphorylation was observed across the microarrayed peptide substrates for all phosphatases, suggesting a complex relationship between substrate sequence recognition and optimal activity. Further analysis of peptide dephosphorylation by hierarchical clustering indicated that DUSPs could be organized by substrate sequence motifs, and peptide-specificities by phylogenetic relationships among the catalytic domains. The most highly dephosphorylated peptides represented proteins from 29 cell-signaling pathways, greatly expanding the list of potential targets of DUSPs. These newly identified DUSP substrates will be important for examining structure-activity relationships with physiologically relevant targets. PMID:26302245

  20. Discovering interacting domains and motifs in protein-protein interactions.

    PubMed

    Hugo, Willy; Sung, Wing-Kin; Ng, See-Kiong

    2013-01-01

    Many important biological processes, such as the signaling pathways, require protein-protein interactions (PPIs) that are designed for fast response to stimuli. These interactions are usually transient, easily formed, and disrupted, yet specific. Many of these transient interactions involve the binding of a protein domain to a short stretch (3-10) of amino acid residues, which can be characterized by a sequence pattern, i.e., a short linear motif (SLiM). We call these interacting domains and motifs domain-SLiM interactions. Existing methods have focused on discovering SLiMs in the interacting proteins' sequence data. With the recent increase in protein structures, we have a new opportunity to detect SLiMs directly from the proteins' 3D structures instead of their linear sequences. In this chapter, we describe a computational method called SLiMDIet to directly detect SLiMs on domain interfaces extracted from 3D structures of PPIs. SLiMDIet comprises two steps: (1) interaction interfaces belonging to the same domain are extracted and grouped together using structural clustering and (2) the extracted interaction interfaces in each cluster are structurally aligned to extract the corresponding SLiM. Using SLiMDIet, de novo SLiMs interacting with protein domains can be computationally detected from structurally clustered domain-SLiM interactions for PFAM domains which have available 3D structures in the PDB database.

  1. A simple motif for protein recognition in DNA secondary structures.

    PubMed

    Landt, Stephen G; Ramirez, Alejandro; Daugherty, Matthew D; Frankel, Alan D

    2005-09-02

    DNA in a single-stranded form (ssDNA) exists transiently within the cell and comprises the telomeres of linear chromosomes and the genomes of some DNA viruses. As with RNA, in the single-stranded state, some DNA sequences are able to fold into complex secondary and tertiary structures that may be recognized by proteins and participate in gene regulation. To better understand how such DNA elements might fold and interact with proteins, and to compare recognition features to those of a structured RNA, we used in vitro selection to identify ssDNAs that bind an RNA-binding peptide from the HIV Rev protein with high affinity and specificity. The large majority of selected binders contain a non-Watson-Crick G.T base-pair and an adjacent C:G base-pair and both are essential for binding. This GT motif can be presented in different DNA contexts, including a nearly perfect duplex and a branched three-helix structure, and appears to be recognized in large part by arginine residues separated by one turn of an alpha-helix. Interestingly, a very similar GT motif is necessary also for protein binding and function of a well-characterized model ssDNA regulatory element from the proenkephalin promoter.

  2. Functional implications of local DNA structures in regulatory motifs.

    PubMed

    Xiang, Qian

    2013-01-01

    The three-dimensional structure of DNA has been proposed to be a major determinant for functional transcription factors (TFs) and DNA interaction. Here, we use hydroxyl radical cleavage pattern as a measure of local DNA structure. We compared the conservation between DNA sequence and structure in terms of information content and attempted to assess the functional implications of DNA structures in regulatory motifs. We used statistical methods to evaluate the structural divergence of substituting a single position within a binding site and applied them to a collection of putative regulatory motifs. The following are our major observations: (i) we observed more information in structural alignment than in the corresponding sequence alignment for most of the transcriptional factors; (ii) for each TF, majority of positions have more information in the structural alignment as compared to the sequence alignment; (iii) we further defined a DNA structural divergence score (SD score) for each wild-type and mutant pair that is distinguished by single-base mutation. The SD score for benign mutations is significantly lower than that of switch mutations. This indicates structural conservation is also important for TFBS to be functional and DNA structures will provide previously unappreciated information for TF to realize the binding specificity.

  3. Ultrasensitive response motifs: basic amplifiers in molecular signalling networks

    PubMed Central

    Zhang, Qiang; Bhattacharya, Sudin; Andersen, Melvin E.

    2013-01-01

    Multi-component signal transduction pathways and gene regulatory circuits underpin integrated cellular responses to perturbations. A recurring set of network motifs serve as the basic building blocks of these molecular signalling networks. This review focuses on ultrasensitive response motifs (URMs) that amplify small percentage changes in the input signal into larger percentage changes in the output response. URMs generally possess a sigmoid input–output relationship that is steeper than the Michaelis–Menten type of response and is often approximated by the Hill function. Six types of URMs can be commonly found in intracellular molecular networks and each has a distinct kinetic mechanism for signal amplification. These URMs are: (i) positive cooperative binding, (ii) homo-multimerization, (iii) multistep signalling, (iv) molecular titration, (v) zero-order covalent modification cycle and (vi) positive feedback. Multiple URMs can be combined to generate highly switch-like responses. Serving as basic signal amplifiers, these URMs are essential for molecular circuits to produce complex nonlinear dynamics, including multistability, robust adaptation and oscillation. These dynamic properties are in turn responsible for higher-level cellular behaviours, such as cell fate determination, homeostasis and biological rhythm. PMID:23615029

  4. Network motifs come in sets: Correlations in the randomization process

    NASA Astrophysics Data System (ADS)

    Ginoza, Reid; Mugler, Andrew

    2010-07-01

    The identification of motifs—subgraphs that appear significantly more often in a particular network than in an ensemble of randomized networks—has become a ubiquitous method for uncovering potentially important subunits within networks drawn from a wide variety of fields. We find that the most common algorithms used to generate the ensemble from the real network change subgraph counts in a highly correlated manner, such that one subgraph’s status as a motif may not be independent from the statuses of the other subgraphs. We demonstrate this effect for the problem of three- and four-node motif identification in the transcriptional regulatory networks of E. coli and S. cerevisiae in which randomized networks are generated via an edge-swapping algorithm. We find strong correlations among subgraph counts; for three-node subgraphs these correlations are easily interpreted, and we present an information-theoretic tool that may be used to identify correlations among subgraphs of any size. Our results suggest that single-feature statistics such as Z scores that implicitly assume independence among subgraph counts constitute an insufficient summary of the network.

  5. Synchronization patterns: from network motifs to hierarchical networks.

    PubMed

    Krishnagopal, Sanjukta; Lehnert, Judith; Poel, Winnie; Zakharova, Anna; Schöll, Eckehard

    2017-03-06

    We investigate complex synchronization patterns such as cluster synchronization and partial amplitude death in networks of coupled Stuart-Landau oscillators with fractal connectivities. The study of fractal or self-similar topology is motivated by the network of neurons in the brain. This fractal property is well represented in hierarchical networks, for which we present three different models. In addition, we introduce an analytical eigensolution method and provide a comprehensive picture of the interplay of network topology and the corresponding network dynamics, thus allowing us to predict the dynamics of arbitrarily large hierarchical networks simply by analysing small network motifs. We also show that oscillation death can be induced in these networks, even if the coupling is symmetric, contrary to previous understanding of oscillation death. Our results show that there is a direct correlation between topology and dynamics: hierarchical networks exhibit the corresponding hierarchical dynamics. This helps bridge the gap between mesoscale motifs and macroscopic networks.This article is part of the themed issue 'Horizons of cybernetical physics'.

  6. Universal structure motifs in biominerals: a lesson from nature for the efficient design of bioinspired functional materials.

    PubMed

    Harris, Joe; Böhm, Corinna F; Wolf, Stephan E

    2017-08-06

    Biominerals are typically indispensable structures for their host organism in which they serve varying functions, such as mechanical support and protection, mineral storage, detoxification site, or as a sensor or optical guide. In this perspective article, we highlight the occurrence of both structural diversity and uniformity within these biogenic ceramics. For the first time, we demonstrate that the universality-diversity paradigm, which was initially introduced for proteins by Buehler et al. (Cranford & Buehler 2012 Biomateriomics; Cranford et al. 2013 Adv. Mater.25, 802-824 (doi:10.1002/adma.201202553); Ackbarow & Buehler 2008 J. Comput. Theor. Nanosci.5, 1193-1204 (doi:10.1166/jctn.2008.001); Buehler & Yung 2009 Nat. Mater.8, 175-188 (doi:10.1038/nmat2387)), is also valid in the realm of biomineralization. A nanogranular composite structure is shared by most biominerals which rests on a common, non-classical crystal growth mechanism. The nanogranular composite structure affects various properties of the macroscale biogenic ceramic, a phenomenon we attribute to emergence. Emergence, in turn, is typical for hierarchically organized materials. This is a clear call to renew comparative studies of even distantly related biomineralizing organisms to identify further universal design motifs and their associated emergent properties. Such universal motifs with emergent macro-scale properties may represent an unparalleled toolbox for the efficient design of bioinspired functional materials.

  7. Structural complexity of Dengue virus untranslated regions: cis-acting RNA motifs and pseudoknot interactions modulating functionality of the viral genome

    PubMed Central

    Sztuba-Solinska, Joanna; Teramoto, Tadahisa; Rausch, Jason W.; Shapiro, Bruce A.; Padmanabhan, Radhakrishnan; Le Grice, Stuart F. J.

    2013-01-01

    The Dengue virus (DENV) genome contains multiple cis-acting elements required for translation and replication. Previous studies indicated that a 719-nt subgenomic minigenome (DENV-MINI) is an efficient template for translation and (−) strand RNA synthesis in vitro. We performed a detailed structural analysis of DENV-MINI RNA, combining chemical acylation techniques, Pb2+ ion-induced hydrolysis and site-directed mutagenesis. Our results highlight protein-independent 5′–3′ terminal interactions involving hybridization between recognized cis-acting motifs. Probing analyses identified tandem dumbbell structures (DBs) within the 3′ terminus spaced by single-stranded regions, internal loops and hairpins with embedded GNRA-like motifs. Analysis of conserved motifs and top loops (TLs) of these dumbbells, and their proposed interactions with downstream pseudoknot (PK) regions, predicted an H-type pseudoknot involving TL1 of the 5′ DB and the complementary region, PK2. As disrupting the TL1/PK2 interaction, via ‘flipping’ mutations of PK2, previously attenuated DENV replication, this pseudoknot may participate in regulation of RNA synthesis. Computer modeling implied that this motif might function as autonomous structural/regulatory element. In addition, our studies targeting elements of the 3′ DB and its complementary region PK1 indicated that communication between 5′–3′ terminal regions strongly depends on structure and sequence composition of the 5′ cyclization region. PMID:23531545

  8. Methods and compositions for targeting macromolecules into the nucleus

    DOEpatents

    Chook, Yuh Min

    2013-06-25

    The present invention includes compositions, methods and kits for directing an agent across the nuclear membrane of a cell. The present invention includes a Karyopherin beta2 translocation motif in a polypeptide having a slightly positively charged region or a slightly hydrophobic region and one or more R/K/H-X.sub.(2-5)-P-Y motifs. The polypeptide targets the agent into the cell nucleus.

  9. Variable motif utilization in homeotic selector (Hox)-cofactor complex formation controls specificity.

    PubMed

    Lelli, Katherine M; Noro, Barbara; Mann, Richard S

    2011-12-27

    Homeotic selector (Hox) proteins often bind DNA cooperatively with cofactors such as Extradenticle (Exd) and Homothorax (Hth) to achieve functional specificity in vivo. Previous studies identified the Hox YPWM motif as an important Exd interaction motif. Using a comparative approach, we characterize the contribution of this and additional conserved sequence motifs to the regulation of specific target genes for three Drosophila Hox proteins. We find that Sex combs reduced (Scr) uses a simple interaction mechanism, where a single tryptophan-containing motif is necessary for Exd-dependent DNA-binding and in vivo functions. Abdominal-A (AbdA) is more complex, using multiple conserved motifs in a context-dependent manner. Lastly, Ultrabithorax (Ubx) is the most flexible, in that it uses multiple conserved motifs that function in parallel to regulate target genes in vivo. We propose that using different binding mechanisms with the same cofactor may be one strategy to achieve functional specificity in vivo.

  10. Exhaustive Search for Over-represented DNA Sequence Motifs with CisFinder

    PubMed Central

    Sharov, Alexei A.; Ko, Minoru S.H.

    2009-01-01

    We present CisFinder software, which generates a comprehensive list of motifs enriched in a set of DNA sequences and describes them with position frequency matrices (PFMs). A new algorithm was designed to estimate PFMs directly from counts of n-mer words with and without gaps; then PFMs are extended over gaps and flanking regions and clustered to generate non-redundant sets of motifs. The algorithm successfully identified binding motifs for 12 transcription factors (TFs) in embryonic stem cells based on published chromatin immunoprecipitation sequencing data. Furthermore, CisFinder successfully identified alternative binding motifs of TFs (e.g. POU5F1, ESRRB, and CTCF) and motifs for known and unknown co-factors of genes associated with the pluripotent state of ES cells. CisFinder also showed robust performance in the identification of motifs that were only slightly enriched in a set of DNA sequences. PMID:19740934

  11. A type of nucleotide motif that distinguishes tobamovirus species more efficiently than nucleotide signatures.

    PubMed

    Gibbs, A J; Armstrong, J S; Gibbs, M J

    2004-10-01

    The complete genomic sequences of forty-eight tobamoviruses were classified and found to form at least twelve species clusters. Individual species were not conveniently defined by 'nucleotide signatures' (i.e. strings of one or more nucleotides unique to a taxon) as these were scattered sparsely throughout the genomes and were mostly single nucleotides. By contrast all the species were concisely and uniquely distinguished by short nucleotide motifs consisting of conserved genus-specific sites intercalated with variable sites that provided species-specific combinations of nucleotides (nucleotide combination motifs; NC-motifs). We describe the procedure for finding NC-motifs in a convenient and phylogenetically conserved region of the tobamovirus RNA polymerase gene, the '4404-50 motif'. NC-motifs have been found in other sets of homologous sequences, and are convenient for use in published taxonomic descriptions.

  12. Combinatorial motif analysis of regulatory gene expression in Mafb deficient macrophages

    PubMed Central

    2011-01-01

    Background Deficiency of the transcription factor MafB, which is normally expressed in macrophages, can underlie cellular dysfunction associated with a range of autoimmune diseases and arteriosclerosis. MafB has important roles in cell differentiation and regulation of target gene expression; however, the mechanisms of this regulation and the identities of other transcription factors with which MafB interacts remain uncertain. Bioinformatics methods provide a valuable approach for elucidating the nature of these interactions with transcriptional regulatory elements from a large number of DNA sequences. In particular, identification of patterns of co-occurrence of regulatory cis-elements (motifs) offers a robust approach. Results Here, the directional relationships among several functional motifs were evaluated using the Log-linear Graphical Model (LGM) after extraction and search for evolutionarily conserved motifs. This analysis highlighted GATA-1 motifs and 5’AT-rich half Maf recognition elements (MAREs) in promoter regions of 18 genes that were down-regulated in Mafb deficient macrophages. GATA-1 motifs and MafB motifs could regulate expression of these genes in both a negative and positive manner, respectively. The validity of this conclusion was tested with data from a luciferase assay that used a C1qa promoter construct carrying both the GATA-1 motifs and MAREs. GATA-1 was found to inhibit the activity of the C1qa promoter with the GATA-1 motifs and MafB motifs. Conclusions These observations suggest that both the GATA-1 motifs and MafB motifs are important for lineage specific expression of C1qa. In addition, these findings show that analysis of combinations of evolutionarily conserved motifs can be successfully used to identify patterns of gene regulation. PMID:22784578

  13. An integrative and applicable phylogenetic footprinting framework for cis-regulatory motifs identification in prokaryotic genomes.

    PubMed

    Liu, Bingqiang; Zhang, Hanyuan; Zhou, Chuan; Li, Guojun; Fennell, Anne; Wang, Guanghui; Kang, Yu; Liu, Qi; Ma, Qin

    2016-08-09

    Phylogenetic footprinting is an important computational technique for identifying cis-regulatory motifs in orthologous regulatory regions from multiple genomes, as motifs tend to evolve slower than their surrounding non-functional sequences. Its application, however, has several difficulties for optimizing the selection of orthologous data and reducing the false positives in motif prediction. Here we present an integrative phylogenetic footprinting framework for accurate motif predictions in prokaryotic genomes (MP(3)). The framework includes a new orthologous data preparation procedure, an additional promoter scoring and pruning method and an integration of six existing motif finding algorithms as basic motif search engines. Specifically, we collected orthologous genes from available prokaryotic genomes and built the orthologous regulatory regions based on sequence similarity of promoter regions. This procedure made full use of the large-scale genomic data and taxonomy information and filtered out the promoters with limited contribution to produce a high quality orthologous promoter set. The promoter scoring and pruning is implemented through motif voting by a set of complementary predicting tools that mine as many motif candidates as possible and simultaneously eliminate the effect of random noise. We have applied the framework to Escherichia coli k12 genome and evaluated the prediction performance through comparison with seven existing programs. This evaluation was systematically carried out at the nucleotide and binding site level, and the results showed that MP(3) consistently outperformed other popular motif finding tools. We have integrated MP(3) into our motif identification and analysis server DMINDA, allowing users to efficiently identify and analyze motifs in 2,072 completely sequenced prokaryotic genomes. The performance evaluation indicated that MP(3) is effective for predicting regulatory motifs in prokaryotic genomes. Its application may enhance

  14. Mining bridge and brick motifs from complex biological networks for functionally and statistically significant discovery.

    PubMed

    Cheng, Chia-Ying; Huang, Chung-Yuan; Sun, Chuen-Tsai

    2008-02-01

    A major task for postgenomic systems biology researchers is to systematically catalogue molecules and their interactions within living cells. Advancements in complex-network theory are being made toward uncovering organizing principles that govern cell formation and evolution, but we lack understanding of how molecules and their interactions determine how complex systems function. Molecular bridge motifs include isolated motifs that neither interact nor overlap with others, whereas brick motifs act as network foundations that play a central role in defining global topological organization. To emphasize their structural organizing and evolutionary characteristics, we define bridge motifs as consisting of weak links only and brick motifs as consisting of strong links only, then propose a method for performing two tasks simultaneously, which are as follows: 1) detecting global statistical features and local connection structures in biological networks and 2) locating functionally and statistically significant network motifs. To further understand the role of biological networks in system contexts, we examine functional and topological differences between bridge and brick motifs for predicting biological network behaviors and functions. After observing brick motif similarities between E. coli and S. cerevisiae, we note that bridge motifs differentiate C. elegans from Drosophila and sea urchin in three types of networks. Similarities (differences) in bridge and brick motifs imply similar (different) key circuit elements in the three organisms. We suggest that motif-content analyses can provide researchers with global and local data for real biological networks and assist in the search for either isolated or functionally and topologically overlapping motifs when investigating and comparing biological system functions and behaviors.

  15. Spontaneous Membrane Translocating Peptides: The Role of Leucine-Arginine Consensus Motifs.

    PubMed

    Fuselier, Taylor; Wimley, William C

    2017-08-22

    We previously used an orthogonal high-throughput screen to select peptides that spontaneously cross synthetic lipid bilayers without bilayer disruption. Many of the 12-residue spontaneous membrane translocating peptides (SMTPs) selected from the library contained a 5-residue consensus motif, LRLLR in positions 5-9. We hypothesized that the conserved motif could be a necessary and sufficient minimal motif for translocation. To test this and to explore the mechanism of spontaneous membrane translocation, we synthesized seven arginine placement variants of LRLLRWC and compared their membrane partitioning, translocation, and perturbation to one of the parent SMTPs, called "TP2". Several motif variant peptides translocate into synthetic vesicles with rates that are similar to TP2. However, the peptide containing the selected motif, LRLLRWC, was not the fastest; sequence context is also important for translocation efficiency. Although none of these peptides permeabilize bilayers, the motif peptides translocate faster at higher peptide to lipid ratios, suggesting that bilayer perturbation and/or cooperative interactions are important for their translocation. On the other hand, TP2 translocates slower as its concentration is increased, suggesting that TP2 translocates as a monomer and is inhibited by lateral interactions in the membrane. TP2 and the LRLLR motif peptide induce lipid translocation, suggesting that lipids chaperone them across the bilayer. The other motif peptides do not induce lipid flip-flop, suggesting an alternate mechanism. Concatenated motifs translocate slower than the motifs alone. Variants of TP2 with shorter and longer arginine side-chain analogs translocate slower than TP2. In summary, these results suggest that multiple patterns of leucine and arginine can support spontaneous membrane translocation, and that sequence context is important for the contribution of the motifs. Because motifs do not make simple, additive contributions to spontaneous

  16. The value of position-specific priors in motif discovery using MEME

    PubMed Central

    2010-01-01

    Background Position-specific priors have been shown to be a flexible and elegant way to extend the power of Gibbs sampler-based motif discovery algorithms. Information of many types–including sequence conservation, nucleosome positioning, and negative examples–can be converted into a prior over the location of motif sites, which then guides the sequence motif discovery algorithm. This approach has been shown to confer many of the benefits of conservation-based and discriminative motif discovery approaches on Gibbs sampler-based motif discovery methods, but has not previously been studied with methods based on expectation maximization (EM). Results We extend the popular EM-based MEME algorithm to utilize position-specific priors and demonstrate their effectiveness for discovering transcription factor (TF) motifs in yeast and mouse DNA sequences. Utilizing a discriminative, conservation-based prior dramatically improves MEME's ability to discover motifs in 156 yeast TF ChIP-chip datasets, more than doubling the number of datasets where it finds the correct motif. On these datasets, MEME using the prior has a higher success rate than eight other conservation-based motif discovery approaches. We also show that the same type of prior improves the accuracy of motifs discovered by MEME in mouse TF ChIP-seq data, and that the motifs tend to be of slightly higher quality those found by a Gibbs sampling algorithm using the same prior. Conclusions We conclude that using position-specific priors can substantially increase the power of EM-based motif discovery algorithms such as MEME algorithm. PMID:20380693

  17. The value of position-specific priors in motif discovery using MEME.

    PubMed

    Bailey, Timothy L; Bodén, Mikael; Whitington, Tom; Machanick, Philip

    2010-04-09

    Position-specific priors have been shown to be a flexible and elegant way to extend the power of Gibbs sampler-based motif discovery algorithms. Information of many types-including sequence conservation, nucleosome positioning, and negative examples-can be converted into a prior over the location of motif sites, which then guides the sequence motif discovery algorithm. This approach has been shown to confer many of the benefits of conservation-based and discriminative motif discovery approaches on Gibbs sampler-based motif discovery methods, but has not previously been studied with methods based on expectation maximization (EM). We extend the popular EM-based MEME algorithm to utilize position-specific priors and demonstrate their effectiveness for discovering transcription factor (TF) motifs in yeast and mouse DNA sequences. Utilizing a discriminative, conservation-based prior dramatically improves MEME's ability to discover motifs in 156 yeast TF ChIP-chip datasets, more than doubling the number of datasets where it finds the correct motif. On these datasets, MEME using the prior has a higher success rate than eight other conservation-based motif discovery approaches. We also show that the same type of prior improves the accuracy of motifs discovered by MEME in mouse TF ChIP-seq data, and that the motifs tend to be of slightly higher quality those found by a Gibbs sampling algorithm using the same prior. We conclude that using position-specific priors can substantially increase the power of EM-based motif discovery algorithms such as MEME algorithm.

  18. Transcription factor and microRNA-regulated network motifs for cancer and signal transduction networks.

    PubMed

    Hsieh, Wen-Tsong; Tzeng, Ke-Rung; Ciou, Jin-Shuei; Tsai, Jeffrey Jp; Kurubanjerdjit, Nilubon; Huang, Chien-Hung; Ng, Ka-Lok

    2015-01-01

    Molecular networks are the basis of biological processes. Such networks can be decomposed into smaller modules, also known as network motifs. These motifs show interesting dynamical behaviors, in which co-operativity effects between the motif components play a critical role in human diseases. We have developed a motif-searching algorithm, which is able to identify common motif types from the cancer networks and signal transduction networks (STNs). Some of the network motifs are interconnected which can be merged together and form more complex structures, the so-called coupled motif structures (CMS). These structures exhibit mixed dynamical behavior, which may lead biological organisms to perform specific functions. In this study, we integrate transcription factors (TFs), microRNAs (miRNAs), miRNA targets and network motifs information to build the cancer-related TF-miRNA-motif networks (TMMN). This allows us to examine the role of network motifs in cancer formation at different levels of regulation, i.e. transcription initiation (TF → miRNA), gene-gene interaction (CMS), and post-transcriptional regulation (miRNA → target genes). Among the cancer networks and STNs we considered, it is found that there is a substantial amount of crosstalking through motif interconnections, in particular, the crosstalk between prostate cancer network and PI3K-Akt STN.To validate the role of network motifs in cancer formation, several examples are presented which demonstrated the effectiveness of the present approach. A web-based platform has been set up which can be accessed at: http://ppi.bioinfo.asia.edu.tw/pathway/. It is very likely that our results can supply very specific CMS missing information for certain cancer types, it is an indispensable tool for cancer biology research.

  19. Transcription factor and microRNA-regulated network motifs for cancer and signal transduction networks

    PubMed Central

    2015-01-01

    Abstract Background Molecular networks are the basis of biological processes. Such networks can be decomposed into smaller modules, also known as network motifs. These motifs show interesting dynamical behaviors, in which co-operativity effects between the motif components play a critical role in human diseases. We have developed a motif-searching algorithm, which is able to identify common motif types from the cancer networks and signal transduction networks (STNs). Some of the network motifs are interconnected which can be merged together and form more complex structures, the so-called coupled motif structures (CMS). These structures exhibit mixed dynamical behavior, which may lead biological organisms to perform specific functions. Results In this study, we integrate transcription factors (TFs), microRNAs (miRNAs), miRNA targets and network motifs information to build the cancer-related TF-miRNA-motif networks (TMMN). This allows us to examine the role of network motifs in cancer formation at different levels of regulation, i.e. transcription initiation (TF → miRNA), gene-gene interaction (CMS), and post-transcriptional regulation (miRNA → target genes). Among the cancer networks and STNs we considered, it is found that there is a substantial amount of crosstalking through motif interconnections, in particular, the crosstalk between prostate cancer network and PI3K-Akt STN. Conclusions To validate the role of network motifs in cancer formation, several examples are presented which demonstrated the effectiveness of the present approach. A web-based platform has been set up which can be accessed at: http://ppi.bioinfo.asia.edu.tw/pathway/. It is very likely that our results can supply very specific CMS missing information for certain cancer types, it is an indispensable tool for cancer biology research. PMID:25707690

  20. Materiaux composites supraconducteurs

    NASA Astrophysics Data System (ADS)

    Kerjouan, Philippe; Boterel, Florence; Lostec, Jean; Bertot, Jean-Paul; Haussonne, Jean-Marie

    1991-11-01

    The new superconductor materials with a high critical current own a large importance as well in the electronic components or in the electrotechnical devices fields. The deposit of such materials with the thick films technology is to be more and more developped in the years to come. Therefore, we tried to realize such thick films screen printed on alumina, and composed mainly of the YBa2CU3O{7-δ} material. We first realized a composite material glass/YBa2CU3O{7-δ}, by analogy with the classical screen-printed inks where the glass ensures the bonding with the substrate. We thus realized different materials by using some different classes of glass. These materials owned a superconducting transition close to the one of the pure YBa2CU3O{7-δ} material. We made a slurry with the most significant composite materials and binders, and screen-printed them on an alumina substrate preliminary or not coated with a diffusion barrier layer. After firing, we studied the thick films adhesion, the alumina/glass/composite material interfaces, and their superconducting properties. Les nouveaux matériaux supraconducteurs à haute température critique ont potentiellement un rôle important à jouer dans le domaine de l'électronique et de l'électrotechnique. En particulier, le dépôt d'oxydes supraconducteurs sur divers types de substrats est une technologie amenée à se développer. Nous avons donc entrepris une étude dont l'objet est la réalisation de conducteurs sérigraphiés sur alumine et composés essentiellement du matériau YBa2CU3O{7-δ}. Nous avons tout d'abord cherché à réaliser un composite verre/YBa2CU3O{7-δ}, par analogie au principe de réalisation de couches conductrices sérigraphiées, le verre permettant d'obtenir une liaison physico-chimique avec le substrat. Une étude préliminaire a permis de réaliser divers matériaux composites massifs, utilisant différentes familles de verres. Ces matériaux massifs, se présentant sous la forme de barreaux de

  1. D-MATRIX: a web tool for constructing weight matrix of conserved DNA motifs.

    PubMed

    Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok

    2009-07-27

    Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. D-MATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the co-regulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sos-box cis-regulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. D-MATRIX tool is accessible through the CIMAP domain network. http://203.190.147.116/dmatrix/

  2. D-MATRIX: A web tool for constructing weight matrix of conserved DNA motifs

    PubMed Central

    Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok

    2009-01-01

    Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. D­MATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the co­regulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sos­box cis­regulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. D­MATRIX tool is accessible through the CIMAP domain network. Availability http://203.190.147.116/dmatrix/ PMID:19759861

  3. WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar.

    PubMed

    Wang, Guandong; Yu, Taotao; Zhang, Weixiong

    2005-07-01

    Transcription factor (TF) binding sites or motifs (TFBMs) are functional cis-regulatory DNA sequences that play an essential role in gene transcriptional regulation. Although many experimental and computational methods have been developed, finding TFBMs remains a challenging problem. We propose and develop a novel dictionary based motif finding algorithm, which we call WordSpy. One significant feature of WordSpy is the combination of a word counting method and a statistical model which consists of a dictionary of motifs and a grammar specifying their usage. The algorithm is suitable for genome-wide motif finding; it is capable of discovering hundreds of motifs from a large set of promoters in a single run. We further enhance WordSpy by applying gene expression information to separate true TFBMs from spurious ones, and by incorporating negative sequences to identify discriminative motifs. In addition, we also use randomly selected promoters from the genome to evaluate the significance of the discovered motifs. The output from WordSpy consists of an ordered list of putative motifs and a set of regulatory sequences with motif binding sites highlighted. The web server of WordSpy is available at http://cic.cs.wustl.edu/wordspy.

  4. WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar

    PubMed Central

    Wang, Guandong; Yu, Taotao; Zhang, Weixiong

    2005-01-01

    Transcription factor (TF) binding sites or motifs (TFBMs) are functional cis-regulatory DNA sequences that play an essential role in gene transcriptional regulation. Although many experimental and computational methods have been developed, finding TFBMs remains a challenging problem. We propose and develop a novel dictionary based motif finding algorithm, which we call WordSpy. One significant feature of WordSpy is the combination of a word counting method and a statistical model which consists of a dictionary of motifs and a grammar specifying their usage. The algorithm is suitable for genome-wide motif finding; it is capable of discovering hundreds of motifs from a large set of promoters in a single run. We further enhance WordSpy by applying gene expression information to separate true TFBMs from spurious ones, and by incorporating negative sequences to identify discriminative motifs. In addition, we also use randomly selected promoters from the genome to evaluate the significance of the discovered motifs. The output from WordSpy consists of an ordered list of putative motifs and a set of regulatory sequences with motif binding sites highlighted. The web server of WordSpy is available at . PMID:15980501

  5. RNAMotifScanX: a graph alignment approach for RNA structural motif identification.

    PubMed

    Zhong, Cuncong; Zhang, Shaojie

    2015-03-01

    RNA structural motifs are recurrent three-dimensional (3D) components found in the RNA architecture. These RNA structural motifs play important structural or functional roles and usually exhibit highly conserved 3D geometries and base-interaction patterns. Analysis of the RNA 3D structures and elucidation of their molecular functions heavily rely on efficient and accurate identification of these motifs. However, efficient RNA structural motif search tools are lacking due to the high complexity of these motifs. In this work, we present RNAMotifScanX, a motif search tool based on a base-interaction graph alignment algorithm. This novel algorithm enables automatic identification of both partially and fully matched motif instances. RNAMotifScanX considers noncanonical base-pairing interactions, base-stacking interactions, and sequence conservation of the motifs, which leads to significantly improved sensitivity and specificity as compared with other state-of-the-art search tools. RNAMotifScanX also adopts a carefully designed branch-and-bound technique, which enables ultra-fast search of large kink-turn motifs against a 23S rRNA. The software package RNAMotifScanX is implemented using GNU C++, and is freely available from http://genome.ucf.edu/RNAMotifScanX. © 2015 Zhong and Zhang; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  6. Analysis of Genomic Sequence Motifs for Deciphering Transcription Factor Binding and Transcriptional Regulation in Eukaryotic Cells

    PubMed Central

    Boeva, Valentina

    2016-01-01

    Eukaryotic genomes contain a variety of structured patterns: repetitive elements, binding sites of DNA and RNA associated proteins, splice sites, and so on. Often, these structured patterns can be formalized as motifs and described using a proper mathematical model such as position weight matrix and IUPAC consensus. Two key tasks are typically carried out for motifs in the context of the analysis of genomic sequences. These are: identification in a set of DNA regions of over-represented motifs from a particular motif database, and de novo discovery of over-represented motifs. Here we describe existing methodology to perform these two tasks for motifs characterizing transcription factor binding. When applied to the output of ChIP-seq and ChIP-exo experiments, or to promoter regions of co-modulated genes, motif analysis techniques allow for the prediction of transcription factor binding events and enable identification of transcriptional regulators and co-regulators. The usefulness of motif analysis is further exemplified in this review by how motif discovery improves peak calling in ChIP-seq and ChIP-exo experiments and, when coupled with information on gene expression, allows insights into physical mechanisms of transcriptional modulation. PMID:26941778

  7. Combining intrinsic disorder prediction and augmented training of hidden Markov models improves discriminative motif discovery

    NASA Astrophysics Data System (ADS)

    Song, Tao; Bu, Xiaoting; Gu, Hong

    2015-08-01

    Identifying short linear motifs (SLiMs) usually suffers from lack of sufficient sequences. SLiMs with the same functional site class are typically characterized by similar motif patterns, which makes them hard to distinguish by generative motif discovery methods. A discriminative method based on maximal mutual information estimation (MMIE) of hidden Markov models (HMMs) is proposed. It masks ordered regions to improve signal to noise ratio and augments the training set to diminish the impact of the lack of sequences. Experimental results on a dataset selected from the Eukaryotic Linear Motif (ELM) resource show that the proposed method is effective and practical.

  8. Study on online community user motif using web usage mining

    NASA Astrophysics Data System (ADS)

    Alphy, Meera; Sharma, Ajay

    2016-04-01

    The Web usage mining is the application of data mining, which is used to extract useful information from the online community. The World Wide Web contains at least 4.73 billion pages according to Indexed Web and it contains at least 228.52 million pages according Dutch Indexed web on 6th august 2015, Thursday. It’s difficult to get needed data from these billions of web pages in World Wide Web. Here is the importance of web usage mining. Personalizing the search engine helps the web user to identify the most used data in an easy way. It reduces the time consumption; automatic site search and automatic restore the useful sites. This study represents the old techniques to latest techniques used in pattern discovery and analysis in web usage mining from 1996 to 2015. Analyzing user motif helps in the improvement of business, e-commerce, personalisation and improvement of websites.

  9. Sequential dynamics in the motif of excitatory coupled elements

    NASA Astrophysics Data System (ADS)

    Korotkov, Alexander G.; Kazakov, Alexey O.; Osipov, Grigory V.

    2015-11-01

    In this article a new model of motif (small ensemble) of neuron-like elements is proposed. It is built with the use of the generalized Lotka-Volterra model with excitatory couplings. The main motivation for this work comes from the problems of neuroscience where excitatory couplings are proved to be the predominant type of interaction between neurons of the brain. In this paper it is shown that there are two modes depending on the type of coupling between the elements: the mode with a stable heteroclinic cycle and the mode with a stable limit cycle. Our second goal is to examine the chaotic dynamics of the generalized three-dimensional Lotka-Volterra model.

  10. Evaluation of the pharmacophoric motif of the caged Garcinia xanthones†

    PubMed Central

    Chantarasriwong, Oraphin; Cho, Woo Cheal; Batova, Ayse; Chavasiri, Warinthorn; Moore, Curtis; Rheingold, Arnold L.; Theodorakis, Emmanuel A.

    2010-01-01

    The combination of unique structure and potent bioactivity exhibited by several family members of the caged Garcinia xanthones, led us to evaluate their pharmacophore. We have developed a Pd(0)-catalyzed method for the reverse prenylation of catechols that, together with a Claisen/Diels–Alder reaction cascade, provides rapid and efficient access to various caged analogues. Evaluation of the growth inhibitory activity of these compounds leads to the conclusion that the intact ABC ring system containing the C-ring caged structure is essential to the bioactivity. Studies with cluvenone (7) also showed that these compounds induce apoptosis and exhibit significant cytotoxicity in multidrug-resistant leukemia cells. As such, the caged Garcinia xanthone motif represents a new and potent pharmacophore. PMID:19907779

  11. cisExpress: motif detection in DNA sequences.

    PubMed

    Triska, Martin; Grocutt, David; Southern, James; Murphy, Denis J; Tatarinova, Tatiana

    2013-09-01

    One of the major challenges for contemporary bioinformatics is the analysis and accurate annotation of genomic datasets to enable extraction of useful information about the functional role of DNA sequences. This article describes a novel genome-wide statistical approach to the detection of specific DNA sequence motifs based on similarities between the promoters of similarly expressed genes. This new tool, cisExpress, is especially designed for use with large datasets, such as those generated by publicly accessible whole genome and transcriptome projects. cisExpress uses a task farming algorithm to exploit all available computational cores within a shared memory node. We demonstrate the robust nature and validity of the proposed method. It is applicable for use with a wide range of genomic databases for any species of interest. cisExpress is available at www.cisexpress.org.

  12. cisExpress: motif detection in DNA sequences

    PubMed Central

    Triska, Martin; Grocutt, David; Southern, James; Murphy, Denis J.; Tatarinova, Tatiana

    2013-01-01

    Motivation: One of the major challenges for contemporary bioinformatics is the analysis and accurate annotation of genomic datasets to enable extraction of useful information about the functional role of DNA sequences. This article describes a novel genome-wide statistical approach to the detection of specific DNA sequence motifs based on similarities between the promoters of similarly expressed genes. This new tool, cisExpress, is especially designed for use with large datasets, such as those generated by publicly accessible whole genome and transcriptome projects. cisExpress uses a task farming algorithm to exploit all available computational cores within a shared memory node. We demonstrate the robust nature and validity of the proposed method. It is applicable for use with a wide range of genomic databases for any species of interest. Availability: cisExpress is available at www.cisexpress.org. Contact: tatiana.tatarinova@usc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23793750

  13. Bacteria-mimicking nanoparticle surface functionalization with targeting motifs

    NASA Astrophysics Data System (ADS)

    Lai, Mei-Hsiu; Clay, Nicholas E.; Kim, Dong Hyun; Kong, Hyunjoon

    2015-04-01

    In recent years, surface modification of nanocarriers with targeting motifs has been explored to modulate delivery of various diagnostic, sensing and therapeutic molecular cargo to desired sites of interest in in vitro bioengineering platforms and in vivo pathologic tissue. However, most surface functionalization approaches are often plagued by complex chemical modifications and effortful purifications. To resolve such challenges, this study demonstrates a unique method to immobilize antibodies that can act as targeting motifs on the surfaces of nanocarriers, inspired by a process that bacteria use for immobilization of the host's antibodies. We hypothesized that alkylated Staphylococcus aureus protein A (SpA) would self-assemble with micelles and subsequently induce stable coupling of antibodies to the micelles. We examined this hypothesis by using poly(2-hydroxyethyl-co-octadecyl aspartamide) (PHEA-g-C18) as a model polymer to form micelles. The self-assembly between the micelles and alkylated SpA became more thermodynamically favorable by increasing the degree of substitution of octadecyl chains to PHEA-g-C18, due to a positive entropy change. Lastly, the mixing of SpA-PA-coupled micelles with antibodies resulted in the coating of micelles with antibodies, as confirmed with a fluorescence resonance energy transfer (FRET) assay. The micelles coated with antibodies to VCAM-1 or integrin αv displayed a higher binding affinity to substrates coated with VCAM-1 and integrin αvβ3, respectively, than other controls, as evaluated with surface plasmon resonance (SPR) spectroscopy and a circulation-simulating flow chamber. We envisage that this bacteria-inspired protein immobilization approach will be useful to improve the quality of targeted delivery of nanoparticles, and can be extended to modify the surface of a wide array of nanocarriers.In recent years, surface modification of nanocarriers with targeting motifs has been explored to modulate delivery of various

  14. Sulfur-induced structural motifs on copper and gold surfaces

    SciTech Connect

    Walen, Holly

    2016-01-01

    The interaction of sulfur with copper and gold surfaces plays a fundamental role in important phenomena that include coarsening of surface nanostructures, and self-assembly of alkanethiols. Here, we identify and analyze unique sulfur-induced structural motifs observed on the low-index surfaces of these two metals. We seek out these structures in an effort to better understand the fundamental interactions between these metals and sulfur that lends to the stability and favorability of metal-sulfur complexes vs. chemisorbed atomic sulfur. The experimental observations presented here—made under identical conditions—together with extensive DFT analyses, allow comparisons and insights into factors that favor the existence of metal-sulfur complexes, vs. chemisorbed atomic sulfur, on metal terraces. We believe this data will be instrumental in better understanding the complex phenomena occurring between the surfaces of coinage metals and sulfur.

  15. RNA Sociology: Group Behavioral Motifs of RNA Consortia

    PubMed Central

    Witzany, Guenther

    2014-01-01

    RNA sociology investigates the behavioral motifs of RNA consortia from the social science perspective. Besides the self-folding of RNAs into single stem loop structures, group building of such stem loops results in a variety of essential agents that are highly active in regulatory processes in cellular and non-cellular life. RNA stem loop self-folding and group building do not depend solely on sequence syntax; more important are their contextual (functional) needs. Also, evolutionary processes seem to occur through RNA stem loop consortia that may act as a complement. This means the whole entity functions only if all participating parts are coordinated, although the complementary building parts originally evolved for different functions. If complementary groups, such as rRNAs and tRNAs, are placed together in selective pressure contexts, new evolutionary features may emerge. Evolution initiated by competent agents in natural genome editing clearly contrasts with statistical error replication narratives. PMID:25426799

  16. Junctions between i-motif tetramers in supramolecular structures

    PubMed Central

    Guittet, Eric; Renciuk, Daniel; Leroy, Jean-Louis

    2012-01-01

    The symmetry of i-motif tetramers gives to cytidine-rich oligonucleotides the capacity to associate into supramolecular structures (sms). In order to determine how the tetramers are linked together in such structures, we have measured by gel filtration chromatography and NMR the formation and dissociation kinetics of sms built by oligonucleotides containing two short C stretches separated by a non-cytidine-base. We show that a stretch of only two cytidines either at the 3′- or 5′-end is long enough to link the tetramers into sms. The analysis of the properties of sms formed by oligonucleotides differing by the length of the oligo-C stretches, the sequence orientation and the nature of the non-C base provides a model of the junction connecting the tetramers in sms. PMID:22362739

  17. [Scanning electron microscopic study of so-called carvable composite filling materials after over one-year functional period].

    PubMed

    Triadan, H

    1979-03-01

    28 class-5 and 16 class-1 fillings were made from the composite material "Epoxydent" on a macaca speciosa monkey and examined with the electron microscope after a 15 months functional period. Statistically significant differences in the size of the marginal space were found to be larger than in comparable composites Adaptic, Concise, Compo-Cap and Cosmic. The spaces were frequently not located on the filling margin but inside, within the filling material. This is attributed to the "carving" technique during the gel phase of setting. The surface shows abrasions and porosities with loss of particles, sometimes fractures and discolored margins with secondary caries. It is not recommended to replace metal fillings by so-called carvable composits.

  18. Multivalent dendrimer vectors with DNA intercalation motifs for gene delivery.

    PubMed

    Wong, Pamela T; Tang, Kenny; Coulter, Alexa; Tang, Shengzhuang; Baker, James R; Choi, Seok Ki

    2014-11-10

    Poly(amido amine) (PAMAM) dendrimers constitute an important class of nonviral, cationic vectors in gene delivery. Here we report on a new concept for dendrimer vector design based on the incorporation of dual binding motifs: DNA intercalation, and receptor recognition for targeted delivery. We prepared a series of dendrimer conjugates derived from a fifth generation (G5) PAMAM dendrimer, each conjugated with multiple folate (FA) or riboflavin (RF) ligands for cell receptor targeting, and with 3,8-diamino-6-phenylphenanthridinium ("DAPP")-derived ligands for anchoring a DNA payload. Polyplexes of each dendrimer with calf thymus dsDNA were made and characterized by surface plasmon resonance (SPR) spectroscopy, dynamic light scattering (DLS) and zeta potential measurement. These studies provided evidence supporting polyplex formation based on the observation of tight DNA-dendrimer adhesion, and changes in particle size and surface charge upon coincubation. Further SPR studies to investigate the adhesion of the polyplex to a model surface immobilized with folate binding protein (FBP), demonstrated that the DNA payload has only a minimal effect on the receptor binding activity of the polyplex: KD = 0.22 nM for G5(FA)(DAPP) versus 0.98 nM for its polyplex. Finally, we performed in vitro transfection assays to determine the efficiency of conjugate mediated delivery of a luciferase-encoding plasmid into the KB cancer cell line and showed that RF-conjugated dendrimers were 1 to 2 orders of magnitude more effective in enhancing luciferase gene transfection than a plasmid only control. In summary, this study serves as a proof of concept for DNA-ligand intercalation as a motif in the design of multivalent dendrimer vectors for targeted gene delivery.

  19. Polyproline and triple helix motifs in host-pathogen recognition.

    PubMed

    Berisio, Rita; Vitagliano, Luigi

    2012-12-01

    Secondary structure elements often mediate protein-protein interactions. Despite their low abundance in folded proteins, polyproline II (PPII) and its variant, the triple helix, are frequently involved in protein-protein interactions, likely due to their peculiar propensity to be solvent-exposed. We here review the role of PPII and triple helix in mediating hostpathogen interactions, with a particular emphasis to the structural aspects of these processes. After a brief description of the basic structural features of these elements, examples of host-pathogen interactions involving these motifs are illustrated. Literature data suggest that the role played by PPII motif in these processes is twofold. Indeed, PPII regions may directly mediate interactions between proteins of the host and the pathogen. Alternatively, PPII may act as structural spacers needed for the correct positioning of the elements needed for adhesion and infectivity. Recent investigations have highlighted that collagen triple helix is also a common target for bacterial adhesins. Although structural data on complexes between adhesins and collagen models are rather limited, experimental and theoretical studies have unveiled some interesting clues of the recognition process. Interestingly, very recent data show that not only is the triple helix used by pathogens as a target in the host-pathogen interaction but it may also act as a bait in these processes since bacterial proteins containing triple helix regions have been shown to interact with host proteins. As both PPII and triple helix expose several main chain non-satisfied hydrogen bond acceptors and donors, both elements are highly solvated. The preservation of the solvation state of both PPII and triple helix upon protein-protein interaction is an emerging aspect that will be here thoroughly discussed.

  20. Polyproline and Triple Helix Motifs in Host-Pathogen Recognition

    PubMed Central

    Berisio, Rita; Vitagliano, Luigi

    2012-01-01

    Secondary structure elements often mediate protein-protein interactions. Despite their low abundance in folded proteins, polyproline II (PPII) and its variant, the triple helix, are frequently involved in protein-protein interactions, likely due to their peculiar propensity to be solvent-exposed. We here review the role of PPII and triple helix in mediating host-pathogen interactions, with a particular emphasis to the structural aspects of these processes. After a brief description of the basic structural features of these elements, examples of host-pathogen interactions involving these motifs are illustrated. Literature data suggest that the role played by PPII motif in these processes is twofold. Indeed, PPII regions may directly mediate interactions between proteins of the host and the pathogen. Alternatively, PPII may act as structural spacers needed for the correct positioning of the elements needed for adhesion and infectivity. Recent investigations have highlighted that collagen triple helix is also a common target for bacterial adhesins. Although structural data on complexes between adhesins and collagen models are rather limited, experimental and theoretical studies have unveiled some interesting clues of the recognition process. Interestingly, very recent data show that not only is the triple helix used by pathogens as a target in the host-pathogen interaction but it may also act as a bait in these processes since bacterial proteins containing triple helix regions have been shown to interact with host proteins. As both PPII and triple helix expose several main chain non-satisfied hydrogen bond acceptors and donors, both elements are highly solvated. The preservation of the solvation state of both PPII and triple helix upon protein-protein interaction is an emerging aspect that will be here thoroughly discussed. PMID:23305370

  1. Folding motifs induced and stabilized by distinct cystine frameworks.

    PubMed

    Tamaoki, H; Miura, R; Kusunoki, M; Kyogoku, Y; Kobayashi, Y; Moroder, L

    1998-08-01

    Bioactive peptides of different sources and biological functionalities, like endothelins, sarafotoxins, bee and scorpion venom toxins, contain a consensus cystine framework, Cys-(X)1-Cys/Cys-(X)3-Cys, which has been found to induce and stabilize a homologous folding motif named the cystine-stabilized alpha-helix (CSH). This is composed of an alpha-helical segment spanning the Cys-(X)3-Cys sequence portion that is crosslinked by two disulfide bridges to the sequence portion Cys-(X)1-Cys, itself folded in an extended beta-strand type structure. Search for sequence homologies of peptides and proteins in the SWISS-PROT and PDB data banks provided additional multiple examples of this type of cystine framework in serine proteinase inhibitors, in insect and plant defense proteins, as well as in members of the growth factor family with the cystine-knot. A comparative analysis of the known 3D-structures of these peptides and proteins confirmed that the presence of this peculiar cystine framework leads in all cases to a high degree of local structural homology that consists of the CSH motif, except for the cystine-knot, of the superfamily of the growth factors. In this case the cyclic structure formed by the parallel cysteine connectivities of Cys-(X)1-Cys/Cys-(X)3-Cys framework is penetrated by a third disulfide bond with formation of a concatenated knot, and the two disulfide-bridged peptide chains Cys-(X)1-Cys and Cys-(X)3-Cys are located in beta-strands. Conversely, peptides and proteins containing Cys-(X)m-Cys/Cys-(X)n-Cys cystine frameworks that differ from m/n = 1/3 were found to fold only sporadically into local alpha-helical structures.

  2. Improved K-means clustering algorithm for exploring local protein sequence motifs representing common structural property.

    PubMed

    Zhong, Wei; Altun, Gulsah; Harrison, Robert; Tai, Phang C; Pan, Yi

    2005-09-01

    Information about local protein sequence motifs is very important to the analysis of biologically significant conserved regions of protein sequences. These conserved regions can potentially determine the diverse conformation and activities of proteins. In this work, recurring sequence motifs of proteins are explored with an improved K-means clustering algorithm on a new dataset. The structural similarity of these recurring sequence clusters to produce sequence motifs is studied in order to evaluate the relationship between sequence motifs and their structures. To the best of our knowledge, the dataset used by our research is the most updated dataset among similar studies for sequence motifs. A new greedy initialization method for the K-means algorithm is proposed to improve traditional K-means clustering techniques. The new initialization method tries to choose suitable initial points, which are well separated and have the potential to form high-quality clusters. Our experiments indicate that the improved K-means algorithm satisfactorily increases the percentage of sequence segments belonging to clusters with high structural similarity. Careful comparison of sequence motifs obtained by the improved and traditional algorithms also suggests that the improved K-means clustering algorithm may discover some relatively weak and subtle sequence motifs, which are undetectable by the traditional K-means algorithms. Many biochemical tests reported in the literature show that these sequence motifs are biologically meaningful. Experimental results also indicate that the improved K-means algorithm generates more detailed sequence motifs representing common structures than previous research. Furthermore, these motifs are universally conserved sequence patterns across protein families, overcoming some weak points of other popular sequence motifs. The satisfactory result of the experiment suggests that this new K-means algorithm may be applied to other areas of bioinformatics

  3. Prediction of virus-host protein-protein interactions mediated by short linear motifs.

    PubMed

    Becerra, Andrés; Bucheli, Victor A; Moreno, Pedro A

    2017-03-09

    Short linear motifs in host organisms proteins can be mimicked by viruses to create protein-protein interactions that disable or control metabolic pathways. Given that viral linear motif instances of host motif regular expressions can be found by chance, it is necessary to develop filtering methods of functional linear motifs. We conduct a systematic comparison of linear motifs filtering methods to develop a computational approach for predicting motif-mediated protein-protein interactions between human and the human immunodeficiency virus 1 (HIV-1). We implemented three filtering methods to obtain linear motif sets: 1) conserved in viral proteins (C), 2) located in disordered regions (D) and 3) rare or scarce in a set of randomized viral sequences (R). The sets C,D,R are united and intersected. The resulting sets are compared by the number of protein-protein interactions correctly inferred with them - with experimental validation. The comparison is done with HIV-1 sequences and interactions from the National Institute of Allergy and Infectious Diseases (NIAID). The number of correctly inferred interactions allows to rank the interactions by the sets used to deduce them: D∪R and C. The ordering of the sets is descending on the probability of capturing functional interactions. With respect to HIV-1, the sets C∪R, D∪R, C∪D∪R infer all known interactions between HIV1 and human proteins mediated by linear motifs. We found that the majority of conserved linear motifs in the virus are located in disordered regions. We have developed a method for predicting protein-protein interactions mediated by linear motifs between HIV-1 and human proteins. The method only use protein sequences as inputs. We can extend the software developed to any other eukaryotic virus and host in order to find and rank candidate interactions. In future works we will use it to explore possible viral attack mechanisms based on linear motif mimicry.

  4. Multiple Weak Linear Motifs Enhance Recruitment and Processivity in SPOP-Mediated Substrate Ubiquitination.

    PubMed

    Pierce, Wendy K; Grace, Christy R; Lee, Jihun; Nourse, Amanda; Marzahn, Melissa R; Watson, Edmond R; High, Anthony A; Peng, Junmin; Schulman, Brenda A; Mittag, Tanja

    2016-03-27

    Primary sequence motifs, with millimolar affinities for binding partners, are abundant in disordered protein regions. In multivalent interactions, such weak linear motifs can cooperate to recruit binding partners via avidity effects. If linear motifs recruit modifying enzymes, optimal placement of weak motifs may regulate access to modification sites. Weak motifs may thus exert physiological relevance stronger than that suggested by their affinities, but molecular mechanisms of their function are still poorly understood. Herein, we use the N-terminal disordered region of the Hedgehog transcriptional regulator Gli3 (Gli3(1-90)) to determine the role of weak motifs encoded in its primary sequence for the recruitment of its ubiquitin ligase CRL3(SPOP) and the subsequent effect on ubiquitination efficiency. The substrate adaptor SPOP binds linear motifs through its MATH (meprin and TRAF homology) domain and forms higher-order oligomers through its oligomerization domains, rendering SPOP multivalent for its substrates. Gli3 has multiple weak SPOP binding motifs. We map three such motifs in Gli3(1-90), the weakest of which has a millimolar dissociation constant. Multivalency of ligase and substrate for each other facilitates enhanced ligase recruitment and stimulates Gli3(1-90) ubiquitination in in vitro ubiquitination assays. We speculate that the weak motifs enable processivity through avidity effects and by providing steric access to lysine residues that are otherwise not prioritized for polyubiquitination. Weak motifs may generally be employed in multivalent systems to act as gatekeepers regulating post-translational modification. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  5. Assembly of supramolecular DNA complexes containing both G-quadruplexes and i-motifs by enhancing the G-repeat-bearing capacity of i-motifs

    PubMed Central

    Cao, Yanwei; Gao, Shang; Yan, Yuting; Bruist, Michael F.; Wang, Bing; Guo, Xinhua

    2017-01-01

    The single-step assembly of supramolecular complexes containing both i-motifs and G-quadruplexes (G4s) is demonstrated. This can be achieved because the formation of four-stranded i-motifs appears to be little affected by certain terminal residues: a five-cytosine tetrameric i-motif can bear ten-base flanking residues. However, things become complex when different lengths of guanine-repeats are added at the 3′ or 5′ ends of the cytosine-repeats. Here, a series of oligomers d(XGiXC5X) and d(XC5XGiX) (X = A, T or none; i < 5) are designed to study the impact of G-repeats on the formation of tetrameric i-motifs. Our data demonstrate that tetramolecular i-motif structure can tolerate specific flanking G-repeats. Assemblies of these oligonucleotides are polymorphic, but may be controlled by solution pH and counter ion species. Importantly, we find that the sequences d(TGiAC5) can form the tetrameric i-motif in large quantities. This leads to the design of two oligonucleotides d(TG4AC7) and d(TGBrGGBrGAC7) that self-assemble to form quadruplex supramolecules under certain conditions. d(TG4AC7) forms supramolecules under acidic conditions in the presence of K+ that are mainly V-shaped or ring-like containing parallel G4s and antiparallel i-motifs. d(TGBrGGBrGAC7) forms long linear quadruplex wires under acidic conditions in the presence of Na+ that consist of both antiparallel G4s and i-motifs. PMID:27899568

  6. Redefining Escherichia coli σ70 Promoter Elements: −15 Motif as a Complement of the −10 Motif ▿ †

    PubMed Central

    Djordjevic, Marko

    2011-01-01

    Classical elements of σ70 bacterial promoters include the −35 element (−35TTGACA−30), the −10 element (−12TATAAT−7), and the extended −10 element (−15TG−14). Although the −35 element, the extended −10 element, and the upstream-most base in the −10 element (−12T) interact with σ70 in double-stranded DNA (dsDNA) form, the downstream bases in the −10 motif (−11ATAAT−7) are responsible for σ70-single-stranded DNA (ssDNA) interactions. In order to directly reflect this correspondence, an extension of the extended −10 element to a so-called −15 element (−15TGnT−12) has been recently proposed. I investigated here the sequence specificity of the proposed −15 element and its relationship to other promoter elements. I found a previously undetected significant conservation of −13G and a high degeneracy at −15T. I therefore defined the −15 element as a degenerate motif, which, together with the conserved stretch of sequence between −15 and −12, allows treating this element analogously to −35 and −10 elements. Furthermore, the strength of the −15 element inversely correlates with the strengths of the −35 element and −10 element, whereas no such complementation between other promoter elements was found. Despite the direct involvement of −15 element in σ70-dsDNA interactions, I found a significantly stronger tendency of this element to complement weak −10 elements that are involved in σ70-ssDNA interactions. This finding is in contrast to the established view, according to which the −15 element provides a sufficient number of σ70-dsDNA interactions, and suggests that the main parameter determining a functional promoter is the overall promoter strength. PMID:21908667

  7. An Examination of Four Key Motifs Found in High Fantasy for Children.

    ERIC Educational Resources Information Center

    Cohen, John Arthur

    The purpose of this study was to come to a greater understanding of contemporary high fantasy for children by analyzing in depth the nature and functions of four key motifs of this sub-genre of fantasy. These motifs are created worlds, time displacement, quest, and combat between good and evil. The 47 books chosen for analysis were recommended in…

  8. Mid deck in-orbit crew portrait taken in flag motif shirts.

    NASA Technical Reports Server (NTRS)

    1992-01-01

    Mid deck in orbit portraits of the entire crew taken in flag motif shirts. Flag motif jersey portraits - front row, left to right, Mission Specialists Kathy Thornton and Rick Heib, middlle row, left to right, Mission Specialists Pierre Thuot and Tom Akers, back row, left to right, Mission Commander Dan Brandenstein, Mission Pilot Kevin Chilton and Mission Specialist Bruce Melnick.

  9. Bayesian multiple-instance motif discovery with BAMBI: inference of recombinase and transcription factor binding sites

    PubMed Central

    Jajamovich, Guido H.; Wang, Xiaodong; Arkin, Adam P.; Samoilov, Michael S.

    2011-01-01

    Finding conserved motifs in genomic sequences represents one of essential bioinformatic problems. However, achieving high discovery performance without imposing substantial auxiliary constraints on possible motif features remains a key algorithmic challenge. This work describes BAMBI—a sequential Monte Carlo motif-identification algorithm, which is based on a position weight matrix model that does not require additional constraints and is able to estimate such motif properties as length, logo, number of instances and their locations solely on the basis of primary nucleotide sequence data. Furthermore, should biologically meaningful information about motif attributes be available, BAMBI takes advantage of this knowledge to further refine the discovery results. In practical applications, we show that the proposed approach can be used to find sites of such diverse DNA-binding molecules as the cAMP receptor protein (CRP) and Din-family site-specific serine recombinases. Results obtained by BAMBI in these and other settings demonstrate better statistical performance than any of the four widely-used profile-based motif discovery methods: MEME, BioProspector with BioOptimizer, SeSiMCMC and Motif Sampler as measured by the nucleotide-level correlation coefficient. Additionally, in the case of Din-family recombinase target site discovery, the BAMBI-inferred motif is found to be the only one functionally accurate from the underlying biochemical mechanism standpoint. C++ and Matlab code is available at http://www.ee.columbia.edu/~guido/BAMBI or http://genomics.lbl.gov/BAMBI/. PMID:21948794

  10. Identification of protein motifs using conserved amino acid properties and partitioning techniques

    SciTech Connect

    Wu, T.D.; Brutlag, D.L.

    1995-12-31

    Analyzing a set of protein sequences involves a fundamental relationship between the coherency of the set and the specificity of the motif that describes it. Motifs may be obscured by training sets that contain incoherent sequences, in part due to protein subclasses, contamination, or errors. We develop an algorithm for motif identification that systematically explores possible patterns of coherency within a set of protein sequences, Our algorithm constructs alternative partitions of the training set data, where one subset of each partition is presumed to contain coherent data and is used for forming a motif. The motif is represented by multiple overlapping amino acid groups based on evolutionary, biochemical, or physical properties. We demonstrate our method on a training set of reverse transcriptases that contains subclasses, sequence errors, misalignments, and contaminating sequences. Despite these complications, our program identifies a novel motif for the subclass of retroviral and retrovirus-related reverse transcriptases. This motif has a much higher specificity than previously reported motifs and suggests the importance of conserved hydrophilic and hydrophobic residues in the structure of reverse transcriptases.

  11. Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium.

    PubMed

    Catania, Francesco; Lynch, Michael

    2010-05-04

    In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa) remains a virtually unexplored issue. By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Our observations 1) shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2) are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3) reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes.

  12. Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium

    PubMed Central

    2010-01-01

    Background In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa) remains a virtually unexplored issue. Results By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Conclusions Our observations 1) shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2) are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3) reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes. PMID:20441586

  13. Computational generation and screening of RNA motifs in large nucleotide sequence pools

    PubMed Central

    Kim, Namhee; Izzo, Joseph A.; Elmetwaly, Shereef; Gan, Hin Hark; Schlick, Tamar

    2010-01-01

    Although identification of active motifs in large random sequence pools is central to RNA in vitro selection, no systematic computational equivalent of this process has yet been developed. We develop a computational approach that combines target pool generation, motif scanning and motif screening using secondary structure analysis for applications to 1012–1014-sequence pools; large pool sizes are made possible using program redesign and supercomputing resources. We use the new protocol to search for aptamer and ribozyme motifs in pools up to experimental pool size (1014 sequences). We show that motif scanning, structure matching and flanking sequence analysis, respectively, reduce the initial sequence pool by 6–8, 1–2 and 1 orders of magnitude, consistent with the rare occurrence of active motifs in random pools. The final yields match the theoretical yields from probability theory for simple motifs and overestimate experimental yields, which constitute lower bounds, for aptamers because screening analyses beyond secondary structure information are not considered systematically. We also show that designed pools using our nucleotide transition probability matrices can produce higher yields for RNA ligase motifs than random pools. Our methods for generating, analyzing and designing large pools can help improve RNA design via simulation of aspects of in vitro selection. PMID:20448026

  14. Population genomics and transcriptional consequences of regulatory motif variation in globally diverse Saccharomyces cerevisiae strains.

    PubMed

    Connelly, Caitlin F; Skelly, Daniel A; Dunham, Maitreya J; Akey, Joshua M

    2013-07-01

    Noncoding genetic variation is known to significantly influence gene expression levels in a growing number of specific cases; however, the patterns of genome-wide noncoding variation present within populations, the evolutionary forces acting on noncoding variants, and the relative effects of regulatory polymorphisms on transcript abundance are not well characterized. Here, we address these questions by analyzing patterns of regulatory variation in motifs for 177 DNA binding proteins in 37 strains of Saccharomyces cerevisiae. Between S. cerevisiae strains, we found considerable polymorphism in regulatory motifs across strains (mean π = 0.005) as well as diversity in regulatory motifs (mean 0.91 motifs differences per regulatory region). Population genetics analyses reveal that motifs are under purifying selection, and there is considerable heterogeneity in the magnitude of selection across different motifs. Finally, we obtained RNA-Seq data in 22 strains and identified 49 polymorphic DNA sequence motifs in 30 distinct genes that are significantly associated with transcriptional differences between strains. In 22 of these genes, there was a single polymorphic motif associated with expression in the upstream region. Our results provide comprehensive insights into the evolutionary trajectory of regulatory variation in yeast and the characteristics of a compendium of regulatory alleles.

  15. Redemptive Rhetoric: The Continuity Motif in the Rhetoric of Right to Life.

    ERIC Educational Resources Information Center

    Solomon, Martha

    1980-01-01

    Traces the use of the "continuity" motif in the Right to Life movement's rhetoric and its influence on the depiction of the abortion controversy. Analyzes how the motif functions rhetorically to aid the movement in defining its activities and involvement. (PD)

  16. Neutral red as a specific light-up fluorescent probe for i-motif DNA.

    PubMed

    Xu, Lijun; Wang, Jine; Sun, Na; Liu, Min; Cao, Yi; Wang, Zhili; Pei, Renjun

    2016-12-06

    We report a specific light-up fluorescent probe for i-motif DNA for the first time. Compared with the previously reported probes, neutral red could selectively interact with an i-motif and show a significant increase in its fluorescence. This feature makes it advantageous for designing label-free fluorescent sensing systems.

  17. Noncoding RNA danger motifs bridge innate and adaptive immunity and are potent adjuvants for vaccination

    PubMed Central

    Wang, Lilin; Smith, Dan; Bot, Simona; Dellamary, Luis; Bloom, Amy; Bot, Adrian

    2002-01-01

    The adaptive immune response is triggered by recognition of T and B cell epitopes and is influenced by “danger” motifs that act via innate immune receptors. This study shows that motifs associated with noncoding RNA are essential features in the immune response reminiscent of viral infection, mediating rapid induction of proinflammatory chemokine expression, recruitment and activation of antigen-presenting cells, modulation of regulatory cytokines, subsequent differentiation of Th1 cells, isotype switching, and stimulation of cross-priming. The heterogeneity of RNA-associated motifs results in differential binding to cellular receptors, and specifically impacts the immune profile. Naturally occurring double-stranded RNA (dsRNA) triggered activation of dendritic cells and enhancement of specific immunity, similar to selected synthetic dsRNA motifs. Based on the ability of specific RNA motifs to block tolerance induction and effectively organize the immune defense during viral infection, we conclude that such RNA species are potent danger motifs. We also demonstrate the feasibility of using selected RNA motifs as adjuvants in the context of novel aerosol carriers for optimizing the immune response to subunit vaccines. In conclusion, RNA-associated motifs produced during viral infection bridge the early response with the late adaptive phase, regulating the activation and differentiation of antigen-specific B and T cells, in addition to a short-term impact on innate immunity. PMID:12393853

  18. Construction of DNA logic gates utilizing a H+/Ag+ induced i-motif structure.

    PubMed

    Shi, Yunhua; Sun, Hongxia; Xiang, Junfeng; Chen, Hongbo; Yang, Qianfan; Guan, Aijiao; Li, Qian; Yu, Lijia; Tang, Yalin

    2014-12-18

    A simple technology to construct diverse DNA logic gates (OR and INHIBIT) has been designed utilizing a H(+) and/or Ag(+) induced i-motif structure. The logic gates are easily controlled and also show a real time response towards inputs. The research provides a new insight for designing DNA logic gates using an i-motif DNA structure.

  19. Wayward Warriors: The Viking Motif in Swedish and English Children's Literature

    ERIC Educational Resources Information Center

    Sundmark, Björn

    2014-01-01

    In this article the Viking motif in children's literature is explored--from its roots in (adult) nationalist and antiquarian discourse, over pedagogical and historical texts for children, to the eventual diversification (or dissolution) of the motif into different genres and forms. The focus is on Swedish Viking narratives, but points of…

  20. The PXDLS linear motif regulates circadian rhythmicity through protein–protein interactions

    PubMed Central

    Shalev, Moran; Aviram, Rona; Adamovich, Yaarit; Kraut-Cohen, Judith; Shamia, Tal; Ben-Dor, Shifra; Golik, Marina; Asher, Gad

    2014-01-01

    The circadian core clock circuitry relies on interlocked transcription-translation feedback loops that largely count on multiple protein interactions. The molecular mechanisms implicated in the assembly of these protein complexes are relatively unknown. Our bioinformatics analysis of short linear motifs, implicated in protein interactions, reveals an enrichment of the Pro-X-Asp-Leu-Ser (PXDLS) motif within circadian transcripts. We show that the PXDLS motif can bind to BMAL1/CLOCK and disrupt circadian oscillations in a cell-autonomous manner. Remarkably, the motif is evolutionary conserved in the core clock protein REV-ERBα, and additional proteins implicated in the clock's function (NRIP1, CBP). In this conjuncture, we uncover a novel cross talk between the two principal core clock feedback loops and show that BMAL/CLOCK and REV-ERBα interact and that the PXDLS motif of REV-ERBα participates in their binding. Furthermore, we demonstrate that the PXDLS motifs of NRIP1 and CBP are involved in circadian rhythmicity. Our findings suggest that the PXDLS motif plays an important role in circadian rhythmicity through regulation of protein interactions within the clock circuitry and that short linear motifs can be employed to modulate circadian oscillations. PMID:25260595

  1. Wayward Warriors: The Viking Motif in Swedish and English Children's Literature

    ERIC Educational Resources Information Center

    Sundmark, Björn

    2014-01-01

    In this article the Viking motif in children's literature is explored--from its roots in (adult) nationalist and antiquarian discourse, over pedagogical and historical texts for children, to the eventual diversification (or dissolution) of the motif into different genres and forms. The focus is on Swedish Viking narratives, but points of…

  2. World Color Survey color naming reveals universal motifs and their within-language diversity

    PubMed Central

    Lindsey, Delwin T.; Brown, Angela M.

    2009-01-01

    We analyzed the color terms in the World Color Survey (WCS) (www.icsi.berkeley.edu/wcs/), a large color-naming database obtained from informants of mostly unwritten languages spoken in preindustrialized cultures that have had limited contact with modern, industrialized society. The color naming idiolects of 2,367 WCS informants fall into three to six “motifs,” where each motif is a different color-naming system based on a subset of a universal glossary of 11 color terms. These motifs are universal in that they occur worldwide, with some individual variation, in completely unrelated languages. Strikingly, these few motifs are distributed across the WCS informants in such a way that multiple motifs occur in most languages. Thus, the culture a speaker comes from does not completely determine how he or she will use color terms. An analysis of the modern patterns of motif usage in the WCS languages, based on the assumption that they reflect historical patterns of color term evolution, suggests that color lexicons have changed over time in a complex but orderly way. The worldwide distribution of the motifs and the cooccurrence of multiple motifs within languages suggest that universal processes control the naming of colors. PMID:19901327

  3. Base motif recognition and design of DNA templates for fluorescent silver clusters by machine learning.

    PubMed

    Copp, Stacy M; Bogdanov, Petko; Debord, Mark; Singh, Ambuj; Gwinn, Elisabeth

    2014-09-03

    Discriminative base motifs within DNA templates for fluorescent silver clusters are identified using methods that combine large experimental data sets with machine learning tools for pattern recognition. Combining the discovery of certain multibase motifs important for determining fluorescence brightness with a generative algorithm, the probability of selecting DNA templates that stabilize fluorescent silver clusters is increased by a factor of >3.

  4. Physical-chemical property based sequence motifs and methods regarding same

    DOEpatents

    Braun, Werner; Mathura, Venkatarajan S.; Schein, Catherine H.

    2008-09-09

    A data analysis system, program, and/or method, e.g., a data mining/data exploration method, using physical-chemical property motifs. For example, a sequence database may be searched for identifying segments thereof having physical-chemical properties similar to the physical-chemical property motifs.

  5. Population Genomics and Transcriptional Consequences of Regulatory Motif Variation in Globally Diverse Saccharomyces cerevisiae Strains

    PubMed Central

    Connelly, Caitlin F.; Skelly, Daniel A.; Dunham, Maitreya J.; Akey, Joshua M.

    2013-01-01

    Noncoding genetic variation is known to significantly influence gene expression levels in a growing number of specific cases; however, the patterns of genome-wide noncoding variation present within populations, the evolutionary forces acting on noncoding variants, and the relative effects of regulatory polymorphisms on transcript abundance are not well characterized. Here, we address these questions by analyzing patterns of regulatory variation in motifs for 177 DNA binding proteins in 37 strains of Saccharomyces cerevisiae. Between S. cerevisiae strains, we found considerable polymorphism in regulatory motifs across strains (mean π = 0.005) as well as diversity in regulatory motifs (mean 0.91 motifs differences per regulatory region). Population genetics analyses reveal that motifs are under purifying selection, and there is considerable heterogeneity in the magnitude of selection across different motifs. Finally, we obtained RNA-Seq data in 22 strains and identified 49 polymorphic DNA sequence motifs in 30 distinct genes that are significantly associated with transcriptional differences between strains. In 22 of these genes, there was a single polymorphic motif associated with expression in the upstream region. Our results provide comprehensive insights into the evolutionary trajectory of regulatory variation in yeast and the characteristics of a compendium of regulatory alleles. PMID:23619145

  6. Genome-wide conserved consensus transcription factor binding motifs are hyper-methylated

    PubMed Central

    2010-01-01

    Background DNA methylation can regulate gene expression by modulating the interaction between DNA and proteins or protein complexes. Conserved consensus motifs exist across the human genome ("predicted transcription factor binding sites": "predicted TFBS") but the large majority of these are proven by chromatin immunoprecipitation and high throughput sequencing (ChIP-seq) not to be biological transcription factor binding sites ("empirical TFBS"). We hypothesize that DNA methylation at conserved consensus motifs prevents promiscuous or disorderly transcription factor binding. Results Using genome-wide methylation maps of the human heart and sperm, we found that all conserved consensus motifs as well as the subset of those that reside outside CpG islands have an aggregate profile of hyper-methylation. In contrast, empirical TFBS with conserved consensus motifs have a profile of hypo-methylation. 40% of empirical TFBS with conserved consensus motifs resided in CpG islands whereas only 7% of all conserved consensus motifs were in CpG islands. Finally we further identified a minority subset of TF whose profiles are either hypo-methylated or neutral at their respective conserved consensus motifs implicating that these TF may be responsible for establishing or maintaining an un-methylated DNA state, or whose binding is not regulated by DNA methylation. Conclusions Our analysis supports the hypothesis that at least for a subset of TF, empirical binding to conserved consensus motifs genome-wide may be controlled by DNA methylation. PMID:20875111

  7. Stabilization of i-motif structures by 2′-β-fluorination of DNA

    PubMed Central

    Assi, Hala Abou; Harkness, Robert W.; Martin-Pintado, Nerea; Wilds, Christopher J.; Campos-Olivas, Ramón; Mittermaier, Anthony K.; González, Carlos; Damha, Masad J.

    2016-01-01

    i-Motifs are four-stranded DNA structures consisting of two parallel DNA duplexes held together by hemi-protonated and intercalated cytosine base pairs (C:CH+). They have attracted considerable research interest for their potential role in gene regulation and their use as pH responsive switches and building blocks in macromolecular assemblies. At neutral and basic pH values, the cytosine bases deprotonate and the structure unfolds into single strands. To avoid this limitation and expand the range of environmental conditions supporting i-motif folding, we replaced the sugar in DNA by 2-deoxy-2-fluoroarabinose. We demonstrate that such a modification significantly stabilizes i-motif formation over a wide pH range, including pH 7. Nuclear magnetic resonance experiments reveal that 2-deoxy-2-fluoroarabinose adopts a C2′-endo conformation, instead of the C3′-endo conformation usually found in unmodified i-motifs. Nevertheless, this substitution does not alter the overall i-motif structure. This conformational change, together with the changes in charge distribution in the sugar caused by the electronegative fluorine atoms, leads to a number of favorable sequential and inter-strand electrostatic interactions. The availability of folded i-motifs at neutral pH will aid investigations into the biological function of i-motifs in vitro, and will expand i-motif applications in nanotechnology. PMID:27166371

  8. The EDLL motif: a potent plant transcriptional activation domain from AP2/ERF transcription factors.

    PubMed

    Tiwari, Shiv B; Belachew, Alemu; Ma, Siu Fong; Young, Melinda; Ade, Jules; Shen, Yu; Marion, Colleen M; Holtan, Hans E; Bailey, Adina; Stone, Jeffrey K; Edwards, Leslie; Wallace, Andreah D; Canales, Roger D; Adam, Luc; Ratcliffe, Oliver J; Repetti, Peter P

    2012-06-01

    In plants, the ERF/EREBP family of transcriptional regulators plays a key role in adaptation to various biotic and abiotic stresses. These proteins contain a conserved AP2 DNA-binding domain and several uncharacterized motifs. Here, we describe a short motif, termed 'EDLL', that is present in AtERF98/TDR1 and other clade members from the same AP2 sub-family. We show that the EDLL motif, which has a unique arrangement of acidic amino acids and hydrophobic leucines, functions as a strong activation domain. The motif is transferable to other proteins, and is active at both proximal and distal positions of target promoters. As such, the EDLL motif is able to partly overcome the repression conferred by the AtHB2 transcription factor, which contains an ERF-associated amphiphilic repression (EAR) motif. We further examined the activation potential of EDLL by analysis of the regulation of flowering time by NF-Y (nuclear factor Y) proteins. Genetic evidence indicates that NF-Y protein complexes potentiate the action of CONSTANS in regulation of flowering in Arabidopsis; we show that the transcriptional activation function of CONSTANS can be substituted by direct fusion of the EDLL activation motif to NF-YB subunits. The EDLL motif represents a potent plant activation domain that can be used as a tool to confer transcriptional activation potential to heterologous DNA-binding proteins.

  9. Identification of a novel mono-leucine basolateral sorting motif within the cytoplasmic domain of amphiregulin

    PubMed Central

    Gephart, Jonathan D.; Singh, Bhuminder; Higginbotham, James N.; Franklin, Jeffrey L.; Gonzalez, Alfonso; Fölsch, Heike; Coffey, Robert J.

    2011-01-01

    Epithelial cells establish apical and basolateral (BL) membranes with distinct protein and lipid compositions. To achieve this spatial asymmetry, the cell utilizes a variety of mechanisms for differential sorting, delivery and retention of cell surface proteins. The EGF receptor (EGFR) and its ligand, amphiregulin (AREG), are transmembrane proteins delivered to the BL membrane in polarized epithelial cells. Herein, we show that the cytoplasmic domain of AREG contains dominant BL sorting information; replacement of the cytoplasmic domain of apically targeted NGFR with the cytoplasmic domain of AREG redirects the chimera to the BL surface. Using sequential truncations and site-directed mutagenesis of the AREG cytoplasmic domain, we identify a novel BL sorting motif consisting of a single leucine C-terminal to an acidic cluster (EEXXXL). In AP-1B-deficient cells, newly synthesized AREG is initially delivered to the BL surface like in AP-1B-expressing cells. However, in these AP-1B-deficient cells, recycling of AREG back to the BL surface is compromised, leading to its appearance at the apical surface. These results show that recycling, but not delivery, of AREG to the BL surface is AP-1B-dependent. PMID:21917092

  10. Seed storage protein gene promoters contain conserved DNA motifs in Brassicaceae, Fabaceae and Poaceae

    PubMed Central

    Fauteux, François; Strömvik, Martina V

    2009-01-01

    Background Accurate computational identification of cis-regulatory motifs is difficult, particularly in eukaryotic promoters, which typically contain multiple short and degenerate DNA sequences bound by several interacting factors. Enrichment in combinations of rare motifs in the promoter sequence of functionally or evolutionarily related genes among several species is an indicator of conserved transcriptional regulatory mechanisms. This provides a basis for the computational identification of cis-regulatory motifs. Results We have used a discriminative seeding DNA motif discovery algorithm for an in-depth analysis of 54 seed storage protein (SSP) gene promoters from three plant families, namely Brassicaceae (mustards), Fabaceae (legumes) and Poaceae (grasses) using backgrounds based on complete sets of promoters from a representative species in each family, namely Arabidopsis (Arabidopsis thaliana (L.) Heynh.), soybean (Glycine max (L.) Merr.) and rice (Oryza sativa L.) respectively. We have identified three conserved motifs (two RY-like and one ACGT-like) in Brassicaceae and Fabaceae SSP gene promoters that are similar to experimentally characterized seed-specific cis-regulatory elements. Fabaceae SSP gene promoter sequences are also enriched in a novel, seed-specific E2Fb-like motif. Conserved motifs identified in Poaceae SSP gene promoters include a GCN4-like motif, two prolamin-box-like motifs and an Skn-1-like motif. Evidence of the presence of a variant of the TATA-box is found in the SSP gene promoters from the three plant families. Motifs discovered in SSP gene promoters were used to score whole-genome sets of promoters from Arabidopsis, soybean and rice. The highest-scoring promoters are associated with genes coding for different subunits or precursors of seed storage proteins. Conclusion Seed storage protein gene promoter motifs are conserved in diverse species, and different plant families are characterized by a distinct combination of conserved motifs

  11. Mental imagery boosts music compositional creativity

    PubMed Central

    Lim, Stephen Wee Hun

    2017-01-01

    We empirically investigated the effect of mental imagery on young children’s music compositional creativity. Children aged 5 to 8 years participated in two music composition sessions. In the control session, participants based their composition on a motif that they had created using a sequence of letter names. In the mental imagery session, participants were given a picture of an animal and instructed to imagine the animal’s sounds and movements, before incorporating what they had imagined into their composition. Six expert judges independently rated all music compositions on creativity based on subjective criteria (consensual assessment). Reliability analyses indicated that the expert judges demonstrated a high level of agreement in their ratings. The mental imagery compositions received significantly higher creativity ratings by the expert judges than did the control compositions. These results provide evidence for the effectiveness of mental imagery in enhancing young children’s music compositional creativity. PMID:28296965

  12. Mental imagery boosts music compositional creativity.

    PubMed

    Wong, Sarah Shi Hui; Lim, Stephen Wee Hun

    2017-01-01

    We empirically investigated the effect of mental imagery on young children's music compositional creativity. Children aged 5 to 8 years participated in two music composition sessions. In the control session, participants based their composition on a motif that they had created using a sequence of letter names. In the mental imagery session, participants were given a picture of an animal and instructed to imagine the animal's sounds and movements, before incorporating what they had imagined into their composition. Six expert judges independently rated all music compositions on creativity based on subjective criteria (consensual assessment). Reliability analyses indicated that the expert judges demonstrated a high level of agreement in their ratings. The mental imagery compositions received significantly higher creativity ratings by the expert judges than did the control compositions. These results provide evidence for the effectiveness of mental imagery in enhancing young children's music compositional creativity.

  13. Methods for Identifying Ligands that Target Nucleic Acid Molecules and Nucleic Acid Structural Motifs

    NASA Technical Reports Server (NTRS)

    Disney, Matthew D. (Inventor); Childs-Disney, Jessica L. (Inventor)

    2017-01-01

    Disclosed are methods for identifying a nucleic acid (e.g., RNA, DNA, etc.) motif which interacts with a ligand. The method includes providing a plurality of ligands immobilized on a support, wherein each particular ligand is immobilized at a discrete location on the support; contacting the plurality of immobilized ligands with a nucleic acid motif library under conditions effective for one or more members of the nucleic acid motif library to bind with the immobilized ligands; and identifying members of the nucleic acid motif library that are bound to a particular immobilized ligand. Also disclosed are methods for selecting, from a plurality of candidate ligands, one or more ligands that have increased likelihood of binding to a nucleic acid molecule comprising a particular nucleic acid motif, as well as methods for identifying a nucleic acid which interacts with a ligand.

  14. Growing scale-free networks with tunable distributions of triad motifs

    NASA Astrophysics Data System (ADS)

    Li, Shuguang; Yuan, Jianping; Shi, Yong; Zagal, Juan Cristóbal

    2015-06-01

    Network motifs are local structural patterns and elementary functional units of complex networks in real world, which can have significant impacts on the global behavior of these systems. Many models are able to reproduce complex networks mimicking a series of global features of real systems, however the local features such as motifs in real networks have not been well represented. We propose a model to grow scale-free networks with tunable motif distributions through a combined operation of preferential attachment and triad motif seeding steps. Numerical experiments show that the constructed networks have adjustable distributions of the local triad motifs, meanwhile preserving the global features of power-law distributions of node degree, short average path lengths of nodes, and highly clustered structures.

  15. The dimerization motif of the glycophorin A transmembrane segment in membranes: importance of glycine residues.

    PubMed

    Brosig, B; Langosch, D

    1998-04-01

    The glycophorin A transmembrane segment homo-dimerizes to a right-handed pair of alpha-helices. Here, we identified the amino acid motif mediating this interaction within a natural membrane environment. Critical residues were grafted onto two different hydrophobic host sequences in a stepwise manner and self-assembly of the hybrid sequences was determined with the ToxR transcription activator system. Our results show that the motif LIxxGxxxGxxxT elicits a level of self-association equivalent to that of the original glycophorin A transmembrane segment. This motif is very similar to the one previously established in detergent solution. Interestingly, the central GxxxG motif by itself already induced strong self-assembly of host sequences and the three-residue spacing between both glycines proved to be optimal for the interaction. The GxxxG element thus appears to be the most crucial part of the interaction motif.

  16. The dimerization motif of the glycophorin A transmembrane segment in membranes: importance of glycine residues.

    PubMed Central

    Brosig, B.; Langosch, D.

    1998-01-01

    The glycophorin A transmembrane segment homo-dimerizes to a right-handed pair of alpha-helices. Here, we identified the amino acid motif mediating this interaction within a natural membrane environment. Critical residues were grafted onto two different hydrophobic host sequences in a stepwise manner and self-assembly of the hybrid sequences was determined with the ToxR transcription activator system. Our results show that the motif LIxxGxxxGxxxT elicits a level of self-association equivalent to that of the original glycophorin A transmembrane segment. This motif is very similar to the one previously established in detergent solution. Interestingly, the central GxxxG motif by itself already induced strong self-assembly of host sequences and the three-residue spacing between both glycines proved to be optimal for the interaction. The GxxxG element thus appears to be the most crucial part of the interaction motif. PMID:9568912

  17. Stochastic and coherence resonance in feed-forward-loop neuronal network motifs

    NASA Astrophysics Data System (ADS)

    Guo, Daqing; Li, Chunguang

    2009-05-01

    The relationships between noise and complex dynamic behaviors of neuronal ensembles are key questions in computational neuroscience, particularly in understanding some basic signal transmission mechanisms of the brain. Here we systemically investigate both the stochastic resonance (SR) and coherence resonance (CR) in the triple-neuron feed-forward-loop (FFL) network motifs by computational modeling. We use the Izhikevich neuron model as well as the chemical coupling to build the FFL motifs, and consider all possible motif types. The simulation results demonstrate that these motifs can exploit noise to enrich its dynamic performance. With a proper choice of noise intensities, both the SR and CR can be exhibited in many types of the FFLs. On the other hand, our results also indicate that the coupling strength serves as a control parameter, which has great impacts on the stochastic dynamics of the FFL motifs. Additionally, biological implications of presented results in the field of neuroscience are outlined.

  18. A generalized profile syntax for biomolecular sequence motifs and its function in automatic sequence interpretation

    SciTech Connect

    Bucher, P.; Bairoch, A.

    1994-12-31

    A general syntax for expressing bimolecular sequence motifs is described, which will be used in future releases of the PROSITE data bank and in a similar collection of nucleic acid sequence motifs currently under development. The central part of the syntax is a regular structure which can be viewed as a generalization of the profiles introduced by Gribskov and coworkers. Accessory features implement specific motif search strategies and provide information helpful for the interpretation of predicted matches. Two contrasting examples, representing E. coli promoters and SH3 domains respectively, are shown to demonstrate the versatility of the syntax, and its compatibility with diverse motif search methods. It is argued, that a comprehensive machine-readable motif collection based on the new syntax, in conjunction with a standard search program, can serve as a general-purpose sequence interpretation and function prediction tool.

  19. Mitoxantrone and Analogues Bind and Stabilize i-Motif Forming DNA Sequences

    NASA Astrophysics Data System (ADS)

    Wright, Elisé P.; Day, Henry A.; Ibrahim, Ali M.; Kumar, Jeethendra; Boswell, Leo J. E.; Huguin, Camille; Stevenson, Clare E. M.; Pors, Klaus; Waller, Zoë A. E.

    2016-12-01

    There are hundreds of ligands which can interact with G-quadruplex DNA, yet very few which target i-motif. To appreciate an understanding between the dynamics between these structures and how they can be affected by intervention with small molecule ligands, more i-motif binding compounds are required. Herein we describe how the drug mitoxantrone can bind, induce folding of and stabilise i-motif forming DNA sequences, even at physiological pH. Additionally, mitoxantrone was found to bind i-motif forming sequences preferentially over double helical DNA. We also describe the stabilisation properties of analogues of mitoxantrone. This offers a new family of ligands with potential for use in experiments into the structure and function of i-motif forming DNA sequences.

  20. Active motif finder - a bio-tool based on mutational structures in DNA sequences

    PubMed Central

    Udayakumar, Mani; Shanmuga-priya, Palaniyandi; Hemavathi, Kamalakannan; Seenivasagam, Rengasamy

    2011-01-01

    Active Motif Finder (AMF) is a novel algorithmic tool, designed based on mutations in DNA sequences. Tools available at present for finding motifs are based on matching a given motif in the query sequence. AMF describes a new algorithm that identifies the occurrences of patterns which possess all kinds of mutations like insertion, deletion and mismatch. The algorithm is mainly based on the Alignment Score Matrix (ASM) computation by comparing input motif with full length sequence. Much of the effort in bioinformatics is directed to identify these motifs in the sequences of newly discovered genes. The proposed bio-tool serves as an open resource for analysis and useful for studying polymorphisms in DNA sequences. AMF can be searched via a user-friendly interface. This tool is intended to serve the scientific community working in the areas of chemical and structural biology, and is freely available to all users, at http://www.sastra.edu/scbt/amf/. PMID:23554723

  1. Mitoxantrone and Analogues Bind and Stabilize i-Motif Forming DNA Sequences

    PubMed Central

    Wright, Elisé P.; Day, Henry A.; Ibrahim, Ali M.; Kumar, Jeethendra; Boswell, Leo J. E.; Huguin, Camille; Stevenson, Clare E. M.; Pors, Klaus; Waller, Zoë A. E.

    2016-01-01

    There are hundreds of ligands which can interact with G-quadruplex DNA, yet very few which target i-motif. To appreciate an understanding between the dynamics between these structures and how they can be affected by intervention with small molecule ligands, more i-motif binding compounds are required. Herein we describe how the drug mitoxantrone can bind, induce folding of and stabilise i-motif forming DNA sequences, even at physiological pH. Additionally, mitoxantrone was found to bind i-motif forming sequences preferentially over double helical DNA. We also describe the stabilisation properties of analogues of mitoxantrone. This offers a new family of ligands with potential for use in experiments into the structure and function of i-motif forming DNA sequences. PMID:28004744

  2. Combinatorial Information Theoretical Measurement of the Semantic Significance of Semantic Graph Motifs

    SciTech Connect

    Joslyn, Cliff A.; al-Saffar, Sinan; Haglin, David J.; Holder, Larry

    2011-06-14

    Given an arbitrary semantic graph data set, perhaps one lacking in explicit ontological information, we wish to first identify its significant semantic structures, and then measure the extent of their significance. Casting a semantic graph dataset as an edge-labeled, directed graph, this task can be built on the ability to mine frequent {\\em labeled} subgraphs in edge-labeled, directed graphs. We begin by considering the fundamentals of the enumerative combinatorics of subgraph motif structures in edge-labeled directed graphs. We identify its frequent labeled, directed subgraph motif patterns, and measure the significance of the resulting motifs by the information gain relative to the expected value of the motif based on the empirical frequency distribution of the link types which compose them, assuming indpendence. We illustrate the method on a small test graph, and discuss results obtained for small linear motifs (link type bigrams and trigrams) in a larger graph structure.

  3. Motif-based construction of a functional map for mammalian olfactory receptors.

    PubMed

    Liu, Agatha H; Zhang, Xinmin; Stolovitzky, Gustavo A; Califano, Andrea; Firestein, Stuart J

    2003-05-01

    We applied an automatic and unsupervised system to a nearly complete database of mammalian odor receptor genes. The generated motifs and gene classification were subjected to extensive and systematic downstream analysis to obtain biological insights. Two major results from this analysis were: (1) a map of sequence motifs that may correlate with function and (2) the corresponding receptor classes in which members of each class are likely to share specific functions. We have discovered motifs that have been implicated in structural integrity and posttranslational modification, as well as motifs very likely to be directly involved in ligand binding. We further propose a combinatorial molecular hypothesis, based on unique combinations of the observed motifs, that provides a foundation for understanding the generation of a large number of ligand binding sites.

  4. Isolation, cloning and characterisation of motifs containing (GA/TC)n repeats isolated from vetch, Vicia bithynica.

    PubMed

    Sakowicz, Tomasz; Bowater, Richard; Parniewski, Paweł

    2004-01-01

    Microsatellites are widely distributed in plant genomes and comprise unstable regions that undergo mutational changes at rates much greater than that observed for non-repetitive sequences. They demonstrate intrinsic genetic instability, manifested as frequent length changes due to insertions or deletions of repeat units. Detailed analysis of 1600 clones containing genomic sequences of Vicia bithynica revealed the presence of microsatellite repeats in its genome. Based on the screening of a partial DNA library of plasmids, 13 clones harbouring (GA/TC)n tracts of various lengths of repeated motif were identified for further analysis of their internal sequence organization. Sequence analyses revealed the precise length, number of repeats, interruptions within tracts, as well as sequence composition flanking the repeat motifs. Representative plasmids containing different lengths of (GA/TC)n embedded in their original flanking sequence were used to investigate the genetic stability of the repeats. In the study presented herein, we employed a well characterised and tractable bacterial genetic system. Recultivations of Escherichia coli harbouring plasmids containing (GA/TC)n inserts demonstrated that the genetic instability of (GA/TC)n microsatellites depends highly on their length (number of repeats). These observations are in agreement with similar studies performed on repetitive sequences from humans and other organisms.

  5. Copper-binding tripeptide motif increases potency of the antimicrobial peptide Anoplin via Reactive Oxygen Species generation.

    PubMed

    Libardo, M Daben J; Nagella, Sai; Lugo, Andrea; Pierce, Scott; Angeles-Boza, Alfredo M

    2015-01-02

    Antimicrobial peptides (AMPs) are broad spectrum antimicrobial agents that act through diverse mechanisms, this characteristic makes them suitable starting points for development of novel classes of antibiotics. We have previously reported the increase in activity of AMPs upon addition of the Amino Terminal Copper and Nickel (ATCUN) Binding Unit. Herein we synthesized the membrane active peptide, Anoplin and two ATCUN-Anoplin derivatives and show that the increase in activity is indeed due to the ROS formation by the Cu(II)-ATCUN complex. We found that the ATCUN-Anoplin peptides were up to four times more potent compared to Anoplin alone against standard test bacteria. We studied membrane disruption, and cellular localization and found that addition of the ATCUN motif did not lead to a difference in these properties. When helical content was calculated, we observed that ATCUN-Anoplin had a lower helical composition. We found that ATCUN-Anoplin are able to oxidatively damage lipids in the bacterial membrane and that their activity trails the rate at which ROS is formed by the Cu(II)-ATCUN complexes alone. This study shows that addition of a metal binding tripeptide motif is a simple strategy to increase potency of AMPs by conferring a secondary action.

  6. An improved poly(A) motifs recognition method based on decision level fusion.

    PubMed

    Zhang, Shanxin; Han, Jiuqiang; Liu, Jun; Zheng, Jiguang; Liu, Ruiling

    2015-02-01

    Polyadenylation is the process of addition of poly(A) tail to mRNA 3' ends. Identification of motifs controlling polyadenylation plays an essential role in improving genome annotation accuracy and better understanding of the mechanisms governing gene regulation. The bioinformatics methods used for poly(A) motifs recognition have demonstrated that information extracted from sequences surrounding the candidate motifs can differentiate true motifs from the false ones greatly. However, these methods depend on either domain features or string kernels. To date, methods combining information from different sources have not been found yet. Here, we proposed an improved poly(A) motifs recognition method by combing different sources based on decision level fusion. First of all, two novel prediction methods was proposed based on support vector machine (SVM): one method is achieved by using the domain-specific features and principle component analysis (PCA) method to eliminate the redundancy (PCA-SVM); the other method is based on Oligo string kernel (Oligo-SVM). Then we proposed a novel machine-learning method for poly(A) motif prediction by marrying four poly(A) motifs recognition methods, including two state-of-the-art methods (Random Forest (RF) and HMM-SVM), and two novel proposed methods (PCA-SVM and Oligo-SVM). A decision level information fusion method was employed to combine the decision values of different classifiers by applying the DS evidence theory. We evaluated our method on a comprehensive poly(A) dataset that consists of 14,740 samples on 12 variants of poly(A) motifs and 2750 samples containing none of these motifs. Our method has achieved accuracy up to 86.13%. Compared with the four classifiers, our evidence theory based method reduces the average error rate by about 30%, 27%, 26% and 16%, respectively. The experimental results suggest that the proposed method is more effective for poly(A) motif recognition.

  7. MOCCS: Clarifying DNA-binding motif ambiguity using ChIP-Seq data.

    PubMed

    Ozaki, Haruka; Iwasaki, Wataru

    2016-08-01

    As a key mechanism of gene regulation, transcription factors (TFs) bind to DNA by recognizing specific short sequence patterns that are called DNA-binding motifs. A single TF can accept ambiguity within its DNA-binding motifs, which comprise both canonical (typical) and non-canonical motifs. Clarification of such DNA-binding motif ambiguity is crucial for revealing gene regulatory networks and evaluating mutations in cis-regulatory elements. Although chromatin immunoprecipitation sequencing (ChIP-seq) now provides abundant data on the genomic sequences to which a given TF binds, existing motif discovery methods are unable to directly answer whether a given TF can bind to a specific DNA-binding motif. Here, we report a method for clarifying the DNA-binding motif ambiguity, MOCCS. Given ChIP-Seq data of any TF, MOCCS comprehensively analyzes and describes every k-mer to which that TF binds. Analysis of simulated datasets revealed that MOCCS is applicable to various ChIP-Seq datasets, requiring only a few minutes per dataset. Application to the ENCODE ChIP-Seq datasets proved that MOCCS directly evaluates whether a given TF binds to each DNA-binding motif, even if known position weight matrix models do not provide sufficient information on DNA-binding motif ambiguity. Furthermore, users are not required to provide numerous parameters or background genomic sequence models that are typically unavailable. MOCCS is implemented in Perl and R and is freely available via https://github.com/yuifu/moccs. By complementing existing motif-discovery software, MOCCS will contribute to the basic understanding of how the genome controls diverse cellular processes via DNA-protein interactions. Copyright © 2016 Elsevier Ltd. All rights reserved.

  8. Bioinformatics study of cancer-related mutations within p53 phosphorylation site motifs.

    PubMed

    Ji, Xiaona; Huang, Qiang; Yu, Long; Nussinov, Ruth; Ma, Buyong

    2014-07-29

    p53 protein has about thirty phosphorylation sites located at the N- and C-termini and in the core domain. The phosphorylation sites are relatively less mutated than other residues in p53. To understand why and how p53 phosphorylation sites are rarely mutated in human cancer, using a bioinformatics approaches, we examined the phosphorylation site and its nearby flanking residues, focusing on the consensus phosphorylation motif pattern, amino-acid correlations within the phosphorylation motifs, the propensity of structural disorder of the phosphorylation motifs, and cancer mutations observed within the phosphorylation motifs. Many p53 phosphorylation sites are targets for several kinases. The phosphorylation sites match 17 consensus sequence motifs out of the 29 classified. In addition to proline, which is common in kinase specificity-determining sites, we found high propensity of acidic residues to be adjacent to phosphorylation sites. Analysis of human cancer mutations in the phosphorylation motifs revealed that motifs with adjacent acidic residues generally have fewer mutations, in contrast to phosphorylation sites near proline residues. p53 phosphorylation motifs are mostly disordered. However, human cancer mutations within phosphorylation motifs tend to decrease the disorder propensity. Our results suggest that combination of acidic residues Asp and Glu with phosphorylation sites provide charge redundancy which may safe guard against loss-of-function mutations, and that the natively disordered nature of p53 phosphorylation motifs may help reduce mutational damage. Our results further suggest that engineering acidic amino acids adjacent to potential phosphorylation sites could be a p53 gene therapy strategy.

  9. A dinucleotide motif in oligonucleotides shows potent immunomodulatory activity and overrides species-specific recognition observed with CpG motif.

    PubMed

    Kandimalla, Ekambar R; Bhagat, Lakshmi; Zhu, Fu-Gang; Yu, Dong; Cong, Yan-Ping; Wang, Daqing; Tang, Jimmy X; Tang, Jin-Yan; Knetter, Cathrine F; Lien, Egil; Agrawal, Sudhir

    2003-11-25

    Bacterial and synthetic DNAs containing CpG dinucleotides in specific sequence contexts activate the vertebrate immune system through Toll-like receptor 9 (TLR9). In the present study, we used a synthetic nucleoside with a bicyclic heterobase [1-(2'-deoxy-beta-d-ribofuranosyl)-2-oxo-7-deaza-8-methyl-purine; R] to replace the C in CpG, resulting in an RpG dinucleotide. The RpG dinucleotide was incorporated in mouse- and human-specific motifs in oligodeoxynucleotides (oligos) and 3'-3-linked oligos, referred to as immunomers. Oligos containing the RpG motif induced cytokine secretion in mouse spleen-cell cultures. Immunomers containing RpG dinucleotides showed activity in transfected-HEK293 cells stably expressing mouse TLR9, suggesting direct involvement of TLR9 in the recognition of RpG motif. In J774 macrophages, RpG motifs activated NF-kappa B and mitogen-activated protein kinase pathways. Immunomers containing the RpG dinucleotide induced high levels of IL-12 and IFN-gamma, but lower IL-6 in time- and concentration-dependent fashion in mouse spleen-cell cultures costimulated with IL-2. Importantly, immunomers containing GTRGTT and GARGTT motifs were recognized to a similar extent by both mouse and human immune systems. Additionally, both mouse- and human-specific RpG immunomers potently stimulated proliferation of peripheral blood mononuclear cells obtained from diverse vertebrate species, including monkey, pig, horse, sheep, goat, rat, and chicken. An immunomer containing GTRGTT motif prevented conalbumin-induced and ragweed allergen-induced allergic inflammation in mice. We show that a synthetic bicyclic nucleotide is recognized in the C position of a CpG dinucleotide by immune cells from diverse vertebrate species without bias for flanking sequences, suggesting a divergent nucleotide motif recognition pattern of TLR9.

  10. A dinucleotide motif in oligonucleotides shows potent immunomodulatory activity and overrides species-specific recognition observed with CpG motif

    PubMed Central

    Kandimalla, Ekambar R.; Bhagat, Lakshmi; Zhu, Fu-Gang; Yu, Dong; Cong, Yan-Ping; Wang, Daqing; Tang, Jimmy X.; Tang, Jin-Yan; Knetter, Cathrine F.; Lien, Egil; Agrawal, Sudhir

    2003-01-01

    Bacterial and synthetic DNAs containing CpG dinucleotides in specific sequence contexts activate the vertebrate immune system through Toll-like receptor 9 (TLR9). In the present study, we used a synthetic nucleoside with a bicyclic heterobase [1-(2′-deoxy-β-d-ribofuranosyl)-2-oxo-7-deaza-8-methyl-purine; R] to replace the C in CpG, resulting in an RpG dinucleotide. The RpG dinucleotide was incorporated in mouse- and human-specific motifs in oligodeoxynucleotides (oligos) and 3′-3-linked oligos, referred to as immunomers. Oligos containing the RpG motif induced cytokine secretion in mouse spleen-cell cultures. Immunomers containing RpG dinucleotides showed activity in transfected-HEK293 cells stably expressing mouse TLR9, suggesting direct involvement of TLR9 in the recognition of RpG motif. In J774 macrophages, RpG motifs activated NF-κB and mitogen-activated protein kinase pathways. Immunomers containing the RpG dinucleotide induced high levels of IL-12 and IFN-γ, but lower IL-6 in time- and concentration-dependent fashion in mouse spleen-cell cultures costimulated with IL-2. Importantly, immunomers containing GTRGTT and GARGTT motifs were recognized to a similar extent by both mouse and human immune systems. Additionally, both mouse- and human-specific RpG immunomers potently stimulated proliferation of peripheral blood mononuclear cells obtained from diverse vertebrate species, including monkey, pig, horse, sheep, goat, rat, and chicken. An immunomer containing GTRGTT motif prevented conalbumin-induced and ragweed allergen-induced allergic inflammation in mice. We show that a synthetic bicyclic nucleotide is recognized in the C position of a CpG dinucleotide by immune cells from diverse vertebrate species without bias for flanking sequences, suggesting a divergent nucleotide motif recognition pattern of TLR9. PMID:14610275

  11. One motif to bind them: A small-XXX-small motif affects transmembrane domain 1 oligomerization, function, localization, and cross-talk between two yeast GPCRs.

    PubMed

    Lock, Antonia; Forfar, Rachel; Weston, Cathryn; Bowsher, Leo; Upton, Graham J G; Reynolds, Christopher A; Ladds, Graham; Dixon, Ann M

    2014-12-01

    G protein-coupled receptors (GPCRs) are the largest family of cell-surface receptors in mammals and facilitate a range of physiological responses triggered by a variety of ligands. GPCRs were thought to function as monomers, however it is now accepted that GPCR homo- and hetero-oligomers also exist and influence receptor properties. The Schizosaccharomyces pombe GPCR Mam2 is a pheromone-sensing receptor involved in mating and has previously been shown to form oligomers in vivo. The first transmembrane domain (TMD) of Mam2 contains a small-XXX-small motif, overrepresented in membrane proteins and well-known for promoting helix-helix interactions. An ortholog of Mam2 in Saccharomyces cerevisiae, Ste2, contains an analogous small-XXX-small motif which has been shown to contribute to receptor homo-oligomerization, localization and function. Here we have used experimental and computational techniques to characterize the role of the small-XXX-small motif in function and assembly of Mam2 for the first time. We find that disruption of the motif via mutagenesis leads to reduction of Mam2 TMD1 homo-oligomerization and pheromone-responsive cellular signaling of the full-length protein. It also impairs correct targeting to the plasma membrane. Mutation of the analogous motif in Ste2 yielded similar results, suggesting a conserved mechanism for assembly. Using co-expression of the two fungal receptors in conjunction with computational models, we demonstrate a functional change in G protein specificity and propose that this is brought about through hetero-dimeric interactions of Mam2 with Ste2 via the complementary small-XXX-small motifs. This highlights the potential of these motifs to affect a range of properties that can be investigated in other GPCRs. Copyright © 2014. Published by Elsevier B.V.

  12. The k-junction motif in RNA structure

    PubMed Central

    Wang, Jia; Daldrop, Peter; Huang, Lin; Lilley, David M. J.

    2014-01-01

    The k-junction is a structural motif in RNA comprising a three-way helical junction based upon kink turn (k-turn) architecture. A computer program written to examine relative helical orientation identified the three-way junction of the Arabidopsis TPP riboswitch as an elaborated k-turn. The Escherichia coli TPP riboswitch contains a related k-junction, and analysis of >11 000 sequences shows that the structure is common to these riboswitches. The k-junction exhibits all the key features of an N1-class k-turn, including the standard cross-strand hydrogen bonds. The third helix of the junction is coaxially aligned with the C (canonical) helix, while the k-turn loop forms the turn into the NC (non-canonical) helix. Analysis of ligand binding by ITC and global folding by gel electrophoresis demonstrates the importance of the k-turn nucleotides. Clearly the basic elements of k-turn structure are structurally well suited to generate a three-way helical junction, retaining all the key features and interactions of the k-turn. PMID:24531930

  13. Information processing by simple molecular motifs and susceptibility to noise.

    PubMed

    Mc Mahon, Siobhan S; Lenive, Oleg; Filippi, Sarah; Stumpf, Michael P H

    2015-09-06

    Biological organisms rely on their ability to sense and respond appropriately to their environment. The molecular mechanisms that facilitate these essential processes are however subject to a range of random effects and stochastic processes, which jointly affect the reliability of information transmission between receptors and, for example, the physiological downstream response. Information is mathematically defined in terms of the entropy; and the extent of information flowing across an information channel or signalling system is typically measured by the 'mutual information', or the reduction in the uncertainty about the output once the input signal is known. Here, we quantify how extrinsic and intrinsic noise affects the transmission of simple signals along simple motifs of molecular interaction networks. Even for very simple systems, the effects of the different sources of variability alone and in combination can give rise to bewildering complexity. In particular, extrinsic variability is apt to generate 'apparent' information that can, in extreme cases, mask the actual information that for a single system would flow between the different molecular components making up cellular signalling pathways. We show how this artificial inflation in apparent information arises and how the effects of different types of noise alone and in combination can be understood.

  14. Predicting perfect adaptation motifs in reaction kinetic networks.

    PubMed

    Drengstig, Tormod; Ueda, Hiroki R; Ruoff, Peter

    2008-12-25

    Adaptation and compensation mechanisms are important to keep organisms fit in a changing environment. "Perfect adaptation" describes an organism's response to an external stepwise perturbation by resetting some of its variables precisely to their original preperturbation values. Examples of perfect adaptation are found in bacterial chemotaxis, photoreceptor responses, or MAP kinase activities. Two concepts have evolved for how perfect adaptation may be understood. In one approach, so-called "robust perfect adaptation", the adaptation is a network property (due to integral feedback control), which is independent of rate constant values. In the other approach, which we have termed "nonrobust perfect adaptation", a fine-tuning of rate constant values is needed to show perfect adaptation. Although integral feedback describes robust perfect adaptation in general terms, it does not directly show where in a network perfect adaptation may be observed. Using control theoretic methods, we are able to predict robust perfect adaptation sites within reaction kinetic networks and show that a prerequisite for robust perfect adaptation is that the network is open and irreversible. We applied the method on various reaction schemes and found that new (robust) perfect adaptation motifs emerge when considering suggested models of bacterial and eukaryotic chemotaxis.

  15. Plasticity of the RNA Kink Turn Structural Motif

    SciTech Connect

    Antonioli, A.; Cochrane, J; Lipchock, S; Strobel, S

    2010-01-01

    The kink turn (K-turn) is an RNA structural motif found in many biologically significant RNAs. While most examples of the K-turn have a similar fold, the crystal structure of the Azoarcus group I intron revealed a novel RNA conformation, a reverse kink turn bent in the direction opposite that of a consensus K-turn. The reverse K-turn is bent toward the major grooves rather than the minor grooves of the flanking helices, yet the sequence differs from the K-turn consensus by only a single nucleotide. Here we demonstrate that the reverse bend direction is not solely defined by internal sequence elements, but is instead affected by structural elements external to the K-turn. It bends toward the major groove under the direction of a tetraloop-tetraloop receptor. The ability of one sequence to form two distinct structures demonstrates the inherent plasticity of the K-turn sequence. Such plasticity suggests that the K-turn is not a primary element in RNA folding, but instead is shaped by other structural elements within the RNA or ribonucleoprotein assembly.

  16. Atomic Force Microscopy of Arrays of Asymmetrical DNA Motifs

    SciTech Connect

    Sherman, W.B.; Mudalige, T.K.

    2012-03-21

    DNA can easily be assembled into wide and relatively flat nanostructures that lend themselves to study via Atomic Force Microscopy (AFM). It is often important to know which side of an assembly the AFM is imaging. This is particularly crucial for characterizing nanomachines, where the movement must be measured relative to fiducial features visible to the AFM. We have developed a cheap and simple technique for building DNA arrays with distinguishable sides, a technique requiring 10 or fewer strands - dozens or hundreds of strands fewer than used for these purposes previously. Our approach involves constructing arrays out of DNA tiles that have low apparent symmetry when imaged via AFM. We have surveyed the effects of varying degrees of motif asymmetry in AFM micrographs. Even at resolutions where the individual tiles cannot be resolved (either because of sub-optimal tip quality, or very gentle tapping by the AFM tip) the larger scale features of the arrays have predictable structures that allow the determination of which side of the array is facing up. We have used this information to verify that DNA hairpins attached to either the up- or down-facing side of an array on mica can be detected in AFM height scans. We have also characterized differences in appearance between hairpins attached to different sides of the arrays.

  17. Sulfur-induced structural motifs on copper and gold surfaces

    NASA Astrophysics Data System (ADS)

    Walen, Holly

    The interaction of sulfur with copper and gold surfaces plays a fundamental role in important phenomena that include coarsening of surface nanostructures, and self-assembly of alkanethiols. Here, we identify and analyze unique sulfur-induced structural motifs observed on the low-index surfaces of these two metals. We seek out these structures in an effort to better understand the fundamental interactions between these metals and sulfur that lends to the stability and favorability of metal-sulfur complexes vs. chemisorbed atomic sulfur. We choose very specific conditions: very low temperature (5 K), and very low sulfur coverage (≤ 0.1 monolayer). In this region of temperature-coverage space, which has not been examined previously for these adsorbate-metal systems, the effects of individual interactions between metals and sulfur are most apparent and can be assessed extensively with the aid of theory and modeling. Furthermore, at this temperature diffusion is minimal and relatively-mobile species can be isolated, and at low coverage the structures observed are not consumed by an extended reconstruction. The primary experimental technique is scanning tunneling microscopy (STM). The experimental observations presented here---made under identical conditions---together with extensive DFT analyses, allow comparisons and insights into factors that favor the existence of metal-sulfur complexes, vs. chemisorbed atomic sulfur, on metal terraces. We believe this data will be instrumental in better understanding the complex phenomena occurring between the surfaces of coinage metals and sulfur.

  18. Eukaryotic Penelope-Like Retroelements Encode Hammerhead Ribozyme Motifs

    PubMed Central

    Cervera, Amelia; De la Peña, Marcos

    2014-01-01

    Small self-cleaving RNAs, such as the paradigmatic Hammerhead ribozyme (HHR), have been recently found widespread in DNA genomes across all kingdoms of life. In this work, we found that new HHR variants are preserved in the ancient family of Penelope-like elements (PLEs), a group of eukaryotic retrotransposons regarded as exceptional for encoding telomerase-like retrotranscriptases and spliceosomal introns. Our bioinformatic analysis revealed not only the presence of minimalist HHRs in the two flanking repeats of PLEs but also their massive and widespread occurrence in metazoan genomes. The architecture of these ribozymes indicates that they may work as dimers, although their low self-cleavage activity in vitro suggests the requirement of other factors in vivo. In plants, however, PLEs show canonical HHRs, whereas fungi and protist PLEs encode ribozyme variants with a stable active conformation as monomers. Overall, our data confirm the connection of self-cleaving RNAs with eukaryotic retroelements and unveil these motifs as a significant fraction of the encoded information in eukaryotic genomes. PMID:25135949

  19. Regulation of GPCR Anterograde Trafficking by Molecular Chaperones and Motifs.

    PubMed

    Young, Brent; Wertman, Jaime; Dupré, Denis J

    2015-01-01

    G protein-coupled receptors (GPCRs) make up a superfamily of integral membrane proteins that respond to a wide variety of extracellular stimuli, giving them an important role in cell function and survival. They have also proven to be valuable targets in the fight against various diseases. As such, GPCR signal regulation has received considerable attention over the last few decades. With the amplitude of signaling being determined in large part by receptor density at the plasma membrane, several endogenous mechanisms for modulating GPCR expression at the cell surface have come to light. It has been shown that cell surface expression is determined by both exocytic and endocytic processes. However, the body of knowledge surrounding GPCR trafficking from the endoplasmic reticulum to the plasma membrane, commonly known as anterograde trafficking, has considerable room for growth. We focus here on the current paradigms of anterograde GPCR trafficking. We will discuss the regulatory role of both the general and "nonclassical private" chaperone systems in GPCR trafficking as well as conserved motifs that serve as modulators of GPCR export from the endoplasmic reticulum and Golgi apparatus. Together, these topics summarize some of the known mechanisms by which the cell regulates anterograde GPCR trafficking. © 2015 Elsevier Inc. All rights reserved.

  20. Crammed signaling motifs in the T-cell receptor.

    PubMed

    Borroto, Aldo; Abia, David; Alarcón, Balbino

    2014-09-01

    Although the T cell antigen receptor (TCR) is long known to contain multiple signaling subunits (CD3γ, CD3δ, CD3ɛ and CD3ζ), their role in signal transduction is still not well understood. The presence of at least one immunoreceptor tyrosine-based activation motif (ITAM) in each CD3 subunit has led to the idea that the multiplication of such elements essentially serves to amplify signals. However, the evolutionary conservation of non-ITAM sequences suggests that each CD3 subunit is likely to have specific non-redundant roles at some stage of development or in mature T cell function. The CD3ɛ subunit is paradigmatic because in a relatively short cytoplasmic sequence (∼55 amino acids) it contains several docking sites for proteins involved in intracellular trafficking and signaling, proteins whose relevance in T cell activation is slowly starting to be revealed. In this review we will summarize our current knowledge on the signaling effectors that bind directly to the TCR and we will propose a hierarchy in their response to TCR triggering.

  1. Identification of single C motif-1/lymphotactin receptor XCR1.

    PubMed

    Yoshida, T; Imai, T; Kakizaki, M; Nishimura, M; Takagi, S; Yoshie, O

    1998-06-26

    Single C motif-1 (SCM-1)/lymphotactin is a member of the chemokine superfamily, but retains only the 2nd and 4th of the four cysteine residues conserved in other chemokines. In humans, there are two highly homologous SCM-1 genes encoding SCM-1alpha and SCM-1beta with two amino acid substitutions. To identify a specific receptor for SCM-1 proteins, we produced recombinant SCM-1alpha and SCM-1beta by the baculovirus expression system and tested them on murine L1.2 cells stably expressing eight known chemokine receptors and three orphan receptors. Both proteins specifically induced migration in cells expressing an orphan receptor, GPR5. The migration was chemotactic and suppressed by pertussis toxin, indicating coupling to a Galpha type of G protein. Both proteins also induced intracellular calcium mobilization in GPR5-expressing L1.2 cells with efficient mutual cross desensitization. SCM-1alpha bound specifically to GPR5-expressing L1.2 cells with a Kd of 10 nM. By Northern blot analysis, GPR5 mRNA of about 5 kilobases was detected strongly in placenta and weakly in spleen and thymus among various human tissues. Identification of a specific receptor for SCM-1 would facilitate our investigation on its biological function. Following the set rule for the chemokine receptor nomenclature, we propose to designate GPR5 as XCR1 from XC chemokine receptor-1.

  2. Network motifs that stabilize the hybrid epithelial/mesenchymal phenotype

    NASA Astrophysics Data System (ADS)

    Jolly, Mohit Kumar; Jia, Dongya; Tripathi, Satyendra; Hanash, Samir; Mani, Sendurai; Ben-Jacob, Eshel; Levine, Herbert

    Epithelial to Mesenchymal Transition (EMT) and its reverse - MET - are hallmarks of cancer metastasis. While transitioning between E and M phenotypes, cells can also attain a hybrid epithelial/mesenchymal (E/M) phenotype that enables collective cell migration as a cluster of Circulating Tumor Cells (CTCs). These clusters can form 50-times more tumors than individually migrating CTCs, underlining their importance in metastasis. However, this hybrid E/M phenotype has been hypothesized to be only a transient one that is attained en route EMT. Here, via mathematically modeling, we identify certain `phenotypic stability factors' that couple with the core three-way decision-making circuit (miR-200/ZEB) and can maintain or stabilize the hybrid E/M phenotype. Further, we show experimentally that this phenotype can be maintained stably at a single-cell level, and knockdown of these factors impairs collective cell migration. We also show that these factors enable the association of hybrid E/M with high stemness or tumor-initiating potential. Finally, based on these factors, we deduce specific network motifs that can maintain the E/M phenotype. Our framework can be used to elucidate the effect of other players in regulating cellular plasticity during metastasis. This work was supported by NSF PHY-1427654 (Center for Theoretical Biological Physics) and the CPRIT Scholar in Cancer Research of the State of Texas at Rice University.

  3. Information processing by simple molecular motifs and susceptibility to noise

    PubMed Central

    Mc Mahon, Siobhan S.; Lenive, Oleg; Filippi, Sarah; Stumpf, Michael P. H.

    2015-01-01

    Biological organisms rely on their ability to sense and respond appropriately to their environment. The molecular mechanisms that facilitate these essential processes are however subject to a range of random effects and stochastic processes, which jointly affect the reliability of information transmission between receptors and, for example, the physiological downstream response. Information is mathematically defined in terms of the entropy; and the extent of information flowing across an information channel or signalling system is typically measured by the ‘mutual information’, or the reduction in the uncertainty about the output once the input signal is known. Here, we quantify how extrinsic and intrinsic noise affects the transmission of simple signals along simple motifs of molecular interaction networks. Even for very simple systems, the effects of the different sources of variability alone and in combination can give rise to bewildering complexity. In particular, extrinsic variability is apt to generate ‘apparent’ information that can, in extreme cases, mask the actual information that for a single system would flow between the different molecular components making up cellular signalling pathways. We show how this artificial inflation in apparent information arises and how the effects of different types of noise alone and in combination can be understood. PMID:26333812

  4. Motif mimetic of epsin perturbs tumor growth and metastasis

    PubMed Central

    Dong, Yunzhou; Wu, Hao; Rahman, H.N. Ashiqur; Liu, Yanjun; Pasula, Satish; Tessneer, Kandice L.; Cai, Xiaofeng; Liu, Xiaolei; Chang, Baojun; McManus, John; Hahn, Scott; Dong, Jiali; Brophy, Megan L.; Yu, Lili; Song, Kai; Silasi-Mansat, Robert; Saunders, Debra; Njoku, Charity; Song, Hoogeun; Mehta-D’Souza, Padmaja; Towner, Rheal; Lupu, Florea; McEver, Rodger P.; Xia, Lijun; Boerboom, Derek; Srinivasan, R. Sathish; Chen, Hong

    2015-01-01

    Tumor angiogenesis is critical for cancer progression. In multiple murine models, endothelium-specific epsin deficiency abrogates tumor progression by shifting the balance of VEGFR2 signaling toward uncontrolled tumor angiogenesis, resulting in dysfunctional tumor vasculature. Here, we designed a tumor endothelium–targeting chimeric peptide (UPI) for the purpose of inhibiting endogenous tumor endothelial epsins by competitively binding activated VEGFR2. We determined that the UPI peptide specifically targets tumor endothelial VEGFR2 through an unconventional binding mechanism that is driven by unique residues present only in the epsin ubiquitin–interacting motif (UIM) and the VEGFR2 kinase domain. In murine models of neoangiogenesis, UPI peptide increased VEGF-driven angiogenesis and neovascularization but spared quiescent vascular beds. Further, in tumor-bearing mice, UPI peptide markedly impaired functional tumor angiogenesis, tumor growth, and metastasis, resulting in a notable increase in survival. Coadministration of UPI peptide with cytotoxic chemotherapeutics further sustained tumor inhibition. Equipped with localized tumor endothelium–specific targeting, our UPI peptide provides potential for an effective and alternative cancer therapy. PMID:26571402

  5. Evidence for a gamma-turn motif in antifreeze glycopeptides.

    PubMed Central

    Drewes, J A; Rowlen, K L

    1993-01-01

    Knowledge of the secondary structure of antifreeze peptides (AFPs) and glycopeptides (AFGPs) is crucial to understanding the mechanism by which these molecules inhibit ice crystal growth. A polyproline type II helix is perhaps the most widely accepted conformation for active AFGPs; however, random coil and alpha-helix conformations have also been proposed. In this report we present vibrational spectroscopic evidence that the conformation of AFGPs in solution is not random, not alpha-helical, and not polyproline type II. Comparison of AFGP amide vibrational frequencies with those observed and calculated for beta and gamma-turns in other peptides strongly suggests that AFGPs contain substantial turn structure. Computer-generated molecular models were utilized to compare gamma-turn, beta-turn, and polyproline II structures. The gamma-turn motif is consistent with observed amide frequencies and results in a molecule with planar symmetry with respect to the disaccharides. This intriguing conformation may provide new insight into the unusual properties of AFGPs. Images FIGURE 6 PMID:8241413

  6. Peptide motif analysis predicts alphaviruses as triggers for rheumatoid arthritis.

    PubMed

    Hogeboom, Charissa

    2015-12-01

    Rheumatoid arthritis (RA) develops in response to both genetic and environmental factors. The strongest genetic determinant is HLA-DR, where polymorphisms within the P4 and P6 binding pockets confer elevated risk. However, low disease concordance across monozygotic twin pairs underscores the importance of an environmental factor, probably infectious. The goal of this investigation was to predict the microorganism most likely to interact with HLA-DR to trigger RA under the molecular mimicry hypothesis. A set of 185 structural proteins from viruses or intracellular bacteria was scanned for regions of sequence homology with a collagen peptide that binds preferentially to DR4; candidates were then evaluated against a motif required for T cell cross-reactivity. The plausibility of the predicted agent was evaluated by comparison of microbial prevalence patterns to epidemiological characteristics of RA. Peptides from alphavirus capsid proteins provided the closest fit. Variations in the P6 position suggest that the HLA binding preference may vary by species, with Ross River virus, Chikungunya virus, and Mayaro virus peptides binding preferentially to DR4, and peptides from Sindbis/Ockelbo virus showing stronger affinity to DR1. The predicted HLA preference is supported by epidemiological studies of post-infection chronic arthralgia. Parallels between the cytokine profiles of RA and chronic alphavirus infection are discussed.

  7. Ab initio coordination chemistry for nickel chelation motifs.

    PubMed

    Sudan, R Jesu Jaya; Kumari, J Lesitha Jeeva; Sudandiradoss, C

    2015-01-01

    Chelation therapy is one of the most appreciated methods in the treatment of metal induced disease predisposition. Coordination chemistry provides a way to understand metal association in biological structures. In this work we have implemented coordination chemistry to study nickel coordination due to its high impact in industrial usage and thereby health consequences. This paper reports the analysis of nickel coordination from a large dataset of nickel bound structures and sequences. Coordination patterns predicted from the structures are reported in terms of donors, chelate length, coordination number, chelate geometry, structural fold and architecture. The analysis revealed histidine as the most favored residue in nickel coordination. The most common chelates identified were histidine based namely HHH, HDH, HEH and HH spaced at specific intervals. Though a maximum coordination number of 8 was observed, the presence of a single protein donor was noted to be mandatory in nickel coordination. The coordination pattern did not reveal any specific fold, nevertheless we report preferable residue spacing for specific structural architecture. In contrast, the analysis of nickel binding proteins from bacterial and archeal species revealed no common coordination patterns. Nickel binding sequence motifs were noted to be organism specific and protein class specific. As a result we identified about 13 signatures derived from 13 classes of nickel binding proteins. The specifications on nickel coordination presented in this paper will prove beneficial for developing better chelation strategies.

  8. Mixture-based peptide libraries for identifying protease cleavage motifs.

    PubMed

    Turk, Benjamin E

    2009-01-01

    All proteases and peptidases are to some extent sequence-specific, in that one or more residues are preferred at particular positions surrounding the cleavage site in substrates. I describe here a general protocol for determining protease cleavage site preferences using mixture-based peptide libraries. Initially a completely random, amino-terminally capped peptide mixture is digested with the protease of interest, and the cleavage products are analyzed by automated Edman sequencing. The distribution of amino acids found in each sequencing cycle indicates which residues are preferred by the protease at positions downstream of the cleavage site. On the basis of these results, a second peptide library is designed that is partially degenerate and partially fixed sequence. Edman sequencing analysis of the cleavage products of this peptide mixture provides preferences amino-terminal to the scissile bond. As necessary, the process is reiterated until the full cleavage motif of the protease is known. Cleavage specificity data obtained with this method have been used to generate specific and efficient peptide substrates, to design potent and specific inhibitors, and to identify novel protease substrates.

  9. Eukaryotic penelope-like retroelements encode hammerhead ribozyme motifs.

    PubMed

    Cervera, Amelia; De la Peña, Marcos

    2014-11-01

    Small self-cleaving RNAs, such as the paradigmatic Hammerhead ribozyme (HHR), have been recently found widespread in DNA genomes across all kingdoms of life. In this work, we found that new HHR variants are preserved in the ancient family of Penelope-like elements (PLEs), a group of eukaryotic retrotransposons regarded as exceptional for encoding telomerase-like retrotranscriptases and spliceosomal introns. Our bioinformatic analysis revealed not only the presence of minimalist HHRs in the two flanking repeats of PLEs but also their massive and widespread occurrence in metazoan genomes. The architecture of these ribozymes indicates that they may work as dimers, although their low self-cleavage activity in vitro suggests the requirement of other factors in vivo. In plants, however, PLEs show canonical HHRs, whereas fungi and protist PLEs encode ribozyme variants with a stable active conformation as monomers. Overall, our data confirm the connection of self-cleaving RNAs with eukaryotic retroelements and unveil these motifs as a significant fraction of the encoded information in eukaryotic genomes. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  10. Ab Initio Coordination Chemistry for Nickel Chelation Motifs

    PubMed Central

    Jesu Jaya Sudan, R.; Lesitha Jeeva Kumari, J.; Sudandiradoss, C.

    2015-01-01

    Chelation therapy is one of the most appreciated methods in the treatment of metal induced disease predisposition. Coordination chemistry provides a way to understand metal association in biological structures. In this work we have implemented coordination chemistry to study nickel coordination due to its high impact in industrial usage and thereby health consequences. This paper reports the analysis of nickel coordination from a large dataset of nickel bound structures and sequences. Coordination patterns predicted from the structures are reported in terms of donors, chelate length, coordination number, chelate geometry, structural fold and architecture. The analysis revealed histidine as the most favored residue in nickel coordination. The most common chelates identified were histidine based namely HHH, HDH, HEH and HH spaced at specific intervals. Though a maximum coordination number of 8 was observed, the presence of a single protein donor was noted to be mandatory in nickel coordination. The coordination pattern did not reveal any specific fold, nevertheless we report preferable residue spacing for specific structural architecture. In contrast, the analysis of nickel binding proteins from bacterial and archeal species revealed no common coordination patterns. Nickel binding sequence motifs were noted to be organism specific and protein class specific. As a result we identified about 13 signatures derived from 13 classes of nickel binding proteins. The specifications on nickel coordination presented in this paper will prove beneficial for developing better chelation strategies. PMID:25985439

  11. LibME-automatic extraction of 3D ligand-binding motifs for mechanistic analysis of protein-ligand recognition.

    PubMed

    He, Wei; Liang, Zhi; Teng, MaiKun; Niu, LiWen

    2016-12-01

    Identifying conserved binding motifs is an efficient way to study protein-ligand recognition. Most 3D binding motifs only contain information from the protein side, and so motifs that combine information from both protein and ligand sides are desired. Here, we propose an algorithm called LibME (Ligand-binding Motif Extractor), which automatically extracts 3D binding motifs composed of the target ligand and surrounding conserved residues. We show that the motifs extracted by LibME for ATP and its analogs are highly similar to well-known motifs reported by previous studies. The superiority of our method to handle flexible ligands was also demonstrated using isocitric acid as an example. Finally, we show that these motifs, together with their visual exhibition, permit better investigating and understanding of protein-ligand recognition process.

  12. Peptide-based identification of functional motifs and their binding partners.

    PubMed

    Shelton, Martin N; Huang, Ming Bo; Ali, Syed; Johnson, Kateena; Roth, William; Powell, Michael; Bond, Vincent

    2013-06-30

    Specific short peptides derived from motifs found in full-length proteins, in our case HIV-1 Nef, not only retain their biological function, but can also competitively inhibit the function of the full-length protein. A set of 20 Nef scanning peptides, 20 amino acids in length with each overlapping 10 amino acids of its neighbor, were used to identify motifs in Nef responsible for its induction of apoptosis. Peptides containing these apoptotic motifs induced apoptosis at levels comparable to the full-length Nef protein. A second peptide, derived from the Secretion Modification Region (SMR) of Nef, retained the ability to interact with cellular proteins involved in Nef's secretion in exosomes (exNef). This SMRwt peptide was used as the "bait" protein in co-immunoprecipitation experiments to isolate cellular proteins that bind specifically to Nef's SMR motif. Protein transfection and antibody inhibition was used to physically disrupt the interaction between Nef and mortalin, one of the isolated SMR-binding proteins, and the effect was measured with a fluorescent-based exNef secretion assay. The SMRwt peptide's ability to outcompete full-length Nef for cellular proteins that bind the SMR motif, make it the first inhibitor of exNef secretion. Thus, by employing the techniques described here, which utilize the unique properties of specific short peptides derived from motifs found in full-length proteins, one may accelerate the identification of functional motifs in proteins and the development of peptide-based inhibitors of pathogenic functions.

  13. Mixotrophy and intraguild predation - dynamic consequences of shifts between food web motifs

    NASA Astrophysics Data System (ADS)

    Karnatak, Rajat; Wollrab, Sabine

    2017-06-01

    Mixotrophy is ubiquitous in microbial communities of aquatic systems with many flagellates being able to use autotroph as well as heterotroph pathways for energy acquisition. The usage of one over the other pathway is associated with resource availability and the coupling of alternative pathways has strong implications for system stability. We investigated the impact of dominance of different energy pathways related to relative resource availability on system dynamics in the setting of a tritrophic food web motif. This motif consists of a mixotroph feeding on a purely autotroph species while competing for a shared resource. In addition, the autotroph can use an additional exclusive food source. By changing the relative abundance of shared vs. exclusive food source, we shift the food web motif from an intraguild predation motif to a food chain motif. We analyzed the dependence of system dynamics on absolute and relative resource availability. In general, the system exhibits a transition from stable to oscillatory dynamics with increasing nutrient availability. However, this transition occurs at a much lower nutrient level for the food chain in comparison to the intraguild predation motif. A similar transition is also observed with variations in the relative abundance of food sources for a range of nutrient levels. We expect this shift in food web motifs to occur frequently in microbial communities and therefore the results from our study are highly relevant for natural systems.

  14. Trend Motif: A Graph Mining Approach for Analysis of Dynamic Complex Networks

    SciTech Connect

    Jin, R; McCallen, S; Almaas, E

    2007-05-28

    Complex networks have been used successfully in scientific disciplines ranging from sociology to microbiology to describe systems of interacting units. Until recently, studies of complex networks have mainly focused on their network topology. However, in many real world applications, the edges and vertices have associated attributes that are frequently represented as vertex or edge weights. Furthermore, these weights are often not static, instead changing with time and forming a time series. Hence, to fully understand the dynamics of the complex network, we have to consider both network topology and related time series data. In this work, we propose a motif mining approach to identify trend motifs for such purposes. Simply stated, a trend motif describes a recurring subgraph where each of its vertices or edges displays similar dynamics over a userdefined period. Given this, each trend motif occurrence can help reveal significant events in a complex system; frequent trend motifs may aid in uncovering dynamic rules of change for the system, and the distribution of trend motifs may characterize the global dynamics of the system. Here, we have developed efficient mining algorithms to extract trend motifs. Our experimental validation using three disparate empirical datasets, ranging from the stock market, world trade, to a protein interaction network, has demonstrated the efficiency and effectiveness of our approach.

  15. MODA: an efficient algorithm for network motif discovery in biological networks.

    PubMed

    Omidi, Saeed; Schreiber, Falk; Masoudi-Nejad, Ali

    2009-10-01

    In recent years, interest has been growing in the study of complex networks. Since Erdös and Rényi (1960) proposed their random graph model about 50 years ago, many researchers have investigated and shaped this field. Many indicators have been proposed to assess the global features of networks. Recently, an active research area has developed in studying local features named motifs as the building blocks of networks. Unfortunately, network motif discovery is a computationally hard problem and finding rather large motifs (larger than 8 nodes) by means of current algorithms is impractical as it demands too much computational effort. In this paper, we present a new algorithm (MODA) that incorporates techniques such as a pattern growth approach for extracting larger motifs efficiently. We have tested our algorithm and found it able to identify larger motifs with more than 8 nodes more efficiently than most of the current state-of-the-art motif discovery algorithms. While most of the algorithms rely on induced subgraphs as motifs of the networks, MODA is able to extract both induced and non-induced subgraphs simultaneously. The MODA source code is freely available at: http://LBB.ut.ac.ir/Download/LBBsoft/MODA/

  16. Recurrent motifs as resonant attractor states in the narrative field: a testable model of archetype.

    PubMed

    Goodwyn, Erik

    2013-06-01

    At the most basic level, archetypes represented Jung's attempt to explain the phenomenon of recurrent myths and folktale motifs (Jung 1956, 1959, para. 99). But the archetype remains controversial as an explanation of recurrent motifs, as the existence of recurrent motifs does not prove that archetypes exist. Thus, the challenge for contemporary archetype theory is not merely to demonstrate that recurrent motifs exist, since that is not disputed, but to demonstrate that archetypes exist and cause recurrent motifs. The present paper proposes a new model which is unlike others in that it postulates how the archetype creates resonant motifs. This model necessarily clarifies and adapts some of Jung's seminal ideas on archetype in order to provide a working framework grounded in contemporary practice and methodologies. For the first time, a model of archetype is proposed that can be validated on empirical, rather than theoretical grounds. This is achieved by linking the archetype to the hard data of recurrent motifs rather than academic trends in other fields.

  17. Fission yeast hotspot sequence motifs are also active in budding yeast.

    PubMed

    Steiner, Walter W; Steiner, Estelle M

    2012-01-01

    In most organisms, including humans, meiotic recombination occurs preferentially at a limited number of sites in the genome known as hotspots. There has been substantial progress recently in elucidating the factors determining the location of meiotic recombination hotspots, and it is becoming clear that simple sequence motifs play a significant role. In S. pombe, there are at least five unique sequence motifs that have been shown to produce hotspots of recombination, and it is likely that there are more. In S. cerevisiae, simple sequence motifs have also been shown to produce hotspots or show significant correlations with hotspots. Some of the hotspot motifs in both yeasts are known or suspected to bind transcription factors (TFs), which are required for the activity of those hotspots. Here we show that four of the five hotspot motifs identified in S. pombe also create hotspots in the distantly related budding yeast S. cerevisiae. For one of these hotspots, M26 (also called CRE), we identify TFs, Cst6 and Sko1, that activate and inhibit the hotspot, respectively. In addition, two of the hotspot motifs show significant correlations with naturally occurring hotspots. The conservation of these hotspots between the distantly related fission and budding yeasts suggests that these sequence motifs, and others yet to be discovered, may function widely as hotspots in many diverse organisms.

  18. CircularLogo: A lightweight web application to visualize intra-motif dependencies.

    PubMed

    Ye, Zhenqing; Ma, Tao; Kalmbach, Michael T; Dasari, Surendra; Kocher, Jean-Pierre A; Wang, Liguo

    2017-05-22

    The sequence logo has been widely used to represent DNA or RNA motifs for more than three decades. Despite its intelligibility and intuitiveness, the traditional sequence logo is unable to display the intra-motif dependencies and therefore is insufficient to fully characterize nucleotide motifs. Many methods have been developed to quantify the intra-motif dependencies, but fewer tools are available for visualization. We developed CircularLogo, a web-based interactive application, which is able to not only visualize the position-specific nucleotide consensus and diversity but also display the intra-motif dependencies. Applying CircularLogo to HNF6 binding sites and tRNA sequences demonstrated its ability to show intra-motif dependencies and intuitively reveal biomolecular structure. CircularLogo is implemented in JavaScript and Python based on the Django web framework. The program's source code and user's manual are freely available at http://circularlogo.sourceforge.net . CircularLogo web server can be accessed from http://bioinformaticstools.mayo.edu/circularlogo/index.html . CircularLogo is an innovative web application that is specifically designed to visualize and interactively explore intra-motif dependencies.

  19. Analysis of repetitive amino acid motifs reveals the essential features of spider dragline silk proteins.

    PubMed

    Malay, Ali D; Arakawa, Kazuharu; Numata, Keiji

    2017-01-01

    The extraordinary mechanical properties of spider dragline silk are dependent on the highly repetitive sequences of the component proteins, major ampullate spidroin 1 and 2 (MaSp2 and MaSp2). MaSp sequences are dominated by repetitive modules composed of short amino acid motifs; however, the patterns of motif conservation through evolution and their relevance to silk characteristics are not well understood. We performed a systematic analysis of MaSp sequences encompassing infraorder Araneomorphae based on the conservation of explicitly defined motifs, with the aim of elucidating the essential elements of MaSp1 and MaSp2. The results show that the GGY motif is nearly ubiquitous in the two types of MaSp, while MaSp2 is invariably associated with GP and di-glutamine (QQ) motifs. Further analysis revealed an extended MaSp2 consensus sequence in family Araneidae, with implications for the classification of the archetypal spidroins ADF3 and ADF4. Additionally, the analysis of RNA-seq data showed the expression of a set of distinct MaSp-like variants in genus Tetragnatha. Finally, an apparent association was uncovered between web architecture and the abundance of GP, QQ, and GGY motifs in MaSp2, which suggests a co-expansion of these motifs in response to the evolution of spiders' prey capture strategy.

  20. SVM2Motif—Reconstructing Overlapping DNA Sequence Motifs by Mimicking an SVM Predictor

    PubMed Central

    Vidovic, Marina M. -C.; Görnitz, Nico; Müller, Klaus-Robert; Rätsch, Gunnar; Kloft, Marius

    2015-01-01

    Identifying discriminative motifs underlying the functionality and evolution of organisms is a major challenge in computational biology. Machine learning approaches such as support vector machines (SVMs) achieve state-of-the-art performances in genomic discrimination tasks, but—due to its black-box character—motifs underlying its decision function are largely unknown. As a remedy, positional oligomer importance matrices (POIMs) allow us to visualize the significance of position-specific subsequences. Although being a major step towards the explanation of trained SVM models, they suffer from the fact that their size grows exponentially in the length of the motif, which renders their manual inspection feasible only for comparably small motif sizes, typically k ≤ 5. In this work, we extend the work on positional oligomer importance matrices, by presenting a new machine-learning methodology, entitled motifPOIM, to extract the truly relevant motifs—regardless of their length and complexity—underlying the predictions of a trained SVM model. Our framework thereby considers the motifs as free parameters in a probabilistic model, a task which can be phrased as a non-convex optimization problem. The exponential dependence of the POIM size on the oligomer length poses a major numerical challenge, which we address by an efficient optimization framework that allows us to find possibly overlapping motifs consisting of up to hundreds of nucleotides. We demonstrate the efficacy of our approach on a synthetic data set as well as a real-world human splice site data set. PMID:26690911

  1. A novel motif in telomerase reverse transcriptase regulates telomere repeat addition rate and processivity

    PubMed Central

    Xie, Mingyi; Podlevsky, Joshua D.; Qi, Xiaodong; Bley, Christopher J.; Chen, Julian J.-L.

    2010-01-01

    Telomerase is a specialized reverse transcriptase that adds telomeric DNA repeats onto chromosome termini. Here, we characterize a new telomerase-specific motif, called motif 3, in the catalytic domain of telomerase reverse transcriptase, that is crucial for telomerase function and evolutionally conserved between vertebrates and ciliates. Comprehensive mutagenesis of motif 3 identified mutations that remarkably increase the rate or alter the processivity of telomere repeat addition. Notably, the rate and processivity of repeat addition are affected independently by separate motif 3 mutations. The processive telomerase action relies upon a template translocation mechanism whereby the RNA template and the telomeric DNA strand separate and realign between each repeat synthesis. By analyzing the mutant telomerases reconstituted in vitro and in cells, we show that the hyperactive mutants exhibit higher repeat addition rates and faster enzyme turnovers, suggesting higher rates of strand-separation during template translocation. In addition, the strong correlation between the processivity of the motif 3 mutants and their ability to use an 8 nt DNA primer, suggests that motif 3 facilitates realignment between the telomeric DNA and the template RNA following strand-separation. These findings support motif 3 as a key determinant for telomerase activity and processivity. PMID:20044353

  2. Coagulase and Efb of Staphylococcus aureus Have a Common Fibrinogen Binding Motif

    PubMed Central

    Ko, Ya-Ping; Kang, Mingsong; Ganesh, Vannakambadi K.; Ravirajan, Dharmanand; Li, Bin

    2016-01-01

    ABSTRACT Coagulase (Coa) and Efb, secreted Staphylococcus aureus proteins, are important virulence factors in staphylococcal infections. Coa interacts with fibrinogen (Fg) and induces the formation of fibrin(ogen) clots through activation of prothrombin. Efb attracts Fg to the bacterial surface and forms a shield to protect the bacteria from phagocytic clearance. This communication describes the use of an array of synthetic peptides to identify variants of a linear Fg binding motif present in Coa and Efb which are responsible for the Fg binding activities of these proteins. This motif represents the first Fg binding motif identified for any microbial protein. We initially located the Fg binding sites to Coa’s C-terminal disordered segment containing tandem repeats by using recombinant fragments of Coa in enzyme-linked immunosorbent assay-type binding experiments. Sequence analyses revealed that this Coa region contained shorter segments with sequences similar to the Fg binding segments in Efb. An alanine scanning approach allowed us to identify the residues in Coa and Efb that are critical for Fg binding and to define the Fg binding motifs in the two proteins. In these motifs, the residues required for Fg binding are largely conserved, and they therefore constitute variants of a common Fg binding motif which binds to Fg with high affinity. Defining a specific motif also allowed us to identify a functional Fg binding register for the Coa repeats that is different from the repeat unit previously proposed. PMID:26733070

  3. Repression domains of class II ERF transcriptional repressors share an essential motif for active repression.

    PubMed

    Ohta, M; Matsui, K; Hiratsu, K; Shinshi, H; Ohme-Takagi, M

    2001-08-01

    We reported previously that three ERF transcription factors, tobacco ERF3 (NtERF3) and Arabidopsis AtERF3 and AtERF4, which are categorized as class II ERFs, are active repressors of transcription. To clarify the roles of these repressors in transcriptional regulation in plants, we attempted to identify the functional domains of the ERF repressor that mediates the repression of transcription. Analysis of the results of a series of deletions revealed that the C-terminal 35 amino acids of NtERF3 are sufficient to confer the capacity for repression of transcription on a heterologous DNA binding domain. This repression domain suppressed the intermolecular activities of other transcriptional activators. In addition, fusion of this repression domain to the VP16 activation domain completely inhibited the transactivation function of VP16. Comparison of amino acid sequences of class II ERF repressors revealed the conservation of the sequence motif (L)/(F)DLN(L)/(F)(x)P. This motif was essential for repression because mutations within the motif eliminated the capacity for repression. We designated this motif the ERF-associated amphiphilic repression (EAR) motif, and we identified this motif in a number of zinc-finger proteins from wheat, Arabidopsis, and petunia plants. These zinc finger proteins functioned as repressors, and their repression domains were identified as regions that contained an EAR motif.

  4. Identification of a putative nuclear export signal motif in human NANOG homeobox domain

    SciTech Connect

    Park, Sung-Won; Do, Hyun-Jin; Huh, Sun-Hyung; Sung, Boreum; Uhm, Sang-Jun; Song, Hyuk; Kim, Nam-Hyung; Kim, Jae-Hwan

    2012-05-11

    Highlights: Black-Right-Pointing-Pointer We found the putative nuclear export signal motif within human NANOG homeodomain. Black-Right-Pointing-Pointer Leucine-rich residues are important for human NANOG homeodomain nuclear export. Black-Right-Pointing-Pointer CRM1-specific inhibitor LMB blocked the potent human NANOG NES-mediated nuclear export. -- Abstract: NANOG is a homeobox-containing transcription factor that plays an important role in pluripotent stem cells and tumorigenic cells. To understand how nuclear localization of human NANOG is regulated, the NANOG sequence was examined and a leucine-rich nuclear export signal (NES) motif ({sup 125}MQELSNILNL{sup 134}) was found in the homeodomain (HD). To functionally validate the putative NES motif, deletion and site-directed mutants were fused to an EGFP expression vector and transfected into COS-7 cells, and the localization of the proteins was examined. While hNANOG HD exclusively localized to the nucleus, a mutant with both NLSs deleted and only the putative NES motif contained (hNANOG HD-{Delta}NLSs) was predominantly cytoplasmic, as observed by nucleo/cytoplasmic fractionation and Western blot analysis as well as confocal microscopy. Furthermore, site-directed mutagenesis of the putative NES motif in a partial hNANOG HD only containing either one of the two NLS motifs led to localization in the nucleus, suggesting that the NES motif may play a functional role in nuclear export. Furthermore, CRM1-specific nuclear export inhibitor LMB blocked the hNANOG potent NES-mediated export, suggesting that the leucine-rich motif may function in CRM1-mediated nuclear export of hNANOG. Collectively, a NES motif is present in the hNANOG HD and may be functionally involved in CRM1-mediated nuclear export pathway.

  5. Dispom: a discriminative de-novo motif discovery tool based on the jstacs library.

    PubMed

    Grau, Jan; Keilwagen, Jens; Gohr, André; Paponov, Ivan A; Posch, Stefan; Seifert, Michael; Strickert, Marc; Grosse, Ivo

    2013-02-01

    DNA-binding proteins are a main component of gene regulation as they activate or repress gene expression by binding to specific binding sites in target regions of genomic DNA. However, de-novo discovery of these binding sites in target regions obtained by wet-lab experiments is a challenging problem in computational biology, which has not yet been solved satisfactorily. Here, we present a detailed description and analysis of the de-novo motif discovery tool Dispom, which has been developed for finding binding sites of DNA-binding proteins that are differentially abundant in a set of target regions compared to a set of control regions. Two additional features of Dispom are its capability of modeling positional preferences of binding sites and adjusting the length of the motif in the learning process. Dispom yields an increased prediction accuracy compared to existing tools for de-novo motif discovery, suggesting that the combination of searching for differentially abundant motifs, inferring their positional distributions, and adjusting the motif lengths is beneficial for de-novo motif discovery. When applying Dispom to promoters of auxin-responsive genes and those of ABI3 target genes from Arabidopsis thaliana, we identify relevant binding motifs with pronounced positional distributions. These results suggest that learning motifs, their positional distributions, and their lengths by a discriminative learning principle may aid motif discovery from ChIP-chip and gene expression data. We make Dispom freely available as part of Jstacs, an open-source Java library that is tailored to statistical sequence analysis. To facilitate extensions of Dispom, we describe its implementation using Jstacs in this manuscript. In addition, we provide a stand-alone application of Dispom at http://www.jstacs.de/index.php/Dispom for instant use.

  6. Regulatory motifs are present in the ITS1 of some flatworm species.

    PubMed

    Van Herwerden, Lynne; Caley, M Julian; Blair, David

    2003-04-15

    Particular sequence motifs can act as transcription regulators. Because the total regulatory effects of such motifs can be related to their abundance, their presence might be expected at locations within the genome where sequences are repeated. Multiple repeats that vary in number among individuals occur within the ribosomal first internal transcribed spacer (ITS1) in some species in three trematode genera: Paragonimus, Schistosoma and Dolichosaccus. In all of these genera we found in ITS1, sequences identical to known enhancer motifs. We also searched for, and identified, known regulatory motifs in published ITS1 sequences of other parasitic flatworms including Echinostoma spp. (Trematoda) and Echinococcus spp. (Cestoda) which lack multiple repeats in ITS1. We present three lines of evidence that this widespread occurrence of such motifs within the ITS1 of parasitic flatworms may indicate a functional role in regulating tissue- or stage-specific transcription of ribosomal genes. First, these motifs are identical to ones whose functional roles have been established using in vitro assays of transcriptional rates. Second, in all 18 species investigated here, between one and three different regulatory motifs were identified. In 14 of these 18 species, the probability that at least one of these motifs occurred because of the random assortment of bases within the regions investigated was 10% or less. In 12 of these 14 species, the probability was 5% or less. Third, the evolutionary divergence of flatworm species investigated is quite ancient. Therefore, the interspecific distribution of motifs observed here, in a rapidly evolving region such as ITS1, is unlikely to be attributable solely to shared evolutionary histories. These results, therefore, suggest a broader functional role for the ITS1 than previously thought.

  7. Minimal motif peptide structure of metzincin clan zinc peptidases in micelles.

    PubMed

    Onoda, Akira; Suzuki, Takako; Ishizuka, Hiroaki; Sugiyama, Rumiko; Ariyasu, Shinya; Yamamura, Takeshi

    2009-12-01

    It is well known that the functions of metalloproteins generally originate from their metal-binding motifs. However, the intrinsic nature of individual motifs remains unknown, particularly the details about metal-binding effects on the folding of motifs; the converse is also unknown, although there is no doubt that the motif is the core of the reactivity for each metalloprotein. In this study, we focused our attention on the zinc-binding motif of the metzincin clan family, HEXXHXXGXXH; this family contains the general zinc-binding sequence His-Glu-Xaa-Xaa-His (HEXXH) and the extended GXXH region. We adopted the motif sequence of stromelysin-1 and investigated the folding properties of the Trp-labeled peptides WAHEIAHSLGLFHA (STR-W1), AWHEIAHSLGLFHA (STR-W2), AHEIAHSLGWFHA (STR-W11), and AHEIAHSLGLFHWA (STR-W14) in the presence and absence of zinc ions in hydrophobic micellar environments by circular dichroism (CD) measurements. We accessed successful incorporation of these zinc peptides into micelles using quenching of Trp fluorescence. Results of CD studies indicated that two of the Trp-incorporated peptides, STR-W1 and STR-W14, exhibited helical folding in the hydrophobic region of cetyltrimethylammonium chloride micelle. The NMR structural analysis of the apo STR-W14 revealed that the conformation in the C-terminus GXXH region significantly differred between the apo state in the micelle and the reported Zn-bound state of stromelysin-1 in crystal structures. The structural analyses of the qualitative Zn-binding properties of this motif peptide provide an interesting Zn-binding mechanism: the minimum consensus motif in the metzincin clan, a basic zinc-binding motif with an extended GXXH region, has the potential to serve as a preorganized Zn binding scaffold in a hydrophobic environment.

  8. P-value-based regulatory motif discovery using positional weight matrices.

    PubMed

    Hartmann, Holger; Guthöhrlein, Eckhart W; Siebert, Matthias; Luehr, Sebastian; Söding, Johannes

    2013-01-01

    To analyze gene regulatory networks, the sequence-dependent DNA/RNA binding affinities of proteins and noncoding RNAs are crucial. Often, these are deduced from sets of sequences enriched in factor binding sites. Two classes of computational approaches exist. The first describe binding motifs by sequence patterns and search the patterns with highest statistical significance for enrichment. The second class uses the more powerful position weight matrices (PWMs). Instead of maximizing the statistical significance of enrichment, they maximize a likelihood. Here we present XXmotif (eXhaustive evaluation of matriX motifs), the first PWM-based motif discovery method that can optimize PWMs by directly minimizing their P-values of enrichment. Optimization requires computing millions of enrichment P-values for thousands of PWMs. For a given PWM, the enrichment P-value is calculated efficiently from the match P-values of all possible motif placements in the input sequences using order statistics. The approach can naturally combine P-values for motif enrichment, conservation, and localization. On ChIP-chip/seq, miRNA knock-down, and coexpression data sets from yeast and metazoans, XXmotif outperformed state-of-the-art tools, both in numbers of correctly identified motifs and in the quality of PWMs. In segmentation modules of D. melanogaster, we detect the known key regulators and several new motifs. In human core promoters, XXmotif reports most previously described and eight novel motifs sharply peaked around the transcription start site, among them an Initiator motif similar to the fly and yeast versions. XXmotif's sensitivity, reliability, and usability will help to leverage the quickly accumulating wealth of functional genomics data.

  9. Characterization of the tandem CWCH2 sequence motif: a hallmark of inter-zinc finger interactions

    PubMed Central

    2010-01-01

    Background The C2H2 zinc finger (ZF) domain is widely conserved among eukaryotic proteins. In Zic/Gli/Zap1 C2H2 ZF proteins, the two N-terminal ZFs form a single structural unit by sharing a hydrophobic core. This structural unit defines a new motif comprised of two tryptophan side chains at the center of the hydrophobic core. Because each tryptophan residue is located between the two cysteine residues of the C2H2 motif, we have named this structure the tandem CWCH2 (tCWCH2) motif. Results Here, we characterized 587 tCWCH2-containing genes using data derived from public databases. We categorized genes into 11 classes including Zic/Gli/Glis, Arid2/Rsc9, PacC, Mizf, Aebp2, Zap1/ZafA, Fungl, Zfp106, Twincl, Clr1, and Fungl-4ZF, based on sequence similarity, domain organization, and functional similarities. tCWCH2 motifs are mostly found in organisms belonging to the Opisthokonta (metazoa, fungi, and choanoflagellates) and Amoebozoa (amoeba, Dictyostelium discoideum). By comparison, the C2H2 ZF motif is distributed widely among the eukaryotes. The structure and organization of the tCWCH2 motif, its phylogenetic distribution, and molecular phylogenetic analysis suggest that prototypical tCWCH2 genes existed in the Opisthokonta ancestor. Within-group or between-group comparisons of the tCWCH2 amino acid sequence identified three additional sequence features (site-specific amino acid frequencies, longer linker sequence between two C2H2 ZFs, and frequent extra-sequences within C2H2 ZF motifs). Conclusion These features suggest that the tCWCH2 motif is a specialized motif involved in inter-zinc finger interactions. PMID:20167128

  10. A greedy strategy for finding motifs from yes-no examples.

    PubMed

    Tateishi, E; Miyano, S

    1996-01-01

    We define a motif as an expression Z1.Z2...Zn with sets Z1, Z2,..., Zn of strings in a specified family omega called the type. This notion can capture the most of the motifs in PROSITE as well as regular pattern languages. A greedy strategy is developed for finding such motifs with ambiguity just from positive and negative examples by exploiting the probabilistic argument. This paper concentrates on describing the idea of the greedy algorithm with its underlying theory. Its experimental results on splicing sites and E. coli promoters are also presented.

  11. The coxsackievirus A9 RGD motif is not essential for virus viability.

    PubMed Central

    Hughes, P J; Horsnell, C; Hyypiä, T; Stanway, G

    1995-01-01

    An RGD (arginine-glycine-aspartic acid) motif in coxsackievirus A9 has been implicated in internalization through an interaction with the integrin alpha v beta 3. We have produced a number of virus mutants, lacking the motif, which have a small-plaque phenotype in LLC-Mk2 and A-Vero cells and are phenotypically normal in RD cells. Substitution of flanking amino acids also affected plaque size. The results suggest that interaction between the RGD motif and alpha v beta 3 is not critical for virus viability in the cell lines tested and therefore that alternative regions of the CAV-9 capsid are involved in internalization. PMID:7494317

  12. ELM 2016—data update and new functionality of the eukaryotic linear motif resource

    PubMed Central

    Dinkel, Holger; Van Roey, Kim; Michael, Sushama; Kumar, Manjeet; Uyar, Bora; Altenberg, Brigitte; Milchevskaya, Vladislava; Schneider, Melanie; Kühn, Helen; Behrendt, Annika; Dahl, Sophie Luise; Damerell, Victoria; Diebel, Sandra; Kalman, Sara; Klein, Steffen; Knudsen, Arne C.; Mäder, Christina; Merrill, Sabina; Staudt, Angelina; Thiel, Vera; Welti, Lukas; Davey, Norman E.; Diella, Francesca; Gibson, Toby J.

    2016-01-01

    The Eukaryotic Linear Motif (ELM) resource (http://elm.eu.org) is a manually curated database of short linear motifs (SLiMs). In this update, we present the latest additions to this resource, along with more improvements to the web interface. ELM 2016 contains more than 240 different motif classes with over 2700 experimentally validated instances, manually curated from more than 2400 scientific publications. In addition, more data have been made available as individually searchable pages and are downloadable in various formats. PMID:26615199

  13. Phosphopeptide interactions with BRCA1 BRCT domains: More than just a motif.

    PubMed

    Wu, Qian; Jubb, Harry; Blundell, Tom L

    2015-03-01

    BRCA1 BRCT domains function as phosphoprotein-binding modules for recognition of the phosphorylated protein-sequence motif pSXXF. While the motif interaction interface provides strong anchor points for binding, protein regions outside the motif have recently been found to be important for binding affinity. In this review, we compare the available structural data for BRCA1 BRCT domains in complex with phosphopeptides in order to gain a more complete understanding of the interaction between phosphopeptides and BRCA1-BRCT domains.

  14. Prevalence of the EH1 Groucho interaction motif in the metazoan Fox family of transcriptional regulators

    PubMed Central

    Yaklichkin, Sergey; Vekker, Alexander; Stayrook, Steven; Lewis, Mitchell; Kessler, Daniel S

    2007-01-01

    Background The Fox gene family comprises a large and functionally diverse group of forkhead-related transcriptional regulators, many of which are essential for metazoan embryogenesis and physiology. Defining conserved functional domains that mediate the transcriptional activity of Fox proteins will contribute to a comprehensive understanding of the biological function of Fox family genes. Results Systematic analysis of 458 protein sequences of the metazoan Fox family was performed to identify the presence of the engrailed homology-1 motif (eh1), a motif known to mediate physical interaction with transcriptional corepressors of the TLE/Groucho family. Greater than 50% of Fox proteins contain sequences with high similarity to the eh1 motif, including ten of the nineteen Fox subclasses (A, B, C, D, E, G, H, I, L, and Q) and Fox proteins of early divergent species such as marine sponge. The eh1 motif is not detected in Fox proteins of the F, J, K, M, N, O, P, R and S subclasses, or in yeast Fox proteins. The eh1-like motifs are positioned C-terminal to the winged helix DNA-binding domain in all subclasses except for FoxG proteins, which have an N-terminal motif. Two similar eh1-like motifs are found in the zebrafish FoxQ1 and in FoxG proteins of sea urchin and amphioxus. The identification of eh1-like motifs by manual sequence alignment was validated by statistical analyses of the Swiss protein database, confirming a high frequency of occurrence of eh1-like sequences in Fox family proteins. Structural predictions suggest that the majority of identified eh1-like motifs are short α-helices, and wheel modeling revealed an amphipathicity that supports this secondary structure prediction. Conclusion A search for eh1 Groucho interaction motifs in the Fox gene family has identified eh1-like sequences in greater than 50% of Fox proteins. The results predict a physical and functional interaction of TLE/Groucho corepressors with many members of the Fox family of transcriptional

  15. Organizational motifs for ground squirrel cone bipolar cells

    PubMed Central

    Light, Adam C.; Zhu, Yongling; Shi, Jun; Saszik, Shannon; Lindstrom, Sarah; Davidson, Laura; Li, Xiaoyu; Chiodo, Vince A.; Hauswirth, William W.; Li, Wei; DeVries, Steven H.

    2012-01-01

    In daylight vision, parallel processing starts at the cone synapse. Cone signals flow to On and Off bipolar cells, which are further divided into types according to morphology, immunocytochemistry, and function. The axons of the bipolar cell types stratify at different levels in the inner plexiform layer (IPL), and can interact with costratifying amacrine and ganglion cells. These interactions endow the ganglion cell types with unique functional properties. The wiring that underlies the interactions between bipolar, amacrine, and ganglion cells is poorly understood. It may be easier to elucidate this wiring if organizational rules can be established. We identify 13 types of cone bipolar cells in the ground squirrel, 11 of which contact contiguous cones with the possible exception of short-wavelength sensitive cones. Cells were identified by antibody labeling, tracer filling, and Golgi-like filling following transduction with an adeno-associated virus encoding for GFP. The 11 bipolar cell types displayed two organizational patterns. In the first pattern, 8-10 of the 11 types came in pairs with partially overlapping axonal stratification. Pairs shared morphological, immunocytochemical, and functional properties. The existence of similar pairs is a new motif that may have implications for how signals first diverge from a cone to bipolar cells, and then re-converge onto a costratifying ganglion cell. The second pattern is a mirror symmetric organization about the middle of the IPL involving at least 7 bipolar cell types. This anatomical symmetry may be associated with a functional symmetry in On and Off ganglion cell responses. PMID:22778006

  16. Peptoid nanosheets exhibit a new secondary-structure motif

    NASA Astrophysics Data System (ADS)

    Mannige, Ranjan V.; Haxton, Thomas K.; Proulx, Caroline; Robertson, Ellen J.; Battigelli, Alessia; Butterfoss, Glenn L.; Zuckermann, Ronald N.; Whitelam, Stephen

    2015-10-01

    A promising route to the synthesis of protein-mimetic materials that are capable of complex functions, such as molecular recognition and catalysis, is provided by sequence-defined peptoid polymers--structural relatives of biologically occurring polypeptides. Peptoids, which are relatively non-toxic and resistant to degradation, can fold into defined structures through a combination of sequence-dependent interactions. However, the range of possible structures that are accessible to peptoids and other biological mimetics is unknown, and our ability to design protein-like architectures from these polymer classes is limited. Here we use molecular-dynamics simulations, together with scattering and microscopy data, to determine the atomic-resolution structure of the recently discovered peptoid nanosheet, an ordered supramolecular assembly that extends macroscopically in only two dimensions. Our simulations show that nanosheets are structurally and dynamically heterogeneous, can be formed only from peptoids of certain lengths, and are potentially porous to water and ions. Moreover, their formation is enabled by the peptoids' adoption of a secondary structure that is not seen in the natural world. This structure, a zigzag pattern that we call a Σ(`sigma')-strand, results from the ability of adjacent backbone monomers to adopt opposed rotational states, thereby allowing the backbone to remain linear and untwisted. Linear backbones tiled in a brick-like way form an extended two-dimensional nanostructure, the Σ-sheet. The binary rotational-state motif of the Σ-strand is not seen in regular protein structures, which are usually built from one type of rotational state. We also show that the concept of building regular structures from multiple rotational states can be generalized beyond the peptoid nanosheet system.

  17. Elongated polyproline motifs facilitate enamel evolution through matrix subunit compaction.

    PubMed

    Jin, Tianquan; Ito, Yoshihiro; Luan, Xianghong; Dangaria, Smit; Walker, Cameron; Allen, Michael; Kulkarni, Ashok; Gibson, Carolyn; Braatz, Richard; Liao, Xiubei; Diekwisch, Thomas G H

    2009-12-01

    Vertebrate body designs rely on hydroxyapatite as the principal mineral component of relatively light-weight, articulated endoskeletons and sophisticated tooth-bearing jaws, facilitating rapid movement and efficient predation. Biological mineralization and skeletal growth are frequently accomplished through proteins containing polyproline repeat elements. Through their well-defined yet mobile and flexible structure polyproline-rich proteins control mineral shape and contribute many other biological functions including Alzheimer's amyloid aggregation and prolamine plant storage. In the present study we have hypothesized that polyproline repeat proteins exert their control over biological events such as mineral growth, plaque aggregation, or viscous adhesion by altering the length of their central repeat domain, resulting in dramatic changes in supramolecular assembly dimensions. In order to test our hypothesis, we have used the vertebrate mineralization protein amelogenin as an exemplar and determined the biological effect of the four-fold increased polyproline tandem repeat length in the amphibian/mammalian transition. To study the effect of polyproline repeat length on matrix assembly, protein structure, and apatite crystal growth, we have measured supramolecular assembly dimensions in various vertebrates using atomic force microscopy, tested the effect of protein assemblies on crystal growth by electron microscopy, generated a transgenic mouse model to examine the effect of an abbreviated polyproline sequence on crystal growth, and determined the structure of polyproline repeat elements using 3D NMR. Our study shows that an increase in PXX/PXQ tandem repeat motif length results (i) in a compaction of protein matrix subunit dimensions, (ii) reduced conformational variability, (iii) an increase in polyproline II helices, and (iv) promotion of apatite crystal length. Together, these findings establish a direct relationship between polyproline tandem repeat fragment

  18. Organizational motifs for ground squirrel cone bipolar cells.

    PubMed

    Light, Adam C; Zhu, Yongling; Shi, Jun; Saszik, Shannon; Lindstrom, Sarah; Davidson, Laura; Li, Xiaoyu; Chiodo, Vince A; Hauswirth, William W; Li, Wei; DeVries, Steven H

    2012-09-01

    In daylight vision, parallel processing starts at the cone synapse. Cone signals flow to On and Off bipolar cells, which are further divided into types according to morphology, immunocytochemistry, and function. The axons of the bipolar cell types stratify at different levels in the inner plexiform layer (IPL) and can interact with costratifying amacrine and ganglion cells. These interactions endow the ganglion cell types with unique functional properties. The wiring that underlies the interactions among bipolar, amacrine, and ganglion cells is poorly understood. It may be easier to elucidate this wiring if organizational rules can be established. We identify 13 types of cone bipolar cells in the ground squirrel, 11 of which contact contiguous cones, with the possible exception of short-wavelength-sensitive cones. Cells were identified by antibody labeling, tracer filling, and Golgi-like filling following transduction with an adeno-associated virus encoding for green fluorescent protein. The 11 bipolar cell types displayed two organizational patterns. In the first pattern, eight to 10 of the 11 types came in pairs with partially overlapping axonal stratification. Pairs shared morphological, immunocytochemical, and functional properties. The existence of similar pairs is a new motif that might have implications for how signals first diverge from a cone to bipolar cells and then reconverge onto a costratifying ganglion cell. The second pattern is a mirror symmetric organization about the middle of the IPL involving at least seven bipolar cell types. This anatomical symmetry may be associated with a functional symmetry in On and Off ganglion cell responses. Copyright © 2012 Wiley Periodicals, Inc.

  19. Elongated Polyproline Motifs Facilitate Enamel Evolution through Matrix Subunit Compaction

    PubMed Central

    Luan, Xianghong; Dangaria, Smit; Walker, Cameron; Allen, Michael; Kulkarni, Ashok; Gibson, Carolyn; Braatz, Richard; Liao, Xiubei; Diekwisch, Thomas G. H.

    2009-01-01

    Vertebrate body designs rely on hydroxyapatite as the principal mineral component of relatively light-weight, articulated endoskeletons and sophisticated tooth-bearing jaws, facilitating rapid movement and efficient predation. Biological mineralization and skeletal growth are frequently accomplished through proteins containing polyproline repeat elements. Through their well-defined yet mobile and flexible structure polyproline-rich proteins control mineral shape and contribute many other biological functions including Alzheimer's amyloid aggregation and prolamine plant storage. In the present study we have hypothesized that polyproline repeat proteins exert their control over biological events such as mineral growth, plaque aggregation, or viscous adhesion by altering the length of their central repeat domain, resulting in dramatic changes in supramolecular assembly dimensions. In order to test our hypothesis, we have used the vertebrate mineralization protein amelogenin as an exemplar and determined the biological effect of the four-fold increased polyproline tan