Efficient sequential and parallel algorithms for finding edit distance based motifs.
Pal, Soumitra; Xiao, Peng; Rajasekaran, Sanguthevar
2016-08-18
Motif search is an important step in extracting meaningful patterns from biological data. The general problem of motif search is intractable and there is a pressing need to develop efficient, exact and approximation algorithms to solve this problem. In this paper, we present several novel, exact, sequential and parallel algorithms for solving the (l,d) Edit-distance-based Motif Search (EMS) problem: given two integers l,d and n biological strings, find all strings of length l that appear in each input string with atmost d errors of types substitution, insertion and deletion. One popular technique to solve the problem is to explore for each input string the set of all possible l-mers that belong to the d-neighborhood of any substring of the input string and output those which are common for all input strings. We introduce a novel and provably efficient neighborhood exploration technique. We show that it is enough to consider the candidates in neighborhood which are at a distance exactly d. We compactly represent these candidate motifs using wildcard characters and efficiently explore them with very few repetitions. Our sequential algorithm uses a trie based data structure to efficiently store and sort the candidate motifs. Our parallel algorithm in a multi-core shared memory setting uses arrays for storing and a novel modification of radix-sort for sorting the candidate motifs. The algorithms for EMS are customarily evaluated on several challenging instances such as (8,1), (12,2), (16,3), (20,4), and so on. The best previously known algorithm, EMS1, is sequential and in estimated 3 days solves up to instance (16,3). Our sequential algorithms are more than 20 times faster on (16,3). On other hard instances such as (9,2), (11,3), (13,4), our algorithms are much faster. Our parallel algorithm has more than 600 % scaling performance while using 16 threads. Our algorithms have pushed up the state-of-the-art of EMS solvers and we believe that the techniques introduced in this paper are also applicable to other motif search problems such as Planted Motif Search (PMS) and Simple Motif Search (SMS).
Symmetry compression method for discovering network motifs.
Wang, Jianxin; Huang, Yuannan; Wu, Fang-Xiang; Pan, Yi
2012-01-01
Discovering network motifs could provide a significant insight into systems biology. Interestingly, many biological networks have been found to have a high degree of symmetry (automorphism), which is inherent in biological network topologies. The symmetry due to the large number of basic symmetric subgraphs (BSSs) causes a certain redundant calculation in discovering network motifs. Therefore, we compress all basic symmetric subgraphs before extracting compressed subgraphs and propose an efficient decompression algorithm to decompress all compressed subgraphs without loss of any information. In contrast to previous approaches, the novel Symmetry Compression method for Motif Detection, named as SCMD, eliminates most redundant calculations caused by widespread symmetry of biological networks. We use SCMD to improve three notable exact algorithms and two efficient sampling algorithms. Results of all exact algorithms with SCMD are the same as those of the original algorithms, since SCMD is a lossless method. The sampling results show that the use of SCMD almost does not affect the quality of sampling results. For highly symmetric networks, we find that SCMD used in both exact and sampling algorithms can help get a remarkable speedup. Furthermore, SCMD enables us to find larger motifs in biological networks with notable symmetry than previously possible.
Efficient exact motif discovery.
Marschall, Tobias; Rahmann, Sven
2009-06-15
The motif discovery problem consists of finding over-represented patterns in a collection of biosequences. It is one of the classical sequence analysis problems, but still has not been satisfactorily solved in an exact and efficient manner. This is partly due to the large number of possibilities of defining the motif search space and the notion of over-representation. Even for well-defined formalizations, the problem is frequently solved in an ad hoc manner with heuristics that do not guarantee to find the best motif. We show how to solve the motif discovery problem (almost) exactly on a practically relevant space of IUPAC generalized string patterns, using the p-value with respect to an i.i.d. model or a Markov model as the measure of over-representation. In particular, (i) we use a highly accurate compound Poisson approximation for the null distribution of the number of motif occurrences. We show how to compute the exact clump size distribution using a recently introduced device called probabilistic arithmetic automaton (PAA). (ii) We define two p-value scores for over-representation, the first one based on the total number of motif occurrences, the second one based on the number of sequences in a collection with at least one occurrence. (iii) We describe an algorithm to discover the optimal pattern with respect to either of the scores. The method exploits monotonicity properties of the compound Poisson approximation and is by orders of magnitude faster than exhaustive enumeration of IUPAC strings (11.8 h compared with an extrapolated runtime of 4.8 years). (iv) We justify the use of the proposed scores for motif discovery by showing our method to outperform other motif discovery algorithms (e.g. MEME, Weeder) on benchmark datasets. We also propose new motifs on Mycobacterium tuberculosis. The method has been implemented in Java. It can be obtained from http://ls11-www.cs.tu-dortmund.de/people/marschal/paa_md/.
Statistical tests to compare motif count exceptionalities
Robin, Stéphane; Schbath, Sophie; Vandewalle, Vincent
2007-01-01
Background Finding over- or under-represented motifs in biological sequences is now a common task in genomics. Thanks to p-value calculation for motif counts, exceptional motifs are identified and represent candidate functional motifs. The present work addresses the related question of comparing the exceptionality of one motif in two different sequences. Just comparing the motif count p-values in each sequence is indeed not sufficient to decide if this motif is significantly more exceptional in one sequence compared to the other one. A statistical test is required. Results We develop and analyze two statistical tests, an exact binomial one and an asymptotic likelihood ratio test, to decide whether the exceptionality of a given motif is equivalent or significantly different in two sequences of interest. For that purpose, motif occurrences are modeled by Poisson processes, with a special care for overlapping motifs. Both tests can take the sequence compositions into account. As an illustration, we compare the octamer exceptionalities in the Escherichia coli K-12 backbone versus variable strain-specific loops. Conclusion The exact binomial test is particularly adapted for small counts. For large counts, we advise to use the likelihood ratio test which is asymptotic but strongly correlated with the exact binomial test and very simple to use. PMID:17346349
Sequential visibility-graph motifs
NASA Astrophysics Data System (ADS)
Iacovacci, Jacopo; Lacasa, Lucas
2016-04-01
Visibility algorithms transform time series into graphs and encode dynamical information in their topology, paving the way for graph-theoretical time series analysis as well as building a bridge between nonlinear dynamics and network science. In this work we introduce and study the concept of sequential visibility-graph motifs, smaller substructures of n consecutive nodes that appear with characteristic frequencies. We develop a theory to compute in an exact way the motif profiles associated with general classes of deterministic and stochastic dynamics. We find that this simple property is indeed a highly informative and computationally efficient feature capable of distinguishing among different dynamics and robust against noise contamination. We finally confirm that it can be used in practice to perform unsupervised learning, by extracting motif profiles from experimental heart-rate series and being able, accordingly, to disentangle meditative from other relaxation states. Applications of this general theory include the automatic classification and description of physical, biological, and financial time series.
Direct AUC optimization of regulatory motifs.
Zhu, Lin; Zhang, Hong-Bo; Huang, De-Shuang
2017-07-15
The discovery of transcription factor binding site (TFBS) motifs is essential for untangling the complex mechanism of genetic variation under different developmental and environmental conditions. Among the huge amount of computational approaches for de novo identification of TFBS motifs, discriminative motif learning (DML) methods have been proven to be promising for harnessing the discovery power of accumulated huge amount of high-throughput binding data. However, they have to sacrifice accuracy for speed and could fail to fully utilize the information of the input sequences. We propose a novel algorithm called CDAUC for optimizing DML-learned motifs based on the area under the receiver-operating characteristic curve (AUC) criterion, which has been widely used in the literature to evaluate the significance of extracted motifs. We show that when the considered AUC loss function is optimized in a coordinate-wise manner, the cost function of each resultant sub-problem is a piece-wise constant function, whose optimal value can be found exactly and efficiently. Further, a key step of each iteration of CDAUC can be efficiently solved as a computational geometry problem. Experimental results on real world high-throughput datasets illustrate that CDAUC outperforms competing methods for refining DML motifs, while being one order of magnitude faster. Meanwhile, preliminary results also show that CDAUC may also be useful for improving the interpretability of convolutional kernels generated by the emerging deep learning approaches for predicting TF sequences specificities. CDAUC is available at: https://drive.google.com/drive/folders/0BxOW5MtIZbJjNFpCeHlBVWJHeW8 . dshuang@tongji.edu.cn. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
qPMS9: An Efficient Algorithm for Quorum Planted Motif Search
NASA Astrophysics Data System (ADS)
Nicolae, Marius; Rajasekaran, Sanguthevar
2015-01-01
Discovering patterns in biological sequences is a crucial problem. For example, the identification of patterns in DNA sequences has resulted in the determination of open reading frames, identification of gene promoter elements, intron/exon splicing sites, and SH RNAs, location of RNA degradation signals, identification of alternative splicing sites, etc. In protein sequences, patterns have led to domain identification, location of protease cleavage sites, identification of signal peptides, protein interactions, determination of protein degradation elements, identification of protein trafficking elements, discovery of short functional motifs, etc. In this paper we focus on the identification of an important class of patterns, namely, motifs. We study the (l, d) motif search problem or Planted Motif Search (PMS). PMS receives as input n strings and two integers l and d. It returns all sequences M of length l that occur in each input string, where each occurrence differs from M in at most d positions. Another formulation is quorum PMS (qPMS), where the motif appears in at least q% of the strings. We introduce qPMS9, a parallel exact qPMS algorithm that offers significant runtime improvements on DNA and protein datasets. qPMS9 solves the challenging DNA (l, d)-instances (28, 12) and (30, 13). The source code is available at https://code.google.com/p/qpms9/.
Counting of oligomers in sequences generated by markov chains for DNA motif discovery.
Shan, Gao; Zheng, Wei-Mou
2009-02-01
By means of the technique of the imbedded Markov chain, an efficient algorithm is proposed to exactly calculate first, second moments of word counts and the probability for a word to occur at least once in random texts generated by a Markov chain. A generating function is introduced directly from the imbedded Markov chain to derive asymptotic approximations for the problem. Two Z-scores, one based on the number of sequences with hits and the other on the total number of word hits in a set of sequences, are examined for discovery of motifs on a set of promoter sequences extracted from A. thaliana genome. Source code is available at http://www.itp.ac.cn/zheng/oligo.c.
Rapid motif compliance scoring with match weight sets.
Venezia, D; O'Hara, P J
1993-02-01
Most current implementations of motif matching in biological sequences have sacrificed the generality of weight matrix scoring for shorter runtimes. The program MOTIF incorporates a weight matrix and a rapid, backtracking tree-search algorithm to score motif compliance with greatly enhanced performance while placing no constraints on the motif. In addition, any positions within a motif can be marked as 'inviolate', thereby requiring an exact match. MOTIF allows a choice of regular expression formats and can use both motif and sequence libraries as either targets or queries. Nucleic acid sequences can optionally be translated by MOTIF in any frame(s) and used against peptide motifs.
CompariMotif: quick and easy comparisons of sequence motifs.
Edwards, Richard J; Davey, Norman E; Shields, Denis C
2008-05-15
CompariMotif is a novel tool for making motif-motif comparisons, identifying and describing similarities between regular expression motifs. CompariMotif can identify a number of different relationships between motifs, including exact matches, variants of degenerate motifs and complex overlapping motifs. Motif relationships are scored using shared information content, allowing the best matches to be easily identified in large comparisons. Many input and search options are available, enabling a list of motifs to be compared to itself (to identify recurring motifs) or to datasets of known motifs. CompariMotif can be run online at http://bioware.ucd.ie/ and is freely available for academic use as a set of open source Python modules under a GNU General Public License from http://bioinformatics.ucd.ie/shields/software/comparimotif/
He, Jieyue; Wang, Chunyan; Qiu, Kunpu; Zhong, Wei
2014-01-01
Motif mining has always been a hot research topic in bioinformatics. Most of current research on biological networks focuses on exact motif mining. However, due to the inevitable experimental error and noisy data, biological network data represented as the probability model could better reflect the authenticity and biological significance, therefore, it is more biological meaningful to discover probability motif in uncertain biological networks. One of the key steps in probability motif mining is frequent pattern discovery which is usually based on the possible world model having a relatively high computational complexity. In this paper, we present a novel method for detecting frequent probability patterns based on circuit simulation in the uncertain biological networks. First, the partition based efficient search is applied to the non-tree like subgraph mining where the probability of occurrence in random networks is small. Then, an algorithm of probability isomorphic based on circuit simulation is proposed. The probability isomorphic combines the analysis of circuit topology structure with related physical properties of voltage in order to evaluate the probability isomorphism between probability subgraphs. The circuit simulation based probability isomorphic can avoid using traditional possible world model. Finally, based on the algorithm of probability subgraph isomorphism, two-step hierarchical clustering method is used to cluster subgraphs, and discover frequent probability patterns from the clusters. The experiment results on data sets of the Protein-Protein Interaction (PPI) networks and the transcriptional regulatory networks of E. coli and S. cerevisiae show that the proposed method can efficiently discover the frequent probability subgraphs. The discovered subgraphs in our study contain all probability motifs reported in the experiments published in other related papers. The algorithm of probability graph isomorphism evaluation based on circuit simulation method excludes most of subgraphs which are not probability isomorphism and reduces the search space of the probability isomorphism subgraphs using the mismatch values in the node voltage set. It is an innovative way to find the frequent probability patterns, which can be efficiently applied to probability motif discovery problems in the further studies.
2014-01-01
Background Motif mining has always been a hot research topic in bioinformatics. Most of current research on biological networks focuses on exact motif mining. However, due to the inevitable experimental error and noisy data, biological network data represented as the probability model could better reflect the authenticity and biological significance, therefore, it is more biological meaningful to discover probability motif in uncertain biological networks. One of the key steps in probability motif mining is frequent pattern discovery which is usually based on the possible world model having a relatively high computational complexity. Methods In this paper, we present a novel method for detecting frequent probability patterns based on circuit simulation in the uncertain biological networks. First, the partition based efficient search is applied to the non-tree like subgraph mining where the probability of occurrence in random networks is small. Then, an algorithm of probability isomorphic based on circuit simulation is proposed. The probability isomorphic combines the analysis of circuit topology structure with related physical properties of voltage in order to evaluate the probability isomorphism between probability subgraphs. The circuit simulation based probability isomorphic can avoid using traditional possible world model. Finally, based on the algorithm of probability subgraph isomorphism, two-step hierarchical clustering method is used to cluster subgraphs, and discover frequent probability patterns from the clusters. Results The experiment results on data sets of the Protein-Protein Interaction (PPI) networks and the transcriptional regulatory networks of E. coli and S. cerevisiae show that the proposed method can efficiently discover the frequent probability subgraphs. The discovered subgraphs in our study contain all probability motifs reported in the experiments published in other related papers. Conclusions The algorithm of probability graph isomorphism evaluation based on circuit simulation method excludes most of subgraphs which are not probability isomorphism and reduces the search space of the probability isomorphism subgraphs using the mismatch values in the node voltage set. It is an innovative way to find the frequent probability patterns, which can be efficiently applied to probability motif discovery problems in the further studies. PMID:25350277
Mori, Hiroyuki; Sakashita, Sohei; Ito, Jun; Ishii, Eiji; Akiyama, Yoshinori
2018-02-23
VemP ( V ibrio protein e xport m onitoring p olypeptide) is a secretory protein comprising 159 amino acid residues, which functions as a secretion monitor in Vibrio and regulates expression of the downstream V.secDF2 genes. When VemP export is compromised, its translation specifically undergoes elongation arrest at the position where the Gln 156 codon of vemP encounters the P-site in the translating ribosome, resulting in up-regulation of V.SecDF2 production. Although our previous study suggests that many residues in a highly conserved C-terminal 20-residue region of VemP contribute to its elongation arrest, the exact role of each residue remains unclear. Here, we constructed a reporter system to easily and exactly monitor the in vivo arrest efficiency of VemP. Using this reporter system, we systematically performed a mutational analysis of the 20 residues (His 138 -Phe 157 ) to identify and characterize the arrest motif. Our results show that 15 residues in the conserved region participate in elongation arrest and that multiple interactions between important residues in VemP and in the interior of the exit tunnel contribute to the elongation arrest of VemP. The arrangement of these important residues induced by specific secondary structures in the ribosomal tunnel is critical for the arrest. Pro scanning analysis of the preceding segment (Met 120 -Phe 137 ) revealed a minor role of this region in the arrest. Considering these results, we conclude that the arrest motif in VemP is mainly composed of the highly conserved multiple residues in the C-terminal region. © 2018 by The American Society for Biochemistry and Molecular Biology, Inc.
Taillefumier, Thibaud; Touboul, Jonathan; Magnasco, Marcelo
2012-12-01
In vivo cortical recording reveals that indirectly driven neural assemblies can produce reliable and temporally precise spiking patterns in response to stereotyped stimulation. This suggests that despite being fundamentally noisy, the collective activity of neurons conveys information through temporal coding. Stochastic integrate-and-fire models delineate a natural theoretical framework to study the interplay of intrinsic neural noise and spike timing precision. However, there are inherent difficulties in simulating their networks' dynamics in silico with standard numerical discretization schemes. Indeed, the well-posedness of the evolution of such networks requires temporally ordering every neuronal interaction, whereas the order of interactions is highly sensitive to the random variability of spiking times. Here, we answer these issues for perfect stochastic integrate-and-fire neurons by designing an exact event-driven algorithm for the simulation of recurrent networks, with delayed Dirac-like interactions. In addition to being exact from the mathematical standpoint, our proposed method is highly efficient numerically. We envision that our algorithm is especially indicated for studying the emergence of polychronized motifs in networks evolving under spike-timing-dependent plasticity with intrinsic noise.
A flexible motif search technique based on generalized profiles.
Bucher, P; Karplus, K; Moeri, N; Hofmann, K
1996-03-01
A flexible motif search technique is presented which has two major components: (1) a generalized profile syntax serving as a motif definition language; and (2) a motif search method specifically adapted to the problem of finding multiple instances of a motif in the same sequence. The new profile structure, which is the core of the generalized profile syntax, combines the functions of a variety of motif descriptors implemented in other methods, including regular expression-like patterns, weight matrices, previously used profiles, and certain types of hidden Markov models (HMMs). The relationship between generalized profiles and other biomolecular motif descriptors is analyzed in detail, with special attention to HMMs. Generalized profiles are shown to be equivalent to a particular class of HMMs, and conversion procedures in both directions are given. The conversion procedures provide an interpretation for local alignment in the framework of stochastic models, allowing for clear, simple significance tests. A mathematical statement of the motif search problem defines the new method exactly without linking it to a specific algorithmic solution. Part of the definition includes a new definition of disjointness of alignments.
Biological network motif detection and evaluation
2011-01-01
Background Molecular level of biological data can be constructed into system level of data as biological networks. Network motifs are defined as over-represented small connected subgraphs in networks and they have been used for many biological applications. Since network motif discovery involves computationally challenging processes, previous algorithms have focused on computational efficiency. However, we believe that the biological quality of network motifs is also very important. Results We define biological network motifs as biologically significant subgraphs and traditional network motifs are differentiated as structural network motifs in this paper. We develop five algorithms, namely, EDGEGO-BNM, EDGEBETWEENNESS-BNM, NMF-BNM, NMFGO-BNM and VOLTAGE-BNM, for efficient detection of biological network motifs, and introduce several evaluation measures including motifs included in complex, motifs included in functional module and GO term clustering score in this paper. Experimental results show that EDGEGO-BNM and EDGEBETWEENNESS-BNM perform better than existing algorithms and all of our algorithms are applicable to find structural network motifs as well. Conclusion We provide new approaches to finding network motifs in biological networks. Our algorithms efficiently detect biological network motifs and further improve existing algorithms to find high quality structural network motifs, which would be impossible using existing algorithms. The performances of the algorithms are compared based on our new evaluation measures in biological contexts. We believe that our work gives some guidelines of network motifs research for the biological networks. PMID:22784624
NASA Astrophysics Data System (ADS)
Changlani, Hitesh; Kumar, Krishna; Kochkov, Dmitrii; Fradkin, Eduardo; Clark, Bryan
We report the existence of a quantum macroscopically degenerate ground state manifold on the nearest neighbor XXZ model on the kagome lattice at the point Jz /Jxy = - 1 / 2 . On many lattices with triangular motifs (including the kagome, sawtooth, icosidodecahedron and Shastry-Sutherland lattice for a certain choice of couplings) this Hamiltonian is found to be frustration-free with exact ground states which correspond to three-colorings of these lattices. Several results also generalize to the case of variable couplings and to other motifs (albeit with possibly more complex Hamiltonians). The degenerate manifold on the kagome lattice corresponds to a ''many-body flat band'' of interacting hard-core bosons; and for the one boson case our results also explain the well-known non-interacting flat band. On adding realistic perturbations, state selection in this manifold of quantum many-body states is discussed along with the implications for the phase diagram of the kagome lattice antiferromagnet. supported by DE-FG02-12ER46875, DMR 1408713, DE-FG02-08ER46544.
Exact calculation of distributions on integers, with application to sequence alignment.
Newberg, Lee A; Lawrence, Charles E
2009-01-01
Computational biology is replete with high-dimensional discrete prediction and inference problems. Dynamic programming recursions can be applied to several of the most important of these, including sequence alignment, RNA secondary-structure prediction, phylogenetic inference, and motif finding. In these problems, attention is frequently focused on some scalar quantity of interest, a score, such as an alignment score or the free energy of an RNA secondary structure. In many cases, score is naturally defined on integers, such as a count of the number of pairing differences between two sequence alignments, or else an integer score has been adopted for computational reasons, such as in the test of significance of motif scores. The probability distribution of the score under an appropriate probabilistic model is of interest, such as in tests of significance of motif scores, or in calculation of Bayesian confidence limits around an alignment. Here we present three algorithms for calculating the exact distribution of a score of this type; then, in the context of pairwise local sequence alignments, we apply the approach so as to find the alignment score distribution and Bayesian confidence limits.
USDA-ARS?s Scientific Manuscript database
G4-quadruplexes are reversible DNA structures that likely function in gene regulation, but exactly how they work is not known. G4 DNA can be predicted from sequence motifs such as the pattern G-G-G-N(1,7)-G-G-G-N(1,7)-G-G-G-N(1,7)-G-G-G-N(1,7). In the maize genome, G4 motifs were found to occupy ...
Visualizing frequent patterns in large multivariate time series
NASA Astrophysics Data System (ADS)
Hao, M.; Marwah, M.; Janetzko, H.; Sharma, R.; Keim, D. A.; Dayal, U.; Patnaik, D.; Ramakrishnan, N.
2011-01-01
The detection of previously unknown, frequently occurring patterns in time series, often called motifs, has been recognized as an important task. However, it is difficult to discover and visualize these motifs as their numbers increase, especially in large multivariate time series. To find frequent motifs, we use several temporal data mining and event encoding techniques to cluster and convert a multivariate time series to a sequence of events. Then we quantify the efficiency of the discovered motifs by linking them with a performance metric. To visualize frequent patterns in a large time series with potentially hundreds of nested motifs on a single display, we introduce three novel visual analytics methods: (1) motif layout, using colored rectangles for visualizing the occurrences and hierarchical relationships of motifs in a multivariate time series, (2) motif distortion, for enlarging or shrinking motifs as appropriate for easy analysis and (3) motif merging, to combine a number of identical adjacent motif instances without cluttering the display. Analysts can interactively optimize the degree of distortion and merging to get the best possible view. A specific motif (e.g., the most efficient or least efficient motif) can be quickly detected from a large time series for further investigation. We have applied these methods to two real-world data sets: data center cooling and oil well production. The results provide important new insights into the recurring patterns.
Matveeva, O. V.; Tsodikov, A. D.; Giddings, M.; Freier, S. M.; Wyatt, J. R.; Spiridonov, A. N.; Shabalina, S. A.; Gesteland, R. F.; Atkins, J. F.
2000-01-01
Design of antisense oligonucleotides targeting any mRNA can be much more efficient when several activity-enhancing motifs are included and activity-decreasing motifs are avoided. This conclusion was made after statistical analysis of data collected from >1000 experiments with phosphorothioate-modified oligonucleotides. Highly significant positive correlation between the presence of motifs CCAC, TCCC, ACTC, GCCA and CTCT in the oligonucleotide and its antisense efficiency was demonstrated. In addition, negative correlation was revealed for the motifs GGGG, ACTG, AAA and TAA. It was found that the likelihood of activity of an oligonucleotide against a desired mRNA target is sequence motif content dependent. PMID:10908347
STEME: A Robust, Accurate Motif Finder for Large Data Sets
Reid, John E.; Wernisch, Lorenz
2014-01-01
Motif finding is a difficult problem that has been studied for over 20 years. Some older popular motif finders are not suitable for analysis of the large data sets generated by next-generation sequencing. We recently published an efficient approximation (STEME) to the EM algorithm that is at the core of many motif finders such as MEME. This approximation allows the EM algorithm to be applied to large data sets. In this work we describe several efficient extensions to STEME that are based on the MEME algorithm. Together with the original STEME EM approximation, these extensions make STEME a fully-fledged motif finder with similar properties to MEME. We discuss the difficulty of objectively comparing motif finders. We show that STEME performs comparably to existing prominent discriminative motif finders, DREME and Trawler, on 13 sets of transcription factor binding data in mouse ES cells. We demonstrate the ability of STEME to find long degenerate motifs which these discriminative motif finders do not find. As part of our method, we extend an earlier method due to Nagarajan et al. for the efficient calculation of motif E-values. STEME's source code is available under an open source license and STEME is available via a web interface. PMID:24625410
Identification of sequence motifs significantly associated with antisense activity.
McQuisten, Kyle A; Peek, Andrew S
2007-06-07
Predicting the suppression activity of antisense oligonucleotide sequences is the main goal of the rational design of nucleic acids. To create an effective predictive model, it is important to know what properties of an oligonucleotide sequence associate significantly with antisense activity. Also, for the model to be efficient we must know what properties do not associate significantly and can be omitted from the model. This paper will discuss the results of a randomization procedure to find motifs that associate significantly with either high or low antisense suppression activity, analysis of their properties, as well as the results of support vector machine modelling using these significant motifs as features. We discovered 155 motifs that associate significantly with high antisense suppression activity and 202 motifs that associate significantly with low suppression activity. The motifs range in length from 2 to 5 bases, contain several motifs that have been previously discovered as associating highly with antisense activity, and have thermodynamic properties consistent with previous work associating thermodynamic properties of sequences with their antisense activity. Statistical analysis revealed no correlation between a motif's position within an antisense sequence and that sequences antisense activity. Also, many significant motifs existed as subwords of other significant motifs. Support vector regression experiments indicated that the feature set of significant motifs increased correlation compared to all possible motifs as well as several subsets of the significant motifs. The thermodynamic properties of the significantly associated motifs support existing data correlating the thermodynamic properties of the antisense oligonucleotide with antisense efficiency, reinforcing our hypothesis that antisense suppression is strongly associated with probe/target thermodynamics, as there are no enzymatic mediators to speed the process along like the RNA Induced Silencing Complex (RISC) in RNAi. The independence of motif position and antisense activity also allows us to bypass consideration of this feature in the modelling process, promoting model efficiency and reducing the chance of overfitting when predicting antisense activity. The increase in SVR correlation with significant features compared to nearest-neighbour features indicates that thermodynamics alone is likely not the only factor in determining antisense efficiency.
An Efficient Scheme for Crystal Structure Prediction Based on Structural Motifs
Zhu, Zizhong; Wu, Ping; Wu, Shunqing; ...
2017-05-15
An efficient scheme based on structural motifs is proposed for the crystal structure prediction of materials. The key advantage of the present method comes in two fold: first, the degrees of freedom of the system are greatly reduced, since each structural motif, regardless of its size, can always be described by a set of parameters (R, θ, φ) with five degrees of freedom; second, the motifs could always appear in the predicted structures when the energies of the structures are relatively low. Both features make the present scheme a very efficient method for predicting desired materials. The method has beenmore » applied to the case of LiFePO 4, an important cathode material for lithium-ion batteries. Numerous new structures of LiFePO 4 have been found, compared to those currently available, available, demonstrating the reliability of the present methodology and illustrating the promise of the concept of structural motifs.« less
An Efficient Scheme for Crystal Structure Prediction Based on Structural Motifs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhu, Zizhong; Wu, Ping; Wu, Shunqing
An efficient scheme based on structural motifs is proposed for the crystal structure prediction of materials. The key advantage of the present method comes in two fold: first, the degrees of freedom of the system are greatly reduced, since each structural motif, regardless of its size, can always be described by a set of parameters (R, θ, φ) with five degrees of freedom; second, the motifs could always appear in the predicted structures when the energies of the structures are relatively low. Both features make the present scheme a very efficient method for predicting desired materials. The method has beenmore » applied to the case of LiFePO 4, an important cathode material for lithium-ion batteries. Numerous new structures of LiFePO 4 have been found, compared to those currently available, available, demonstrating the reliability of the present methodology and illustrating the promise of the concept of structural motifs.« less
The bioactive acidic serine- and aspartate-rich motif peptide.
Minamizaki, Tomoko; Yoshiko, Yuji
2015-01-01
The organic component of the bone matrix comprises 40% dry weight of bone. The organic component is mostly composed of type I collagen and small amounts of non-collagenous proteins (NCPs) (10-15% of the total bone protein content). The small integrin-binding ligand N-linked glycoprotein (SIBLING) family, a NCP, is considered to play a key role in bone mineralization. SIBLING family of proteins share common structural features and includes the arginine-glycine-aspartic acid (RGD) motif and acidic serine- and aspartic acid-rich motif (ASARM). Clinical manifestations of gene mutations and/or genetically modified mice indicate that SIBLINGs play diverse roles in bone and extraskeletal tissues. ASARM peptides might not be primary responsible for the functional diversity of SIBLINGs, but this motif is suggested to be a key domain of SIBLINGs. However, the exact function of ASARM peptides is poorly understood. In this article, we discuss the considerable progress made in understanding the role of ASARM as a bioactive peptide.
Using SCOPE to identify potential regulatory motifs in coregulated genes.
Martyanov, Viktor; Gross, Robert H
2011-05-31
SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data. In this article, we utilize a web version of SCOPE to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs and has been used in other studies. The three algorithms that comprise SCOPE are BEAM, which finds non-degenerate motifs (ACCGGT), PRISM, which finds degenerate motifs (ASCGWT), and SPACER, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well. Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor. Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run. Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail.
Wiese, Claudia; Hinz, John M; Tebbs, Robert S; Nham, Peter B; Urbin, Salustra S; Collins, David W; Thompson, Larry H; Schild, David
2006-01-01
In vertebrates, homologous recombinational repair (HRR) requires RAD51 and five RAD51 paralogs (XRCC2, XRCC3, RAD51B, RAD51C and RAD51D) that all contain conserved Walker A and B ATPase motifs. In human RAD51D we examined the requirement for these motifs in interactions with XRCC2 and RAD51C, and for survival of cells in response to DNA interstrand crosslinks (ICLs). Ectopic expression of wild-type human RAD51D or mutants having a non-functional A or B motif was used to test for complementation of a rad51d knockout hamster CHO cell line. Although A-motif mutants complement very efficiently, B-motif mutants do not. Consistent with these results, experiments using the yeast two- and three-hybrid systems show that the interactions between RAD51D and its XRCC2 and RAD51C partners also require a functional RAD51D B motif, but not motif A. Similarly, hamster Xrcc2 is unable to bind to the non-complementing human RAD51D B-motif mutants in co-immunoprecipitation assays. We conclude that a functional Walker B motif, but not A motif, is necessary for RAD51D's interactions with other paralogs and for efficient HRR. We present a model in which ATPase sites are formed in a bipartite manner between RAD51D and other RAD51 paralogs.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shaw, Debra J.; Morse, Robert; Todd, Adrian G.
The Ewing Sarcoma (EWS) protein is a ubiquitously expressed RNA processing factor that localises predominantly to the nucleus. However, the mechanism through which EWS enters the nucleus remains unclear, with differing reports identifying three separate import signals within the EWS protein. Here we have utilized a panel of truncated EWS proteins to clarify the reported nuclear localisation signals. We describe three C-terminal domains that are important for efficient EWS nuclear localization: (1) the third RGG-motif; (2) the last 10 amino acids (known as the PY-import motif); and (3) the zinc-finger motif. Although these three domains are involved in nuclear import,more » they are not independently capable of driving the efficient import of a GFP-moiety. However, collectively they form a complex tripartite signal that efficiently drives GFP-import into the nucleus. This study helps clarify the EWS import signal, and the identification of the involvement of both the RGG- and zinc-finger motifs has wide reaching implications.« less
Identification of high-efficiency 3'GG gRNA motifs in indexed FASTA files with ngg2.
Roberson, Elisha D O
CRISPR/Cas9 is emerging as one of the most-used methods of genome modification in organisms ranging from bacteria to human cells. However, the efficiency of editing varies tremendously site-to-site. A recent report identified a novel motif, called the 3'GG motif, which substantially increases the efficiency of editing at all sites tested in C. elegans . Furthermore, they highlighted that previously published gRNAs with high editing efficiency also had this motif. I designed a python command-line tool, ngg2, to identify 3'GG gRNA sites from indexed FASTA files. As a proof-of-concept, I screened for these motifs in six model genomes: Saccharomyces cerevisiae , Caenorhabditis elegans , Drosophila melanogaster , Danio rerio , Mus musculus , and Homo sapiens. I also scanned the genomes of pig ( Sus scrofa ) and African elephant ( Loxodonta africana ) to demonstrate the utility in non-model organisms. I identified more than 60 million single match 3'GG motifs in these genomes. Greater than 61% of all protein coding genes in the reference genomes had at least one unique 3'GG gRNA site overlapping an exon. In particular, more than 96% of mouse and 93% of human protein coding genes have at least one unique, overlapping 3'GG gRNA. These identified sites can be used as a starting point in gRNA selection, and the ngg2 tool provides an important ability to identify 3'GG editing sites in any species with an available genome sequence.
QuateXelero: An Accelerated Exact Network Motif Detection Algorithm
Khakabimamaghani, Sahand; Sharafuddin, Iman; Dichter, Norbert; Koch, Ina; Masoudi-Nejad, Ali
2013-01-01
Finding motifs in biological, social, technological, and other types of networks has become a widespread method to gain more knowledge about these networks’ structure and function. However, this task is very computationally demanding, because it is highly associated with the graph isomorphism which is an NP problem (not known to belong to P or NP-complete subsets yet). Accordingly, this research is endeavoring to decrease the need to call NAUTY isomorphism detection method, which is the most time-consuming step in many existing algorithms. The work provides an extremely fast motif detection algorithm called QuateXelero, which has a Quaternary Tree data structure in the heart. The proposed algorithm is based on the well-known ESU (FANMOD) motif detection algorithm. The results of experiments on some standard model networks approve the overal superiority of the proposed algorithm, namely QuateXelero, compared with two of the fastest existing algorithms, G-Tries and Kavosh. QuateXelero is especially fastest in constructing the central data structure of the algorithm from scratch based on the input network. PMID:23874498
Statistical significance of combinatorial regulations
Terada, Aika; Okada-Hatakeyama, Mariko; Tsuda, Koji; Sese, Jun
2013-01-01
More than three transcription factors often work together to enable cells to respond to various signals. The detection of combinatorial regulation by multiple transcription factors, however, is not only computationally nontrivial but also extremely unlikely because of multiple testing correction. The exponential growth in the number of tests forces us to set a strict limit on the maximum arity. Here, we propose an efficient branch-and-bound algorithm called the “limitless arity multiple-testing procedure” (LAMP) to count the exact number of testable combinations and calibrate the Bonferroni factor to the smallest possible value. LAMP lists significant combinations without any limit, whereas the family-wise error rate is rigorously controlled under the threshold. In the human breast cancer transcriptome, LAMP discovered statistically significant combinations of as many as eight binding motifs. This method may contribute to uncover pathways regulated in a coordinated fashion and find hidden associations in heterogeneous data. PMID:23882073
Memetic algorithms for de novo motif-finding in biomedical sequences.
Bi, Chengpeng
2012-09-01
The objectives of this study are to design and implement a new memetic algorithm for de novo motif discovery, which is then applied to detect important signals hidden in various biomedical molecular sequences. In this paper, memetic algorithms are developed and tested in de novo motif-finding problems. Several strategies in the algorithm design are employed that are to not only efficiently explore the multiple sequence local alignment space, but also effectively uncover the molecular signals. As a result, there are a number of key features in the implementation of the memetic motif-finding algorithm (MaMotif), including a chromosome replacement operator, a chromosome alteration-aware local search operator, a truncated local search strategy, and a stochastic operation of local search imposed on individual learning. To test the new algorithm, we compare MaMotif with a few of other similar algorithms using simulated and experimental data including genomic DNA, primary microRNA sequences (let-7 family), and transmembrane protein sequences. The new memetic motif-finding algorithm is successfully implemented in C++, and exhaustively tested with various simulated and real biological sequences. In the simulation, it shows that MaMotif is the most time-efficient algorithm compared with others, that is, it runs 2 times faster than the expectation maximization (EM) method and 16 times faster than the genetic algorithm-based EM hybrid. In both simulated and experimental testing, results show that the new algorithm is compared favorably or superior to other algorithms. Notably, MaMotif is able to successfully discover the transcription factors' binding sites in the chromatin immunoprecipitation followed by massively parallel sequencing (ChIP-Seq) data, correctly uncover the RNA splicing signals in gene expression, and precisely find the highly conserved helix motif in the transmembrane protein sequences, as well as rightly detect the palindromic segments in the primary microRNA sequences. The memetic motif-finding algorithm is effectively designed and implemented, and its applications demonstrate it is not only time-efficient, but also exhibits excellent performance while compared with other popular algorithms. Copyright © 2012 Elsevier B.V. All rights reserved.
Yu, Yun-Zhou; Ma, Yao; Xu, Wen-Hui; Wang, Shuang; Sun, Zhi-Wei
2015-08-01
DNA vaccines are generally weak stimulators of the immune system. Fortunately, their efficacy can be improved using a viral replicon vector or by the addition of immunostimulatory CpG motifs, although the design of these engineered DNA vectors requires optimization. Our results clearly suggest that multiple copies of three types of CpG motifs or combinations of various types of CpG motifs cloned into a viral replicon vector backbone with strong immunostimulatory activities on human PBMC are efficient adjuvants for these DNA vaccines to modulate and enhance protective immunity against anthrax, although modifications with these different CpG forms in vivo elicited inconsistent immune response profiles. Modification with more copies of CpG motifs elicited more potent adjuvant effects leading to the generation of enhanced immunity, which indicated a CpG motif dose-dependent enhancement of antigen-specific immune responses. Notably, the enhanced and/or synchronous adjuvant effects were observed in modification with combinations of two different types of CpG motifs, which provides not only a contribution to the knowledge base on the adjuvant activities of CpG motifs combinations but also implications for the rational design of optimal DNA vaccines with combinations of CpG motifs as "built-in" adjuvants. We describe an efficient strategy to design and optimize DNA vaccines by the addition of combined immunostimulatory CpG motifs in a viral replicon DNA plasmid to produce strong immune responses, which indicates that the CpG-modified viral replicon DNA plasmid may be desirable for use as vector of DNA vaccines.
Majidi, Asia; Nikkhah, Maryam; Sadeghian, Faranak; Hosseinkhani, Saman
2016-10-01
In last decades great efforts have been devoted to the study of development of recombinant peptide based vectors that consist of biological motifs with potential applications in gene therapy. Recombinant Biomimetic Chimeric Vectors (rBCVs) are biopolymeric nanocarriers that are designed to mimic viral features to overcome the cellular obstacles in gene transferring pathway into cell nucleus. In this research, we designed and genetically engineered three novel rBCVs with similar sequences that differed in motifs arrangement and motif abundance: MPG-2H1, 2TMPG-2H1 and 2RMPG-2H1. The MPG as a famous amphipathic cell penetrating peptide is the main segment of these constructs which was studied for the first time in association with truncated histone H1 DNA condensing motif. Through the performance of several physicochemical and biological assays, the rBCVs were remarkably examined regarding transfection efficiency. The main objective of this study is focused on the importance of motif design in transfection efficiency of rBCVs on one hand, and the assessment of correlation between structural features and functionality of motifs on the other hand. The results revealed that all three kinds of rBCVs/pDNA nanoparticles with average sizes of 200nm could overwhelm the cellular obstacles associated with gene transfer, and lead to efficient gene delivery. Furthermore, no significant toxicity was perceived and efficient endosome disruptive activity was obtained. It is noteworthy to say among three mentioned constructs 2RMPG-2H1 showed the highest transfection efficiency. Overall the peptide based vectors hold great promise as a nontoxic and effective gene carrier in vitro and in vivo, besides the rational design possibility as the most vital advantages over the other non-viral gene delivery vectors. Copyright © 2016 Elsevier B.V. All rights reserved.
Automated Recognition of RNA Structure Motifs by Their SHAPE Data Signatures.
Radecki, Pierce; Ledda, Mirko; Aviran, Sharon
2018-06-14
High-throughput structure profiling (SP) experiments that provide information at nucleotide resolution are revolutionizing our ability to study RNA structures. Of particular interest are RNA elements whose underlying structures are necessary for their biological functions. We previously introduced patteRNA , an algorithm for rapidly mining SP data for patterns characteristic of such motifs. This work provided a proof-of-concept for the detection of motifs and the capability of distinguishing structures displaying pronounced conformational changes. Here, we describe several improvements and automation routines to patteRNA . We then consider more elaborate biological situations starting with the comparison or integration of results from searches for distinct motifs and across datasets. To facilitate such analyses, we characterize patteRNA ’s outputs and describe a normalization framework that regularizes results. We then demonstrate that our algorithm successfully discerns between highly similar structural variants of the human immunodeficiency virus type 1 (HIV-1) Rev response element (RRE) and readily identifies its exact location in whole-genome structure profiles of HIV-1. This work highlights the breadth of information that can be gleaned from SP data and broadens the utility of data-driven methods as tools for the detection of novel RNA elements.
A Motif in the Clathrin Heavy Chain Required for the Hsc70/Auxilin Uncoating Reaction
Rapoport, Iris; Boll, Werner; Yu, Anan; Böcking, Till
2008-01-01
The 70-kDa heat-shock cognate protein (Hsc70) chaperone is an ATP-dependent “disassembly enzyme” for many subcellular structures, including clathrin-coated vesicles where it functions as an uncoating ATPase. Hsc70, and its cochaperone auxilin together catalyze coat disassembly. Like other members of the Hsp70 chaperone family, it is thought that ATP-bound Hsc70 recognizes the clathrin triskelion through an unfolded exposed hydrophobic segment. The best candidate is the unstructured C terminus (residues 1631–1675) of the heavy chain at the foot of the tripod below the hub, containing the sequence motif QLMLT, closely related to the sequence bound preferentially by the substrate groove of Hsc70 (Fotin et al., 2004b). To test this hypothesis, we generated in insect cells recombinant mammalian triskelions that in vitro form clathrin cages and clathrin/AP-2 coats exactly like those assembled from native clathrin. We show that coats assembled from recombinant clathrin are good substrates for ATP- and auxilin-dependent, Hsc70-catalyzed uncoating. Finally, we show that this uncoating reaction proceeds normally when the coats contain recombinant heavy chains truncated C-terminal to the QLMLT motif, but very inefficiently when the motif is absent. Thus, the QLMLT motif is required for Hsc-70–facilitated uncoating, consistent with the proposal that this sequence is a specific target of the chaperone. PMID:17978091
CombiMotif: A new algorithm for network motifs discovery in protein-protein interaction networks
NASA Astrophysics Data System (ADS)
Luo, Jiawei; Li, Guanghui; Song, Dan; Liang, Cheng
2014-12-01
Discovering motifs in protein-protein interaction networks is becoming a current major challenge in computational biology, since the distribution of the number of network motifs can reveal significant systemic differences among species. However, this task can be computationally expensive because of the involvement of graph isomorphic detection. In this paper, we present a new algorithm (CombiMotif) that incorporates combinatorial techniques to count non-induced occurrences of subgraph topologies in the form of trees. The efficiency of our algorithm is demonstrated by comparing the obtained results with the current state-of-the art subgraph counting algorithms. We also show major differences between unicellular and multicellular organisms. The datasets and source code of CombiMotif are freely available upon request.
Trend Motif: A Graph Mining Approach for Analysis of Dynamic Complex Networks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, R; McCallen, S; Almaas, E
2007-05-28
Complex networks have been used successfully in scientific disciplines ranging from sociology to microbiology to describe systems of interacting units. Until recently, studies of complex networks have mainly focused on their network topology. However, in many real world applications, the edges and vertices have associated attributes that are frequently represented as vertex or edge weights. Furthermore, these weights are often not static, instead changing with time and forming a time series. Hence, to fully understand the dynamics of the complex network, we have to consider both network topology and related time series data. In this work, we propose a motifmore » mining approach to identify trend motifs for such purposes. Simply stated, a trend motif describes a recurring subgraph where each of its vertices or edges displays similar dynamics over a userdefined period. Given this, each trend motif occurrence can help reveal significant events in a complex system; frequent trend motifs may aid in uncovering dynamic rules of change for the system, and the distribution of trend motifs may characterize the global dynamics of the system. Here, we have developed efficient mining algorithms to extract trend motifs. Our experimental validation using three disparate empirical datasets, ranging from the stock market, world trade, to a protein interaction network, has demonstrated the efficiency and effectiveness of our approach.« less
Identification of multiple nuclear localization signals in murine Elf3, an ETS transcription factor.
Do, Hyun-Jin; Song, Hyuk; Yang, Heung-Mo; Kim, Dong-Ku; Kim, Nam-Hyung; Kim, Jin-Hoi; Cha, Kwang-Yul; Chung, Hyung-Min; Kim, Jae-Hwan
2006-03-20
We investigated nuclear localization signal (NLS) determinants within the AT-hook and ETS DNA-binding domains of murine Elf3 (mElf3), a member of the subfamily of epithelium-specific ETS transcription factors. Deletion mutants containing the AT-hook, ETS domain or both localized strictly in the nucleus, suggesting that these individual domains contain independent NLS motif(s). Within the AT-hook domain, four basic residues (244KRKR247) were critical for strong NLS activity, and two potent bipartite NLS motifs (236-252 and 249-267) were sufficient for nuclear import of mElf3, although less efficient than the full domain. In addition, one stretch of basic residues (318KKK320) within the ETS domain appears to be essential for mElf3 nuclear localization. Taken together, mElf3 contains multiple NLS motifs, which may function cooperatively to effect efficient nuclear transport.
info-gibbs: a motif discovery algorithm that directly optimizes information content during sampling.
Defrance, Matthieu; van Helden, Jacques
2009-10-15
Discovering cis-regulatory elements in genome sequence remains a challenging issue. Several methods rely on the optimization of some target scoring function. The information content (IC) or relative entropy of the motif has proven to be a good estimator of transcription factor DNA binding affinity. However, these information-based metrics are usually used as a posteriori statistics rather than during the motif search process itself. We introduce here info-gibbs, a Gibbs sampling algorithm that efficiently optimizes the IC or the log-likelihood ratio (LLR) of the motif while keeping computation time low. The method compares well with existing methods like MEME, BioProspector, Gibbs or GAME on both synthetic and biological datasets. Our study shows that motif discovery techniques can be enhanced by directly focusing the search on the motif IC or the motif LLR. http://rsat.ulb.ac.be/rsat/info-gibbs
Tarr, Sarah J; Cryar, Adam; Thalassinos, Konstantinos; Haldar, Kasturi; Osborne, Andrew R
2013-01-01
The malaria parasite exports proteins across its plasma membrane and a surrounding parasitophorous vacuole membrane, into its host erythrocyte. Most exported proteins contain a Host Targeting motif (HT motif) that targets them for export. In the parasite secretory pathway, the HT motif is cleaved by the protease plasmepsin V, but the role of the newly generated N-terminal sequence in protein export is unclear. Using a model protein that is cleaved by an exogenous viral protease, we show that the new N-terminal sequence, normally generated by plasmepsin V cleavage, is sufficient to target a protein for export, and that cleavage by plasmepsin V is not coupled directly to the transfer of a protein to the next component in the export pathway. Mutation of the fourth and fifth positions of the HT motif, as well as amino acids further downstream, block or affect the efficiency of protein export indicating that this region is necessary for efficient export. We also show that the fifth position of the HT motif is important for plasmepsin V cleavage. Our results indicate that plasmepsin V cleavage is required to generate a new N-terminal sequence that is necessary and sufficient to mediate protein export by the malaria parasite. PMID:23279267
Mackiewicz, Dorota; de Oliveira, Paulo Murilo Castro; Moss de Oliveira, Suzana; Cebrat, Stanisław
2013-01-01
Recombination is the main cause of genetic diversity. Thus, errors in this process can lead to chromosomal abnormalities. Recombination events are confined to narrow chromosome regions called hotspots in which characteristic DNA motifs are found. Genomic analyses have shown that both recombination hotspots and DNA motifs are distributed unevenly along human chromosomes and are much more frequent in the subtelomeric regions of chromosomes than in their central parts. Clusters of motifs roughly follow the distribution of recombination hotspots whereas single motifs show a negative correlation with the hotspot distribution. To model the phenomena related to recombination, we carried out computer Monte Carlo simulations of genome evolution. Computer simulations generated uneven distribution of hotspots with their domination in the subtelomeric regions of chromosomes. They also revealed that purifying selection eliminating defective alleles is strong enough to cause such hotspot distribution. After sufficiently long time of simulations, the structure of chromosomes reached a dynamic equilibrium, in which number and global distribution of both hotspots and defective alleles remained statistically unchanged, while their precise positions were shifted. This resembles the dynamic structure of human and chimpanzee genomes, where hotspots change their exact locations but the global distributions of recombination events are very similar.
Mackiewicz, Dorota; de Oliveira, Paulo Murilo Castro; Moss de Oliveira, Suzana; Cebrat, Stanisław
2013-01-01
Recombination is the main cause of genetic diversity. Thus, errors in this process can lead to chromosomal abnormalities. Recombination events are confined to narrow chromosome regions called hotspots in which characteristic DNA motifs are found. Genomic analyses have shown that both recombination hotspots and DNA motifs are distributed unevenly along human chromosomes and are much more frequent in the subtelomeric regions of chromosomes than in their central parts. Clusters of motifs roughly follow the distribution of recombination hotspots whereas single motifs show a negative correlation with the hotspot distribution. To model the phenomena related to recombination, we carried out computer Monte Carlo simulations of genome evolution. Computer simulations generated uneven distribution of hotspots with their domination in the subtelomeric regions of chromosomes. They also revealed that purifying selection eliminating defective alleles is strong enough to cause such hotspot distribution. After sufficiently long time of simulations, the structure of chromosomes reached a dynamic equilibrium, in which number and global distribution of both hotspots and defective alleles remained statistically unchanged, while their precise positions were shifted. This resembles the dynamic structure of human and chimpanzee genomes, where hotspots change their exact locations but the global distributions of recombination events are very similar. PMID:23776462
Chilton, Scott S; Falbel, Tanya G; Hromada, Susan; Burton, Briana M
2017-08-01
Genetic competence is a process in which cells are able to take up DNA from their environment, resulting in horizontal gene transfer, a major mechanism for generating diversity in bacteria. Many bacteria carry homologs of the central DNA uptake machinery that has been well characterized in Bacillus subtilis It has been postulated that the B. subtilis competence helicase ComFA belongs to the DEAD box family of helicases/translocases. Here, we made a series of mutants to analyze conserved amino acid motifs in several regions of B. subtilis ComFA. First, we confirmed that ComFA activity requires amino acid residues conserved among the DEAD box helicases, and second, we show that a zinc finger-like motif consisting of four cysteines is required for efficient transformation. Each cysteine in the motif is important, and mutation of at least two of the cysteines dramatically reduces transformation efficiency. Further, combining multiple cysteine mutations with the helicase mutations shows an additive phenotype. Our results suggest that the helicase and metal binding functions are two distinct activities important for ComFA function during transformation. IMPORTANCE ComFA is a highly conserved protein that has a role in DNA uptake during natural competence, a mechanism for horizontal gene transfer observed in many bacteria. Investigation of the details of the DNA uptake mechanism is important for understanding the ways in which bacteria gain new traits from their environment, such as drug resistance. To dissect the role of ComFA in the DNA uptake machinery, we introduced point mutations into several motifs in the protein sequence. We demonstrate that several amino acid motifs conserved among ComFA proteins are important for efficient transformation. This report is the first to demonstrate the functional requirement of an amino-terminal cysteine motif in ComFA. Copyright © 2017 American Society for Microbiology.
SCOPE: a web server for practical de novo motif discovery.
Carlson, Jonathan M; Chakravarty, Arijit; DeZiel, Charles E; Gross, Robert H
2007-07-01
SCOPE is a novel parameter-free method for the de novo identification of potential regulatory motifs in sets of coordinately regulated genes. The SCOPE algorithm combines the output of three component algorithms, each designed to identify a particular class of motifs. Using an ensemble learning approach, SCOPE identifies the best candidate motifs from its component algorithms. In tests on experimentally determined datasets, SCOPE identified motifs with a significantly higher level of accuracy than a number of other web-based motif finders run with their default parameters. Because SCOPE has no adjustable parameters, the web server has an intuitive interface, requiring only a set of gene names or FASTA sequences and a choice of species. The most significant motifs found by SCOPE are displayed graphically on the main results page with a table containing summary statistics for each motif. Detailed motif information, including the sequence logo, PWM, consensus sequence and specific matching sites can be viewed through a single click on a motif. SCOPE's efficient, parameter-free search strategy has enabled the development of a web server that is readily accessible to the practising biologist while providing results that compare favorably with those of other motif finders. The SCOPE web server is at
ProMotE: an efficient algorithm for counting independent motifs in uncertain network topologies.
Ren, Yuanfang; Sarkar, Aisharjya; Kahveci, Tamer
2018-06-26
Identifying motifs in biological networks is essential in uncovering key functions served by these networks. Finding non-overlapping motif instances is however a computationally challenging task. The fact that biological interactions are uncertain events further complicates the problem, as it makes the existence of an embedding of a given motif an uncertain event as well. In this paper, we develop a novel method, ProMotE (Probabilistic Motif Embedding), to count non-overlapping embeddings of a given motif in probabilistic networks. We utilize a polynomial model to capture the uncertainty. We develop three strategies to scale our algorithm to large networks. Our experiments demonstrate that our method scales to large networks in practical time with high accuracy where existing methods fail. Moreover, our experiments on cancer and degenerative disease networks show that our method helps in uncovering key functional characteristics of biological networks.
DIVERSITY in binding, regulation, and evolution revealed from high-throughput ChIP.
Mitra, Sneha; Biswas, Anushua; Narlikar, Leelavati
2018-04-01
Genome-wide in vivo protein-DNA interactions are routinely mapped using high-throughput chromatin immunoprecipitation (ChIP). ChIP-reported regions are typically investigated for enriched sequence-motifs, which are likely to model the DNA-binding specificity of the profiled protein and/or of co-occurring proteins. However, simple enrichment analyses can miss insights into the binding-activity of the protein. Note that ChIP reports regions making direct contact with the protein as well as those binding through intermediaries. For example, consider a ChIP experiment targeting protein X, which binds DNA at its cognate sites, but simultaneously interacts with four other proteins. Each of these proteins also binds to its own specific cognate sites along distant parts of the genome, a scenario consistent with the current view of transcriptional hubs and chromatin loops. Since ChIP will pull down all X-associated regions, the final reported data will be a union of five distinct sets of regions, each containing binding sites of one of the five proteins, respectively. Characterizing all five different motifs and the corresponding sets is important to interpret the ChIP experiment and ultimately, the role of X in regulation. We present diversity which attempts exactly this: it partitions the data so that each partition can be characterized with its own de novo motif. Diversity uses a Bayesian approach to identify the optimal number of motifs and the associated partitions, which together explain the entire dataset. This is in contrast to standard motif finders, which report motifs individually enriched in the data, but do not necessarily explain all reported regions. We show that the different motifs and associated regions identified by diversity give insights into the various complexes that may be forming along the chromatin, something that has so far not been attempted from ChIP data. Webserver at http://diversity.ncl.res.in/; standalone (Mac OS X/Linux) from https://github.com/NarlikarLab/DIVERSITY/releases/tag/v1.0.0.
Local Renyi entropic profiles of DNA sequences.
Vinga, Susana; Almeida, Jonas S
2007-10-16
In a recent report the authors presented a new measure of continuous entropy for DNA sequences, which allows the estimation of their randomness level. The definition therein explored was based on the Rényi entropy of probability density estimation (pdf) using the Parzen's window method and applied to Chaos Game Representation/Universal Sequence Maps (CGR/USM). Subsequent work proposed a fractal pdf kernel as a more exact solution for the iterated map representation. This report extends the concepts of continuous entropy by defining DNA sequence entropic profiles using the new pdf estimations to refine the density estimation of motifs. The new methodology enables two results. On the one hand it shows that the entropic profiles are directly related with the statistical significance of motifs, allowing the study of under and over-representation of segments. On the other hand, by spanning the parameters of the kernel function it is possible to extract important information about the scale of each conserved DNA region. The computational applications, developed in Matlab m-code, the corresponding binary executables and additional material and examples are made publicly available at http://kdbio.inesc-id.pt/~svinga/ep/. The ability to detect local conservation from a scale-independent representation of symbolic sequences is particularly relevant for biological applications where conserved motifs occur in multiple, overlapping scales, with significant future applications in the recognition of foreign genomic material and inference of motif structures.
Local Renyi entropic profiles of DNA sequences
Vinga, Susana; Almeida, Jonas S
2007-01-01
Background In a recent report the authors presented a new measure of continuous entropy for DNA sequences, which allows the estimation of their randomness level. The definition therein explored was based on the Rényi entropy of probability density estimation (pdf) using the Parzen's window method and applied to Chaos Game Representation/Universal Sequence Maps (CGR/USM). Subsequent work proposed a fractal pdf kernel as a more exact solution for the iterated map representation. This report extends the concepts of continuous entropy by defining DNA sequence entropic profiles using the new pdf estimations to refine the density estimation of motifs. Results The new methodology enables two results. On the one hand it shows that the entropic profiles are directly related with the statistical significance of motifs, allowing the study of under and over-representation of segments. On the other hand, by spanning the parameters of the kernel function it is possible to extract important information about the scale of each conserved DNA region. The computational applications, developed in Matlab m-code, the corresponding binary executables and additional material and examples are made publicly available at . Conclusion The ability to detect local conservation from a scale-independent representation of symbolic sequences is particularly relevant for biological applications where conserved motifs occur in multiple, overlapping scales, with significant future applications in the recognition of foreign genomic material and inference of motif structures. PMID:17939871
Continuation of research into software for space operations support, volume 1
NASA Technical Reports Server (NTRS)
Collier, Mark D.; Killough, Ronnie; Martin, Nancy L.
1990-01-01
A prototype workstation executive called the Hardware Independent Software Development Environment (HISDE) was developed. Software technologies relevant to workstation executives were researched and evaluated and HISDE was used as a test bed for prototyping efforts. New X Windows software concepts and technology were introduced into workstation executives and related applications. The four research efforts performed included: (1) Research into the usability and efficiency of Motif (an X Windows based graphic user interface) which consisted of converting the existing Athena widget based HISDE user interface to Motif demonstrating the usability of Motif and providing insight into the level of effort required to translate an application from widget to another; (2) Prototype a real time data display widget which consisted of research methods for and prototyping the selected method of displaying textual values in an efficient manner; (3) X Windows performance evaluation which consisted of a series of performance measurements which demonstrated the ability of low level X Windows to display textural information; (4) Convert the Display Manager to X Window/Motif which is the application used by NASA for data display during operational mode.
Chen, Jia-Rong; Cao, Yi-Ju; Zou, You-Quan; Tan, Fen; Fu, Liang; Zhu, Xiao-Yu; Xiao, Wen-Jing
2010-03-21
A series of thiourea-amine bifunctional catalysts have been developed by a rational combination of prolines with cinchona alkaloids, which are connected by a thiourea motif. The catalyst 3a, prepared from L-proline and cinchonidine, was found to be a highly efficient catalyst for the conjugate addition of ketones/aldehydes to a wide range of nitroalkenes (up to 98/2 dr and 96% ee). The privileged cinchonidine backbone and the thiourea motif are essential to the reaction activity and enantioselectivity.
Zhang, Lu; Xu, Jinhao; Ma, Jinbiao
2016-07-25
RNA-binding protein exerts important biological function by specifically recognizing RNA motif. SELEX (Systematic evolution of ligands by exponential enrichment), an in vitro selection method, can obtain consensus motif with high-affinity and specificity for many target molecules from DNA or RNA libraries. Here, we combined SELEX with next-generation sequencing to study the protein-RNA interaction in vitro. A pool of RNAs with 20 bp random sequences were transcribed by T7 promoter, and target protein was inserted into plasmid containing SBP-tag, which can be captured by streptavidin beads. Through only one cycle, the specific RNA motif can be obtained, which dramatically improved the selection efficiency. Using this method, we found that human hnRNP A1 RRMs domain (UP1 domain) bound RNA motifs containing AGG and AG sequences. The EMSA experiment indicated that hnRNP A1 RRMs could bind the obtained RNA motif. Taken together, this method provides a rapid and effective method to study the RNA binding specificity of proteins.
Zhang, Limeng; Zhang, Hua; Fan, Ziyao; Zhou, Xue; Yu, Liquan; Sun, Hunan; Wu, Zhijun; Yu, Yongzhong; Song, Baifen; Ma, Jinzhu; Tong, Chunyu; Zhu, Zhanbo; Cui, Yudong
2015-02-01
Streptococcus dysgalactiae (S. dysgalactiae) GapC protein is a protective antigen that induces partial immunity against S. dysgalactiae infection in animals. To identify the conserved B-cell epitope of S. dysgalactiae GapC, a mouse monoclonal antibody 1E11 (mAb1E11) against GapC was generated and used to screen a phage-displayed 12-mer random peptide library (Ph.D.-12). Eleven positive clones recognized by mAb1E11 were identified, most of which matched the consensus motif TGFFAKK. Sequence of the motif exactly matched amino acids 97-103 of the S. dysgalactiae GapC. In addition, the epitope (97)TGFFAKK(103) showed high homology among different streptococcus species. Site-directed mutagenic analysis further confirmed that residues G98, F99, F100 and K103 formed the core of (97)TGFFAKK(103), and this core motif was the minimal determinant of the B-cell epitope recognized by the mAb1E11. Collectively, the identification of conserved B-cell epitope within S. dysgalactiae GapC highlights the possibility of developing the epitope-based vaccine. Copyright © 2014 Elsevier Ltd. All rights reserved.
Efficient activation of transcription in yeast by the BPV1 E2 protein.
Stanway, C A; Sowden, M P; Wilson, L E; Kingsman, A J; Kingsman, S M
1989-01-01
The full-length gene product encoded by the E2 open reading frame (ORF) of bovine papillomavirus type 1 (BPV1) is a transcriptional transactivator. It is believed to mediate its effect on the BPV1 long control region (LCR) by binding to motifs with the consensus sequence ACCN6GGT. The minimal functional cis active site, called the E2 response element (E2RE), in mammalian cells comprises two copies of this motif. Here we have shown that E2 can function in Saccharomyces cerevisiae by placing an E2RE upstream of a synthetic yeast assay promoter which consists of a TATA motif and an mRNA initiation site, spaced correctly. This E2RE-minimal promoter is only transcriptionally active in the presence of E2 protein and the resulting mRNA is initiated at the authentic start site. This is the first report of a mammalian viral transactivator functioning in yeast. The level of activation by E2 via the E2RE was the same as observed with the highly efficient authentic PGK promoter where the upstream activation sequence is composed of three distinct elements. Furthermore a single E2 motif which is insufficient in mammalian cells as an activation site was as efficiently utilized in yeast as the E2RE (2 motifs). Previous studies have shown that mammalian cellular activators can function in yeast and our data now extend this to viral-specific activators. Our data indicate however that while the mechanism of transactivation is broadly conserved there may be significant differences at the detailed level. Images PMID:2539584
Selection of the simplest RNA that binds isoleucine
LOZUPONE, CATHERINE; CHANGAYIL, SHANKAR; MAJERFELD, IRENE; YARUS, MICHAEL
2003-01-01
We have identified the simplest RNA binding site for isoleucine using selection-amplification (SELEX), by shrinking the size of the randomized region until affinity selection is extinguished. Such a protocol can be useful because selection does not necessarily make the simplest active motif most prominent, as is often assumed. We find an isoleucine binding site that behaves exactly as predicted for the site that requires fewest nucleotides. This UAUU motif (16 highly conserved positions; 27 total), is also the most abundant site in successful selections on short random tracts. The UAUU site, now isolated independently at least 63 times, is a small asymmetric internal loop. Conserved loop sequences include isoleucine codon and anticodon triplets, whose nucleotides are required for amino acid binding. This reproducible association between isoleucine and its coding sequences supports the idea that the genetic code is, at least in part, a stereochemical residue of the most easily isolated RNA–amino acid binding structures. PMID:14561881
Jiang, Ya-Jun; Che, Mei-Xia; Yuan, Jin-Qiao; Xie, Yuan-Yuan; Yan, Xian-Zhong; Hu, Hong-Yu
2011-01-01
Huntington disease (HD) is an autosomal inherited disorder that causes the deterioration of brain cells. The polyglutamine (polyQ) expansion of huntingtin (Htt) is implicated in the pathogenesis of HD via interaction with an RNA splicing factor, Htt yeast two-hybrid protein A/forming-binding protein 11 (HYPA/FBP11). Besides the pathogenic polyQ expansion, Htt also contains a proline-rich region (PRR) located exactly in the C terminus to the polyQ tract. However, how the polyQ expansion influences the PRR-mediated protein interaction and how this abnormal interaction leads to the biological consequence remain elusive. Our NMR structural analysis indicates that the PRR motif of Htt cooperatively interacts with the tandem WW domains of HYPA through domain chaperoning effect of WW1 on WW2. The polyQ-expanded Htt sequesters HYPA to the cytosolic location and then significantly reduces the efficiency of pre-mRNA splicing. We propose that the toxic gain-of-function of the polyQ-expanded Htt that causes dysfunction of cellular RNA processing contributes to the pathogenesis of HD. PMID:21566141
Jiang, Ya-Jun; Che, Mei-Xia; Yuan, Jin-Qiao; Xie, Yuan-Yuan; Yan, Xian-Zhong; Hu, Hong-Yu
2011-07-15
Huntington disease (HD) is an autosomal inherited disorder that causes the deterioration of brain cells. The polyglutamine (polyQ) expansion of huntingtin (Htt) is implicated in the pathogenesis of HD via interaction with an RNA splicing factor, Htt yeast two-hybrid protein A/forming-binding protein 11 (HYPA/FBP11). Besides the pathogenic polyQ expansion, Htt also contains a proline-rich region (PRR) located exactly in the C terminus to the polyQ tract. However, how the polyQ expansion influences the PRR-mediated protein interaction and how this abnormal interaction leads to the biological consequence remain elusive. Our NMR structural analysis indicates that the PRR motif of Htt cooperatively interacts with the tandem WW domains of HYPA through domain chaperoning effect of WW1 on WW2. The polyQ-expanded Htt sequesters HYPA to the cytosolic location and then significantly reduces the efficiency of pre-mRNA splicing. We propose that the toxic gain-of-function of the polyQ-expanded Htt that causes dysfunction of cellular RNA processing contributes to the pathogenesis of HD.
ProGeRF: Proteome and Genome Repeat Finder Utilizing a Fast Parallel Hash Function
Moraes, Walas Jhony Lopes; Rodrigues, Thiago de Souza; Bartholomeu, Daniella Castanheira
2015-01-01
Repetitive element sequences are adjacent, repeating patterns, also called motifs, and can be of different lengths; repetitions can involve their exact or approximate copies. They have been widely used as molecular markers in population biology. Given the sizes of sequenced genomes, various bioinformatics tools have been developed for the extraction of repetitive elements from DNA sequences. However, currently available tools do not provide options for identifying repetitive elements in the genome or proteome, displaying a user-friendly web interface, and performing-exhaustive searches. ProGeRF is a web site for extracting repetitive regions from genome and proteome sequences. It was designed to be efficient, fast, and accurate and primarily user-friendly web tool allowing many ways to view and analyse the results. ProGeRF (Proteome and Genome Repeat Finder) is freely available as a stand-alone program, from which the users can download the source code, and as a web tool. It was developed using the hash table approach to extract perfect and imperfect repetitive regions in a (multi)FASTA file, while allowing a linear time complexity. PMID:25811026
Tlatli, Rym; Nozach, Hervé; Collet, Guillaume; Beau, Fabrice; Vera, Laura; Stura, Enrico; Dive, Vincent; Cuniasse, Philippe
2013-01-01
Artificial miniproteins that are able to target catalytic sites of matrix metalloproteinases (MMPs) were designed using a functional motif-grafting approach. The motif corresponded to the four N-terminal residues of TIMP-2, a broad-spectrum protein inhibitor of MMPs. Scaffolds that are able to reproduce the functional topology of this motif were obtained by exhaustive screening of the Protein Data Bank (PDB) using STAMPS software (search for three-dimensional atom motifs in protein structures). Ten artificial protein binders were produced. The designed proteins bind catalytic sites of MMPs with affinities ranging from 450 nm to 450 μm prior to optimization. The crystal structure of one artificial binder in complex with the catalytic domain of MMP-12 showed that the inter-molecular interactions established by the functional motif in the artificial binder corresponded to those found in the MMP-14-TIMP-2 complex, albeit with some differences in geometry. Molecular dynamics simulations of the ten binders in complex with MMP-14 suggested that these scaffolds may allow partial reproduction of native inter-molecular interactions, but differences in geometry and stability may contribute to the lower affinity of the artificial protein binders compared to the natural protein binder. Nevertheless, these results show that the in silico design method used provides sets of protein binders that target a specific binding site with a good rate of success. This approach may constitute the first step of an efficient hybrid computational/experimental approach to protein binder design. © 2012 The Authors Journal compilation © 2012 FEBS.
La Sala, Giuseppina; Riccardi, Laura; Gaspari, Roberto; Cavalli, Andrea; Hantschel, Oliver; De Vivo, Marco
2016-11-08
A number of structural factors modulate the activity of Abelson (Abl) tyrosine kinase, whose deregulation is often related to oncogenic processes. First, only the open conformation of the Abl kinase domain's activation loop (A-loop) favors ATP binding to the catalytic cleft. In this regard, the trans-autophosphorylation of the Y412 residue, which is located along the A-loop, favors the stability of the open conformation, in turn enhancing Abl activity. Another key factor for full Abl activity is the formation of active conformations of the catalytic DFG motif in the Abl kinase domain. Furthermore, binding of the SH2 domain to the N-lobe of the Abl kinase was recently demonstrated to have a long-range allosteric effect on the stabilization of the A-loop open state. Intriguingly, these distinct structural factors imply a complex signal transmission network for controlling the A-loop's flexibility and conformational preference for optimal Abl function. However, the exact dynamical features of this signal transmission network structure remain unclear. Here, we report on microsecond-long molecular dynamics coupled with enhanced sampling simulations of multiple Abl model systems, in the presence or absence of the SH2 domain and with the DFG motif flipped in two ways (in or out conformation). Through comparative analysis, our simulations augment the interpretation of the existing Abl experimental data, revealing a dynamical network of interactions that interconnect SH2 domain binding with A-loop plasticity and Y412 autophosphorylation in Abl. This signaling network engages the DFG motif and, importantly, other conserved structural elements of the kinase domain, namely, the EPK-ELK H-bond network and the HRD motif. Our results show that the signal propagation for modulating the A-loop spatial localization is highly dependent on the HRD motif conformation, which thus acts as the central hub of this (allosteric) signaling network controlling Abl activation and function.
The effect of orthology and coregulation on detecting regulatory motifs.
Storms, Valerie; Claeys, Marleen; Sanchez, Aminael; De Moor, Bart; Verstuyf, Annemieke; Marchal, Kathleen
2010-02-03
Computational de novo discovery of transcription factor binding sites is still a challenging problem. The growing number of sequenced genomes allows integrating orthology evidence with coregulation information when searching for motifs. Moreover, the more advanced motif detection algorithms explicitly model the phylogenetic relatedness between the orthologous input sequences and thus should be well adapted towards using orthologous information. In this study, we evaluated the conditions under which complementing coregulation with orthologous information improves motif detection for the class of probabilistic motif detection algorithms with an explicit evolutionary model. We designed datasets (real and synthetic) covering different degrees of coregulation and orthologous information to test how well Phylogibbs and Phylogenetic sampler, as representatives of the motif detection algorithms with evolutionary model performed as compared to MEME, a more classical motif detection algorithm that treats orthologs independently. Under certain conditions detecting motifs in the combined coregulation-orthology space is indeed more efficient than using each space separately, but this is not always the case. Moreover, the difference in success rate between the advanced algorithms and MEME is still marginal. The success rate of motif detection depends on the complex interplay between the added information and the specificities of the applied algorithms. Insights in this relation provide information useful to both developers and users. All benchmark datasets are available at http://homes.esat.kuleuven.be/~kmarchal/Supplementary_Storms_Valerie_PlosONE.
The Effect of Orthology and Coregulation on Detecting Regulatory Motifs
Storms, Valerie; Claeys, Marleen; Sanchez, Aminael; De Moor, Bart; Verstuyf, Annemieke; Marchal, Kathleen
2010-01-01
Background Computational de novo discovery of transcription factor binding sites is still a challenging problem. The growing number of sequenced genomes allows integrating orthology evidence with coregulation information when searching for motifs. Moreover, the more advanced motif detection algorithms explicitly model the phylogenetic relatedness between the orthologous input sequences and thus should be well adapted towards using orthologous information. In this study, we evaluated the conditions under which complementing coregulation with orthologous information improves motif detection for the class of probabilistic motif detection algorithms with an explicit evolutionary model. Methodology We designed datasets (real and synthetic) covering different degrees of coregulation and orthologous information to test how well Phylogibbs and Phylogenetic sampler, as representatives of the motif detection algorithms with evolutionary model performed as compared to MEME, a more classical motif detection algorithm that treats orthologs independently. Results and Conclusions Under certain conditions detecting motifs in the combined coregulation-orthology space is indeed more efficient than using each space separately, but this is not always the case. Moreover, the difference in success rate between the advanced algorithms and MEME is still marginal. The success rate of motif detection depends on the complex interplay between the added information and the specificities of the applied algorithms. Insights in this relation provide information useful to both developers and users. All benchmark datasets are available at http://homes.esat.kuleuven.be/~kmarchal/Supplementary_Storms_Valerie_PlosONE. PMID:20140085
Pisanti, Nadia; Soldano, Henry; Carpentier, Mathilde; Pothier, Joel
2009-12-01
The geometrical configurations of atoms in protein structures can be viewed as approximate relations among them. Then, finding similar common substructures within a set of protein structures belongs to a new class of problems that generalizes that of finding repeated motifs. The novelty lies in the addition of constraints on the motifs in terms of relations that must hold between pairs of positions of the motifs. We will hence denote them as relational motifs. For this class of problems, we present an algorithm that is a suitable extension of the KMR paradigm and, in particular, of the KMRC as it uses a degenerate alphabet. Our algorithm contains several improvements that become especially useful when-as it is required for relational motifs-the inference is made by partially overlapping shorter motifs, rather than concatenating them. The efficiency, correctness and completeness of the algorithm is ensured by several non-trivial properties that are proven in this paper. The algorithm has been applied in the important field of protein common 3D substructure searching. The methods implemented have been tested on several examples of protein families such as serine proteases, globins and cytochromes P450 additionally. The detected motifs have been compared to those found by multiple structural alignments methods.
Argo_CUDA: Exhaustive GPU based approach for motif discovery in large DNA datasets.
Vishnevsky, Oleg V; Bocharnikov, Andrey V; Kolchanov, Nikolay A
2018-02-01
The development of chromatin immunoprecipitation sequencing (ChIP-seq) technology has revolutionized the genetic analysis of the basic mechanisms underlying transcription regulation and led to accumulation of information about a huge amount of DNA sequences. There are a lot of web services which are currently available for de novo motif discovery in datasets containing information about DNA/protein binding. An enormous motif diversity makes their finding challenging. In order to avoid the difficulties, researchers use different stochastic approaches. Unfortunately, the efficiency of the motif discovery programs dramatically declines with the query set size increase. This leads to the fact that only a fraction of top "peak" ChIP-Seq segments can be analyzed or the area of analysis should be narrowed. Thus, the motif discovery in massive datasets remains a challenging issue. Argo_Compute Unified Device Architecture (CUDA) web service is designed to process the massive DNA data. It is a program for the detection of degenerate oligonucleotide motifs of fixed length written in 15-letter IUPAC code. Argo_CUDA is a full-exhaustive approach based on the high-performance GPU technologies. Compared with the existing motif discovery web services, Argo_CUDA shows good prediction quality on simulated sets. The analysis of ChIP-Seq sequences revealed the motifs which correspond to known transcription factor binding sites.
Damsel: A Data Model Storage Library for Exascale Science
DOE Office of Scientific and Technical Information (OSTI.GOV)
Koziol, Quincey
The goal of this project is to enable exascale computational science applications to interact conveniently and efficiently with storage through abstractions that match their data models. We will accomplish this through three major activities: (1) identifying major data model motifs in computational science applications and developing representative benchmarks; (2) developing a data model storage library, called Damsel, that supports these motifs, provides efficient storage data layouts, incorporates optimizations to enable exascale operation, and is tolerant to failures; and (3) productizing Damsel and working with computational scientists to encourage adoption of this library by the scientific community.
Zhang, ZhiZhuo; Chang, Cheng Wei; Hugo, Willy; Cheung, Edwin; Sung, Wing-Kin
2013-03-01
Although de novo motifs can be discovered through mining over-represented sequence patterns, this approach misses some real motifs and generates many false positives. To improve accuracy, one solution is to consider some additional binding features (i.e., position preference and sequence rank preference). This information is usually required from the user. This article presents a de novo motif discovery algorithm called SEME (sampling with expectation maximization for motif elicitation), which uses pure probabilistic mixture model to model the motif's binding features and uses expectation maximization (EM) algorithms to simultaneously learn the sequence motif, position, and sequence rank preferences without asking for any prior knowledge from the user. SEME is both efficient and accurate thanks to two important techniques: the variable motif length extension and importance sampling. Using 75 large-scale synthetic datasets, 32 metazoan compendium benchmark datasets, and 164 chromatin immunoprecipitation sequencing (ChIP-Seq) libraries, we demonstrated the superior performance of SEME over existing programs in finding transcription factor (TF) binding sites. SEME is further applied to a more difficult problem of finding the co-regulated TF (coTF) motifs in 15 ChIP-Seq libraries. It identified significantly more correct coTF motifs and, at the same time, predicted coTF motifs with better matching to the known motifs. Finally, we show that the learned position and sequence rank preferences of each coTF reveals potential interaction mechanisms between the primary TF and the coTF within these sites. Some of these findings were further validated by the ChIP-Seq experiments of the coTFs. The application is available online.
GPUmotif: An Ultra-Fast and Energy-Efficient Motif Analysis Program Using Graphics Processing Units
Zandevakili, Pooya; Hu, Ming; Qin, Zhaohui
2012-01-01
Computational detection of TF binding patterns has become an indispensable tool in functional genomics research. With the rapid advance of new sequencing technologies, large amounts of protein-DNA interaction data have been produced. Analyzing this data can provide substantial insight into the mechanisms of transcriptional regulation. However, the massive amount of sequence data presents daunting challenges. In our previous work, we have developed a novel algorithm called Hybrid Motif Sampler (HMS) that enables more scalable and accurate motif analysis. Despite much improvement, HMS is still time-consuming due to the requirement to calculate matching probabilities position-by-position. Using the NVIDIA CUDA toolkit, we developed a graphics processing unit (GPU)-accelerated motif analysis program named GPUmotif. We proposed a “fragmentation" technique to hide data transfer time between memories. Performance comparison studies showed that commonly-used model-based motif scan and de novo motif finding procedures such as HMS can be dramatically accelerated when running GPUmotif on NVIDIA graphics cards. As a result, energy consumption can also be greatly reduced when running motif analysis using GPUmotif. The GPUmotif program is freely available at http://sourceforge.net/projects/gpumotif/ PMID:22662128
GPUmotif: an ultra-fast and energy-efficient motif analysis program using graphics processing units.
Zandevakili, Pooya; Hu, Ming; Qin, Zhaohui
2012-01-01
Computational detection of TF binding patterns has become an indispensable tool in functional genomics research. With the rapid advance of new sequencing technologies, large amounts of protein-DNA interaction data have been produced. Analyzing this data can provide substantial insight into the mechanisms of transcriptional regulation. However, the massive amount of sequence data presents daunting challenges. In our previous work, we have developed a novel algorithm called Hybrid Motif Sampler (HMS) that enables more scalable and accurate motif analysis. Despite much improvement, HMS is still time-consuming due to the requirement to calculate matching probabilities position-by-position. Using the NVIDIA CUDA toolkit, we developed a graphics processing unit (GPU)-accelerated motif analysis program named GPUmotif. We proposed a "fragmentation" technique to hide data transfer time between memories. Performance comparison studies showed that commonly-used model-based motif scan and de novo motif finding procedures such as HMS can be dramatically accelerated when running GPUmotif on NVIDIA graphics cards. As a result, energy consumption can also be greatly reduced when running motif analysis using GPUmotif. The GPUmotif program is freely available at http://sourceforge.net/projects/gpumotif/
A novel swarm intelligence algorithm for finding DNA motifs.
Lei, Chengwei; Ruan, Jianhua
2009-01-01
Discovering DNA motifs from co-expressed or co-regulated genes is an important step towards deciphering complex gene regulatory networks and understanding gene functions. Despite significant improvement in the last decade, it still remains one of the most challenging problems in computational molecular biology. In this work, we propose a novel motif finding algorithm that finds consensus patterns using a population-based stochastic optimisation technique called Particle Swarm Optimisation (PSO), which has been shown to be effective in optimising difficult multidimensional problems in continuous domains. We propose to use a word dissimilarity graph to remap the neighborhood structure of the solution space of DNA motifs, and propose a modification of the naive PSO algorithm to accommodate discrete variables. In order to improve efficiency, we also propose several strategies for escaping from local optima and for automatically determining the termination criteria. Experimental results on simulated challenge problems show that our method is both more efficient and more accurate than several existing algorithms. Applications to several sets of real promoter sequences also show that our approach is able to detect known transcription factor binding sites, and outperforms two of the most popular existing algorithms.
Huang, Jin; Ying, Le; Yang, Xiaohai; Yang, Yanjing; Quan, Ke; Wang, He; Xie, Nuli; Ou, Min; Zhou, Qifeng; Wang, Kemin
2015-09-01
We designed a new ratiometric fluorescent nanoprobe for sensing pH values in living cells. Briefly, the nanoprobe consists of a gold nanoparticle (AuNP), short single-stranded oligonucleotides, and dual-fluorophore-labeled i-motif sequences. The short oligonucleotides are designed to bind with the i-motif sequences and immobilized on the AuNP surface via Au-S bond. At neutral pH, the dual fluorophores are separated, resulting in very low fluorescence resonance energy transfer (FRET) efficiency. At acidic pH, the i-motif strands fold into a quadruplex structure and leave the AuNP, bringing the dual fluorophores into close proximity, resulting in high FRET efficiency, which could be used as a signal for pH sensing. The nanoprobe possesses abilities of cellular transfection, enzymatic protection, fast response and quantitative pH detection. The in vitro and intracellular applications of the nanoprobe were demonstrated, which showed excellent response in the physiological pH range. Furthermore, our experimental results suggested that the nanoprobe showed excellent spatial and temporal resolution in living cells. We think that the ratiometric sensing strategy could potentially be applied to create a variety of new multicolor sensors for intracellular detection.
Woodman, Zenda L; Schwager, Sylva L U; Redelinghuys, Pierre; Carmona, Adriana K; Ehlers, Mario R W; Sturrock, Edward D
2005-08-01
sACE (somatic angiotensin-converting enzyme) consists of two homologous, N and C domains, whereas the testis isoenzyme [tACE (testis ACE)] consists of a single C domain. Both isoenzymes are shed from the cell surface by a sheddase activity, although sACE is shed much less efficiently than tACE. We hypothesize that the N domain of sACE plays a regulatory role, by occluding a recognition motif on the C domain required for ectodomain shedding and by influencing the catalytic efficiency. To test this, we constructed two mutants: CNdom-ACE and CCdom-ACE. CNdom-ACE was shed less efficiently than sACE, whereas CCdom-ACE was shed as efficiently as tACE. Notably, cleavage occurred both within the stalk and the interdomain bridge in both mutants, suggesting that a sheddase recognition motif resides within the C domain and is capable of directly cleaving at both positions. Analysis of the catalytic properties of the mutants and comparison with sACE and tACE revealed that the k(cat) for sACE and CNdom-ACE was less than or equal to the sum of the kcat values for tACE and the N-domain, suggesting negative co-operativity, whereas the kcat value for the CCdom-ACE suggested positive co-operativity between the two domains. Taken together, the results provide support for (i) the existence of a sheddase recognition motif in the C domain and (ii) molecular flexibility of the N and C domains in sACE, resulting in occlusion of the C-domain recognition motif by the N domain as well as close contact of the two domains during hydrolysis of peptide substrates.
Woodman, Zenda L.; Schwager, Sylva L. U.; Redelinghuys, Pierre; Carmona, Adriana K.; Ehlers, Mario R. W.; Sturrock, Edward D.
2005-01-01
sACE (somatic angiotensin-converting enzyme) consists of two homologous, N and C domains, whereas the testis isoenzyme [tACE (testis ACE)] consists of a single C domain. Both isoenzymes are shed from the cell surface by a sheddase activity, although sACE is shed much less efficiently than tACE. We hypothesize that the N domain of sACE plays a regulatory role, by occluding a recognition motif on the C domain required for ectodomain shedding and by influencing the catalytic efficiency. To test this, we constructed two mutants: CNdom-ACE and CCdom-ACE. CNdom-ACE was shed less efficiently than sACE, whereas CCdom-ACE was shed as efficiently as tACE. Notably, cleavage occurred both within the stalk and the interdomain bridge in both mutants, suggesting that a sheddase recognition motif resides within the C domain and is capable of directly cleaving at both positions. Analysis of the catalytic properties of the mutants and comparison with sACE and tACE revealed that the kcat for sACE and CNdom-ACE was less than or equal to the sum of the kcat values for tACE and the N-domain, suggesting negative co-operativity, whereas the kcat value for the CCdom-ACE suggested positive co-operativity between the two domains. Taken together, the results provide support for (i) the existence of a sheddase recognition motif in the C domain and (ii) molecular flexibility of the N and C domains in sACE, resulting in occlusion of the C-domain recognition motif by the N domain as well as close contact of the two domains during hydrolysis of peptide substrates. PMID:15813703
Havrila, Marek; Réblová, Kamila; Zirbel, Craig L.; Leontis, Neocles B.; Šponer, Jiří
2013-01-01
The Sarcin-Ricin RNA motif (SR motif) is one of the most prominent recurrent RNA building blocks that occurs in many different RNA contexts and folds autonomously, i.e., in a context-independent manner. In this study, we combined bioinformatics analysis with explicit-solvent molecular dynamics (MD) simulations to better understand the relation between the RNA sequence and the evolutionary patterns of SR motif. SHAPE probing experiment was also performed to confirm fidelity of MD simulations. We identified 57 instances of the SR motif in a non-redundant subset of the RNA X-ray structure database and analyzed their basepairing, base-phosphate, and backbone-backbone interactions. We extracted sequences aligned to these instances from large ribosomal RNA alignments to determine frequency of occurrence for different sequence variants. We then used a simple scoring scheme based on isostericity to suggest 10 sequence variants with highly variable expected degree of compatibility with the SR motif 3D structure. We carried out MD simulations of SR motifs with these base substitutions. Non isosteric base substitutions led to unstable structures, but so did isosteric substitutions which were unable to make key base-phosphate interactions. MD technique explains why some potentially isosteric SR motifs are not realized during evolution. We also found that inability to form stable cWW geometry is an important factor in case of the first base pair of the flexible region of the SR motif. Comparison of structural, bioinformatics, SHAPE probing and MD simulation data reveals that explicit solvent MD simulations neatly reflect viability of different sequence variants of the SR motif. Thus, MD simulations can efficiently complement bioinformatics tools in studies of conservation patterns of RNA motifs and provide atomistic insight into the role of their different signature interactions. PMID:24144333
Cellular automata simulation of topological effects on the dynamics of feed-forward motifs
Apte, Advait A; Cain, John W; Bonchev, Danail G; Fong, Stephen S
2008-01-01
Background Feed-forward motifs are important functional modules in biological and other complex networks. The functionality of feed-forward motifs and other network motifs is largely dictated by the connectivity of the individual network components. While studies on the dynamics of motifs and networks are usually devoted to the temporal or spatial description of processes, this study focuses on the relationship between the specific architecture and the overall rate of the processes of the feed-forward family of motifs, including double and triple feed-forward loops. The search for the most efficient network architecture could be of particular interest for regulatory or signaling pathways in biology, as well as in computational and communication systems. Results Feed-forward motif dynamics were studied using cellular automata and compared with differential equation modeling. The number of cellular automata iterations needed for a 100% conversion of a substrate into a target product was used as an inverse measure of the transformation rate. Several basic topological patterns were identified that order the specific feed-forward constructions according to the rate of dynamics they enable. At the same number of network nodes and constant other parameters, the bi-parallel and tri-parallel motifs provide higher network efficacy than single feed-forward motifs. Additionally, a topological property of isodynamicity was identified for feed-forward motifs where different network architectures resulted in the same overall rate of the target production. Conclusion It was shown for classes of structural motifs with feed-forward architecture that network topology affects the overall rate of a process in a quantitatively predictable manner. These fundamental results can be used as a basis for simulating larger networks as combinations of smaller network modules with implications on studying synthetic gene circuits, small regulatory systems, and eventually dynamic whole-cell models. PMID:18304325
Genetic Retargeting of Adenovirus: Novel Strategy Employing “Deknobbing” of the Fiber
Magnusson, Maria K.; Hong, Saw See; Boulanger, Pierre; Lindholm, Leif
2001-01-01
For efficient and versatile use of adenovirus (Ad) as an in vivo gene therapy vector, modulation of the viral tropism is highly desirable. In this study, a novel method to genetically alter the Ad fiber tropism is described. The knob and the last 15 shaft repeats of the fiber gene were deleted and replaced with an external trimerization motif and a new cell-binding ligand, in this case the integrin-binding motif RGD. The corresponding recombinant fiber retained the basic biological functions of the natural fiber, i.e., trimerization, nuclear import, penton formation, and ligand binding. The recombinant fiber bound to integrins but failed to react with antiknob antibody. For virus production, the recombinant fiber gene was rescued into the Ad genome at the exact position of the wild-type (WT) fiber to make use of the native regulation of fiber expression. The recombinant virus Ad5/FibR7-RGD yielded plaques on 293 cells, but the spread through the monolayer was two to three times delayed compared to WT, and the ratio of infectious to physical particles was 20 times lower. Studies on virus tropism showed that Ad5/FibR7-RGD was able to infect cells which did not express the coxsackie-adenovirus receptor (CAR), but did express integrins. Ad5/FibR7-RGD virus infectivity was unchanged in the presence of antiknob antibody, which neutralized the WT virus. Ad5/FibR7-RGD virus showed an expanded tropism, which is useful when gene transfer to cells not expressing CAR is needed. The described method should also make possible the construction of Ad genetically retargeted via ligands other than RGD. PMID:11462000
Liu, Bingqiang; Zhang, Hanyuan; Zhou, Chuan; Li, Guojun; Fennell, Anne; Wang, Guanghui; Kang, Yu; Liu, Qi; Ma, Qin
2016-08-09
Phylogenetic footprinting is an important computational technique for identifying cis-regulatory motifs in orthologous regulatory regions from multiple genomes, as motifs tend to evolve slower than their surrounding non-functional sequences. Its application, however, has several difficulties for optimizing the selection of orthologous data and reducing the false positives in motif prediction. Here we present an integrative phylogenetic footprinting framework for accurate motif predictions in prokaryotic genomes (MP(3)). The framework includes a new orthologous data preparation procedure, an additional promoter scoring and pruning method and an integration of six existing motif finding algorithms as basic motif search engines. Specifically, we collected orthologous genes from available prokaryotic genomes and built the orthologous regulatory regions based on sequence similarity of promoter regions. This procedure made full use of the large-scale genomic data and taxonomy information and filtered out the promoters with limited contribution to produce a high quality orthologous promoter set. The promoter scoring and pruning is implemented through motif voting by a set of complementary predicting tools that mine as many motif candidates as possible and simultaneously eliminate the effect of random noise. We have applied the framework to Escherichia coli k12 genome and evaluated the prediction performance through comparison with seven existing programs. This evaluation was systematically carried out at the nucleotide and binding site level, and the results showed that MP(3) consistently outperformed other popular motif finding tools. We have integrated MP(3) into our motif identification and analysis server DMINDA, allowing users to efficiently identify and analyze motifs in 2,072 completely sequenced prokaryotic genomes. The performance evaluation indicated that MP(3) is effective for predicting regulatory motifs in prokaryotic genomes. Its application may enhance progress in elucidating transcription regulation mechanism, thus provide benefit to the genomic research community and prokaryotic genome researchers in particular.
Khurana, Simran; Chakraborty, Sharmistha; Zhao, Xuan; Liu, Yu; Guan, Dongyin; Lam, Minh; Huang, Wei; Yang, Sichun; Kao, Hung-Ying
2012-01-01
α-Actinins (ACTNs) are a family of proteins cross-linking actin filaments that maintain cytoskeletal organization and cell motility. Recently, it has also become clear that ACTN4 can function in the nucleus. In this report, we found that ACTN4 (full length) and its spliced isoform ACTN4 (Iso) possess an unusual LXXLL nuclear receptor interacting motif. Both ACTN4 (full length) and ACTN4 (Iso) potentiate basal transcription activity and directly interact with estrogen receptor α, although ACTN4 (Iso) binds ERα more strongly. We have also found that both ACTN4 (full length) and ACTN4 (Iso) interact with the ligand-independent and the ligand-dependent activation domains of estrogen receptor α. Although ACTN4 (Iso) interacts efficiently with transcriptional co-activators such as p300/CBP-associated factor (PCAF) and steroid receptor co-activator 1 (SRC-1), the full length ACTN4 protein either does not or does so weakly. More importantly, the flanking sequences of the LXXLL motif are important not only for interacting with nuclear receptors but also for the association with co-activators. Taken together, we have identified a novel extended LXXLL motif that is critical for interactions with both receptors and co-activators. This motif functions more efficiently in a spliced isoform of ACTN4 than it does in the full-length protein. PMID:22908231
Searching RNA motifs and their intermolecular contacts with constraint networks.
Thébault, P; de Givry, S; Schiex, T; Gaspin, C
2006-09-01
Searching RNA gene occurrences in genomic sequences is a task whose importance has been renewed by the recent discovery of numerous functional RNA, often interacting with other ligands. Even if several programs exist for RNA motif search, none exists that can represent and solve the problem of searching for occurrences of RNA motifs in interaction with other molecules. We present a constraint network formulation of this problem. RNA are represented as structured motifs that can occur on more than one sequence and which are related together by possible hybridization. The implemented tool MilPat is used to search for several sRNA families in genomic sequences. Results show that MilPat allows to efficiently search for interacting motifs in large genomic sequences and offers a simple and extensible framework to solve such problems. New and known sRNA are identified as H/ACA candidates in Methanocaldococcus jannaschii. http://carlit.toulouse.inra.fr/MilPaT/MilPat.pl.
Finding Hidden Location Patterns of Two Competitive Supermarkets in Thailand
NASA Astrophysics Data System (ADS)
Khumsri, Jinattaporn; Fujihara, Akihiro
There are two famous supermarkets in Thailand: Big C and Lotus. They are the highest competitive supermarkets whose hold the most market share by lots of promotions and also gather all convenience services including banking, restaurant, and others. In recent years, they gradually expand their stores and they take a similar strategy to determine where to locate a store. It is important for them to consider store allocation to obtain new customers efficiently. To consider this, we gather geographical locations of these supermarkets from Twitter using Twitter API. We gathered tweets having these supermarket names and geotags for seven months. To extract hidden location patterns from gathered data, we introduce location motif which is a directed subgraph whose edges are linked to every pair of the shortest-distance opponent node. We investigate every possible configuration of location motif when they have a small number of nodes and find that the configuration increases exponentially. We also visualize location motifs generated from gathered data on the map of Thailand and count the frequency of observed location motifs. As a result, we find that even if the possible location motifs exponentially increase as the number of nodes grows, limited location motifs can be observed. Using location motif, we successfully find an evidence of biased store allocation in reality.
Identification of a conserved B-cell epitope on the GapC protein of Streptococcus dysgalactiae.
Zhang, Limeng; Zhou, Xue; Fan, Ziyao; Tang, Wei; Chen, Liang; Dai, Jian; Wei, Yuhua; Zhang, Jianxin; Yang, Xuan; Yang, Xijing; Liu, Daolong; Yu, Liquan; Zhang, Hua; Wu, Zhijun; Yu, Yongzhong; Sun, Hunan; Cui, Yudong
2015-01-01
Streptococcus dysgalactiae (S. dysgalactia) GapC is a highly conserved surface dehydrogenase among the streptococcus spp., which is responsible for inducing protective antibody immune responses in animals. However, the B-cell epitope of S. dysgalactia GapC have not been well characterized. In this study, a monoclonal antibody 1F2 (mAb1F2) against S. dysgalactiae GapC was generated by the hybridoma technique and used to screen a phage-displayed 12-mer random peptide library (Ph.D.-12) for mapping the linear B-cell epitope. The mAb1F2 recognized phages displaying peptides with the consensus motif TRINDLT. Amino acid sequence of the motif exactly matched (30)TRINDLT(36) of the S. dysgalactia GapC. Subsequently, site-directed mutagenic analysis further demonstrated that residues R31, I32, N33, D34 and L35 formed the core of (30)TRINDLT(36), and this core motif was the minimal determinant of the B-cell epitope recognized by the mAb1F2. The epitope (30)TRINDLT(36) showed high homology among different streptococcus species. Overall, our findings characterized a conserved B-cell epitope, which will be useful for the further study of epitope-based vaccines. Copyright © 2015 Elsevier Ltd. All rights reserved.
Improving the Accuracy and Scalability of Discriminative Learning Methods for Markov Logic Networks
2011-05-01
9 2.2 Inductive Logic Programming and Aleph . . . . . . . . . . . . 10 2.3 MLNs and Alchemy ...positive examples. Aleph allows users to customize each of 10 these steps, and thereby supports a variety of specific algorithms. 2.3 MLNs and Alchemy An...tural motifs. By limiting the search to each unique motif, LSM is able to find good clauses in an efficient manner. Alchemy (Kok, Singla, Richardson
Seet, Bruce T; Berry, Donna M; Maltzman, Jonathan S; Shabason, Jacob; Raina, Monica; Koretzky, Gary A; McGlade, C Jane; Pawson, Tony
2007-02-07
The relationship between the binding affinity and specificity of modular interaction domains is potentially important in determining biological signaling responses. In signaling from the T-cell receptor (TCR), the Gads C-terminal SH3 domain binds a core RxxK sequence motif in the SLP-76 scaffold. We show that residues surrounding this motif are largely optimized for binding the Gads C-SH3 domain resulting in a high-affinity interaction (K(D)=8-20 nM) that is essential for efficient TCR signaling in Jurkat T cells, since Gads-mediated signaling declines with decreasing affinity. Furthermore, the SLP-76 RxxK motif has evolved a very high specificity for the Gads C-SH3 domain. However, TCR signaling in Jurkat cells is tolerant of potential SLP-76 crossreactivity, provided that very high-affinity binding to the Gads C-SH3 domain is maintained. These data provide a quantitative argument that the affinity of the Gads C-SH3 domain for SLP-76 is physiologically important and suggest that the integrity of TCR signaling in vivo is sustained both by strong selection of SLP-76 for the Gads C-SH3 domain and by a capacity to buffer intrinsic crossreactivity.
Lin, Yi-Chieh; Chen, Bing-Mae; Lu, Wei-Cheng; Su, Chien-I; Prijovich, Zeljko M.; Chung, Wen-Chuan; Wu, Pei-Yu; Chen, Kai-Chuan; Lee, I-Chiao; Juan, Ting-Yi; Roffler, Steve R.
2013-01-01
Membrane-tethered proteins (mammalian surface display) are increasingly being used for novel therapeutic and biotechnology applications. Maximizing surface expression of chimeric proteins on mammalian cells is important for these applications. We show that the cytoplasmic domain from the B7-1 antigen, a commonly used element for mammalian surface display, can enhance the intracellular transport and surface display of chimeric proteins in a Sar1 and Rab1 dependent fashion. However, mutational, alanine scanning and deletion analysis demonstrate the absence of linear ER export motifs in the B7 cytoplasmic domain. Rather, efficient intracellular transport correlated with the presence of predicted secondary structure in the cytoplasmic tail. Examination of the cytoplasmic domains of 984 human and 782 mouse type I transmembrane proteins revealed that many previously identified ER export motifs are rarely found in the cytoplasmic tail of type I transmembrane proteins. Our results suggest that efficient intracellular transport of B7 chimeric proteins is associated with the structure rather than to the presence of a linear ER export motif in the cytoplasmic tail, and indicate that short (less than ~ 10-20 amino acids) and unstructured cytoplasmic tails should be avoided to express high levels of chimeric proteins on mammalian cells. PMID:24073236
DNA polymerase preference determines PCR priming efficiency.
Pan, Wenjing; Byrne-Steele, Miranda; Wang, Chunlin; Lu, Stanley; Clemmons, Scott; Zahorchak, Robert J; Han, Jian
2014-01-30
Polymerase chain reaction (PCR) is one of the most important developments in modern biotechnology. However, PCR is known to introduce biases, especially during multiplex reactions. Recent studies have implicated the DNA polymerase as the primary source of bias, particularly initiation of polymerization on the template strand. In our study, amplification from a synthetic library containing a 12 nucleotide random portion was used to provide an in-depth characterization of DNA polymerase priming bias. The synthetic library was amplified with three commercially available DNA polymerases using an anchored primer with a random 3' hexamer end. After normalization, the next generation sequencing (NGS) results of the amplified libraries were directly compared to the unamplified synthetic library. Here, high throughput sequencing was used to systematically demonstrate and characterize DNA polymerase priming bias. We demonstrate that certain sequence motifs are preferred over others as primers where the six nucleotide sequences at the 3' end of the primer, as well as the sequences four base pairs downstream of the priming site, may influence priming efficiencies. DNA polymerases in the same family from two different commercial vendors prefer similar motifs, while another commercially available enzyme from a different DNA polymerase family prefers different motifs. Furthermore, the preferred priming motifs are GC-rich. The DNA polymerase preference for certain sequence motifs was verified by amplification from single-primer templates. We incorporated the observed DNA polymerase preference into a primer-design program that guides the placement of the primer to an optimal location on the template. DNA polymerase priming bias was characterized using a synthetic library amplification system and NGS. The characterization of DNA polymerase priming bias was then utilized to guide the primer-design process and demonstrate varying amplification efficiencies among three commercially available DNA polymerases. The results suggest that the interaction of the DNA polymerase with the primer:template junction during the initiation of DNA polymerization is very important in terms of overall amplification bias and has broader implications for both the primer design process and multiplex PCR.
β-hairpin-mediated nucleation of polyglutamine amyloid formation
Kar, Karunakar; Hoop, Cody L.; Drombosky, Kenneth W.; Baker, Matthew A.; Kodali, Ravindra; Arduini, Irene; van der Wel, Patrick C. A.; Horne, W. Seth; Wetzel, Ronald
2013-01-01
The conformational preferences of polyglutamine (polyQ) sequences are of major interest because of their central importance in the expanded CAG repeat diseases that include Huntington’s disease (HD). Here we explore the response of various biophysical parameters to the introduction of β-hairpin motifs within polyQ sequences. These motifs (trpzip, disulfide, D-Pro-Gly, Coulombic attraction, L-Pro-Gly) enhance formation rates and stabilities of amyloid fibrils with degrees of effectiveness well-correlated with their known abilities to enhance β-hairpin formation in other peptides. These changes led to decreases in the critical nucleus for amyloid formation from a value of n* = 4 for a simple, unbroken Q23 sequence to approximate unitary n* values for similar length polyQs containing β-hairpin motifs. At the same time, the morphologies, secondary structures, and bioactivities of the resulting fibrils were essentially unchanged from simple polyQ aggregates. In particular, the signature pattern of SSNMR 13C Gln resonances that appears to be unique to polyQ amyloid is replicated exactly in fibrils from a β-hairpin polyQ. Importantly, while β-hairpin motifs do produce enhancements in the equilibrium constant for nucleation in aggregation reactions, these Kn* values remain quite low (~ 10−10) and there is no evidence for significant embellishment of β-structure within the monomer ensemble. The results indicate an important role for β-turns in the nucleation mechanism and structure of polyQ amyloid and have implications for the nature of the toxic species in expanded CAG repeat diseases. PMID:23353826
Duda, Teresa; Bharill, Shashank; Wojtas, Ireneusz; Yadav, Prem; Gryczynski, Ignacy; Gryczynski, Zygmunt; Sharma, Rameshwar K.
2010-01-01
ANF-RGC$ membrane guanylate cyclase is the receptor for the hypotensive peptide hormones, atrial natriuretic factor (ANF) and type B natriuretic peptide (BNP). It is a single transmembrane spanning protein. Binding the hormone to the extracellular domain activates its intracellular catalytic domain. This results in accelerated production of cyclic GMP, a second messenger in controlling blood pressure, cardiac vasculature and fluid secretion. ATP is the obligatory transducer of the ANF signal. It works through its ATP regulated module, ARM, which is juxtaposed to the C-terminal side of the transmembrane domain. Upon interaction, ATP induces a cascade of temporal and spatial changes in the ARM, which, finally, result in activation of the catalytic module. Although the exact nature and the details of these changes are not known, some of these have been stereographed in the simulated three-dimensional model of the ARM and validated biochemically. Through comprehensive techniques ofsteady-state, time-resolved tryptophan fluorescence and Forster Resonance Energy Transfer (FRET), site-directed and deletion-mutagenesis, and reconstitution, the present study validates and explains themechanism of the model-based predicted transduction role of the ARM’s structural motif, 669WTAPELL675. This motif is critical in the ATP-dependent ANF signaling. Molecular modeling shows that ATP binding exposes the 669WTAPELL675 motif, the exposure, in turn, facilitates its interaction and activation of the catalytic module. These principles of the model have been experimentally validated. This knowledge brings us a step closer to our understanding of the mechanism by which the ATP-dependent spatial changes within the ARM cause ANF signaling of ANF-RGC. PMID:19137266
Correlated Mutation in the Evolution of Catalysis in Uracil DNA Glycosylase Superfamily
NASA Astrophysics Data System (ADS)
Xia, Bo; Liu, Yinling; Guevara, Jose; Li, Jing; Jilich, Celeste; Yang, Ye; Wang, Liangjiang; Dominy, Brian N.; Cao, Weiguo
2017-04-01
Enzymes in Uracil DNA glycosylase (UDG) superfamily are essential for the removal of uracil. Family 4 UDGa is a robust uracil DNA glycosylase that only acts on double-stranded and single-stranded uracil-containing DNA. Based on mutational, kinetic and modeling analyses, a catalytic mechanism involving leaving group stabilization by H155 in motif 2 and water coordination by N89 in motif 3 is proposed. Mutual Information analysis identifies a complexed correlated mutation network including a strong correlation in the EG doublet in motif 1 of family 4 UDGa and in the QD doublet in motif 1 of family 1 UNG. Conversion of EG doublet in family 4 Thermus thermophilus UDGa to QD doublet increases the catalytic efficiency by over one hundred-fold and seventeen-fold over the E41Q and G42D single mutation, respectively, rectifying the strong correlation in the doublet. Molecular dynamics simulations suggest that the correlated mutations in the doublet in motif 1 position the catalytic H155 in motif 2 to stabilize the leaving uracilate anion. The integrated approach has important implications in studying enzyme evolution and protein structure and function.
SVM2Motif—Reconstructing Overlapping DNA Sequence Motifs by Mimicking an SVM Predictor
Vidovic, Marina M. -C.; Görnitz, Nico; Müller, Klaus-Robert; Rätsch, Gunnar; Kloft, Marius
2015-01-01
Identifying discriminative motifs underlying the functionality and evolution of organisms is a major challenge in computational biology. Machine learning approaches such as support vector machines (SVMs) achieve state-of-the-art performances in genomic discrimination tasks, but—due to its black-box character—motifs underlying its decision function are largely unknown. As a remedy, positional oligomer importance matrices (POIMs) allow us to visualize the significance of position-specific subsequences. Although being a major step towards the explanation of trained SVM models, they suffer from the fact that their size grows exponentially in the length of the motif, which renders their manual inspection feasible only for comparably small motif sizes, typically k ≤ 5. In this work, we extend the work on positional oligomer importance matrices, by presenting a new machine-learning methodology, entitled motifPOIM, to extract the truly relevant motifs—regardless of their length and complexity—underlying the predictions of a trained SVM model. Our framework thereby considers the motifs as free parameters in a probabilistic model, a task which can be phrased as a non-convex optimization problem. The exponential dependence of the POIM size on the oligomer length poses a major numerical challenge, which we address by an efficient optimization framework that allows us to find possibly overlapping motifs consisting of up to hundreds of nucleotides. We demonstrate the efficacy of our approach on a synthetic data set as well as a real-world human splice site data set. PMID:26690911
Human telomeric DNA: G-quadruplex, i-motif and Watson–Crick double helix
Phan, Anh Tuân; Mergny, Jean-Louis
2002-01-01
Human telomeric DNA composed of (TTAGGG/CCCTAA)n repeats may form a classical Watson–Crick double helix. Each individual strand is also prone to quadruplex formation: the G-rich strand may adopt a G-quadruplex conformation involving G-quartets whereas the C-rich strand may fold into an i-motif based on intercalated C·C+ base pairs. Using an equimolar mixture of the telomeric oligonucleotides d[AGGG(TTAGGG)3] and d[(CCCTAA)3CCCT], we defined which structures existed and which would be the predominant species under a variety of experimental conditions. Under near-physiological conditions of pH, temperature and salt concentration, telomeric DNA was predominantly in a double-helix form. However, at lower pH values or higher temperatures, the G-quadruplex and/or the i-motif efficiently competed with the duplex. We also present kinetic and thermodynamic data for duplex association and for G-quadruplex/i-motif unfolding. PMID:12409451
NASA Technical Reports Server (NTRS)
Collier, Mark D.; Killough, Ronnie; Martin, Nancy L.
1990-01-01
NASA is currently using a set of applications called the Display Builder and Display Manager. They run on Concurrent systems and heavily depend on the Graphic Kernel System (GKS). At this time however, these two applications would more appropriately be developed in X Windows, in which a low X is used for all actual text and graphics display and a standard widget set (such as Motif) is used for the user interface. Use of the X Windows will increase performance, improve the user interface, enhance portability, and improve reliability. Prototype of X Window/Motif based Display Manager provides the following advantages over a GKS based application: improved performance by using a low level X Windows, display of graphic and text will be more efficient; improved user interface by using Motif; Improved portability by operating on both Concurrent and Sun workstations; and Improved reliability.
Peptide-binding motifs of two common equine class I MHC molecules in Thoroughbred horses.
Bergmann, Tobias; Lindvall, Mikaela; Moore, Erin; Moore, Eugene; Sidney, John; Miller, Donald; Tallmadge, Rebecca L; Myers, Paisley T; Malaker, Stacy A; Shabanowitz, Jeffrey; Osterrieder, Nikolaus; Peters, Bjoern; Hunt, Donald F; Antczak, Douglas F; Sette, Alessandro
2017-05-01
Quantitative peptide-binding motifs of MHC class I alleles provide a valuable tool to efficiently identify putative T cell epitopes. Detailed information on equine MHC class I alleles is still very limited, and to date, only a single equine MHC class I allele, Eqca-1*00101 (ELA-A3 haplotype), has been characterized. The present study extends the number of characterized ELA class I specificities in two additional haplotypes found commonly in the Thoroughbred breed. Accordingly, we here report quantitative binding motifs for the ELA-A2 allele Eqca-16*00101 and the ELA-A9 allele Eqca-1*00201. Utilizing analyses of endogenously bound and eluted ligands and the screening of positional scanning combinatorial libraries, detailed and quantitative peptide-binding motifs were derived for both alleles. Eqca-16*00101 preferentially binds peptides with aliphatic/hydrophobic residues in position 2 and at the C-terminus, and Eqca-1*00201 has a preference for peptides with arginine in position 2 and hydrophobic/aliphatic residues at the C-terminus. Interestingly, the Eqca-16*00101 motif resembles that of the human HLA A02-supertype, while the Eqca-1*00201 motif resembles that of the HLA B27-supertype and two macaque class I alleles. It is expected that the identified motifs will facilitate the selection of candidate epitopes for the study of immune responses in horses.
Cui, Yunxi; Kong, Deming; Ghimire, Chiran; Xu, Cuixia; Mao, Hanbin
2016-04-19
G-Quadruplex and i-motif are tetraplex structures that may form in opposite strands at the same location of a duplex DNA. Recent discoveries have indicated that the two tetraplex structures can have conflicting biological activities, which poses a challenge for cells to coordinate. Here, by performing innovative population analysis on mechanical unfolding profiles of tetraplex structures in double-stranded DNA, we found that formations of G-quadruplex and i-motif in the two complementary strands are mutually exclusive in a variety of DNA templates, which include human telomere and promoter fragments of hINS and hTERT genes. To explain this behavior, we placed G-quadruplex- and i-motif-hosting sequences in an offset fashion in the two complementary telomeric DNA strands. We found simultaneous formation of the G-quadruplex and i-motif in opposite strands, suggesting that mutual exclusivity between the two tetraplexes is controlled by steric hindrance. This conclusion was corroborated in the BCL-2 promoter sequence, in which simultaneous formation of two tetraplexes was observed due to possible offset arrangements between G-quadruplex and i-motif in opposite strands. The mutual exclusivity revealed here sets a molecular basis for cells to efficiently coordinate opposite biological activities of G-quadruplex and i-motif at the same dsDNA location.
Composition-dependent stability of the medium-range order responsible for metallic glass formation
Zhang, Feng; Ji, Min; Fang, Xiao-Wei; ...
2014-09-18
The competition between the characteristic medium-range order corresponding to amorphous alloys and that in ordered crystalline phases is central to phase selection and morphology evolution under various processing conditions. We examine the stability of a model glass system, Cu–Zr, by comparing the energetics of various medium-range structural motifs over a wide range of compositions using first-principles calculations. Furthermore, we focus specifically on motifs that represent possible building blocks for competing glassy and crystalline phases, and we employ a genetic algorithm to efficiently identify the energetically favored decorations of each motif for specific compositions. These results show that a Bergman-type motifmore » with crystallization-resisting icosahedral symmetry is energetically most favorable in the composition range 0.63 < xCu < 0.68, and is the underlying motif for one of the three optimal glass-forming ranges observed experimentally for this binary system (Li et al., 2008). This work establishes an energy-based methodology to evaluate specific medium-range structural motifs which compete with stable crystalline nuclei in deeply undercooled liquids.« less
Adamus, Tomasz; Konieczny, Paweł; Sekuła, Małgorzata; Sułkowski, Maciej; Majka, Marcin
2014-01-01
The main goal in gene therapy and biomedical research is an efficient transcription factors (TFs) delivery system. SNAIL, a zinc finger transcription factor, is strongly involved in tumor, what makes its signaling pathways an interesting research subject. The necessity of tracking activation of intracellular pathways has prompted fluorescent proteins usage as localization markers. Advanced molecular cloning techniques allow to generate fusion proteins from fluorescent markers and transcription factors. Depending on fusion strategy, the protein expression levels and nuclear transport ability are significantly different. The P2A self-cleavage motif through its cleavage ability allows two single proteins to be simultaneously expressed. The aim of this study was to compare two strategies for introducing a pair of genes using expression vector system. We have examined GFP and SNAI1 gene fusions by comprising common nucleotide polylinker (multiple cloning site) or P2A motif in between them, resulting in one fusion or two independent protein expressions respectively. In each case transgene expression levels and translation efficiency as well as nuclear localization of expressed protein have been analyzed. Our data showed that usage of P2A motif provides more effective nuclear transport of SNAIL transcription factor than conventional genes linker. At the same time the fluorescent marker spreads evenly in subcellular space.
Kinjo, Akira R; Nakamura, Haruki
2013-01-01
Protein functions are mediated by interactions between proteins and other molecules. One useful approach to analyze protein functions is to compare and classify the structures of interaction interfaces of proteins. Here, we describe the procedures for compiling a database of interface structures and efficiently comparing the interface structures. To do so requires a good understanding of the data structures of the Protein Data Bank (PDB). Therefore, we also provide a detailed account of the PDB exchange dictionary necessary for extracting data that are relevant for analyzing interaction interfaces and secondary structures. We identify recurring structural motifs by classifying similar interface structures, and we define a coarse-grained representation of supersecondary structures (SSS) which represents a sequence of two or three secondary structure elements including their relative orientations as a string of four to seven letters. By examining the correspondence between structural motifs and SSS strings, we show that no SSS string has particularly high propensity to be found interaction interfaces in general, indicating any SSS can be used as a binding interface. When individual structural motifs are examined, there are some SSS strings that have high propensity for particular groups of structural motifs. In addition, it is shown that while the SSS strings found in particular structural motifs for nonpolymer and protein interfaces are as abundant as in other structural motifs that belong to the same subunit, structural motifs for nucleic acid interfaces exhibit somewhat stronger preference for SSS strings. In regard to protein folds, many motif-specific SSS strings were found across many folds, suggesting that SSS may be a useful description to investigate the universality of ligand binding modes.
Ateka, Elijah; Alicai, Titus; Ndunguru, Joseph; Tairo, Fred; Sseruwagi, Peter; Kiarie, Samuel; Makori, Timothy; Kehoe, Monica A; Boykin, Laura M
2017-01-01
Cassava is the main staple food for over 800 million people globally. Its production in eastern Africa is being constrained by two devastating Ipomoviruses that cause cassava brown streak disease (CBSD); Cassava brown streak virus (CBSV) and Ugandan cassava brown streak virus (UCBSV), with up to 100% yield loss for smallholder farmers in the region. To date, vector studies have not resulted in reproducible and highly efficient transmission of CBSV and UCBSV. Most virus transmission studies have used Bemisia tabaci (whitefly), but a maximum of 41% U/CBSV transmission efficiency has been documented for this vector. With the advent of next generation sequencing, researchers are generating whole genome sequences for both CBSV and UCBSV from throughout eastern Africa. Our initial goal for this study was to characterize U/CBSV whole genomes from CBSD symptomatic cassava plants sampled in Kenya. We have generated 8 new whole genomes (3 CBSV and 5 UCBSV) from Kenya, and in the process of analyzing these genomes together with 26 previously published sequences, we uncovered the aphid transmission associated DAG motif within coat protein genes of all CBSV whole genomes at amino acid positions 52-54, but not in UCBSV. Upon further investigation, the DAG motif was also found at the same positions in two other Ipomoviruses: Squash vein yellowing virus (SqVYV), Coccinia mottle virus (CocMoV). Until this study, the highly-conserved DAG motif, which is associated with aphid transmission was only noticed once, in SqVYV but discounted as being of minimal importance. This study represents the first comprehensive look at Ipomovirus genomes to determine the extent of DAG motif presence and significance for vector relations. The presence of this motif suggests that aphids could potentially be a vector of CBSV, SqVYV and CocMov. Further transmission and ipomoviral protein evolutionary studies are needed to confirm this hypothesis.
2007-03-01
RTb motif mutants hTERT Senescence Apoptosis Long lag period [20,25] Ribozymes Hairpin hTR, hTERT Apoptosis Incomplete knockdown of target [26...O-(2-Methoxyethyl) oligomers. b Reverse transcriptase motif.the growth and viability of cancer cells (Table 1). Ribozymes and short-interfering RNA...recent studies indicate that complete knockdown is not essential for efficient and rapid apoptosis in reference to siRNA against hTR and ribozymes
Aranda-Orgillés, Beatriz; Rutschow, Désirée; Zeller, Raphael; Karagiannidis, Antonios I.; Köhler, Andrea; Chen, Changwei; Wilson, Timothy; Krause, Sven; Roepcke, Stefan; Lilley, David; Schneider, Rainer; Schweiger, Susann
2011-01-01
We have shown previously that the ubiquitin ligase MID1, mutations of which cause the midline malformation Opitz BBB/G syndrome (OS), serves as scaffold for a microtubule-associated protein complex that regulates protein phosphatase 2A (PP2A) activity in a ubiquitin-dependent manner. Here, we show that the MID1 protein complex associates with mRNAs via a purine-rich sequence motif called MIDAS (MID1 association sequence) and thereby increases stability and translational efficiency of these mRNAs. Strikingly, inclusion of multiple copies of the MIDAS motif into mammalian mRNAs increases production of the encoded proteins up to 20-fold. Mutated MID1, as found in OS patients, loses its influence on MIDAS-containing mRNAs, suggesting that the malformations in OS patients could be caused by failures in the regulation of cytoskeleton-bound protein translation. This is supported by the observation that the majority of mRNAs that carry MIDAS motifs is involved in developmental processes and/or energy homeostasis. Further analysis of one of the proteins encoded by a MIDAS-containing mRNA, namely PDPK-1 (3-phosphoinositide dependent protein kinase-1), which is an important regulator of mammalian target of rapamycin/PP2A signaling, showed that PDPK-1 protein synthesis is significantly reduced in cells from an OS patient compared with an age-matched control and can be rescued by functional MID1. Together, our data uncover a novel messenger ribonucleoprotein complex that regulates microtubule-associated protein translation. They suggest a novel mechanism underlying OS and point at an enormous potential of the MIDAS motif to increase the efficiency of biotechnological protein production in mammalian cells. PMID:21930711
Wu, Yongzhen; Zhu, Wei-Hong; Zakeeruddin, Shaik M; Grätzel, Michael
2015-05-13
The dye-sensitized solar cell (DSSC) is one of the most promising photovoltaic technologies with potential of low cost, light weight, and good flexibility. The practical application of DSSCs requires further improvement in power conversion efficiency and long-term stability. Recently, significant progress has been witnessed in DSSC research owing to the novel concept of the D-A-π-A motif for the molecular engineering of organic photosensitizers. New organic and porphyrin dyes based on the D-A-π-A motif can not only enhance photovoltaic performance, but also improve durability in DSSC applications. This Spotlight on Applications highlights recent advances in the D-A-π-A-based photosensitizers, specifically focusing on the mechanism of efficiency and stability enhancements. Also, we find insight into the additional acceptor as well as the trade-off of long wavelength response. The basic principles are involved in molecular engineering of efficient D-A-π-A sensitizers, providing a clear road map showing how to modulate the energy bands, rationally extending the response wavelength, and optimizing photovoltaic efficiency step by step.
RGAugury: a pipeline for genome-wide prediction of resistance gene analogs (RGAs) in plants.
Li, Pingchuan; Quan, Xiande; Jia, Gaofeng; Xiao, Jin; Cloutier, Sylvie; You, Frank M
2016-11-02
Resistance gene analogs (RGAs), such as NBS-encoding proteins, receptor-like protein kinases (RLKs) and receptor-like proteins (RLPs), are potential R-genes that contain specific conserved domains and motifs. Thus, RGAs can be predicted based on their conserved structural features using bioinformatics tools. Computer programs have been developed for the identification of individual domains and motifs from the protein sequences of RGAs but none offer a systematic assessment of the different types of RGAs. A user-friendly and efficient pipeline is needed for large-scale genome-wide RGA predictions of the growing number of sequenced plant genomes. An integrative pipeline, named RGAugury, was developed to automate RGA prediction. The pipeline first identifies RGA-related protein domains and motifs, namely nucleotide binding site (NB-ARC), leucine rich repeat (LRR), transmembrane (TM), serine/threonine and tyrosine kinase (STTK), lysin motif (LysM), coiled-coil (CC) and Toll/Interleukin-1 receptor (TIR). RGA candidates are identified and classified into four major families based on the presence of combinations of these RGA domains and motifs: NBS-encoding, TM-CC, and membrane associated RLP and RLK. All time-consuming analyses of the pipeline are paralleled to improve performance. The pipeline was evaluated using the well-annotated Arabidopsis genome. A total of 98.5, 85.2, and 100 % of the reported NBS-encoding genes, membrane associated RLPs and RLKs were validated, respectively. The pipeline was also successfully applied to predict RGAs for 50 sequenced plant genomes. A user-friendly web interface was implemented to ease command line operations, facilitate visualization and simplify result management for multiple datasets. RGAugury is an efficiently integrative bioinformatics tool for large scale genome-wide identification of RGAs. It is freely available at Bitbucket: https://bitbucket.org/yaanlpc/rgaugury .
Goldstrohm, Aaron C.; Albrecht, Todd R.; Suñé, Carles; Bedford, Mark T.; Garcia-Blanco, Mariano A.
2001-01-01
CA150 represses RNA polymerase II (RNAPII) transcription by inhibiting the elongation of transcripts. The FF repeat domains of CA150 bind directly to the phosphorylated carboxyl-terminal domain of the largest subunit of RNAPII. We determined that this interaction is required for efficient CA150-mediated repression of transcription from the α4-integrin promoter. Additional functional determinants, namely, the WW1 and WW2 domains of CA150, were also required for efficient repression. A protein that interacted directly with CA150 WW1 and WW2 was identified as the splicing-transcription factor SF1. Previous studies have demonstrated a role for SF1 in transcription repression, and we found that binding of the CA150 WW1 and WW2 domains to SF1 correlated exactly with the functional contribution of these domains for repression. The binding specificity of the CA150 WW domains was found to be unique in comparison to known classes of WW domains. Furthermore, the CA150 binding site, within the carboxyl-terminal half of SF1, contains a novel type of proline-rich motif that may be recognized by the CA150 WW1 and WW2 domains. These results support a model for the recruitment of CA150 to repress transcription elongation. In this model, CA150 binds to the phosphorylated CTD of elongating RNAPII and SF1 targets the nascent transcript. PMID:11604498
Goldstrohm, A C; Albrecht, T R; Suñé, C; Bedford, M T; Garcia-Blanco, M A
2001-11-01
CA150 represses RNA polymerase II (RNAPII) transcription by inhibiting the elongation of transcripts. The FF repeat domains of CA150 bind directly to the phosphorylated carboxyl-terminal domain of the largest subunit of RNAPII. We determined that this interaction is required for efficient CA150-mediated repression of transcription from the alpha(4)-integrin promoter. Additional functional determinants, namely, the WW1 and WW2 domains of CA150, were also required for efficient repression. A protein that interacted directly with CA150 WW1 and WW2 was identified as the splicing-transcription factor SF1. Previous studies have demonstrated a role for SF1 in transcription repression, and we found that binding of the CA150 WW1 and WW2 domains to SF1 correlated exactly with the functional contribution of these domains for repression. The binding specificity of the CA150 WW domains was found to be unique in comparison to known classes of WW domains. Furthermore, the CA150 binding site, within the carboxyl-terminal half of SF1, contains a novel type of proline-rich motif that may be recognized by the CA150 WW1 and WW2 domains. These results support a model for the recruitment of CA150 to repress transcription elongation. In this model, CA150 binds to the phosphorylated CTD of elongating RNAPII and SF1 targets the nascent transcript.
Crystal Structure Predictions Using Adaptive Genetic Algorithm and Motif Search methods
NASA Astrophysics Data System (ADS)
Ho, K. M.; Wang, C. Z.; Zhao, X.; Wu, S.; Lyu, X.; Zhu, Z.; Nguyen, M. C.; Umemoto, K.; Wentzcovitch, R. M. M.
2017-12-01
Material informatics is a new initiative which has attracted a lot of attention in recent scientific research. The basic strategy is to construct comprehensive data sets and use machine learning to solve a wide variety of problems in material design and discovery. In pursuit of this goal, a key element is the quality and completeness of the databases used. Recent advance in the development of crystal structure prediction algorithms has made it a complementary and more efficient approach to explore the structure/phase space in materials using computers. In this talk, we discuss the importance of the structural motifs and motif-networks in crystal structure predictions. Correspondingly, powerful methods are developed to improve the sampling of the low-energy structure landscape.
Accurate quantification of microRNA via single strand displacement reaction on DNA origami motif.
Zhu, Jie; Feng, Xiaolu; Lou, Jingyu; Li, Weidong; Li, Sheng; Zhu, Hongxin; Yang, Lun; Zhang, Aiping; He, Lin; Li, Can
2013-01-01
DNA origami is an emerging technology that assembles hundreds of staple strands and one single-strand DNA into certain nanopattern. It has been widely used in various fields including detection of biological molecules such as DNA, RNA and proteins. MicroRNAs (miRNAs) play important roles in post-transcriptional gene repression as well as many other biological processes such as cell growth and differentiation. Alterations of miRNAs' expression contribute to many human diseases. However, it is still a challenge to quantitatively detect miRNAs by origami technology. In this study, we developed a novel approach based on streptavidin and quantum dots binding complex (STV-QDs) labeled single strand displacement reaction on DNA origami to quantitatively detect the concentration of miRNAs. We illustrated a linear relationship between the concentration of an exemplary miRNA as miRNA-133 and the STV-QDs hybridization efficiency; the results demonstrated that it is an accurate nano-scale miRNA quantifier motif. In addition, both symmetrical rectangular motif and asymmetrical China-map motif were tested. With significant linearity in both motifs, our experiments suggested that DNA Origami motif with arbitrary shape can be utilized in this method. Since this DNA origami-based method we developed owns the unique advantages of simple, time-and-material-saving, potentially multi-targets testing in one motif and relatively accurate for certain impurity samples as counted directly by atomic force microscopy rather than fluorescence signal detection, it may be widely used in quantification of miRNAs.
Casimiro, Ana C; Vinga, Susana; Freitas, Ana T; Oliveira, Arlindo L
2008-02-07
Motif finding algorithms have developed in their ability to use computationally efficient methods to detect patterns in biological sequences. However the posterior classification of the output still suffers from some limitations, which makes it difficult to assess the biological significance of the motifs found. Previous work has highlighted the existence of positional bias of motifs in the DNA sequences, which might indicate not only that the pattern is important, but also provide hints of the positions where these patterns occur preferentially. We propose to integrate position uniformity tests and over-representation tests to improve the accuracy of the classification of motifs. Using artificial data, we have compared three different statistical tests (Chi-Square, Kolmogorov-Smirnov and a Chi-Square bootstrap) to assess whether a given motif occurs uniformly in the promoter region of a gene. Using the test that performed better in this dataset, we proceeded to study the positional distribution of several well known cis-regulatory elements, in the promoter sequences of different organisms (S. cerevisiae, H. sapiens, D. melanogaster, E. coli and several Dicotyledons plants). The results show that position conservation is relevant for the transcriptional machinery. We conclude that many biologically relevant motifs appear heterogeneously distributed in the promoter region of genes, and therefore, that non-uniformity is a good indicator of biological relevance and can be used to complement over-representation tests commonly used. In this article we present the results obtained for the S. cerevisiae data sets.
Piston-rotaxanes as molecular shock absorbers.
Sevick, E M; Williams, D R M
2010-04-20
We describe the thermomechanical response of a new molecular system that behaves as a shock absorber. The system consists of a rodlike rotaxane connected to a piston and tethered to a surface. The response of this system is dominated by the translational entropy of the rotaxane rings and can be calculated exactly. The force laws are contrasted with those for a rigid rod and a polymer. In some cases, the rotaxanes undergo a sudden transition to a tilted state when compressed. These piston-rotaxanes provide a potential motif for the design of a new class of materials with a novel thermomechanical response.
Advances in Significance Testing for Cluster Detection
NASA Astrophysics Data System (ADS)
Coleman, Deidra Andrea
Over the past two decades, much attention has been given to data driven project goals such as the Human Genome Project and the development of syndromic surveillance systems. A major component of these types of projects is analyzing the abundance of data. Detecting clusters within the data can be beneficial as it can lead to the identification of specified sequences of DNA nucleotides that are related to important biological functions or the locations of epidemics such as disease outbreaks or bioterrorism attacks. Cluster detection techniques require efficient and accurate hypothesis testing procedures. In this dissertation, we improve upon the hypothesis testing procedures for cluster detection by enhancing distributional theory and providing an alternative method for spatial cluster detection using syndromic surveillance data. In Chapter 2, we provide an efficient method to compute the exact distribution of the number and coverage of h-clumps of a collection of words. This method involves defining a Markov chain using a minimal deterministic automaton to reduce the number of states needed for computation. We allow words of the collection to contain other words of the collection making the method more general. We use our method to compute the distributions of the number and coverage of h-clumps in the Chi motif of H. influenza.. In Chapter 3, we provide an efficient algorithm to compute the exact distribution of multiple window discrete scan statistics for higher-order, multi-state Markovian sequences. This algorithm involves defining a Markov chain to efficiently keep track of probabilities needed to compute p-values of the statistic. We use our algorithm to identify cases where the available approximation does not perform well. We also use our algorithm to detect unusual clusters of made free throw shots by National Basketball Association players during the 2009-2010 regular season. In Chapter 4, we give a procedure to detect outbreaks using syndromic surveillance data while controlling the Bayesian False Discovery Rate (BFDR). The procedure entails choosing an appropriate Bayesian model that captures the spatial dependency inherent in epidemiological data and considers all days of interest, selecting a test statistic based on a chosen measure that provides the magnitude of the maximumal spatial cluster for each day, and identifying a cutoff value that controls the BFDR for rejecting the collective null hypothesis of no outbreak over a collection of days for a specified region.We use our procedure to analyze botulism-like syndrome data collected by the North Carolina Disease Event Tracking and Epidemiologic Collection Tool (NC DETECT).
Ivanova, Lyudmila; Buch, Anna; Döhner, Katinka; Pohlmann, Anja; Binz, Anne; Prank, Ute; Sandbaumhüter, Malte
2016-01-01
ABSTRACT Herpes simplex virus (HSV) replicates in the skin and mucous membranes, and initiates lytic or latent infections in sensory neurons. Assembly of progeny virions depends on the essential large tegument protein pUL36 of 3,164 amino acid residues that links the capsids to the tegument proteins pUL37 and VP16. Of the 32 tryptophans of HSV-1-pUL36, the tryptophan-acidic motifs 1766WD1767 and 1862WE1863 are conserved in all HSV-1 and HSV-2 isolates. Here, we characterized the role of these motifs in the HSV life cycle since the rare tryptophans often have unique roles in protein function due to their large hydrophobic surface. The infectivity of the mutants HSV-1(17+)Lox-pUL36-WD/AA-WE/AA and HSV-1(17+)Lox-CheVP26-pUL36-WD/AA-WE/AA, in which the capsid has been tagged with the fluorescent protein Cherry, was significantly reduced. Quantitative electron microscopy shows that there were a larger number of cytosolic capsids and fewer enveloped virions compared to their respective parental strains, indicating a severe impairment in secondary capsid envelopment. The capsids of the mutant viruses accumulated in the perinuclear region around the microtubule-organizing center and were not dispersed to the cell periphery but still acquired the inner tegument proteins pUL36 and pUL37. Furthermore, cytoplasmic capsids colocalized with tegument protein VP16 and, to some extent, with tegument protein VP22 but not with the envelope glycoprotein gD. These results indicate that the unique conserved tryptophan-acidic motifs in the central region of pUL36 are required for efficient targeting of progeny capsids to the membranes of secondary capsid envelopment and for efficient virion assembly. IMPORTANCE Herpesvirus infections give rise to severe animal and human diseases, especially in young, immunocompromised, and elderly individuals. The structural hallmark of herpesvirus virions is the tegument, which contains evolutionarily conserved proteins that are essential for several stages of the herpesvirus life cycle. Here we characterized two conserved tryptophan-acidic motifs in the central region of the large tegument protein pUL36 of herpes simplex virus. When we mutated these motifs, secondary envelopment of cytosolic capsids and the production of infectious particles were severely impaired. Our data suggest that pUL36 and its homologs in other herpesviruses, and in particular such tryptophan-acidic motifs, could provide attractive targets for the development of novel drugs to prevent herpesvirus assembly and spread. PMID:27009950
Batista, F R; Hernández, L; Fernández, J R; Arrieta, J; Menéndez, C; Gómez, R; Támbara, Y; Pons, T
1999-01-01
beta-Fructofuranosidases share a conserved aspartic acid-containing motif (Arg-Asp-Pro; RDP) which is absent from alpha-glucopyranosidases. The role of Asp-309 located in the RDP motif of levansucrase (EC 2.4.1.10) from Acetobacter diazotrophicus SRT4 was studied by site-directed mutagenesis. Substitution of Asp-309 by Asn did not affect enzyme secretion. The kcat of the mutant levansucrase was reduced 75-fold, but its Km was similar to that of the wild-type enzyme, indicating that Asp-309 plays a major role in catalysis. The two levansucrases showed optimal activity at pH 5.0 and yielded similar product profiles. Thus the mutation D309N affected the efficiency of sucrose hydrolysis, but not the enzyme specificity. Since the RDP motif is present in a conserved position in fructosyltransferases, invertases, levanases, inulinases and sucrose-6-phosphate hydrolases, it is likely to have a common functional role in beta-fructofuranosidases. PMID:9895294
Process-based network decomposition reveals backbone motif structure
Wang, Guanyu; Du, Chenghang; Chen, Hao; Simha, Rahul; Rong, Yongwu; Xiao, Yi; Zeng, Chen
2010-01-01
A central challenge in systems biology today is to understand the network of interactions among biomolecules and, especially, the organizing principles underlying such networks. Recent analysis of known networks has identified small motifs that occur ubiquitously, suggesting that larger networks might be constructed in the manner of electronic circuits by assembling groups of these smaller modules. Using a unique process-based approach to analyzing such networks, we show for two cell-cycle networks that each of these networks contains a giant backbone motif spanning all the network nodes that provides the main functional response. The backbone is in fact the smallest network capable of providing the desired functionality. Furthermore, the remaining edges in the network form smaller motifs whose role is to confer stability properties rather than provide function. The process-based approach used in the above analysis has additional benefits: It is scalable, analytic (resulting in a single analyzable expression that describes the behavior), and computationally efficient (all possible minimal networks for a biological process can be identified and enumerated). PMID:20498084
G4 motifs affect origin positioning and efficiency in two vertebrate replicators
Valton, Anne-Laure; Hassan-Zadeh, Vahideh; Lema, Ingrid; Boggetto, Nicole; Alberti, Patrizia; Saintomé, Carole; Riou, Jean-François; Prioleau, Marie-Noëlle
2014-01-01
DNA replication ensures the accurate duplication of the genome at each cell cycle. It begins at specific sites called replication origins. Genome-wide studies in vertebrates have recently identified a consensus G-rich motif potentially able to form G-quadruplexes (G4) in most replication origins. However, there is no experimental evidence to demonstrate that G4 are actually required for replication initiation. We show here, with two model origins, that G4 motifs are required for replication initiation. Two G4 motifs cooperate in one of our model origins. The other contains only one critical G4, and its orientation determines the precise position of the replication start site. Point mutations affecting the stability of this G4 in vitro also impair origin function. Finally, this G4 is not sufficient for origin activity and must cooperate with a 200-bp cis-regulatory element. In conclusion, our study strongly supports the predicted essential role of G4 in replication initiation. PMID:24521668
iFORM: Incorporating Find Occurrence of Regulatory Motifs.
Ren, Chao; Chen, Hebing; Yang, Bite; Liu, Feng; Ouyang, Zhangyi; Bo, Xiaochen; Shu, Wenjie
2016-01-01
Accurately identifying the binding sites of transcription factors (TFs) is crucial to understanding the mechanisms of transcriptional regulation and human disease. We present incorporating Find Occurrence of Regulatory Motifs (iFORM), an easy-to-use and efficient tool for scanning DNA sequences with TF motifs described as position weight matrices (PWMs). Both performance assessment with a receiver operating characteristic (ROC) curve and a correlation-based approach demonstrated that iFORM achieves higher accuracy and sensitivity by integrating five classical motif discovery programs using Fisher's combined probability test. We have used iFORM to provide accurate results on a variety of data in the ENCODE Project and the NIH Roadmap Epigenomics Project, and the tool has demonstrated its utility in further elucidating individual roles of functional elements. Both the source and binary codes for iFORM can be freely accessed at https://github.com/wenjiegroup/iFORM. The identified TF binding sites across human cell and tissue types using iFORM have been deposited in the Gene Expression Omnibus under the accession ID GSE53962.
Sun, Eric I; Leyn, Semen A; Kazanov, Marat D; Saier, Milton H; Novichkov, Pavel S; Rodionov, Dmitry A
2013-09-02
In silico comparative genomics approaches have been efficiently used for functional prediction and reconstruction of metabolic and regulatory networks. Riboswitches are metabolite-sensing structures often found in bacterial mRNA leaders controlling gene expression on transcriptional or translational levels.An increasing number of riboswitches and other cis-regulatory RNAs have been recently classified into numerous RNA families in the Rfam database. High conservation of these RNA motifs provides a unique advantage for their genomic identification and comparative analysis. A comparative genomics approach implemented in the RegPredict tool was used for reconstruction and functional annotation of regulons controlled by RNAs from 43 Rfam families in diverse taxonomic groups of Bacteria. The inferred regulons include ~5200 cis-regulatory RNAs and more than 12000 target genes in 255 microbial genomes. All predicted RNA-regulated genes were classified into specific and overall functional categories. Analysis of taxonomic distribution of these categories allowed us to establish major functional preferences for each analyzed cis-regulatory RNA motif family. Overall, most RNA motif regulons showed predictable functional content in accordance with their experimentally established effector ligands. Our results suggest that some RNA motifs (including thiamin pyrophosphate and cobalamin riboswitches that control the cofactor metabolism) are widespread and likely originated from the last common ancestor of all bacteria. However, many more analyzed RNA motifs are restricted to a narrow taxonomic group of bacteria and likely represent more recent evolutionary innovations. The reconstructed regulatory networks for major known RNA motifs substantially expand the existing knowledge of transcriptional regulation in bacteria. The inferred regulons can be used for genetic experiments, functional annotations of genes, metabolic reconstruction and evolutionary analysis. The obtained genome-wide collection of reference RNA motif regulons is available in the RegPrecise database (http://regprecise.lbl.gov/).
Rewiring yeast sugar transporter preference through modifying a conserved protein motif.
Young, Eric M; Tong, Alice; Bui, Hang; Spofford, Caitlin; Alper, Hal S
2014-01-07
Utilization of exogenous sugars found in lignocellulosic biomass hydrolysates, such as xylose, must be improved before yeast can serve as an efficient biofuel and biochemical production platform. In particular, the first step in this process, the molecular transport of xylose into the cell, can serve as a significant flux bottleneck and is highly inhibited by other sugars. Here we demonstrate that sugar transport preference and kinetics can be rewired through the programming of a sequence motif of the general form G-G/F-XXX-G found in the first transmembrane span. By evaluating 46 different heterologously expressed transporters, we find that this motif is conserved among functional transporters and highly enriched in transporters that confer growth on xylose. Through saturation mutagenesis and subsequent rational mutagenesis, four transporter mutants unable to confer growth on glucose but able to sustain growth on xylose were engineered. Specifically, Candida intermedia gxs1 Phe(38)Ile(39)Met(40), Scheffersomyces stipitis rgt2 Phe(38) and Met(40), and Saccharomyces cerevisiae hxt7 Ile(39)Met(40)Met(340) all exhibit this phenotype. In these cases, primary hexose transporters were rewired into xylose transporters. These xylose transporters nevertheless remained inhibited by glucose. Furthermore, in the course of identifying this motif, novel wild-type transporters with superior monosaccharide growth profiles were discovered, namely S. stipitis RGT2 and Debaryomyces hansenii 2D01474. These findings build toward the engineering of efficient pentose utilization in yeast and provide a blueprint for reprogramming transporter properties.
Kim, Hyun-Jun; Kwon, Hye-Rim; Bae, Chang-Dae; Park, Joobae; Hong, Kyung U
2010-05-15
During mitosis, regulation of protein structures and functions by phosphorylation plays critical roles in orchestrating a series of complex events essential for the cell division process. Tumor-associated microtubule-associated protein (TMAP), also known as cytoskeleton-associated protein 2 (CKAP2), is a novel player in spindle assembly and chromosome segregation. We have previously reported that TMAP is phosphorylated at multiple residues specifically during mitosis. However, the mechanisms and functional importance of phosphorylation at most of the sites identified are currently unknown. Here, we report that TMAP is a novel substrate of the Aurora B kinase. Ser627 of TMAP was specifically phosphorylated by Aurora B both in vitro and in vivo. Ser627 and neighboring conserved residues were strictly required for efficient phosphorylation of TMAP by Aurora B, as even minor amino acid substitutions of the phosphorylation motif significantly diminished the efficiency of the substrate phosphorylation. Nearly all mutations at the phosphorylation motif had dramatic effects on the subcellular localization of TMAP. Instead of being localized to the chromosome region during late mitosis, the mutants remained associated with microtubules and centrosomes throughout mitosis. However, the changes in the subcellular localization of these mutants could not be completely explained by the phosphorylation status on Ser627. Our findings suggest that the motif surrounding Ser627 ((625) RRSRRL (630)) is a critical part of a functionally important sequence motif which not only governs the kinase-substrate recognition, but also regulates the subcellular localization of TMAP during mitosis.
RNAfbinv: an interactive Java application for fragment-based design of RNA sequences.
Weinbrand, Lina; Avihoo, Assaf; Barash, Danny
2013-11-15
In RNA design problems, it is plausible to assume that the user would be interested in preserving a particular RNA secondary structure motif, or fragment, for biological reasons. The preservation could be in structure or sequence, or both. Thus, the inverse RNA folding problem could benefit from considering fragment constraints. We have developed a new interactive Java application called RNA fragment-based inverse that allows users to insert an RNA secondary structure in dot-bracket notation. It then performs sequence design that conforms to the shape of the input secondary structure, the specified thermodynamic stability, the specified mutational robustness and the user-selected fragment after shape decomposition. In this shape-based design approach, specific RNA structural motifs with known biological functions are strictly enforced, while others can possess more flexibility in their structure in favor of preserving physical attributes and additional constraints. RNAfbinv is freely available for download on the web at http://www.cs.bgu.ac.il/~RNAexinv/RNAfbinv. The site contains a help file with an explanation regarding the exact use.
Principles for circadian orchestration of metabolic pathways.
Thurley, Kevin; Herbst, Christopher; Wesener, Felix; Koller, Barbara; Wallach, Thomas; Maier, Bert; Kramer, Achim; Westermark, Pål O
2017-02-14
Circadian rhythms govern multiple aspects of animal metabolism. Transcriptome-, proteome- and metabolome-wide measurements have revealed widespread circadian rhythms in metabolism governed by a cellular genetic oscillator, the circadian core clock. However, it remains unclear if and under which conditions transcriptional rhythms cause rhythms in particular metabolites and metabolic fluxes. Here, we analyzed the circadian orchestration of metabolic pathways by direct measurement of enzyme activities, analysis of transcriptome data, and developing a theoretical method called circadian response analysis. Contrary to a common assumption, we found that pronounced rhythms in metabolic pathways are often favored by separation rather than alignment in the times of peak activity of key enzymes. This property holds true for a set of metabolic pathway motifs (e.g., linear chains and branching points) and also under the conditions of fast kinetics typical for metabolic reactions. By circadian response analysis of pathway motifs, we determined exact timing separation constraints on rhythmic enzyme activities that allow for substantial rhythms in pathway flux and metabolite concentrations. Direct measurements of circadian enzyme activities in mouse skeletal muscle confirmed that such timing separation occurs in vivo.
Principles for circadian orchestration of metabolic pathways
Thurley, Kevin; Herbst, Christopher; Wesener, Felix; Koller, Barbara; Wallach, Thomas; Maier, Bert; Kramer, Achim
2017-01-01
Circadian rhythms govern multiple aspects of animal metabolism. Transcriptome-, proteome- and metabolome-wide measurements have revealed widespread circadian rhythms in metabolism governed by a cellular genetic oscillator, the circadian core clock. However, it remains unclear if and under which conditions transcriptional rhythms cause rhythms in particular metabolites and metabolic fluxes. Here, we analyzed the circadian orchestration of metabolic pathways by direct measurement of enzyme activities, analysis of transcriptome data, and developing a theoretical method called circadian response analysis. Contrary to a common assumption, we found that pronounced rhythms in metabolic pathways are often favored by separation rather than alignment in the times of peak activity of key enzymes. This property holds true for a set of metabolic pathway motifs (e.g., linear chains and branching points) and also under the conditions of fast kinetics typical for metabolic reactions. By circadian response analysis of pathway motifs, we determined exact timing separation constraints on rhythmic enzyme activities that allow for substantial rhythms in pathway flux and metabolite concentrations. Direct measurements of circadian enzyme activities in mouse skeletal muscle confirmed that such timing separation occurs in vivo. PMID:28159888
Efficient budding of the tacaribe virus matrix protein z requires the nucleoprotein.
Groseth, Allison; Wolff, Svenja; Strecker, Thomas; Hoenen, Thomas; Becker, Stephan
2010-04-01
The Z protein has been shown for several arenaviruses to serve as the viral matrix protein. As such, Z provides the principal force for the budding of virus particles and is capable of forming virus-like particles (VLPs) when expressed alone. For most arenaviruses, this activity has been shown to be linked to the presence of proline-rich late-domain motifs in the C terminus; however, for the New World arenavirus Tacaribe virus (TCRV), no such motif exists within Z. It was recently demonstrated that while TCRV Z is still capable of functioning as a matrix protein to induce the formation of VLPs, neither its ASAP motif, which replaces a canonical PT/SAP motif in related viruses, nor its YxxL motif is involved in budding, leading to the suggestion that TCRV uses a novel budding mechanism. Here we show that in comparison to its closest relative, Junin virus (JUNV), TCRV Z buds only weakly when expressed in isolation. While this budding activity is independent of the ASAP or YxxL motif, it is significantly enhanced by coexpression with the nucleoprotein (NP), an effect not seen with JUNV Z. Interestingly, both the ASAP and YxxL motifs of Z appear to be critical for the recruitment of NP into VLPs, as well as for the enhancement of TCRV Z-mediated budding. While it is known that TCRV budding remains dependent on the endosomal sorting complex required for transport, our findings provide further evidence that TCRV uses a budding mechanism distinct from that of other known arenaviruses and suggest an essential role for NP in this process.
Accurate Quantification of microRNA via Single Strand Displacement Reaction on DNA Origami Motif
Lou, Jingyu; Li, Weidong; Li, Sheng; Zhu, Hongxin; Yang, Lun; Zhang, Aiping; He, Lin; Li, Can
2013-01-01
DNA origami is an emerging technology that assembles hundreds of staple strands and one single-strand DNA into certain nanopattern. It has been widely used in various fields including detection of biological molecules such as DNA, RNA and proteins. MicroRNAs (miRNAs) play important roles in post-transcriptional gene repression as well as many other biological processes such as cell growth and differentiation. Alterations of miRNAs' expression contribute to many human diseases. However, it is still a challenge to quantitatively detect miRNAs by origami technology. In this study, we developed a novel approach based on streptavidin and quantum dots binding complex (STV-QDs) labeled single strand displacement reaction on DNA origami to quantitatively detect the concentration of miRNAs. We illustrated a linear relationship between the concentration of an exemplary miRNA as miRNA-133 and the STV-QDs hybridization efficiency; the results demonstrated that it is an accurate nano-scale miRNA quantifier motif. In addition, both symmetrical rectangular motif and asymmetrical China-map motif were tested. With significant linearity in both motifs, our experiments suggested that DNA Origami motif with arbitrary shape can be utilized in this method. Since this DNA origami-based method we developed owns the unique advantages of simple, time-and-material-saving, potentially multi-targets testing in one motif and relatively accurate for certain impurity samples as counted directly by atomic force microscopy rather than fluorescence signal detection, it may be widely used in quantification of miRNAs. PMID:23990889
Gaur, Pallavi; Chaturvedi, Anoop
2017-07-22
The clustering pattern and motifs give immense information about any biological data. An application of machine learning algorithms for clustering and candidate motif detection in miRNAs derived from exosomes is depicted in this paper. Recent progress in the field of exosome research and more particularly regarding exosomal miRNAs has led much bioinformatic-based research to come into existence. The information on clustering pattern and candidate motifs in miRNAs of exosomal origin would help in analyzing existing, as well as newly discovered miRNAs within exosomes. Along with obtaining clustering pattern and candidate motifs in exosomal miRNAs, this work also elaborates the usefulness of the machine learning algorithms that can be efficiently used and executed on various programming languages/platforms. Data were clustered and sequence candidate motifs were detected successfully. The results were compared and validated with some available web tools such as 'BLASTN' and 'MEME suite'. The machine learning algorithms for aforementioned objectives were applied successfully. This work elaborated utility of machine learning algorithms and language platforms to achieve the tasks of clustering and candidate motif detection in exosomal miRNAs. With the information on mentioned objectives, deeper insight would be gained for analyses of newly discovered miRNAs in exosomes which are considered to be circulating biomarkers. In addition, the execution of machine learning algorithms on various language platforms gives more flexibility to users to try multiple iterations according to their requirements. This approach can be applied to other biological data-mining tasks as well.
Nonlinear Stimulated Raman Exact Passage by Resonance-Locked Inverse Engineering
NASA Astrophysics Data System (ADS)
Dorier, V.; Gevorgyan, M.; Ishkhanyan, A.; Leroy, C.; Jauslin, H. R.; Guérin, S.
2017-12-01
We derive an exact and robust stimulated Raman process for nonlinear quantum systems driven by pulsed external fields. The external fields are designed with closed-form expressions from the inverse engineering of a given efficient and stable dynamics. This technique allows one to induce a controlled population inversion which surpasses the usual nonlinear stimulated Raman adiabatic passage efficiency.
Exact lower and upper bounds on stationary moments in stochastic biochemical systems
NASA Astrophysics Data System (ADS)
Ghusinga, Khem Raj; Vargas-Garcia, Cesar A.; Lamperski, Andrew; Singh, Abhyudai
2017-08-01
In the stochastic description of biochemical reaction systems, the time evolution of statistical moments for species population counts is described by a linear dynamical system. However, except for some ideal cases (such as zero- and first-order reaction kinetics), the moment dynamics is underdetermined as lower-order moments depend upon higher-order moments. Here, we propose a novel method to find exact lower and upper bounds on stationary moments for a given arbitrary system of biochemical reactions. The method exploits the fact that statistical moments of any positive-valued random variable must satisfy some constraints that are compactly represented through the positive semidefiniteness of moment matrices. Our analysis shows that solving moment equations at steady state in conjunction with constraints on moment matrices provides exact lower and upper bounds on the moments. These results are illustrated by three different examples—the commonly used logistic growth model, stochastic gene expression with auto-regulation and an activator-repressor gene network motif. Interestingly, in all cases the accuracy of the bounds is shown to improve as moment equations are expanded to include higher-order moments. Our results provide avenues for development of approximation methods that provide explicit bounds on moments for nonlinear stochastic systems that are otherwise analytically intractable.
Analysis of osmotin, a PR protein as metabolic modulator in plants
Abdin, Malik Zainul; Kiran, Usha; Alam, Afshar
2011-01-01
Osmotin is an abundant cationic multifunctional protein discovered in cells of tobacco (Nicotiana tabacum L. var Wisconsin 38) adapted to an environment of low osmotic potential. Beside its role as osmoregulator, it provides plants protection from pathogens, hence also placed in the PRP family of proteins. The osmotin induced proline accumulation has been reported to confer tolerance against both biotic and abiotic stresses in plants including transgenic tomato and strawberry overexpressing osmotin gene. The exact mechanism of induction of proline by osmotin is however, not known till date. These observations have led us to hypothesize that osmotin could be regulating these plant responses through its involvement either as transcription factor, cell signal pathway modulator or both in plants. We have therefore, undertaken the present investigation to analyze the osmotin protein as transcription factor using bioinformatics tools. The results of available online DNA binding motif search programs revealed that osmotin does not contain DNAbinding motifs. The alignment results of osmotin protein with the protein sequence from DATF showed the homology in the range of 0-20%, suggesting that it might not contain a DNA binding motif. Further to find unique DNA-binding domain, the superimposition of osmotin 3D structure on modeled Arabidopsis transcription factors using Chimera also suggested absence of the same. However, evidence implicating osmotin in cell signaling were found during the study. With these results, we therefore, concluded that osmotin is not a transcription factor, but regulating plant responses to biotic and abiotic stresses through cell signaling. PMID:21383921
Analysis of osmotin, a PR protein as metabolic modulator in plants.
Abdin, Malik Zainul; Kiran, Usha; Alam, Afshar
2011-01-22
Osmotin is an abundant cationic multifunctional protein discovered in cells of tobacco (Nicotiana tabacum L. var Wisconsin 38) adapted to an environment of low osmotic potential. Beside its role as osmoregulator, it provides plants protection from pathogens, hence also placed in the PRP family of proteins. The osmotin induced proline accumulation has been reported to confer tolerance against both biotic and abiotic stresses in plants including transgenic tomato and strawberry overexpressing osmotin gene. The exact mechanism of induction of proline by osmotin is however, not known till date. These observations have led us to hypothesize that osmotin could be regulating these plant responses through its involvement either as transcription factor, cell signal pathway modulator or both in plants. We have therefore, undertaken the present investigation to analyze the osmotin protein as transcription factor using bioinformatics tools. The results of available online DNA binding motif search programs revealed that osmotin does not contain DNAbinding motifs. The alignment results of osmotin protein with the protein sequence from DATF showed the homology in the range of 0-20%, suggesting that it might not contain a DNA binding motif. Further to find unique DNA-binding domain, the superimposition of osmotin 3D structure on modeled Arabidopsis transcription factors using Chimera also suggested absence of the same. However, evidence implicating osmotin in cell signaling were found during the study. With these results, we therefore, concluded that osmotin is not a transcription factor, but regulating plant responses to biotic and abiotic stresses through cell signaling.
Study of coupled nonlinear partial differential equations for finding exact analytical solutions.
Khan, Kamruzzaman; Akbar, M Ali; Koppelaar, H
2015-07-01
Exact solutions of nonlinear partial differential equations (NPDEs) are obtained via the enhanced (G'/G)-expansion method. The method is subsequently applied to find exact solutions of the Drinfel'd-Sokolov-Wilson (DSW) equation and the (2+1)-dimensional Painlevé integrable Burgers (PIB) equation. The efficiency of this method for finding these exact solutions is demonstrated. The method is effective and applicable for many other NPDEs in mathematical physics.
ATtRACT-a database of RNA-binding proteins and associated motifs.
Giudice, Girolamo; Sánchez-Cabo, Fátima; Torroja, Carlos; Lara-Pezzi, Enrique
2016-01-01
RNA-binding proteins (RBPs) play a crucial role in key cellular processes, including RNA transport, splicing, polyadenylation and stability. Understanding the interaction between RBPs and RNA is key to improve our knowledge of RNA processing, localization and regulation in a global manner. Despite advances in recent years, a unified non-redundant resource that includes information on experimentally validated motifs, RBPs and integrated tools to exploit this information is lacking. Here, we developed a database named ATtRACT (available athttp://attract.cnic.es) that compiles information on 370 RBPs and 1583 RBP consensus binding motifs, 192 of which are not present in any other database. To populate ATtRACT we (i) extracted and hand-curated experimentally validated data from CISBP-RNA, SpliceAid-F, RBPDB databases, (ii) integrated and updated the unavailable ASD database and (iii) extracted information from Protein-RNA complexes present in Protein Data Bank database through computational analyses. ATtRACT provides also efficient algorithms to search a specific motif and scan one or more RNA sequences at a time. It also allows discoveringde novomotifs enriched in a set of related sequences and compare them with the motifs included in the database.Database URL:http:// attract. cnic. es. © The Author(s) 2016. Published by Oxford University Press.
Barendt, Pamela A.; Shah, Najaf A.; Barendt, Gregory A.; Kothari, Parth A.; Sarkar, Casim A.
2013-01-01
While the ribosome has evolved to function in complex intracellular environments, these contexts do not easily allow for the study of its inherent capabilities. We have used a synthetic, well-defined, Escherichia coli (E. coli)-based translation system in conjunction with ribosome display, a powerful in vitro selection method, to identify ribosome binding sites (RBSs) that can promote the efficient translation of messenger RNAs (mRNAs) with a leader length representative of natural E. coli mRNAs. In previous work, we used a longer leader sequence and unexpectedly recovered highly efficient cytosine-rich sequences with complementarity to the 16S ribosomal RNA (rRNA) and similarity to eukaryotic RBSs. In the current study, Shine-Dalgarno (SD) sequences were prevalent but non-SD sequences were also heavily enriched and were dominated by novel guanine- and uracil-rich motifs which showed statistically significant complementarity to the 16S rRNA. Additionally, only SD motifs exhibited position-dependent decreases in sequence entropy, indicating that non-SD motifs likely operate by increasing the local concentration of ribosomes in the vicinity of the start codon, rather than by a position-dependent mechanism. These results further support the putative generality of mRNA-rRNA complementarity in facilitating mRNA translation, but also suggest that context (e.g., leader length and composition) dictates the specific subset of possible RBSs that are used for efficient translation of a given transcript. PMID:23427812
Characterization of a new phage, termed ϕA318, which is specific for Vibrio alginolyticus.
Lin, Ying-Rong; Chiu, Chi-Wen; Chang, Feng-Yi; Lin, Chan-Shing
2012-05-01
Vibrio alginolyticus is an opportunistic pathogen of animals and humans; its related strains can also produce tetrodotoxin and hemolysins. A new phage, ϕA318, which lysed its host V. alginolyticus with high efficiency, was characterized. The burst size of ϕA318 in V. alginolyticus was 72 PFU/bacterium at an MOI of 1 at room temperature; the plaque size was as large as 5 mm in diameter. Electron microscopy (EM) of the phage particles revealed a 50- to 55-nm isomorphous icosahedral head with a 12-nm non-contractile tail, similar to the T7-like phages of the family Podoviridae. Phylogenetic analysis based on complete sequences of the DNA-directed RNA polymerase gene revealed that ϕA318 had 28-47% amino acid identity to enterobacteria phages T7 and SP6, and other Vibrio phages, and the phylogenetic distance suggested that ϕA318 could be classified as a new T7-like bacteriophage. Nevertheless, several motifs in the ϕA318 phage RNA polymerase were highly conserved, including DFRGR (T7-421 motif), DG (T7-537 motif), PSEKPQDIYGAVS (T7-563 motif), RSMTKKPVMTL PYGS (T7-627 motif), and HDS (T7-811 motif). Genetic analysis indicated that phage ϕA318 is not a thermostable direct hemolysin producer. The results suggest that the MOI should be higher than 0.1 to prevent the chance of hemolysin production by the bacteria before they are lysed by the phage.
Accurate and Efficient Approximation to the Optimized Effective Potential for Exchange
NASA Astrophysics Data System (ADS)
Ryabinkin, Ilya G.; Kananenka, Alexei A.; Staroverov, Viktor N.
2013-07-01
We devise an efficient practical method for computing the Kohn-Sham exchange-correlation potential corresponding to a Hartree-Fock electron density. This potential is almost indistinguishable from the exact-exchange optimized effective potential (OEP) and, when used as an approximation to the OEP, is vastly better than all existing models. Using our method one can obtain unambiguous, nearly exact OEPs for any reasonable finite one-electron basis set at the same low cost as the Krieger-Li-Iafrate and Becke-Johnson potentials. For all practical purposes, this solves the long-standing problem of black-box construction of OEPs in exact-exchange calculations.
Staufen1 dimerizes via a conserved motif and a degenerate dsRNA-binding domain to promote mRNA decay
Gleghorn, Michael L.; Gong, Chenguang; Kielkopf, Clara L.; Maquat, Lynne E.
2014-01-01
Staufen (STAU)1-mediated mRNA decay (SMD) degrades mammalian-cell mRNAs that bind the double-stranded (ds)RNA-binding protein STAU1 in their 3′-untranslated region. We report a new motif, which typifies STAU homologs from all vertebrate classes, that is responsible for human (h)STAU1 homodimerization. Our crystal structure and mutagenesis analyses reveal that this motif, now named the Staufen-swapping motif (SSM), and dsRNA-binding domain 5 (‘RBD’5) mediate protein dimerization: the two SSM α-helices of one molecule interact primarily through a hydrophobic patch with the two ‘RBD’5 α-helices of a second molecule. ‘RBD’5 adopts the canonical α-β-β-β-α fold of a functional RBD, but it lacks residues and features needed to bind duplex RNA. In cells, SSM-mediated hSTAU1 dimerization increases the efficiency of SMD by augmenting hSTAU1 binding to the ATP-dependent RNA helicase hUPF1. Dimerization regulates keratinocyte-mediated wound-healing and, undoubtedly, many other cellular processes. PMID:23524536
Study of coupled nonlinear partial differential equations for finding exact analytical solutions
Khan, Kamruzzaman; Akbar, M. Ali; Koppelaar, H.
2015-01-01
Exact solutions of nonlinear partial differential equations (NPDEs) are obtained via the enhanced (G′/G)-expansion method. The method is subsequently applied to find exact solutions of the Drinfel'd–Sokolov–Wilson (DSW) equation and the (2+1)-dimensional Painlevé integrable Burgers (PIB) equation. The efficiency of this method for finding these exact solutions is demonstrated. The method is effective and applicable for many other NPDEs in mathematical physics. PMID:26587256
Yu, Qiang; Wei, Dingbang; Huo, Hongwei
2018-06-18
Given a set of t n-length DNA sequences, q satisfying 0 < q ≤ 1, and l and d satisfying 0 ≤ d < l < n, the quorum planted motif search (qPMS) finds l-length strings that occur in at least qt input sequences with up to d mismatches and is mainly used to locate transcription factor binding sites in DNA sequences. Existing qPMS algorithms have been able to efficiently process small standard datasets (e.g., t = 20 and n = 600), but they are too time consuming to process large DNA datasets, such as ChIP-seq datasets that contain thousands of sequences or more. We analyze the effects of t and q on the time performance of qPMS algorithms and find that a large t or a small q causes a longer computation time. Based on this information, we improve the time performance of existing qPMS algorithms by selecting a sample sequence set D' with a small t and a large q from the large input dataset D and then executing qPMS algorithms on D'. A sample sequence selection algorithm named SamSelect is proposed. The experimental results on both simulated and real data show (1) that SamSelect can select D' efficiently and (2) that the qPMS algorithms executed on D' can find implanted or real motifs in a significantly shorter time than when executed on D. We improve the ability of existing qPMS algorithms to process large DNA datasets from the perspective of selecting high-quality sample sequence sets so that the qPMS algorithms can find motifs in a short time in the selected sample sequence set D', rather than take an unfeasibly long time to search the original sequence set D. Our motif discovery method is an approximate algorithm.
Papadopoulos, Dimitrios K.; Reséndez-Pérez, Diana; Cárdenas-Chávez, Diana L.; Villanueva-Segura, Karina; Canales-del-Castillo, Ricardo; Felix, Daniel A.; Fünfschilling, Raphael; Gehring, Walter J.
2011-01-01
Segmental identity along the anteroposterior axis of bilateral animals is specified by Hox genes. These genes encode transcription factors, harboring the conserved homeodomain and, generally, a YPWM motif, which binds Hox cofactors and increases Hox transcriptional specificity in vivo. Here we derive synthetic Drosophila Antennapedia genes, consisting only of the YPWM motif and homeodomain, and investigate their functional role throughout development. Synthetic peptides and full-length Antennapedia proteins cause head-to-thorax transformations in the embryo, as well as antenna-to-tarsus and eye-to-wing transformations in the adult, thus converting the entire head to a mesothorax. This conversion is achieved by repression of genes required for head and antennal development and ectopic activation of genes promoting thoracic and tarsal fates, respectively. Synthetic Antennapedia peptides bind DNA specifically and interact with Extradenticle and Bric-à-brac interacting protein 2 cofactors in vitro and ex vivo. Substitution of the YPWM motif by alanines abolishes Antennapedia homeotic function, whereas substitution of YPWM by the WRPW repressor motif, which binds the transcriptional corepressor Groucho, allows all proteins to act as repressors only. Finally, naturally occurring variations in the size of the linker between the homeodomain and YPWM motif enhance Antennapedia repressive or activating efficiency, emphasizing the importance of linker size, rather than sequence, for specificity. Our results clearly show that synthetic Antennapedia genes are functional in vivo and therefore provide powerful tools for synthetic biology. Moreover, the YPWM motif is necessary—whereas the entire N terminus of the protein is dispensable—for Antennapedia homeotic function, indicating its dual role in transcriptional activation and repression by recruiting either coactivators or corepressors. PMID:21712439
Entropic Profiler – detection of conservation in genomes using information theory
Fernandes, Francisco; Freitas, Ana T; Almeida, Jonas S; Vinga, Susana
2009-01-01
Background In the last decades, with the successive availability of whole genome sequences, many research efforts have been made to mathematically model DNA. Entropic Profiles (EP) were proposed recently as a new measure of continuous entropy of genome sequences. EP represent local information plots related to DNA randomness and are based on information theory and statistical concepts. They express the weighed relative abundance of motifs for each position in genomes. Their study is very relevant because under or over-representation segments are often associated with significant biological meaning. Findings The Entropic Profiler application here presented is a new tool designed to detect and extract under and over-represented DNA segments in genomes by using EP. It allows its computation in a very efficient way by recurring to improved algorithms and data structures, which include modified suffix trees. Available through a web interface and as downloadable source code, it allows to study positions and to search for motifs inside the whole sequence or within a specified range. DNA sequences can be entered from different sources, including FASTA files, pre-loaded examples or resuming a previously saved work. Besides the EP value plots, p-values and z-scores for each motif are also computed, along with the Chaos Game Representation of the sequence. Conclusion EP are directly related with the statistical significance of motifs and can be considered as a new method to extract and classify significant regions in genomes and estimate local scales in DNA. The present implementation establishes an efficient and useful tool for whole genome analysis. PMID:19416538
Deriving the exact nonadiabatic quantum propagator in the mapping variable representation.
Hele, Timothy J H; Ananth, Nandini
2016-12-22
We derive an exact quantum propagator for nonadiabatic dynamics in multi-state systems using the mapping variable representation, where classical-like Cartesian variables are used to represent both continuous nuclear degrees of freedom and discrete electronic states. The resulting Liouvillian is a Moyal series that, when suitably approximated, can allow for the use of classical dynamics to efficiently model large systems. We demonstrate that different truncations of the exact Liouvillian lead to existing approximate semiclassical and mixed quantum-classical methods and we derive an associated error term for each method. Furthermore, by combining the imaginary-time path-integral representation of the Boltzmann operator with the exact Liouvillian, we obtain an analytic expression for thermal quantum real-time correlation functions. These results provide a rigorous theoretical foundation for the development of accurate and efficient classical-like dynamics to compute observables such as electron transfer reaction rates in complex quantized systems.
De Novo Evolutionary Emergence of a Symmetrical Protein Is Shaped by Folding Constraints
Smock, Robert G.; Yadid, Itamar; Dym, Orly; Clarke, Jane; Tawfik, Dan S.
2016-01-01
Summary Molecular evolution has focused on the divergence of molecular functions, yet we know little about how structurally distinct protein folds emerge de novo. We characterized the evolutionary trajectories and selection forces underlying emergence of β-propeller proteins, a globular and symmetric fold group with diverse functions. The identification of short propeller-like motifs (<50 amino acids) in natural genomes indicated that they expanded via tandem duplications to form extant propellers. We phylogenetically reconstructed 47-residue ancestral motifs that form five-bladed lectin propellers via oligomeric assembly. We demonstrate a functional trajectory of tandem duplications of these motifs leading to monomeric lectins. Foldability, i.e., higher efficiency of folding, was the main parameter leading to improved functionality along the entire evolutionary trajectory. However, folding constraints changed along the trajectory: initially, conflicts between monomer folding and oligomer assembly dominated, whereas subsequently, upon tandem duplication, tradeoffs between monomer stability and foldability took precedence. PMID:26806127
Bistetrazine-cyanines as double-clicking fluorogenic two-point binder or crosslinker probes.
Kormos, Attila; Koehler, Christine; Fodor, Eszter; Rutkai, Zsófia; Martin, Maddison; Mező, Gábor; Lemke, Edward; Kele, Péter
2018-04-20
Fluorogenic probes are capable of minimizing background fluorescence of unreacted and non-specifically adsorbed reagents. The preceding years have brought substantial developments in the design and synthesis of bioorthogonally applicable fluorogenic systems mainly based on the quenching effects of azide and tetrazine moieties. The modulation power exerted by these bioorthogonal motifs typically becomes less efficient on more conjugated systems, i.e. on probes with red-shifted emission wavelength. In order to reach efficient quenching, i.e. fluorogenicity even in the red range of the spectrum, We present the synthesis, fluorogenic and conjugation characterization of bistetrazine-cyanine probes with emission maxima between 600-620 nm. The probes can bind to genetically altered proteins harboring an 11-amino acid peptide tag with two appending cyclooctyne motifs. Moreover, we also demonstrate the use of these bistetrazines as fluorogenic, covalent cross-linkers between monocyclooctynylated proteins. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Arginine methylation promotes translation repression activity of eIF4G-binding protein, Scd6.
Poornima, Gopalakrishna; Shah, Shanaya; Vignesh, Venkadasubramanian; Parker, Roy; Rajyaguru, Purusharth I
2016-11-02
Regulation of translation plays a critical role in determining mRNA fate. A new role was recently reported for a subset of RGG-motif proteins in repressing translation initiation by binding eIF4G1. However the signaling mechanism(s) that leads to spatial and temporal regulation of repression activity of RGG-motif proteins remains unknown. Here we report the role of arginine methylation in regulation of repression activity of Scd6, a conserved RGG-motif protein. We demonstrate that Scd6 gets arginine methylated at its RGG-motif and Hmt1 plays an important role in its methylation. We identify specific methylated arginine residues in the Scd6 RGG-motif in vivo We provide evidence that methylation augments Scd6 repression activity. Arginine methylation defective (AMD) mutant of Scd6 rescues the growth defect caused by overexpression of Scd6, a feature of translation repressors in general. Live-cell imaging of the AMD mutant revealed that it is defective in inducing formation of stress granules. Live-cell imaging and pull-down results indicate that it fails to bind eIF4G1 efficiently. Consistent with these results, a strain lacking Hmt1 is also defective in Scd6-eIF4G1 interaction. Our results establish that arginine methylation augments Scd6 repression activity by promoting eIF4G1-binding. We propose that arginine methylation of translation repressors with RGG-motif could be a general modulator of their repression activity. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
The Rho ADP-ribosylating C3 exoenzyme binds cells via an Arg-Gly-Asp motif.
Rohrbeck, Astrid; Höltje, Markus; Adolf, Andrej; Oms, Elisabeth; Hagemann, Sandra; Ahnert-Hilger, Gudrun; Just, Ingo
2017-10-27
The Rho ADP-ribosylating C3 exoenzyme (C3bot) is a bacterial protein toxin devoid of a cell-binding or -translocation domain. Nevertheless, C3 can efficiently enter intact cells, including neurons, but the mechanism of C3 binding and uptake is not yet understood. Previously, we identified the intermediate filament vimentin as an extracellular membranous interaction partner of C3. However, uptake of C3 into cells still occurs (although reduced) in the absence of vimentin, indicating involvement of an additional host cell receptor. C3 harbors an Arg-Gly-Asp (RGD) motif, which is the major integrin-binding site, present in a variety of integrin ligands. To check whether the RGD motif of C3 is involved in binding to cells, we performed a competition assay with C3 and RGD peptide or with a monoclonal antibody binding to β1-integrin subunit and binding assays in different cell lines, primary neurons, and synaptosomes with C3-RGD mutants. Here, we report that preincubation of cells with the GRGDNP peptide strongly reduced C3 binding to cells. Moreover, mutation of the RGD motif reduced C3 binding to intact cells and also to recombinant vimentin. Anti-integrin antibodies also lowered the C3 binding to cells. Our results indicate that the RGD motif of C3 is at least one essential C3 motif for binding to host cells and that integrin is an additional receptor for C3 besides vimentin. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Web server to identify similarity of amino acid motifs to compounds (SAAMCO).
Casey, Fergal P; Davey, Norman E; Baran, Ivan; Varekova, Radka Svobodova; Shields, Denis C
2008-07-01
Protein-protein interactions are fundamental in mediating biological processes including metabolism, cell growth, and signaling. To be able to selectively inhibit or induce protein activity or complex formation is a key feature in controlling disease. For those situations in which protein-protein interactions derive substantial affinity from short linear peptide sequences, or motifs, we can develop search algorithms for peptidomimetic compounds that resemble the short peptide's structure but are not compromised by poor pharmacological properties. SAAMCO is a Web service ( http://bioware.ucd.ie/ approximately saamco) that facilitates the screening of motifs with known structures against bioactive compound databases. It is built on an algorithm that defines compound similarity based on the presence of appropriate amino acid side chain fragments and a favorable Root Mean Squared Deviation (RMSD) between compound and motif structure. The methodology is efficient as the available compound databases are preprocessed and fast regular expression searches filter potential matches before time-intensive 3D superposition is performed. The required input information is minimal, and the compound databases have been selected to maximize the availability of information on biological activity. "Hits" are accompanied with a visualization window and links to source database entries. Motif matching can be defined on partial or full similarity which will increase or reduce respectively the number of potential mimetic compounds. The Web server provides the functionality for rapid screening of known or putative interaction motifs against prepared compound libraries using a novel search algorithm. The tabulated results can be analyzed by linking to appropriate databases and by visualization.
Baldeck, Nadège; Janel-Bintz, Régine; Wagner, Jérome; Tissier, Agnès; Fuchs, Robert P.; Burkovics, Peter; Haracska, Lajos; Despras, Emmanuelle; Bichara, Marc; Chatton, Bruno; Cordonnier, Agnès M.
2015-01-01
Switching between replicative and translesion synthesis (TLS) DNA polymerases are crucial events for the completion of genomic DNA synthesis when the replication machinery encounters lesions in the DNA template. In eukaryotes, the translesional DNA polymerase η (Polη) plays a central role for accurate bypass of cyclobutane pyrimidine dimers, the predominant DNA lesions induced by ultraviolet irradiation. Polη deficiency is responsible for a variant form of the Xeroderma pigmentosum (XPV) syndrome, characterized by a predisposition to skin cancer. Here, we show that the FF483–484 amino acids in the human Polη (designated F1 motif) are necessary for the interaction of this TLS polymerase with POLD2, the B subunit of the replicative DNA polymerase δ, both in vitro and in vivo. Mutating this motif impairs Polη function in the bypass of both an N-2-acetylaminofluorene adduct and a TT-CPD lesion in cellular extracts. By complementing XPV cells with different forms of Polη, we show that the F1 motif contributes to the progression of DNA synthesis and to the cell survival after UV irradiation. We propose that the integrity of the F1 motif of Polη, necessary for the Polη/POLD2 interaction, is required for the establishment of an efficient TLS complex. PMID:25662213
Yaffe, Yakey; Shepshelovitch, Jeanne; Nevo-Yassaf, Inbar; Yeheskel, Adva; Shmerling, Hedva; Kwiatek, Joanna M; Gaus, Katharina; Pasmanik-Chor, Metsada; Hirschberg, Koret
2012-08-01
Occludin (Ocln), a MARVEL-motif-containing protein, is found in all tight junctions. MARVEL motifs are comprised of four transmembrane helices associated with the localization to or formation of diverse membrane subdomains by interacting with the proximal lipid environment. The functions of the Ocln MARVEL motif are unknown. Bioinformatics sequence- and structure-based analyses demonstrated that the MARVEL domain of Ocln family proteins has distinct evolutionarily conserved sequence features that are consistent with its basolateral membrane localization. Live-cell microscopy, fluorescence resonance energy transfer (FRET) and bimolecular fluorescence complementation (BiFC) were used to analyze the intracellular distribution and self-association of fluorescent-protein-tagged full-length human Ocln or the Ocln MARVEL motif excluding the cytosolic C- and N-termini (amino acids 60-269, FP-MARVEL-Ocln). FP-MARVEL-Ocln efficiently arrived at the plasma membrane (PM) and was sorted to the basolateral PM in filter-grown polarized MDCK cells. A series of conserved aromatic amino acids within the MARVEL domain were found to be associated with Ocln dimerization using BiFC. FP-MARVEL-Ocln inhibited membrane pore growth during Triton-X-100-induced solubilization and was shown to increase the membrane-ordered state using Laurdan, a lipid dye. These data demonstrate that the Ocln MARVEL domain mediates self-association and correct sorting to the basolateral membrane.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fan, Yao; Tan, Kemin; Chhor, Gekleng
The EsxB protein from Bacillus anthracis belongs to the WXG100 family, a group of proteins secreted by a specialized secretion system. We have determined the crystal structures of recombinant EsxB and discovered that the small protein (~10 kDa), comprised of a helix-loop-helix (HLH) hairpin, is capable of associating into two different helical bundles. The two basic quaternary assemblies of EsxB are an antiparallel (AP) dimer and a rarely observed bisecting U (BU) dimer. This structural duality of EsxB is believed to originate from the heptad repeat sequence diversity of the first helix of its HLH hairpin, which allows for twomore » alternative helix packing. The flexibility of EsxB and the ability to form alternative helical bundles underscore the possibility that this protein can serve as an adaptor in secretion and can form hetero-oligomeric helix bundle(s) with other secreted members of the WXG100 family, such as EsxW. The highly conserved WXG motif is located within the loop of the HLH hairpin and is mostly buried within the helix bundle suggesting that its role is mainly structural. The exact functions of the motif, including a proposed role as a secretion signal, remain unknown.« less
Regulation of Polycystin-1 Function by Calmodulin Binding
Doerr, Nicholas; Wang, Yidi; Kipp, Kevin R.; Liu, Guangyi; Benza, Jesse J.; Pletnev, Vladimir; Pavlov, Tengis S.; Staruschenko, Alexander; Mohieldin, Ashraf M.; Takahashi, Maki; Nauli, Surya M.; Weimbs, Thomas
2016-01-01
Autosomal Dominant Polycystic Kidney Disease (ADPKD) is a common genetic disease that leads to progressive renal cyst growth and loss of renal function, and is caused by mutations in the genes encoding polycystin-1 (PC1) and polycystin-2 (PC2), respectively. The PC1/PC2 complex localizes to primary cilia and can act as a flow-dependent calcium channel in addition to numerous other signaling functions. The exact functions of the polycystins, their regulation and the purpose of the PC1/PC2 channel are still poorly understood. PC1 is an integral membrane protein with a large extracytoplasmic N-terminal domain and a short, ~200 amino acid C-terminal cytoplasmic tail. Most proteins that interact with PC1 have been found to bind via the cytoplasmic tail. Here we report that the PC1 tail has homology to the regulatory domain of myosin heavy chain including a conserved calmodulin-binding motif. This motif binds to CaM in a calcium-dependent manner. Disruption of the CaM-binding motif in PC1 does not affect PC2 binding, cilia targeting, or signaling via heterotrimeric G-proteins or STAT3. However, disruption of CaM binding inhibits the PC1/PC2 calcium channel activity and the flow-dependent calcium response in kidney epithelial cells. Furthermore, expression of CaM-binding mutant PC1 disrupts cellular energy metabolism. These results suggest that critical functions of PC1 are regulated by its ability to sense cytosolic calcium levels via binding to CaM. PMID:27560828
A conserved truncated isoform of the ATR-X syndrome protein lacking the SWI/SNF-homology domain.
Garrick, David; Samara, Vassiliki; McDowell, Tarra L; Smith, Andrew J H; Dobbie, Lorraine; Higgs, Douglas R; Gibbons, Richard J
2004-02-04
Mutations in the ATRX gene cause a severe X-linked mental retardation syndrome that is frequently associated with alpha thalassemia (ATR-X syndrome). The previously characterized ATRX protein (approximately 280 kDa) contains both a Plant homeodomain (PHD)-like zinc finger motif as well as an ATPase domain of the SNF2 family. These motifs suggest that ATRX may function as a regulator of gene expression, probably by exerting an effect on chromatin structure, although the exact cellular role of ATRX has not yet been fully elucidated. Here we characterize a truncated (approximately 200 kDa) isoform of ATRX (called here ATRXt) that has been highly conserved between mouse and human. In both species, ATRXt arises due to the failure to splice intron 11 from the primary transcript, and the use of a proximal intronic poly(A) signal. We show that the relative expression of the full length and ATRXt isoforms is subject to tissue-specific regulation. The ATRXt isoform contains the PHD-like domain but not the SWI/SNF-like motifs and is therefore unlikely to be functionally equivalent to the full length protein. We used indirect immunofluorescence to demonstrate that the full length and ATRXt isoforms are colocalized at blocks of pericentromeric heterochromatin but unlike full length ATRX, the truncated isoform does not associate with promyelocytic leukemia (PML) nuclear bodies. The high degree of conservation of ATRXt and the tight regulation of its expression relative to the full length protein suggest that this truncated isoform fulfills an important biological function.
ATM activation and its recruitment to damaged DNA require binding to the C terminus of Nbs1.
You, Zhongsheng; Chahwan, Charly; Bailis, Julie; Hunter, Tony; Russell, Paul
2005-07-01
ATM has a central role in controlling the cellular responses to DNA damage. It and other phosphoinositide 3-kinase-related kinases (PIKKs) have giant helical HEAT repeat domains in their amino-terminal regions. The functions of these domains in PIKKs are not well understood. ATM activation in response to DNA damage appears to be regulated by the Mre11-Rad50-Nbs1 (MRN) complex, although the exact functional relationship between the MRN complex and ATM is uncertain. Here we show that two pairs of HEAT repeats in fission yeast ATM (Tel1) interact with an FXF/Y motif at the C terminus of Nbs1. This interaction resembles nucleoporin FXFG motif binding to HEAT repeats in importin-beta. Budding yeast Nbs1 (Xrs2) appears to have two FXF/Y motifs that interact with Tel1 (ATM). In Xenopus egg extracts, the C terminus of Nbs1 recruits ATM to damaged DNA, where it is subsequently autophosphorylated. This interaction is essential for ATM activation. A C-terminal 147-amino-acid fragment of Nbs1 that has the Mre11- and ATM-binding domains can restore ATM activation in an Nbs1-depleted extract. We conclude that an interaction between specific HEAT repeats in ATM and the C-terminal FXF/Y domain of Nbs1 is essential for ATM activation. We propose that conformational changes in the MRN complex that occur upon binding to damaged DNA are transmitted through the FXF/Y-HEAT interface to activate ATM. This interaction also retains active ATM at sites of DNA damage.
Swellix: a computational tool to explore RNA conformational space.
Sloat, Nathan; Liu, Jui-Wen; Schroeder, Susan J
2017-11-21
The sequence of nucleotides in an RNA determines the possible base pairs for an RNA fold and thus also determines the overall shape and function of an RNA. The Swellix program presented here combines a helix abstraction with a combinatorial approach to the RNA folding problem in order to compute all possible non-pseudoknotted RNA structures for RNA sequences. The Swellix program builds on the Crumple program and can include experimental constraints on global RNA structures such as the minimum number and lengths of helices from crystallography, cryoelectron microscopy, or in vivo crosslinking and chemical probing methods. The conceptual advance in Swellix is to count helices and generate all possible combinations of helices rather than counting and combining base pairs. Swellix bundles similar helices and includes improvements in memory use and efficient parallelization. Biological applications of Swellix are demonstrated by computing the reduction in conformational space and entropy due to naturally modified nucleotides in tRNA sequences and by motif searches in Human Endogenous Retroviral (HERV) RNA sequences. The Swellix motif search reveals occurrences of protein and drug binding motifs in the HERV RNA ensemble that do not occur in minimum free energy or centroid predicted structures. Swellix presents significant improvements over Crumple in terms of efficiency and memory use. The efficient parallelization of Swellix enables the computation of sequences as long as 418 nucleotides with sufficient experimental constraints. Thus, Swellix provides a practical alternative to free energy minimization tools when multiple structures, kinetically determined structures, or complex RNA-RNA and RNA-protein interactions are present in an RNA folding problem.
Lohmann, Ingrid
2012-01-01
In multi-cellular organisms, spatiotemporal activity of cis-regulatory DNA elements depends on their occupancy by different transcription factors (TFs). In recent years, genome-wide ChIP-on-Chip, ChIP-Seq and DamID assays have been extensively used to unravel the combinatorial interaction of TFs with cis-regulatory modules (CRMs) in the genome. Even though genome-wide binding profiles are increasingly becoming available for different TFs, single TF binding profiles are in most cases not sufficient for dissecting complex regulatory networks. Thus, potent computational tools detecting statistically significant and biologically relevant TF-motif co-occurrences in genome-wide datasets are essential for analyzing context-dependent transcriptional regulation. We have developed COPS (Co-Occurrence Pattern Search), a new bioinformatics tool based on a combination of association rules and Markov chain models, which detects co-occurring TF binding sites (BSs) on genomic regions of interest. COPS scans DNA sequences for frequent motif patterns using a Frequent-Pattern tree based data mining approach, which allows efficient performance of the software with respect to both data structure and implementation speed, in particular when mining large datasets. Since transcriptional gene regulation very often relies on the formation of regulatory protein complexes mediated by closely adjoining TF binding sites on CRMs, COPS additionally detects preferred short distance between co-occurring TF motifs. The performance of our software with respect to biological significance was evaluated using three published datasets containing genomic regions that are independently bound by several TFs involved in a defined biological process. In sum, COPS is a fast, efficient and user-friendly tool mining statistically and biologically significant TFBS co-occurrences and therefore allows the identification of TFs that combinatorially regulate gene expression. PMID:23272209
Rewriting nature's assembly manual for a ssRNA virus.
Patel, Nikesh; Wroblewski, Emma; Leonov, German; Phillips, Simon E V; Tuma, Roman; Twarock, Reidun; Stockley, Peter G
2017-11-14
Satellite tobacco necrosis virus (STNV) is one of the smallest viruses known. Its genome encodes only its coat protein (CP) subunit, relying on the polymerase of its helper virus TNV for replication. The genome has been shown to contain a cryptic set of dispersed assembly signals in the form of stem-loops that each present a minimal CP-binding motif AXXA in the loops. The genomic fragment encompassing nucleotides 1-127 is predicted to contain five such packaging signals (PSs). We have used mutagenesis to determine the critical assembly features in this region. These include the CP-binding motif, the relative placement of PS stem-loops, their number, and their folding propensity. CP binding has an electrostatic contribution, but assembly nucleation is dominated by the recognition of the folded PSs in the RNA fragment. Mutation to remove all AXXA motifs in PSs throughout the genome yields an RNA that is unable to assemble efficiently. In contrast, when a synthetic 127-nt fragment encompassing improved PSs is swapped onto the RNA otherwise lacking CP recognition motifs, assembly is partially restored, although the virus-like particles created are incomplete, implying that PSs outside this region are required for correct assembly. Swapping this improved region into the wild-type STNV1 sequence results in a better assembly substrate than the viral RNA, producing complete capsids and outcompeting the wild-type genome in head-to-head competition. These data confirm details of the PS-mediated assembly mechanism for STNV and identify an efficient approach for production of stable virus-like particles encapsidating nonnative RNAs or other cargoes. Copyright © 2017 the Author(s). Published by PNAS.
Boisgerault, F; Khalil, I; Tieng, V; Connan, F; Tabary, T; Cohen, J H; Choppin, J; Charron, D; Toubert, A
1996-01-01
The peptide-binding motif of HLA-A29, the predisposing allele for birdshot retinopathy, was determined after acid-elution of endogenous peptides from purified HLA-A29 molecules. Individual and pooled HPLC fractions were sequenced by Edman degradation. Major anchor residues could be defined as glutamate at the second position of the peptide and as tyrosine at the carboxyl terminus. In vitro binding of polyglycine synthetic peptides to purified HLA-A29 molecules also revealed the need for an auxiliary anchor residue at the third position, preferably phenylalanine. By using this motif, we synthesized six peptides from the retinal soluble antigen, a candidate autoantigen in autoimmune uveoretinitis. Their in vitro binding was tested on HLA-A29 and also on HLA-B44 and HLA-B61, two alleles sharing close peptide-binding motifs. Two peptides derived from the carboxyl-terminal sequence of the human retinal soluble antigen bound efficiently to HLA-A29. This study could contribute to the prediction of T-cell epitopes from retinal autoantigens implicated in birdshot retinopathy. PMID:8622959
A One-Pot Synthesis of Dibenzofurans from 6-Diazo-2-cyclohexenones.
Zhao, Hua; Yang, Ke; Zheng, Hongyan; Ding, Ruichao; Yin, Fangjie; Wang, Ning; Li, Yun; Cheng, Bin; Wang, Huifei; Zhai, Hongbin
2015-12-04
A novel and efficient protocol for the rapid construction of dibenzofuran motifs from 6-diazo-2-cyclohexenone and ortho-haloiodobenzene has been developed. The process involves one-pot Pd-catalyzed cross-coupling/aromatization and Cu-catalyzed Ullmann coupling.
Designer self-assembling hydrogel scaffolds can impact skin cell proliferation and migration
Bradshaw, Michael; Ho, Diwei; Fear, Mark W.; Gelain, Fabrizio; Wood, Fiona M.; Iyer, K. Swaminathan
2014-01-01
There is a need to develop economical, efficient and widely available therapeutic approaches to enhance the rate of skin wound healing. The optimal outcome of wound healing is restoration to the pre-wound quality of health. In this study we investigate the cellular response to biological stimuli using functionalized nanofibers from the self-assembling peptide, RADA16. We demonstrate that adding different functional motifs to the RADA16 base peptide can influence the rate of proliferation and migration of keratinocytes and dermal fibroblasts. Relative to unmodified RADA16; the Collagen I motif significantly promotes cell migration, and reduces proliferation. PMID:25384420
Dolnik, Olga; Kolesnikova, Larissa; Welsch, Sonja; Strecker, Thomas; Schudt, Gordian; Becker, Stephan
2014-10-01
Endosomal sorting complex required for transport (ESCRT) machinery supports the efficient budding of Marburg virus (MARV) and many other enveloped viruses. Interaction between components of the ESCRT machinery and viral proteins is predominantly mediated by short tetrapeptide motifs, known as late domains. MARV contains late domain motifs in the matrix protein VP40 and in the genome-encapsidating nucleoprotein (NP). The PSAP late domain motif of NP recruits the ESCRT-I protein tumor susceptibility gene 101 (Tsg101). Here, we generated a recombinant MARV encoding NP with a mutated PSAP late domain (rMARV(PSAPmut)). rMARV(PSAPmut) was attenuated by up to one log compared with recombinant wild-type MARV (rMARV(wt)), formed smaller plaques and exhibited delayed virus release. Nucleocapsids in rMARV(PSAPmut)-infected cells were more densely packed inside viral inclusions and more abundant in the cytoplasm than in rMARV(wt)-infected cells. A similar phenotype was detected when MARV-infected cells were depleted of Tsg101. Live-cell imaging analyses revealed that Tsg101 accumulated in inclusions of rMARV(wt)-infected cells and was co-transported together with nucleocapsids. In contrast, rMARV(PSAPmut) nucleocapsids did not display co-localization with Tsg101, had significantly shorter transport trajectories, and migration close to the plasma membrane was severely impaired, resulting in reduced recruitment into filopodia, the major budding sites of MARV. We further show that the Tsg101 interacting protein IQGAP1, an actin cytoskeleton regulator, was recruited into inclusions and to individual nucleocapsids together with Tsg101. Moreover, IQGAP1 was detected in a contrail-like structure at the rear end of migrating nucleocapsids. Down regulation of IQGAP1 impaired release of MARV. These results indicate that the PSAP motif in NP, which enables binding to Tsg101, is important for the efficient actin-dependent transport of nucleocapsids to the sites of budding. Thus, the interaction between NP and Tsg101 supports several steps of MARV assembly before virus fission.
Netz, Daili J. A.; Pierik, Antonio J.; Stümpfig, Martin; Bill, Eckhard; Sharma, Anil K.; Pallesen, Leif J.; Walden, William E.; Lill, Roland
2012-01-01
The essential P-loop NTPases Cfd1 and Nbp35 of the cytosolic iron-sulfur (Fe-S) protein assembly machinery perform a scaffold function for Fe-S cluster synthesis. Both proteins contain a nucleotide binding motif of unknown function and a C-terminal motif with four conserved cysteine residues. The latter motif defines the Mrp/Nbp35 subclass of P-loop NTPases and is suspected to be involved in transient Fe-S cluster binding. To elucidate the function of these two motifs, we first created cysteine mutant proteins of Cfd1 and Nbp35 and investigated the consequences of these mutations by genetic, cell biological, biochemical, and spectroscopic approaches. The two central cysteine residues (CPXC) of the C-terminal motif were found to be crucial for cell viability, protein function, coordination of a labile [4Fe-4S] cluster, and Cfd1-Nbp35 hetero-tetramer formation. Surprisingly, the two proximal cysteine residues were dispensable for all these functions, despite their strict evolutionary conservation. Several lines of evidence suggest that the C-terminal CPXC motifs of Cfd1-Nbp35 coordinate a bridging [4Fe-4S] cluster. Upon mutation of the nucleotide binding motifs Fe-S clusters could no longer be assembled on these proteins unless wild-type copies of Cfd1 and Nbp35 were present in trans. This result indicated that Fe-S cluster loading on these scaffold proteins is a nucleotide-dependent step. We propose that the bridging coordination of the C-terminal Fe-S cluster may be ideal for its facile assembly, labile binding, and efficient transfer to target Fe-S apoproteins, a step facilitated by the cytosolic iron-sulfur (Fe-S) protein assembly proteins Nar1 and Cia1 in vivo. PMID:22362766
Salomone, Fabrizio; Cardarelli, Francesco; Di Luca, Mariagrazia; Boccardi, Claudia; Nifosì, Riccardo; Bardi, Giuseppe; Di Bari, Lorenzo; Serresi, Michela; Beltram, Fabio
2012-11-10
Efficient endocytosis into a wide range of target cells and low toxicity make the arginine-rich Tat peptide (Tat(11): YGRKKRRQRRR, residues 47-57 of HIV-1 Tat protein) an excellent transporter for delivery purposes. Unfortunately, molecules taken up by endocytosis undergo endosomal entrapment and possible metabolic degradation. Escape from the endosome is therefore actively researched. In this context, antimicrobial peptides (AMPs) provide viable templates for the design of new membrane-disruptive motifs. In particular the Cecropin-A and Melittin hybrids (CMs) are among the smallest and most effective peptides with membrane-perturbing abilities. Here we present a novel chimeric peptide in which the Tat(11) motif is fused to the CM(18) hybrid (KWKLFKKIGAVLKVLTTG, residues 1-7 of Cecropin-A and 2-12 of Melittin). When administered to cells, CM(18)-Tat(11) combines the two desired functionalities: efficient uptake and destabilization of endocytotic-vesicle membranes. We show that this chimeric peptide effectively increases cargo-molecule cytoplasm availability and allows the subsequent intracellular localization of diverse membrane-impermeable molecules (i.e. Tat(11)-EGFP fusion protein, calcein, dextrans, and plasmidic DNA) with no detectable cytotoxicity. The present results open the way to the rational engineering of "modular" cell-penetrating peptides (CPPs) that combine (i) efficient translocation from the extracellular milieu into vesicles and (ii) efficient release of molecules from vesicles into the cytoplasm. Copyright © 2012 Elsevier B.V. All rights reserved.
Hesselmann, Andreas; Görling, Andreas
2011-01-21
A recently introduced time-dependent exact-exchange (TDEXX) method, i.e., a response method based on time-dependent density-functional theory that treats the frequency-dependent exchange kernel exactly, is reformulated. In the reformulated version of the TDEXX method electronic excitation energies can be calculated by solving a linear generalized eigenvalue problem while in the original version of the TDEXX method a laborious frequency iteration is required in the calculation of each excitation energy. The lowest eigenvalues of the new TDEXX eigenvalue equation corresponding to the lowest excitation energies can be efficiently obtained by, e.g., a version of the Davidson algorithm appropriate for generalized eigenvalue problems. Alternatively, with the help of a series expansion of the new TDEXX eigenvalue equation, standard eigensolvers for large regular eigenvalue problems, e.g., the standard Davidson algorithm, can be used to efficiently calculate the lowest excitation energies. With the help of the series expansion as well, the relation between the TDEXX method and time-dependent Hartree-Fock is analyzed. Several ways to take into account correlation in addition to the exact treatment of exchange in the TDEXX method are discussed, e.g., a scaling of the Kohn-Sham eigenvalues, the inclusion of (semi)local approximate correlation potentials, or hybrids of the exact-exchange kernel with kernels within the adiabatic local density approximation. The lowest lying excitations of the molecules ethylene, acetaldehyde, and pyridine are considered as examples.
Enhancing the efficiency of sortase-mediated ligations through nickel-peptide complex formation.
David Row, R; Roark, Travis J; Philip, Marina C; Perkins, Lorena L; Antos, John M
2015-08-14
A modified sortase A recognition motif containing a masked Ni(2+)-binding peptide was employed to boost the efficiency of sortase-catalyzed ligation reactions. Deactivation of the Ni(2+)-binding peptide using a Ni(2+) additive improved reaction performance at low to equimolar ratios of the glycine amine nucleophile and sortase substrate. The success of this approach was demonstrated with both peptide and protein substrates.
Kuo, Ching-Ying; Li, Xu; Kong, Xiang-Qian; Luo, Cheng; Chang, Che-Chang; Chung, Yiyin; Shih, Hsiu-Ming; Li, Keqin Kathy; Ann, David K
2014-07-25
Krüppel-associated box domain-associated protein 1 (KAP1) is a universal transcriptional corepressor that undergoes multiple posttranslational modifications (PTMs), including SUMOylation and Ser-824 phosphorylation. However, the functional interplay of KAP1 PTMs in regulating KAP1 turnover during DNA damage response remains unclear. To decipher the role and cross-talk of multiple KAP1 PTMs, we show here that DNA double strand break-induced KAP1 Ser-824 phosphorylation promoted the recruitment of small ubiquitin-like modifier (SUMO)-targeted ubiquitin E3 ligase, ring finger protein 4 (RNF4), and subsequent RNF4-mediated, SUMO-dependent degradation. Besides the SUMO interacting motif (SIM), a previously unrecognized, but evolutionarily conserved, arginine-rich motif (ARM) in RNF4 acts as a novel recognition motif for selective target recruitment. Results from combined mutagenesis and computational modeling studies suggest that RNF4 utilizes concerted bimodular recognition, namely SIM for Lys-676 SUMOylation and ARM for Ser(P)-824 of simultaneously phosphorylated and SUMOylated KAP1 (Ser(P)-824-SUMO-KAP1). Furthermore, we proved that arginines 73 and 74 within the ARM of RNF4 are required for efficient recruitment to KAP1 or accelerated degradation of promyelocytic leukemia protein (PML) under stress. In parallel, results of bimolecular fluorescence complementation assays validated the role of the ARM in recognizing Ser(P)-824 in living cells. Taken together, we establish that the ARM is required for RNF4 to efficiently target Ser(P)-824-SUMO-KAP1, conferring ubiquitin Lys-48-mediated proteasomal degradation in the context of double strand breaks. The conservation of such a motif may possibly explain the requirement for timely substrate selectivity determination among a myriad of SUMOylated proteins under stress conditions. Thus, the ARM dynamically regulates the SIM-dependent recruitment of targets to RNF4, which could be critical to dynamically fine-tune the abundance of Ser(P)-824-SUMO-KAP1 and, potentially, other SUMOylated proteins during DNA damage response. © 2014 by The American Society for Biochemistry and Molecular Biology, Inc.
Kuo, Ching-Ying; Li, Xu; Kong, Xiang-Qian; Luo, Cheng; Chang, Che-Chang; Chung, Yiyin; Shih, Hsiu-Ming; Li, Keqin Kathy; Ann, David K.
2014-01-01
Krüppel-associated box domain-associated protein 1 (KAP1) is a universal transcriptional corepressor that undergoes multiple posttranslational modifications (PTMs), including SUMOylation and Ser-824 phosphorylation. However, the functional interplay of KAP1 PTMs in regulating KAP1 turnover during DNA damage response remains unclear. To decipher the role and cross-talk of multiple KAP1 PTMs, we show here that DNA double strand break-induced KAP1 Ser-824 phosphorylation promoted the recruitment of small ubiquitin-like modifier (SUMO)-targeted ubiquitin E3 ligase, ring finger protein 4 (RNF4), and subsequent RNF4-mediated, SUMO-dependent degradation. Besides the SUMO interacting motif (SIM), a previously unrecognized, but evolutionarily conserved, arginine-rich motif (ARM) in RNF4 acts as a novel recognition motif for selective target recruitment. Results from combined mutagenesis and computational modeling studies suggest that RNF4 utilizes concerted bimodular recognition, namely SIM for Lys-676 SUMOylation and ARM for Ser(P)-824 of simultaneously phosphorylated and SUMOylated KAP1 (Ser(P)-824-SUMO-KAP1). Furthermore, we proved that arginines 73 and 74 within the ARM of RNF4 are required for efficient recruitment to KAP1 or accelerated degradation of promyelocytic leukemia protein (PML) under stress. In parallel, results of bimolecular fluorescence complementation assays validated the role of the ARM in recognizing Ser(P)-824 in living cells. Taken together, we establish that the ARM is required for RNF4 to efficiently target Ser(P)-824-SUMO-KAP1, conferring ubiquitin Lys-48-mediated proteasomal degradation in the context of double strand breaks. The conservation of such a motif may possibly explain the requirement for timely substrate selectivity determination among a myriad of SUMOylated proteins under stress conditions. Thus, the ARM dynamically regulates the SIM-dependent recruitment of targets to RNF4, which could be critical to dynamically fine-tune the abundance of Ser(P)-824-SUMO-KAP1 and, potentially, other SUMOylated proteins during DNA damage response. PMID:24907272
Liu, Gary W; Livesay, Brynn R; Kacherovsky, Nataly A; Cieslewicz, Maryelise; Lutz, Emi; Waalkes, Adam; Jensen, Michael C; Salipante, Stephen J; Pun, Suzie H
2015-08-19
Peptide ligands are used to increase the specificity of drug carriers to their target cells and to facilitate intracellular delivery. One method to identify such peptide ligands, phage display, enables high-throughput screening of peptide libraries for ligands binding to therapeutic targets of interest. However, conventional methods for identifying target binders in a library by Sanger sequencing are low-throughput, labor-intensive, and provide a limited perspective (<0.01%) of the complete sequence space. Moreover, the small sample space can be dominated by nonspecific, preferentially amplifying "parasitic sequences" and plastic-binding sequences, which may lead to the identification of false positives or exclude the identification of target-binding sequences. To overcome these challenges, we employed next-generation Illumina sequencing to couple high-throughput screening and high-throughput sequencing, enabling more comprehensive access to the phage display library sequence space. In this work, we define the hallmarks of binding sequences in next-generation sequencing data, and develop a method that identifies several target-binding phage clones for murine, alternatively activated M2 macrophages with a high (100%) success rate: sequences and binding motifs were reproducibly present across biological replicates; binding motifs were identified across multiple unique sequences; and an unselected, amplified library accurately filtered out parasitic sequences. In addition, we validate the Multiple Em for Motif Elicitation tool as an efficient and principled means of discovering binding sequences.
Lee, Semin; Thebault, Philippe; Freschi, Luca; Beaufils, Sylvie; Blundell, Tom L.; Landry, Christian R.; Bolanos-Garcia, Victor M.; Elowe, Sabine
2012-01-01
Kinetochore targeting of the mitotic kinases Bub1, BubR1, and Mps1 has been implicated in efficient execution of their functions in the spindle checkpoint, the self-monitoring system of the eukaryotic cell cycle that ensures chromosome segregation occurs with high fidelity. In all three kinases, kinetochore docking is mediated by the N-terminal region of the protein. Deletions within this region result in checkpoint failure and chromosome segregation defects. Here, we use an interdisciplinary approach that includes biophysical, biochemical, cell biological, and bioinformatics methods to study the N-terminal region of human Mps1. We report the identification of a tandem repeat of the tetratricopeptide repeat (TPR) motif in the N-terminal kinetochore binding region of Mps1, with close homology to the tandem TPR motif of Bub1 and BubR1. Phylogenetic analysis indicates that TPR Mps1 was acquired after the split between deutorostomes and protostomes, as it is distinguishable in chordates and echinoderms. Overexpression of TPR Mps1 resulted in decreased efficiency of both chromosome alignment and mitotic arrest, likely through displacement of endogenous Mps1 from the kinetochore and decreased Mps1 catalytic activity. Taken together, our multidisciplinary strategy provides new insights into the evolution, structural organization, and function of Mps1 N-terminal region. PMID:22187426
Lee, Semin; Thebault, Philippe; Freschi, Luca; Beaufils, Sylvie; Blundell, Tom L; Landry, Christian R; Bolanos-Garcia, Victor M; Elowe, Sabine
2012-02-17
Kinetochore targeting of the mitotic kinases Bub1, BubR1, and Mps1 has been implicated in efficient execution of their functions in the spindle checkpoint, the self-monitoring system of the eukaryotic cell cycle that ensures chromosome segregation occurs with high fidelity. In all three kinases, kinetochore docking is mediated by the N-terminal region of the protein. Deletions within this region result in checkpoint failure and chromosome segregation defects. Here, we use an interdisciplinary approach that includes biophysical, biochemical, cell biological, and bioinformatics methods to study the N-terminal region of human Mps1. We report the identification of a tandem repeat of the tetratricopeptide repeat (TPR) motif in the N-terminal kinetochore binding region of Mps1, with close homology to the tandem TPR motif of Bub1 and BubR1. Phylogenetic analysis indicates that TPR Mps1 was acquired after the split between deutorostomes and protostomes, as it is distinguishable in chordates and echinoderms. Overexpression of TPR Mps1 resulted in decreased efficiency of both chromosome alignment and mitotic arrest, likely through displacement of endogenous Mps1 from the kinetochore and decreased Mps1 catalytic activity. Taken together, our multidisciplinary strategy provides new insights into the evolution, structural organization, and function of Mps1 N-terminal region.
New analytical exact solutions of time fractional KdV-KZK equation by Kudryashov methods
NASA Astrophysics Data System (ADS)
S Saha, Ray
2016-04-01
In this paper, new exact solutions of the time fractional KdV-Khokhlov-Zabolotskaya-Kuznetsov (KdV-KZK) equation are obtained by the classical Kudryashov method and modified Kudryashov method respectively. For this purpose, the modified Riemann-Liouville derivative is used to convert the nonlinear time fractional KdV-KZK equation into the nonlinear ordinary differential equation. In the present analysis, the classical Kudryashov method and modified Kudryashov method are both used successively to compute the analytical solutions of the time fractional KdV-KZK equation. As a result, new exact solutions involving the symmetrical Fibonacci function, hyperbolic function and exponential function are obtained for the first time. The methods under consideration are reliable and efficient, and can be used as an alternative to establish new exact solutions of different types of fractional differential equations arising from mathematical physics. The obtained results are exhibited graphically in order to demonstrate the efficiencies and applicabilities of these proposed methods of solving the nonlinear time fractional KdV-KZK equation.
Saranathan, Vinodkumar; Osuji, Chinedum O; Mochrie, Simon G J; Noh, Heeso; Narayanan, Suresh; Sandy, Alec; Dufresne, Eric R; Prum, Richard O
2010-06-29
Complex three-dimensional biophotonic nanostructures produce the vivid structural colors of many butterfly wing scales, but their exact nanoscale organization is uncertain. We used small angle X-ray scattering (SAXS) on single scales to characterize the 3D photonic nanostructures of five butterfly species from two families (Papilionidae, Lycaenidae). We identify these chitin and air nanostructures as single network gyroid (I4(1)32) photonic crystals. We describe their optical function from SAXS data and photonic band-gap modeling. Butterflies apparently grow these gyroid nanostructures by exploiting the self-organizing physical dynamics of biological lipid-bilayer membranes. These butterfly photonic nanostructures initially develop within scale cells as a core-shell double gyroid (Ia3d), as seen in block-copolymer systems, with a pentacontinuous volume comprised of extracellular space, cell plasma membrane, cellular cytoplasm, smooth endoplasmic reticulum (SER) membrane, and intra-SER lumen. This double gyroid nanostructure is subsequently transformed into a single gyroid network through the deposition of chitin in the extracellular space and the degeneration of the rest of the cell. The butterflies develop the thermodynamically favored double gyroid precursors as a route to the optically more efficient single gyroid nanostructures. Current approaches to photonic crystal engineering also aim to produce single gyroid motifs. The biologically derived photonic nanostructures characterized here may offer a convenient template for producing optical devices based on biomimicry or direct dielectric infiltration.
Saranathan, Vinodkumar; Osuji, Chinedum O.; Mochrie, Simon G. J.; Noh, Heeso; Narayanan, Suresh; Sandy, Alec; Dufresne, Eric R.; Prum, Richard O.
2010-01-01
Complex three-dimensional biophotonic nanostructures produce the vivid structural colors of many butterfly wing scales, but their exact nanoscale organization is uncertain. We used small angle X-ray scattering (SAXS) on single scales to characterize the 3D photonic nanostructures of five butterfly species from two families (Papilionidae, Lycaenidae). We identify these chitin and air nanostructures as single network gyroid (I4132) photonic crystals. We describe their optical function from SAXS data and photonic band-gap modeling. Butterflies apparently grow these gyroid nanostructures by exploiting the self-organizing physical dynamics of biological lipid-bilayer membranes. These butterfly photonic nanostructures initially develop within scale cells as a core-shell double gyroid (Ia3d), as seen in block-copolymer systems, with a pentacontinuous volume comprised of extracellular space, cell plasma membrane, cellular cytoplasm, smooth endoplasmic reticulum (SER) membrane, and intra-SER lumen. This double gyroid nanostructure is subsequently transformed into a single gyroid network through the deposition of chitin in the extracellular space and the degeneration of the rest of the cell. The butterflies develop the thermodynamically favored double gyroid precursors as a route to the optically more efficient single gyroid nanostructures. Current approaches to photonic crystal engineering also aim to produce single gyroid motifs. The biologically derived photonic nanostructures characterized here may offer a convenient template for producing optical devices based on biomimicry or direct dielectric infiltration. PMID:20547870
Sztuba-Solinska, Joanna; Teramoto, Tadahisa; Rausch, Jason W.; Shapiro, Bruce A.; Padmanabhan, Radhakrishnan; Le Grice, Stuart F. J.
2013-01-01
The Dengue virus (DENV) genome contains multiple cis-acting elements required for translation and replication. Previous studies indicated that a 719-nt subgenomic minigenome (DENV-MINI) is an efficient template for translation and (−) strand RNA synthesis in vitro. We performed a detailed structural analysis of DENV-MINI RNA, combining chemical acylation techniques, Pb2+ ion-induced hydrolysis and site-directed mutagenesis. Our results highlight protein-independent 5′–3′ terminal interactions involving hybridization between recognized cis-acting motifs. Probing analyses identified tandem dumbbell structures (DBs) within the 3′ terminus spaced by single-stranded regions, internal loops and hairpins with embedded GNRA-like motifs. Analysis of conserved motifs and top loops (TLs) of these dumbbells, and their proposed interactions with downstream pseudoknot (PK) regions, predicted an H-type pseudoknot involving TL1 of the 5′ DB and the complementary region, PK2. As disrupting the TL1/PK2 interaction, via ‘flipping’ mutations of PK2, previously attenuated DENV replication, this pseudoknot may participate in regulation of RNA synthesis. Computer modeling implied that this motif might function as autonomous structural/regulatory element. In addition, our studies targeting elements of the 3′ DB and its complementary region PK1 indicated that communication between 5′–3′ terminal regions strongly depends on structure and sequence composition of the 5′ cyclization region. PMID:23531545
Motifs in triadic random graphs based on Steiner triple systems
NASA Astrophysics Data System (ADS)
Winkler, Marco; Reichardt, Jörg
2013-08-01
Conventionally, pairwise relationships between nodes are considered to be the fundamental building blocks of complex networks. However, over the last decade, the overabundance of certain subnetwork patterns, i.e., the so-called motifs, has attracted much attention. It has been hypothesized that these motifs, instead of links, serve as the building blocks of network structures. Although the relation between a network's topology and the general properties of the system, such as its function, its robustness against perturbations, or its efficiency in spreading information, is the central theme of network science, there is still a lack of sound generative models needed for testing the functional role of subgraph motifs. Our work aims to overcome this limitation. We employ the framework of exponential random graph models (ERGMs) to define models based on triadic substructures. The fact that only a small portion of triads can actually be set independently poses a challenge for the formulation of such models. To overcome this obstacle, we use Steiner triple systems (STSs). These are partitions of sets of nodes into pair-disjoint triads, which thus can be specified independently. Combining the concepts of ERGMs and STSs, we suggest generative models capable of generating ensembles of networks with nontrivial triadic Z-score profiles. Further, we discover inevitable correlations between the abundance of triad patterns, which occur solely for statistical reasons and need to be taken into account when discussing the functional implications of motif statistics. Moreover, we calculate the degree distributions of our triadic random graphs analytically.
Elengoe, Asita; Hamdan, Salehhuddin
2017-12-01
In this study, we explored the possibility of determining the synergistic interactions between nucleotide-binding domain (NBD) of Homo sapiens heat-shock 70 kDa protein (Hsp70) and E1A 32 kDa of adenovirus serotype 5 motif (PNLVP) in the efficiency of killing of tumor cells in cancer treatment. At present, the protein interaction between NBD and PNLVP motif is still unknown, but believed to enhance the rate of virus replication in tumor cells. Three mutant models (E229V, H225P and D230C) were built and simulated, and their interactions with PNLVP motif were studied. The PNLVP motif showed the binding energy and intermolecular energy values with the novel E229V mutant at -7.32 and -11.2 kcal/mol. The E229V mutant had the highest number of hydrogen bonds (7). Based on the root mean square deviation, root mean square fluctuation, hydrogen bonds, salt bridge, secondary structure, surface-accessible solvent area, potential energy and distance matrices analyses, it was proved that the E229V had the strongest and most stable interaction with the PNLVP motif among all the four protein-ligand complex structures. The knowledge of this protein-ligand complex model would help in designing Hsp70 structure-based drug for cancer therapy.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fukumoto, Yasunori, E-mail: fukumoto@faculty.chiba-u.jp; Ikeuchi, Masayoshi; Nakayama, Yuji
ATR-dependent DNA damage checkpoint is the major DNA damage checkpoint against UV irradiation and DNA replication stress. The Rad17–RFC and Rad9–Rad1–Hus1 (9–1–1) complexes interact with each other to contribute to ATR signaling, however, the precise regulatory mechanism of the interaction has not been established. Here, we identified a conserved sequence motif, KYxxL, in the AAA+ domain of Rad17 protein, and demonstrated that this motif is essential for the interaction with the 9–1–1 complex. We also show that UV-induced Rad17 phosphorylation is increased in the Rad17 KYxxL mutants. These data indicate that the interaction with the 9–1–1 complex is not required formore » Rad17 protein to be an efficient substrate for the UV-induced phosphorylation. Our data also raise the possibility that the 9–1–1 complex plays a negative regulatory role in the Rad17 phosphorylation. We also show that the nucleotide-binding activity of Rad17 is required for its nuclear localization. - Highlights: • We have identified a conserved KYxxL motif in Rad17 protein. • The KYxxL motif is crucial for the interaction with the 9–1–1 complex. • The KYxxL motif is dispensable or inhibitory for UV-induced Rad17 phosphorylation. • Nucleotide binding of Rad17 is required for its nuclear localization.« less
Bello-Rivas, Juan M.; Elber, Ron
2015-01-01
A new theory and an exact computer algorithm for calculating kinetics and thermodynamic properties of a particle system are described. The algorithm avoids trapping in metastable states, which are typical challenges for Molecular Dynamics (MD) simulations on rough energy landscapes. It is based on the division of the full space into Voronoi cells. Prior knowledge or coarse sampling of space points provides the centers of the Voronoi cells. Short time trajectories are computed between the boundaries of the cells that we call milestones and are used to determine fluxes at the milestones. The flux function, an essential component of the new theory, provides a complete description of the statistical mechanics of the system at the resolution of the milestones. We illustrate the accuracy and efficiency of the exact Milestoning approach by comparing numerical results obtained on a model system using exact Milestoning with the results of long trajectories and with a solution of the corresponding Fokker-Planck equation. The theory uses an equation that resembles the approximate Milestoning method that was introduced in 2004 [A. K. Faradjian and R. Elber, J. Chem. Phys. 120(23), 10880-10889 (2004)]. However, the current formulation is exact and is still significantly more efficient than straightforward MD simulations on the system studied. PMID:25747056
Improved treatment of exact exchange in Quantum ESPRESSO
Barnes, Taylor A.; Kurth, Thorsten; Carrier, Pierre; ...
2017-01-18
Here, we present an algorithm and implementation for the parallel computation of exact exchange in Quantum ESPRESSO (QE) that exhibits greatly improved strong scaling. QE is an open-source software package for electronic structure calculations using plane wave density functional theory, and supports the use of local, semi-local, and hybrid DFT functionals. Wider application of hybrid functionals is desirable for the improved simulation of electronic band energy alignments and thermodynamic properties, but the computational complexity of evaluating the exact exchange potential limits the practical application of hybrid functionals to large systems and requires efficient implementations. We demonstrate that existing implementations ofmore » hybrid DFT that utilize a single data structure for both the local and exact exchange regions of the code are significantly limited in the degree of parallelization achievable. We present a band-pair parallelization approach, in which the calculation of exact exchange is parallelized and evaluated independently from the parallelization of the remainder of the calculation, with the wavefunction data being efficiently transformed on-the-fly into a form that is optimal for each part of the calculation. For a 64 water molecule supercell, our new algorithm reduces the overall time to solution by nearly an order of magnitude.« less
Versatile RNA tetra-U helix linking motif as a toolkit for nucleic acid nanotechnology.
Bui, My N; Brittany Johnson, M; Viard, Mathias; Satterwhite, Emily; Martins, Angelica N; Li, Zhihai; Marriott, Ian; Afonin, Kirill A; Khisamutdinov, Emil F
2017-04-01
RNA nanotechnology employs synthetically modified ribonucleic acid (RNA) to engineer highly stable nanostructures in one, two, and three dimensions for medical applications. Despite the tremendous advantages in RNA nanotechnology, unmodified RNA itself is fragile and prone to enzymatic degradation. In contrast to use traditionally modified RNA strands e.g. 2'-fluorine, 2'-amine, 2'-methyl, we studied the effect of RNA/DNA hybrid approach utilizing a computer-assisted RNA tetra-uracil (tetra-U) motif as a toolkit to address questions related to assembly efficiency, versatility, stability, and the production costs of hybrid RNA/DNA nanoparticles. The tetra-U RNA motif was implemented to construct four functional triangles using RNA, DNA and RNA/DNA mixtures, resulting in fine-tunable enzymatic and thermodynamic stabilities, immunostimulatory activity and RNAi capability. Moreover, the tetra-U toolkit has great potential in the fabrication of rectangular, pentagonal, and hexagonal NPs, representing the power of simplicity of RNA/DNA approach for RNA nanotechnology and nanomedicine community. Copyright © 2017 Elsevier Inc. All rights reserved.
Defrance, Matthieu; Janky, Rekin's; Sand, Olivier; van Helden, Jacques
2008-01-01
This protocol explains how to discover functional signals in genomic sequences by detecting over- or under-represented oligonucleotides (words) or spaced pairs thereof (dyads) with the Regulatory Sequence Analysis Tools (http://rsat.ulb.ac.be/rsat/). Two typical applications are presented: (i) predicting transcription factor-binding motifs in promoters of coregulated genes and (ii) discovering phylogenetic footprints in promoters of orthologous genes. The steps of this protocol include purging genomic sequences to discard redundant fragments, discovering over-represented patterns and assembling them to obtain degenerate motifs, scanning sequences and drawing feature maps. The main strength of the method is its statistical ground: the binomial significance provides an efficient control on the rate of false positives. In contrast with optimization-based pattern discovery algorithms, the method supports the detection of under- as well as over-represented motifs. Computation times vary from seconds (gene clusters) to minutes (whole genomes). The execution of the whole protocol should take approximately 1 h.
Wang, Yin; Li, Rudong; Zhou, Yuhua; Ling, Zongxin; Guo, Xiaokui; Xie, Lu; Liu, Lei
2016-01-01
Text data of 16S rRNA are informative for classifications of microbiota-associated diseases. However, the raw text data need to be systematically processed so that features for classification can be defined/extracted; moreover, the high-dimension feature spaces generated by the text data also pose an additional difficulty. Here we present a Phylogenetic Tree-Based Motif Finding algorithm (PMF) to analyze 16S rRNA text data. By integrating phylogenetic rules and other statistical indexes for classification, we can effectively reduce the dimension of the large feature spaces generated by the text datasets. Using the retrieved motifs in combination with common classification methods, we can discriminate different samples of both pneumonia and dental caries better than other existing methods. We extend the phylogenetic approaches to perform supervised learning on microbiota text data to discriminate the pathological states for pneumonia and dental caries. The results have shown that PMF may enhance the efficiency and reliability in analyzing high-dimension text data.
Alvarez, Bruno; Barra, Carolina; Nielsen, Morten; Andreatta, Massimo
2018-01-12
Recent advances in proteomics and mass-spectrometry have widely expanded the detectable peptide repertoire presented by major histocompatibility complex (MHC) molecules on the cell surface, collectively known as the immunopeptidome. Finely characterizing the immunopeptidome brings about important basic insights into the mechanisms of antigen presentation, but can also reveal promising targets for vaccine development and cancer immunotherapy. This report describes a number of practical and efficient approaches to analyze immunopeptidomics data, discussing the identification of meaningful sequence motifs in various scenarios and considering current limitations. Guidelines are provided for the filtering of false hits and contaminants, and to address the problem of motif deconvolution in cell lines expressing multiple MHC alleles, both for the MHC class I and class II systems. Finally, it is demonstrated how machine learning can be readily employed by non-expert users to generate accurate prediction models directly from mass-spectrometry eluted ligand data sets. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Exact Tests for the Rasch Model via Sequential Importance Sampling
ERIC Educational Resources Information Center
Chen, Yuguo; Small, Dylan
2005-01-01
Rasch proposed an exact conditional inference approach to testing his model but never implemented it because it involves the calculation of a complicated probability. This paper furthers Rasch's approach by (1) providing an efficient Monte Carlo methodology for accurately approximating the required probability and (2) illustrating the usefulness…
Constructing the Exact Significance Level for a Person-Fit Statistic.
ERIC Educational Resources Information Center
Liou, Michelle; Chang, Chih-Hsin
1992-01-01
An extension is proposed for the network algorithm introduced by C.R. Mehta and N.R. Patel to construct exact tail probabilities for testing the general hypothesis that item responses are distributed according to the Rasch model. A simulation study indicates the efficiency of the algorithm. (SLD)
Kirby, Thomas W.; Gassman, Natalie R.; Smith, Cassandra E.; ...
2015-08-25
We have characterized the nuclear localization signal (NLS) of XRCC1 structurally using X-ray crystallography and functionally using fluorescence imaging. Crystallography and binding studies confirm the bipartite nature of the XRCC1 NLS interaction with Importin α (Impα) in which the major and minor binding motifs are separated by >20 residues, and resolve previous inconsistent determinations. Binding studies of peptides corresponding to the bipartite NLS, as well as its major and minor binding motifs, to both wild-type and mutated forms of Impα reveal pronounced cooperative binding behavior that is generated by the proximity effect of the tethered major and minor motifs ofmore » the NLS. The cooperativity stems from the increased local concentration of the second motif near its cognate binding site that is a consequence of the stepwise binding behavior of the bipartite NLS. We predict that the stepwise dissociation of the NLS from Impα facilitates unloading by providing a partially complexed intermediate that is available for competitive binding by Nup50 or the Importin β binding domain. This behavior gives a basis for meeting the intrinsically conflicting high affinity and high flux requirements of an efficient nuclear transport system.« less
SMARTIV: combined sequence and structure de-novo motif discovery for in-vivo RNA binding data.
Polishchuk, Maya; Paz, Inbal; Yakhini, Zohar; Mandel-Gutfreund, Yael
2018-05-25
Gene expression regulation is highly dependent on binding of RNA-binding proteins (RBPs) to their RNA targets. Growing evidence supports the notion that both RNA primary sequence and its local secondary structure play a role in specific Protein-RNA recognition and binding. Despite the great advance in high-throughput experimental methods for identifying sequence targets of RBPs, predicting the specific sequence and structure binding preferences of RBPs remains a major challenge. We present a novel webserver, SMARTIV, designed for discovering and visualizing combined RNA sequence and structure motifs from high-throughput RNA-binding data, generated from in-vivo experiments. The uniqueness of SMARTIV is that it predicts motifs from enriched k-mers that combine information from ranked RNA sequences and their predicted secondary structure, obtained using various folding methods. Consequently, SMARTIV generates Position Weight Matrices (PWMs) in a combined sequence and structure alphabet with assigned P-values. SMARTIV concisely represents the sequence and structure motif content as a single graphical logo, which is informative and easy for visual perception. SMARTIV was examined extensively on a variety of high-throughput binding experiments for RBPs from different families, generated from different technologies, showing consistent and accurate results. Finally, SMARTIV is a user-friendly webserver, highly efficient in run-time and freely accessible via http://smartiv.technion.ac.il/.
Conservation of batik: Conseptual framework of design and process development
NASA Astrophysics Data System (ADS)
Syamwil, Rodia
2018-03-01
Development of Conservation Batik concept becomes critical due to the recessive of traditional batik as the intangible cultural heritage of humanity. The existence of printed batik, polluting process, and new stream design becomes the consequences of batik industry transformation to creative industry. Conservation Batik was proposed to answer all the threats to traditional batik, in the aspect of technique, process, and motif. However, creativities are also critical to meet consumer satisfaction. Research and development was conducted, start with the initial research in formulating the concept, and exploration of ideas to develop the designs of conservation motifs. In development steps, cyclical process to complete motif with high preferences, in the aspect of aesthetics, productivity, and efficiency. Data were collected through bibliography, documentation, observation, and interview, and analyzed in qualitative methods. The concept of Conservation Batik adopted from the principles of Universitas Negeri Semarang (UNNES) vision, as well as theoretical analyses, and expert judgment. Conservation Batik are assessed from three aspect, design, process, and consumer preferences. Conservation means the effort of safeguarding, promoting, maintaining, and preserving. Concervation Batik concept could be interpreted as batik with: (1) traditional values and authenticity; (2) the values of philosophycal meanings; (3) eco-friendly process with minimum waste; (4) conservation as idea resources of design; and (5) raising up of classic motifs.
NASA Astrophysics Data System (ADS)
Klosik, David F.; Bornholdt, Stefan; Hütt, Marc-Thorsten
2014-09-01
Following the work of Krumov et al. [Eur. Phys. J. B 84, 535 (2011), 10.1140/epjb/e2011-10746-5] we revisit the question whether the usage of large citation datasets allows for the quantitative assessment of social (by means of coauthorship of publications) influence on the progression of science. Applying a more comprehensive and well-curated dataset containing the publications in the journals of the American Physical Society during the whole 20th century we find that the measure chosen in the original study, a score based on small induced subgraphs, has to be used with caution, since the obtained results are highly sensitive to the exact implementation of the author disambiguation task.
EsxB, a secreted protein from Bacillus anthracis forms two distinct helical bundles
Fan, Yao; Tan, Kemin; Chhor, Gekleng; ...
2015-07-03
The EsxB protein from Bacillus anthracis belongs to the WXG100 family, a group of proteins secreted by a specialized secretion system. We have determined the crystal structures of recombinant EsxB and discovered that the small protein (~10 kDa), comprised of a helix-loop-helix (HLH) hairpin, is capable of associating into two different helical bundles. The two basic quaternary assemblies of EsxB are an antiparallel (AP) dimer and a rarely observed bisecting U (BU) dimer. This structural duality of EsxB is believed to originate from the heptad repeat sequence diversity of the first helix of its HLH hairpin, which allows for twomore » alternative helix packing. The flexibility of EsxB and the ability to form alternative helical bundles underscore the possibility that this protein can serve as an adaptor in secretion and can form hetero-oligomeric helix bundle(s) with other secreted members of the WXG100 family, such as EsxW. The highly conserved WXG motif is located within the loop of the HLH hairpin and is mostly buried within the helix bundle suggesting that its role is mainly structural. The exact functions of the motif, including a proposed role as a secretion signal, remain unknown.« less
NASA Astrophysics Data System (ADS)
Ghanbari, Behzad; Inc, Mustafa
2018-04-01
The present paper suggests a novel technique to acquire exact solutions of nonlinear partial differential equations. The main idea of the method is to generalize the exponential rational function method. In order to examine the ability of the method, we consider the resonant nonlinear Schrödinger equation (R-NLSE). Many variants of exact soliton solutions for the equation are derived by the proposed method. Physical interpretations of some obtained solutions is also included. One can easily conclude that the new proposed method is very efficient and finds the exact solutions of the equation in a relatively easy way.
NASA Astrophysics Data System (ADS)
Chen, Guangye; Chacon, Luis
2015-11-01
We discuss a new, conservative, fully implicit 2D3V Vlasov-Darwin particle-in-cell algorithm in curvilinear geometry for non-radiative, electromagnetic kinetic plasma simulations. Unlike standard explicit PIC schemes, fully implicit PIC algorithms are unconditionally stable and allow exact discrete energy and charge conservation. Here, we extend these algorithms to curvilinear geometry. The algorithm retains its exact conservation properties in curvilinear grids. The nonlinear iteration is effectively accelerated with a fluid preconditioner for weakly to modestly magnetized plasmas, which allows efficient use of large timesteps, O (√{mi/me}c/veT) larger than the explicit CFL. In this presentation, we will introduce the main algorithmic components of the approach, and demonstrate the accuracy and efficiency properties of the algorithm with various numerical experiments in 1D (slow shock) and 2D (island coalescense).
Sloma, Michael F.; Mathews, David H.
2016-01-01
RNA secondary structure prediction is widely used to analyze RNA sequences. In an RNA partition function calculation, free energy nearest neighbor parameters are used in a dynamic programming algorithm to estimate statistical properties of the secondary structure ensemble. Previously, partition functions have largely been used to estimate the probability that a given pair of nucleotides form a base pair, the conditional stacking probability, the accessibility to binding of a continuous stretch of nucleotides, or a representative sample of RNA structures. Here it is demonstrated that an RNA partition function can also be used to calculate the exact probability of formation of hairpin loops, internal loops, bulge loops, or multibranch loops at a given position. This calculation can also be used to estimate the probability of formation of specific helices. Benchmarking on a set of RNA sequences with known secondary structures indicated that loops that were calculated to be more probable were more likely to be present in the known structure than less probable loops. Furthermore, highly probable loops are more likely to be in the known structure than the set of loops predicted in the lowest free energy structures. PMID:27852924
An Exact Model-Based Method for Near-Field Sources Localization with Bistatic MIMO System.
Singh, Parth Raj; Wang, Yide; Chargé, Pascal
2017-03-30
In this paper, we propose an exact model-based method for near-field sources localization with a bistatic multiple input, multiple output (MIMO) radar system, and compare it with an approximated model-based method. The aim of this paper is to propose an efficient way to use the exact model of the received signals of near-field sources in order to eliminate the systematic error introduced by the use of approximated model in most existing near-field sources localization techniques. The proposed method uses parallel factor (PARAFAC) decomposition to deal with the exact model. Thanks to the exact model, the proposed method has better precision and resolution than the compared approximated model-based method. The simulation results show the performance of the proposed method.
Damsel: A Data Model Storage Library for Exascale Science
DOE Office of Scientific and Technical Information (OSTI.GOV)
Choudhary, Alok; Liao, Wei-keng
Computational science applications have been described as having one of seven motifs (the “seven dwarfs”), each having a particular pattern of computation and communication. From a storage and I/O perspective, these applications can also be grouped into a number of data model motifs describing the way data is organized and accessed during simulation, analysis, and visualization. Major storage data models developed in the 1990s, such as Network Common Data Format (netCDF) and Hierarchical Data Format (HDF) projects, created support for more complex data models. Development of both netCDF and HDF5 was influenced by multi-dimensional dataset storage requirements, but their accessmore » models and formats were designed with sequential storage in mind (e.g., a POSIX I/O model). Although these and other high-level I/O libraries have had a beneficial impact on large parallel applications, they do not always attain a high percentage of peak I/O performance due to fundamental design limitations, and they do not address the full range of current and future computational science data models. The goal of this project is to enable exascale computational science applications to interact conveniently and efficiently with storage through abstractions that match their data models. The project consists of three major activities: (1) identifying major data model motifs in computational science applications and developing representative benchmarks; (2) developing a data model storage library, called Damsel, that supports these motifs, provides efficient storage data layouts, incorporates optimizations to enable exascale operation, and is tolerant to failures; and (3) productizing Damsel and working with computational scientists to encourage adoption of this library by the scientific community. The product of this project, Damsel library, is openly available for download from http://cucis.ece.northwestern.edu/projects/DAMSEL. Several case studies and application programming interface reference are also available to assist new users to learn to use the library.« less
USDA-ARS?s Scientific Manuscript database
Advances in long-read, single molecule real-time sequencing technology and analysis software over the last two years has enabled the efficient production of closed bacterial genome sequences. However, consistent annotation of these genomes has lagged behind the ability to create them, while the avai...
Ramírez-Iglesias, José Rubén; Pérez-Gordones, María Carolina; Del Castillo, Jesús Rafael; Mijares, Alfredo; Benaim, Gustavo; Mendoza, Marta
2018-05-09
The plasma membrane Ca 2+ -ATPase (PMCA) from trypanosomatids lacks a classical calmodulin (CaM) binding domain, although CaM stimulated activities have been detected by biochemical assays. Recently we proposed that the Trypanosoma equiperdum CaM-sensitive PMCA (TePMCA) contains a potential 1-18 CaM-binding motif at the C-terminal region of the pump. In the present study, we evaluated the potential CaM-binding motifs using CaM from Trypanosoma cruzi and either the recombinant full length TePMCA C-terminal sequence (P14) or synthetic peptides comprising different regions of the C-terminal domain. We demonstrated that P14 and a synthetic peptide corresponding to residues 1037-1062 (which contains the predicted 1-18 binding motif) competed efficiently for binding to TcCaM, exhibiting similar IC 50 s of 200 nM. A stable complex of this peptide and TcCaM was formed in the presence of Ca 2+ , as determined by native-polyacrylamide gel electrophoresis. A predicted structure obtained by molecular docking showed an interaction of the 1-18 binding motif with the Ca 2+ /CaM complex. Moreover, when the peptide was incubated with CaM and Ca 2+ , a blue shift in the tryptophan fluorescence spectrum (from 350 to 329 nm) was observed. Substitutions at W 1039 and F 1056 , strongly decreased both CaM-peptide interaction and the complex assembly. Our results demonstrated the presence of a functional 1-18 motif at the TePMCA C-terminal domain. Furthermore, on the basis of spectrofluorometric assays and the resulting structure modeled by docking we propose that the L 1042 and W 1060 residues might also participate as anchors to form a 1-4-18-22 motif. Copyright © 2018 Elsevier B.V. All rights reserved.
Carral-Menoyo, Asier; Ortiz-de-Elguea, Verónica; Martinez-Nunes, Mikel; Sotomayor, Nuria; Lete, Esther
2017-01-01
Palladium-catalyzed dehydrogenative coupling is an efficient synthetic strategy for the construction of quinoline scaffolds, a privileged structure and prevalent motif in many natural and biologically active products, in particular in marine alkaloids. Thus, quinolines and 1,2-dihydroquinolines can be selectively obtained in moderate-to-good yields via intramolecular C–H alkenylation reactions, by choosing the reaction conditions. This methodology provides a direct method for the construction of this type of quinoline through an efficient and atom economical procedure, and constitutes significant advance over the existing procedures that require preactivated reaction partners. PMID:28867803
Federal Register 2010, 2011, 2012, 2013, 2014
2011-10-11
... operating systems to promote efficiency and streamline its operations. Approval of the elimination of these... of match mode because MBSD's system already attempts to find an exact match for trade input and, only if an exact match is not found, will the system revert to Net Position Match Mode. This change will...
Starodubova, E S; Kuzmenko, Y V; Latanova, A A; Preobrazhenskaya, O V; Karpov, V L
2017-01-01
The glycoprotein of rabies virus is the central antigen elicited the immune response to infection; therefore, the majority of developing anti-rabies vaccines are based on this protein. In order to increase the efficacy of DNA immunogen encoding rabies virus glycoprotein, the construction of chimeric protein with the CD63 domain has been proposed. The CD63 is a transmembrane protein localized on the cell surface and in lysosomes. The lysosome targeting motif GYEVM is located at its C-terminus. We used the domain that bears this motif (c-CD63) to generate chimeric glycoprotein in order to relocalize it into lysosomes. Here, it was shown that, in cells transfected with plasmid that encodes glycoprotein with c-CD63 motif at the C-terminus, the chimeric protein was predominantly observed in lysosomes and at the cell membrane where the unmodified glycoprotein is localized in the endoplasmic reticulum and at the cell surface. We suppose that current modification of the glycoprotein may improve the immunogenicity of anti-rabies DNA vaccines due to more efficient antibody production.
Microprocessor depends on hemin to recognize the apical loop of primary microRNA
Park, Joha; Dang, Thi Lieu; Choi, Yeon-Gil; Kim, V Narry
2018-01-01
Abstract Microprocessor, which consists of a ribonuclease III DROSHA and its cofactor DGCR8, initiates microRNA (miRNA) maturation by cleaving primary miRNA transcripts (pri-miRNAs). We recently demonstrated that the DGCR8 dimer recognizes the apical elements of pri-miRNAs, including the UGU motif, to accurately locate and orient Microprocessor on pri-miRNAs. However, the mechanism underlying the selective RNA binding remains unknown. In this study, we find that hemin, a ferric ion-containing porphyrin, enhances the specific interaction between the apical UGU motif and the DGCR8 dimer, allowing Microprocessor to achieve high efficiency and fidelity of pri-miRNA processing in vitro. Furthermore, by generating a DGCR8 mutant cell line and carrying out rescue experiments, we discover that hemin preferentially stimulates the expression of miRNAs possessing the UGU motif, thereby conferring differential regulation of miRNA maturation. Our findings reveal the molecular action mechanism of hemin in pri-miRNA processing and establish a novel function of hemin in inducing specific RNA-protein interaction. PMID:29750274
Microprocessor depends on hemin to recognize the apical loop of primary microRNA.
Nguyen, Tuan Anh; Park, Joha; Dang, Thi Lieu; Choi, Yeon-Gil; Kim, V Narry
2018-06-20
Microprocessor, which consists of a ribonuclease III DROSHA and its cofactor DGCR8, initiates microRNA (miRNA) maturation by cleaving primary miRNA transcripts (pri-miRNAs). We recently demonstrated that the DGCR8 dimer recognizes the apical elements of pri-miRNAs, including the UGU motif, to accurately locate and orient Microprocessor on pri-miRNAs. However, the mechanism underlying the selective RNA binding remains unknown. In this study, we find that hemin, a ferric ion-containing porphyrin, enhances the specific interaction between the apical UGU motif and the DGCR8 dimer, allowing Microprocessor to achieve high efficiency and fidelity of pri-miRNA processing in vitro. Furthermore, by generating a DGCR8 mutant cell line and carrying out rescue experiments, we discover that hemin preferentially stimulates the expression of miRNAs possessing the UGU motif, thereby conferring differential regulation of miRNA maturation. Our findings reveal the molecular action mechanism of hemin in pri-miRNA processing and establish a novel function of hemin in inducing specific RNA-protein interaction.
Simoni, Elena; Bergamini, Christian; Fato, Romana; Tarozzi, Andrea; Bains, Sandip; Motterlini, Roberto; Cavalli, Andrea; Bolognesi, Maria Laura; Minarini, Anna; Hrelia, Patrizia; Lenaz, Giorgio; Rosini, Michela; Melchiorre, Carlo
2010-10-14
Mitochondria-directed antioxidants 2-5 were designed by conjugating curcumin congeners with different polyamine motifs as vehicle tools. The conjugates emerged as efficient antioxidants in mitochondria and fibroblasts and also exerted a protecting role through heme oxygenase-1 activation. Notably, the insertion of a polyamine function into the curcumin-like moiety allowed an efficient intracellular uptake and mitochondria targeting. It also resulted in a significant decrease in the cytotoxicity effects. 2-5 are therefore promising molecules for neuroprotectant lead discovery.
Wurtmann, Elisabeth J.; Ratushny, Alexander V.; Pan, Min; Beer, Karlyn D.; Aitchison, John D.; Baliga, Nitin S.
2014-01-01
Summary It is known that environmental context influences the degree of regulation at the transcriptional and post-transcriptional levels. However, the principles governing the differential usage and interplay of regulation at these two levels are not clear. Here, we show that the integration of transcriptional and post-transcriptional regulatory mechanisms in a characteristic network motif drives efficient environment-dependent state transitions. Through phenotypic screening, systems analysis, and rigorous experimental validation, we discovered an RNase (VNG2099C) in Halobacterium salinarum that is transcriptionally co-regulated with genes of the aerobic physiologic state but acts on transcripts of the anaerobic state. Through modeling and experimentation we show that this arrangement generates an efficient state-transition switch, within which RNase-repression of a transcriptional positive autoregulation (RPAR) loop is critical for shutting down ATP-consuming active potassium uptake to reserve energy required for salinity adaptation under aerobic, high potassium, or dark conditions. Subsequently, we discovered that many Escherichia coli operons with energy-associated functions are also putatively controlled by RPAR indicating that this network motif may have evolved independently in phylogenetically distant organisms. Thus, our data suggest that interplay of transcriptional and post-transcriptional regulation in the RPAR motifis a generalized principle for efficient environment-dependent state transitions across prokaryotes. PMID:24612392
Approximations to the exact exchange potential: KLI versus semilocal
NASA Astrophysics Data System (ADS)
Tran, Fabien; Blaha, Peter; Betzinger, Markus; Blügel, Stefan
2016-10-01
In the search for an accurate and computationally efficient approximation to the exact exchange potential of Kohn-Sham density functional theory, we recently compared various semilocal exchange potentials to the exact one [F. Tran et al., Phys. Rev. B 91, 165121 (2015), 10.1103/PhysRevB.91.165121]. It was concluded that the Becke-Johnson (BJ) potential is a very good starting point, but requires the use of empirical parameters to obtain good agreement with the exact exchange potential. In this work, we extend the comparison by considering the Krieger-Li-Iafrate (KLI) approximation, which is a beyond-semilocal approximation. It is shown that overall the KLI- and BJ-based potentials are the most reliable approximations to the exact exchange potential, however, sizable differences, especially for the antiferromagnetic transition-metal oxides, can be obtained.
Barouch-Bentov, Rina; Neveu, Gregory; Xiao, Fei; Beer, Melanie; Bekerman, Elena; Schor, Stanford; Campbell, Joseph; Boonyaratanakornkit, Jim; Lindenbach, Brett; Lu, Albert; Jacob, Yves
2016-01-01
ABSTRACT Enveloped viruses commonly utilize late-domain motifs, sometimes cooperatively with ubiquitin, to hijack the endosomal sorting complex required for transport (ESCRT) machinery for budding at the plasma membrane. However, the mechanisms underlying budding of viruses lacking defined late-domain motifs and budding into intracellular compartments are poorly characterized. Here, we map a network of hepatitis C virus (HCV) protein interactions with the ESCRT machinery using a mammalian-cell-based protein interaction screen and reveal nine novel interactions. We identify HRS (hepatocyte growth factor-regulated tyrosine kinase substrate), an ESCRT-0 complex component, as an important entry point for HCV into the ESCRT pathway and validate its interactions with the HCV nonstructural (NS) proteins NS2 and NS5A in HCV-infected cells. Infectivity assays indicate that HRS is an important factor for efficient HCV assembly. Specifically, by integrating capsid oligomerization assays, biophysical analysis of intracellular viral particles by continuous gradient centrifugations, proteolytic digestion protection, and RNase digestion protection assays, we show that HCV co-opts HRS to mediate a late assembly step, namely, envelopment. In the absence of defined late-domain motifs, K63-linked polyubiquitinated lysine residues in the HCV NS2 protein bind the HRS ubiquitin-interacting motif to facilitate assembly. Finally, ESCRT-III and VPS/VTA1 components are also recruited by HCV proteins to mediate assembly. These data uncover involvement of ESCRT proteins in intracellular budding of a virus lacking defined late-domain motifs and a novel mechanism by which HCV gains entry into the ESCRT network, with potential implications for other viruses. PMID:27803188
Petitdemange, Caroline; Achour, Abla; Dispinseri, Stefania; Malet, Isabelle; Sennepin, Alexis; Ho Tsong Fang, Raphaël; Crouzet, Joël; Marcelin, Anne-Geneviève; Calvez, Vincent; Scarlatti, Gabriella; Debré, Patrice; Vieillard, Vincent
2013-09-01
The induction of neutralizing antibodies against conserved regions of the human immunodeficiency virus type 1 (HIV-1) envelope protein is a major goal of vaccine strategies. We previously identified 3S, a critical conserved motif of gp41 that induces the NKp44L ligand of an activating NK receptor. In vivo, anti-3S antibodies protect against the natural killer (NK) cell-mediated CD4 depletion that occurs without efficient viral neutralization. Specific substitutions within the 3S peptide motif were prepared by directed mutagenesis. Virus production was monitored by measuring the p24 production. Neutralization assays were performed with immune-purified antibodies from immunized mice and a cohort of HIV-infected patients. Expression of NKp44L on CD4(+) T cells and degranulation assay on activating NK cells were both performed by flow cytometry. Here, we show that specific substitutions in the 3S motif reduce viral infection without affecting gp41 production, while decreasing both its capacity to induce NKp44L expression on CD4(+) T cells and its sensitivity to autologous NK cells. Generation of antibodies in mice against the W614 specific position in the 3S motif elicited a capacity to neutralize cross-clade viruses, notable in its magnitude, breadth, and durability. Antibodies against this 3S variant were also detected in sera from some HIV-1-infected patients, demonstrating both neutralization activity and protection against CD4 depletion. These findings suggest that a specific substitution in a 3S-based immunogen might allow the generation of specific antibodies, providing a foundation for a rational vaccine that combine a capacity to neutralize HIV-1 and to protect CD4(+) T cells.
Morales, Lucia; Mateos-Gomez, Pedro A.; Capiscol, Carmen; del Palacio, Lorena; Sola, Isabel
2013-01-01
Preferential RNA packaging in coronaviruses involves the recognition of viral genomic RNA, a crucial process for viral particle morphogenesis mediated by RNA-specific sequences, known as packaging signals. An essential packaging signal component of transmissible gastroenteritis coronavirus (TGEV) has been further delimited to the first 598 nucleotides (nt) from the 5′ end of its RNA genome, by using recombinant viruses transcribing subgenomic mRNA that included potential packaging signals. The integrity of the entire sequence domain was necessary because deletion of any of the five structural motifs defined within this region abrogated specific packaging of this viral RNA. One of these RNA motifs was the stem-loop SL5, a highly conserved motif in coronaviruses located at nucleotide positions 106 to 136. Partial deletion or point mutations within this motif also abrogated packaging. Using TGEV-derived defective minigenomes replicated in trans by a helper virus, we have shown that TGEV RNA packaging is a replication-independent process. Furthermore, the last 494 nt of the genomic 3′ end were not essential for packaging, although this region increased packaging efficiency. TGEV RNA sequences identified as necessary for viral genome packaging were not sufficient to direct packaging of a heterologous sequence derived from the green fluorescent protein gene. These results indicated that TGEV genome packaging is a complex process involving many factors in addition to the identified RNA packaging signal. The identification of well-defined RNA motifs within the TGEV RNA genome that are essential for packaging will be useful for designing packaging-deficient biosafe coronavirus-derived vectors and providing new targets for antiviral therapies. PMID:23966403
Effective Feature Selection for Classification of Promoter Sequences.
K, Kouser; P G, Lavanya; Rangarajan, Lalitha; K, Acharya Kshitish
2016-01-01
Exploring novel computational methods in making sense of biological data has not only been a necessity, but also productive. A part of this trend is the search for more efficient in silico methods/tools for analysis of promoters, which are parts of DNA sequences that are involved in regulation of expression of genes into other functional molecules. Promoter regions vary greatly in their function based on the sequence of nucleotides and the arrangement of protein-binding short-regions called motifs. In fact, the regulatory nature of the promoters seems to be largely driven by the selective presence and/or the arrangement of these motifs. Here, we explore computational classification of promoter sequences based on the pattern of motif distributions, as such classification can pave a new way of functional analysis of promoters and to discover the functionally crucial motifs. We make use of Position Specific Motif Matrix (PSMM) features for exploring the possibility of accurately classifying promoter sequences using some of the popular classification techniques. The classification results on the complete feature set are low, perhaps due to the huge number of features. We propose two ways of reducing features. Our test results show improvement in the classification output after the reduction of features. The results also show that decision trees outperform SVM (Support Vector Machine), KNN (K Nearest Neighbor) and ensemble classifier LibD3C, particularly with reduced features. The proposed feature selection methods outperform some of the popular feature transformation methods such as PCA and SVD. Also, the methods proposed are as accurate as MRMR (feature selection method) but much faster than MRMR. Such methods could be useful to categorize new promoters and explore regulatory mechanisms of gene expressions in complex eukaryotic species.
Zhang, Yanju; Lameijer, Eric-Wubbo; 't Hoen, Peter A C; Ning, Zemin; Slagboom, P Eline; Ye, Kai
2012-02-15
RNA-seq is a powerful technology for the study of transcriptome profiles that uses deep-sequencing technologies. Moreover, it may be used for cellular phenotyping and help establishing the etiology of diseases characterized by abnormal splicing patterns. In RNA-Seq, the exact nature of splicing events is buried in the reads that span exon-exon boundaries. The accurate and efficient mapping of these reads to the reference genome is a major challenge. We developed PASSion, a pattern growth algorithm-based pipeline for splice site detection in paired-end RNA-Seq reads. Comparing the performance of PASSion to three existing RNA-Seq analysis pipelines, TopHat, MapSplice and HMMSplicer, revealed that PASSion is competitive with these packages. Moreover, the performance of PASSion is not affected by read length and coverage. It performs better than the other three approaches when detecting junctions in highly abundant transcripts. PASSion has the ability to detect junctions that do not have known splicing motifs, which cannot be found by the other tools. Of the two public RNA-Seq datasets, PASSion predicted ≈ 137,000 and 173,000 splicing events, of which on average 82 are known junctions annotated in the Ensembl transcript database and 18% are novel. In addition, our package can discover differential and shared splicing patterns among multiple samples. The code and utilities can be freely downloaded from https://trac.nbic.nl/passion and ftp://ftp.sanger.ac.uk/pub/zn1/passion.
Exact consideration of data redundancies for spiral cone-beam CT
NASA Astrophysics Data System (ADS)
Lauritsch, Guenter; Katsevich, Alexander; Hirsch, Michael
2004-05-01
In multi-slice spiral computed tomography (CT) there is an obvious trend in adding more and more detector rows. The goals are numerous: volume coverage, isotropic spatial resolution, and speed. Consequently, there will be a variety of scan protocols optimizing clinical applications. Flexibility in table feed requires consideration of data redundancies to ensure efficient detector usage. Until recently this was achieved by approximate reconstruction algorithms only. However, due to the increasing cone angles there is a need of exact treatment of the cone beam geometry. A new, exact and efficient 3-PI algorithm for considering three-fold data redundancies was derived from a general, theoretical framework based on 3D Radon inversion using Grangeat's formula. The 3-PI algorithm possesses a simple and efficient structure as the 1-PI method for non-redundant data previously proposed. Filtering is one-dimensional, performed along lines with variable tilt on the detector. This talk deals with a thorough evaluation of the performance of the 3-PI algorithm in comparison to the 1-PI method. Image quality of the 3-PI algorithm is superior. The prominent spiral artifacts and other discretization artifacts are significantly reduced due to averaging effects when taking into account redundant data. Certainly signal-to-noise ratio is increased. The computational expense is comparable even to that of approximate algorithms. The 3-PI algorithm proves its practicability for applications in medical imaging. Other exact n-PI methods for n-fold data redundancies (n odd) can be deduced from the general, theoretical framework.
Solving Integer Programs from Dependence and Synchronization Problems
1993-03-01
DEFF.NSNE Solving Integer Programs from Dependence and Synchronization Problems Jaspal Subhlok March 1993 CMU-CS-93-130 School of Computer ScienceT IC...method Is an exact and efficient way of solving integer programming problems arising in dependence and synchronization analysis of parallel programs...7/;- p Keywords: Exact dependence tesing, integer programming. parallelilzng compilers, parallel program analysis, synchronization analysis Solving
Federal Register 2010, 2011, 2012, 2013, 2014
2011-08-31
... undertaking a rewrite of its internal software applications and operating systems to promote efficiency and... believes there is no need to provide participants with a choice of match mode because MBSD's system already attempts to find an exact match for trade input and, only if an exact match is not found, will the system...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Barnes, Taylor A.; Kurth, Thorsten; Carrier, Pierre
Here, we present an algorithm and implementation for the parallel computation of exact exchange in Quantum ESPRESSO (QE) that exhibits greatly improved strong scaling. QE is an open-source software package for electronic structure calculations using plane wave density functional theory, and supports the use of local, semi-local, and hybrid DFT functionals. Wider application of hybrid functionals is desirable for the improved simulation of electronic band energy alignments and thermodynamic properties, but the computational complexity of evaluating the exact exchange potential limits the practical application of hybrid functionals to large systems and requires efficient implementations. We demonstrate that existing implementations ofmore » hybrid DFT that utilize a single data structure for both the local and exact exchange regions of the code are significantly limited in the degree of parallelization achievable. We present a band-pair parallelization approach, in which the calculation of exact exchange is parallelized and evaluated independently from the parallelization of the remainder of the calculation, with the wavefunction data being efficiently transformed on-the-fly into a form that is optimal for each part of the calculation. For a 64 water molecule supercell, our new algorithm reduces the overall time to solution by nearly an order of magnitude.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bello-Rivas, Juan M.; Elber, Ron; Department of Chemistry, University of Texas at Austin, Austin, Texas 78712
A new theory and an exact computer algorithm for calculating kinetics and thermodynamic properties of a particle system are described. The algorithm avoids trapping in metastable states, which are typical challenges for Molecular Dynamics (MD) simulations on rough energy landscapes. It is based on the division of the full space into Voronoi cells. Prior knowledge or coarse sampling of space points provides the centers of the Voronoi cells. Short time trajectories are computed between the boundaries of the cells that we call milestones and are used to determine fluxes at the milestones. The flux function, an essential component of themore » new theory, provides a complete description of the statistical mechanics of the system at the resolution of the milestones. We illustrate the accuracy and efficiency of the exact Milestoning approach by comparing numerical results obtained on a model system using exact Milestoning with the results of long trajectories and with a solution of the corresponding Fokker-Planck equation. The theory uses an equation that resembles the approximate Milestoning method that was introduced in 2004 [A. K. Faradjian and R. Elber, J. Chem. Phys. 120(23), 10880-10889 (2004)]. However, the current formulation is exact and is still significantly more efficient than straightforward MD simulations on the system studied.« less
Foulk, Michael S.; Urban, John M.; Casella, Cinzia; Gerbi, Susan A.
2015-01-01
Nascent strand sequencing (NS-seq) is used to discover DNA replication origins genome-wide, allowing identification of features for their specification. NS-seq depends on the ability of lambda exonuclease (λ-exo) to efficiently digest parental DNA while leaving RNA-primer protected nascent strands intact. We used genomics and biochemical approaches to determine if λ-exo digests all parental DNA sequences equally. We report that λ-exo does not efficiently digest G-quadruplex (G4) structures in a plasmid. Moreover, λ-exo digestion of nonreplicating genomic DNA (LexoG0) enriches GC-rich DNA and G4 motifs genome-wide. We used LexoG0 data to control for nascent strand–independent λ-exo biases in NS-seq and validated this approach at the rDNA locus. The λ-exo–controlled NS-seq peaks are not GC-rich, and only 35.5% overlap with 6.8% of all G4s, suggesting that G4s are not general determinants for origin specification but may play a role for a subset. Interestingly, we observed a periodic spacing of G4 motifs and nucleosomes around the peak summits, suggesting that G4s may position nucleosomes at this subset of origins. Finally, we demonstrate that use of Na+ instead of K+ in the λ-exo digestion buffer reduced the effect of G4s on λ-exo digestion and discuss ways to increase both the sensitivity and specificity of NS-seq. PMID:25695952
Foulk, Michael S; Urban, John M; Casella, Cinzia; Gerbi, Susan A
2015-05-01
Nascent strand sequencing (NS-seq) is used to discover DNA replication origins genome-wide, allowing identification of features for their specification. NS-seq depends on the ability of lambda exonuclease (λ-exo) to efficiently digest parental DNA while leaving RNA-primer protected nascent strands intact. We used genomics and biochemical approaches to determine if λ-exo digests all parental DNA sequences equally. We report that λ-exo does not efficiently digest G-quadruplex (G4) structures in a plasmid. Moreover, λ-exo digestion of nonreplicating genomic DNA (LexoG0) enriches GC-rich DNA and G4 motifs genome-wide. We used LexoG0 data to control for nascent strand-independent λ-exo biases in NS-seq and validated this approach at the rDNA locus. The λ-exo-controlled NS-seq peaks are not GC-rich, and only 35.5% overlap with 6.8% of all G4s, suggesting that G4s are not general determinants for origin specification but may play a role for a subset. Interestingly, we observed a periodic spacing of G4 motifs and nucleosomes around the peak summits, suggesting that G4s may position nucleosomes at this subset of origins. Finally, we demonstrate that use of Na(+) instead of K(+) in the λ-exo digestion buffer reduced the effect of G4s on λ-exo digestion and discuss ways to increase both the sensitivity and specificity of NS-seq. © 2015 Foulk et al.; Published by Cold Spring Harbor Laboratory Press.
Gál, Zita; Hegedüs, Csilla; Szakács, Gergely; Váradi, András; Sarkadi, Balázs; Özvegy-Laczka, Csilla
2015-02-01
Human ABCG2 is a plasma membrane glycoprotein causing multidrug resistance in cancer. Membrane cholesterol and bile acids are efficient regulators of ABCG2 function, while the molecular nature of the sterol-sensing sites has not been elucidated. The cholesterol recognition amino acid consensus (CRAC, L/V-(X)(1-5)-Y-(X)(1-5)-R/K) sequence is one of the conserved motifs involved in cholesterol binding in several proteins. We have identified five potential CRAC motifs in the transmembrane domain of the human ABCG2 protein. In order to define their roles in sterol-sensing, the central tyrosines of these CRACs (Y413, 459, 469, 570 and 645) were mutated to S or F and the mutants were expressed both in insect and mammalian cells. We found that mutation in Y459 prevented protein expression; the Y469S and Y645S mutants lost their activity; while the Y570S, Y469F, and Y645F mutants retained function as well as cholesterol and bile acid sensitivity. We found that in the case of the Y413S mutant, drug transport was efficient, while modulation of the ATPase activity by cholesterol and bile acids was significantly altered. We suggest that the Y413 residue within a putative CRAC motif has a role in sterol-sensing and the ATPase/drug transport coupling in the ABCG2 multidrug transporter. Copyright © 2014. Published by Elsevier B.V.
Igura, Mayumi; Kohda, Daisuke
2011-04-15
Asn-linked glycosylation is the most ubiquitous posttranslational protein modification in eukaryotes and archaea, and in some eubacteria. Oligosaccharyltransferase (OST) catalyzes the transfer of preassembled oligosaccharides on lipid carriers onto asparagine residues in polypeptide chains. Inefficient oligosaccharide transfer results in glycoprotein heterogeneity, which is particularly bothersome in pharmaceutical glycoprotein production. Amino acid variation at the X position of the Asn-X-Ser/Thr sequon is known to modulate the glycosylation efficiency. The best amino acid at X is valine, for an archaeal Pyrococcus furiosus OST. We performed a systematic alanine mutagenesis study of the archaeal OST to identify the essential and dispensable amino acid residues in the three catalytic motifs. We then investigated the effects of the dispensable mutations on the amino acid preference in the N-glycosylation sequon. One residue position was found to selectively affect the amino acid preference at the X position. This residue is located within the recently identified DXXKXXX(M/I) motif, suggesting the involvement of this motif in N-glycosylation sequon recognition. In applications, mutations at this position may facilitate the design of OST variants adapted to particular N-glycosylation sites to reduce the heterogeneity of glycan occupancy. In fact, a mutation at this position led to 9-fold higher activity relative to the wild-type enzyme, toward a peptide containing arginine at X in place of valine. This mutational approach is potentially applicable to eukaryotic and eubacterial OSTs for the production of homogenous glycoproteins in engineered mammalian and Escherichia coli cells.
Igura, Mayumi; Kohda, Daisuke
2011-01-01
Asn-linked glycosylation is the most ubiquitous posttranslational protein modification in eukaryotes and archaea, and in some eubacteria. Oligosaccharyltransferase (OST) catalyzes the transfer of preassembled oligosaccharides on lipid carriers onto asparagine residues in polypeptide chains. Inefficient oligosaccharide transfer results in glycoprotein heterogeneity, which is particularly bothersome in pharmaceutical glycoprotein production. Amino acid variation at the X position of the Asn-X-Ser/Thr sequon is known to modulate the glycosylation efficiency. The best amino acid at X is valine, for an archaeal Pyrococcus furiosus OST. We performed a systematic alanine mutagenesis study of the archaeal OST to identify the essential and dispensable amino acid residues in the three catalytic motifs. We then investigated the effects of the dispensable mutations on the amino acid preference in the N-glycosylation sequon. One residue position was found to selectively affect the amino acid preference at the X position. This residue is located within the recently identified DXXKXXX(M/I) motif, suggesting the involvement of this motif in N-glycosylation sequon recognition. In applications, mutations at this position may facilitate the design of OST variants adapted to particular N-glycosylation sites to reduce the heterogeneity of glycan occupancy. In fact, a mutation at this position led to 9-fold higher activity relative to the wild-type enzyme, toward a peptide containing arginine at X in place of valine. This mutational approach is potentially applicable to eukaryotic and eubacterial OSTs for the production of homogenous glycoproteins in engineered mammalian and Escherichia coli cells. PMID:21357684
A Study of the Effects of Altitude on Thermal Ice Protection System Performance
NASA Technical Reports Server (NTRS)
Addy, Gene; Oleskiw, Myron; Broeren, Andy P.; Orchard, David
2013-01-01
Thermal ice protection systems use heat energy to prevent a dangerous buildup of ice on an aircraft. As aircraft become more efficient, less heat energy is available to operate a thermal ice protections system. This requires that thermal ice protection systems be designed to more exacting standards so as to more efficiently prevent a dangerous ice buildup without adversely affecting aircraft safety. While the effects of altitude have always beeing taked into account in the design of thermal ice protection systems, a better understanding of these effects is needed so as to enable more exact design, testing, and evaluation of these systems.
Efficient global biopolymer sampling with end-transfer configurational bias Monte Carlo
NASA Astrophysics Data System (ADS)
Arya, Gaurav; Schlick, Tamar
2007-01-01
We develop an "end-transfer configurational bias Monte Carlo" method for efficient thermodynamic sampling of complex biopolymers and assess its performance on a mesoscale model of chromatin (oligonucleosome) at different salt conditions compared to other Monte Carlo moves. Our method extends traditional configurational bias by deleting a repeating motif (monomer) from one end of the biopolymer and regrowing it at the opposite end using the standard Rosenbluth scheme. The method's sampling efficiency compared to local moves, pivot rotations, and standard configurational bias is assessed by parameters relating to translational, rotational, and internal degrees of freedom of the oligonucleosome. Our results show that the end-transfer method is superior in sampling every degree of freedom of the oligonucleosomes over other methods at high salt concentrations (weak electrostatics) but worse than the pivot rotations in terms of sampling internal and rotational sampling at low-to-moderate salt concentrations (strong electrostatics). Under all conditions investigated, however, the end-transfer method is several orders of magnitude more efficient than the standard configurational bias approach. This is because the characteristic sampling time of the innermost oligonucleosome motif scales quadratically with the length of the oligonucleosomes for the end-transfer method while it scales exponentially for the traditional configurational-bias method. Thus, the method we propose can significantly improve performance for global biomolecular applications, especially in condensed systems with weak nonbonded interactions and may be combined with local enhancements to improve local sampling.
NASA Astrophysics Data System (ADS)
Lu, Dianchen; Seadawy, Aly R.; Ali, Asghar
2018-06-01
In this current work, we employ novel methods to find the exact travelling wave solutions of Modified Liouville equation and the Symmetric Regularized Long Wave equation, which are called extended simple equation and exp(-Ψ(ξ))-expansion methods. By assigning the different values to the parameters, different types of the solitary wave solutions are derived from the exact traveling wave solutions, which shows the efficiency and precision of our methods. Some solutions have been represented by graphical. The obtained results have several applications in physical science.
A survey of motif finding Web tools for detecting binding site motifs in ChIP-Seq data
2014-01-01
Abstract ChIP-Seq (chromatin immunoprecipitation sequencing) has provided the advantage for finding motifs as ChIP-Seq experiments narrow down the motif finding to binding site locations. Recent motif finding tools facilitate the motif detection by providing user-friendly Web interface. In this work, we reviewed nine motif finding Web tools that are capable for detecting binding site motifs in ChIP-Seq data. We showed each motif finding Web tool has its own advantages for detecting motifs that other tools may not discover. We recommended the users to use multiple motif finding Web tools that implement different algorithms for obtaining significant motifs, overlapping resemble motifs, and non-overlapping motifs. Finally, we provided our suggestions for future development of motif finding Web tool that better assists researchers for finding motifs in ChIP-Seq data. Reviewers This article was reviewed by Prof. Sandor Pongor, Dr. Yuriy Gusev, and Dr. Shyam Prabhakar (nominated by Prof. Limsoon Wong). PMID:24555784
HIV-1 nucleocapsid protein localizes efficiently to the nucleus and nucleolus.
Yu, Kyung Lee; Lee, Sun Hee; Lee, Eun Soo; You, Ji Chang
2016-05-01
The HIV-1 nucleocapsid (NC) is an essential viral protein containing two highly conserved retroviral-type zinc finger (ZF) motifs, which functions in multiple stages of the HIV-1 life cycle. Although a number of functions for NC either in its mature form or as a domain of Gag have been revealed, little is known about the intracellular localization of NC and, moreover, its role in Gag protein trafficking. Here, we have investigated various forms of HIV-1 NC protein for its cellular localization and found that the NC has a strong nuclear and nucleolar localization activity. The linker region, composed of a stretch of basic amino acids between the two ZF motifs, was necessary and sufficient for the activity. Copyright © 2016 Elsevier Inc. All rights reserved.
Sloma, Michael F; Mathews, David H
2016-12-01
RNA secondary structure prediction is widely used to analyze RNA sequences. In an RNA partition function calculation, free energy nearest neighbor parameters are used in a dynamic programming algorithm to estimate statistical properties of the secondary structure ensemble. Previously, partition functions have largely been used to estimate the probability that a given pair of nucleotides form a base pair, the conditional stacking probability, the accessibility to binding of a continuous stretch of nucleotides, or a representative sample of RNA structures. Here it is demonstrated that an RNA partition function can also be used to calculate the exact probability of formation of hairpin loops, internal loops, bulge loops, or multibranch loops at a given position. This calculation can also be used to estimate the probability of formation of specific helices. Benchmarking on a set of RNA sequences with known secondary structures indicated that loops that were calculated to be more probable were more likely to be present in the known structure than less probable loops. Furthermore, highly probable loops are more likely to be in the known structure than the set of loops predicted in the lowest free energy structures. © 2016 Sloma and Mathews; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Nonlinear Multidimensional Assignment Problems Efficient Conic Optimization Methods and Applications
2015-06-24
WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) Arizona State University School of Mathematical & Statistical Sciences 901 S...SUPPLEMENTARY NOTES 14. ABSTRACT The major goals of this project were completed: the exact solution of previously unsolved challenging combinatorial optimization... combinatorial optimization problem, the Directional Sensor Problem, was solved in two ways. First, heuristically in an engineering fashion and second, exactly
The snoRNA domain of vertebrate telomerase RNA functions to localize the RNA within the nucleus.
Lukowiak, A A; Narayanan, A; Li, Z H; Terns, R M; Terns, M P
2001-01-01
Telomerase RNA is an essential component of the ribonucleoprotein enzyme involved in telomere length maintenance, a process implicated in cellular senescence and cancer. Vertebrate telomerase RNAs contain a box H/ACA snoRNA motif that is not required for telomerase activity in vitro but is essential in vivo. Using the Xenopus oocyte system, we have found that the box H/ACA motif functions in the subcellular localization of telomerase RNA. We have characterized the transport and biogenesis of telomerase RNA by injecting labeled wild-type and variant RNAs into Xenopus oocytes and assaying nucleocytoplasmic distribution, intranuclear localization, modification, and protein binding. Although yeast telomerase RNA shares characteristics of spliceosomal snRNAs, we show that human telomerase RNA is not associated with Sm proteins or efficiently imported into the nucleus. In contrast, the transport properties of vertebrate telomerase RNA resemble those of snoRNAs; telomerase RNA is retained in the nucleus and targeted to nucleoli. Furthermore, both nuclear retention and nucleolar localization depend on the box H/ACA motif. Our findings suggest that the H/ACA motif confers functional localization of vertebrate telomerase RNAs to the nucleus, the compartment where telomeres are synthesized. We have also found that telomerase RNA localizes to Cajal bodies, intranuclear structures where it is thought that assembly of various cellular RNPs takes place. Our results identify the Cajal body as a potential site of telomerase RNP biogenesis. PMID:11780638
Rajkovic, Andrei; Hummels, Katherine R; Witzky, Anne; Erickson, Sarah; Gafken, Philip R; Whitelegge, Julian P; Faull, Kym F; Kearns, Daniel B; Ibba, Michael
2016-05-20
Elongation factor P (EF-P) accelerates diprolyl synthesis and requires a posttranslational modification to maintain proteostasis. Two phylogenetically distinct EF-P modification pathways have been described and are encoded in the majority of Gram-negative bacteria, but neither is present in Gram-positive bacteria. Prior work suggested that the EF-P-encoding gene (efp) primarily supports Bacillus subtilis swarming differentiation, whereas EF-P in Gram-negative bacteria has a more global housekeeping role, prompting our investigation to determine whether EF-P is modified and how it impacts gene expression in motile cells. We identified a 5-aminopentanol moiety attached to Lys(32) of B. subtilis EF-P that is required for swarming motility. A fluorescent in vivo B. subtilis reporter system identified peptide motifs whose efficient synthesis was most dependent on 5-aminopentanol EF-P. Examination of the B. subtilis genome sequence showed that these EF-P-dependent peptide motifs were represented in flagellar genes. Taken together, these data show that, in B. subtilis, a previously uncharacterized posttranslational modification of EF-P can modulate the synthesis of specific diprolyl motifs present in proteins required for swarming motility. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
A common minimal motif for the ligands of HLA-B*27 class I molecules.
Barriga, Alejandro; Lorente, Elena; Johnstone, Carolina; Mir, Carmen; del Val, Margarita; López, Daniel
2014-01-01
CD8(+) T cells identify and kill infected cells through the specific recognition of short viral antigens bound to human major histocompatibility complex (HLA) class I molecules. The colossal number of polymorphisms in HLA molecules makes it essential to characterize the antigen-presenting properties common to large HLA families or supertypes. In this context, the HLA-B*27 family comprising at least 100 different alleles, some of them widely distributed in the human population, is involved in the cellular immune response against pathogens and also associated to autoimmune spondyloarthritis being thus a relevant target of study. To this end, HLA binding assays performed using nine HLA-B*2705-restricted ligands endogenously processed and presented in virus-infected cells revealed a common minimal peptide motif for efficient binding to the HLA-B*27 family. The motif was independently confirmed using four unrelated peptides. This experimental approach, which could be easily transferred to other HLA class I families and supertypes, has implications for the validation of new bioinformatics tools in the functional clustering of HLA molecules, for the identification of antiviral cytotoxic T lymphocyte responses, and for future vaccine development.
[Prediction of Promoter Motifs in Virophages].
Gong, Chaowen; Zhou, Xuewen; Pan, Yingjie; Wang, Yongjie
2015-07-01
Virophages have crucial roles in ecosystems and are the transport vectors of genetic materials. To shed light on regulation and control mechanisms in virophage--host systems as well as evolution between virophages and their hosts, the promoter motifs of virophages were predicted on the upstream regions of start codons using an analytical tool for prediction of promoter motifs: Multiple EM for Motif Elicitation. Seventeen potential promoter motifs were identified based on the E-value, location, number and length of promoters in genomes. Sputnik and zamilon motif 2 with AT-rich regions were distributed widely on genomes, suggesting that these motifs may be associated with regulation of the expression of various genes. Motifs containing the TCTA box were predicted to be late promoter motif in mavirus; motifs containing the ATCT box were the potential late promoter motif in the Ace Lake mavirus . AT-rich regions were identified on motif 2 in the Organic Lake virophage, motif 3 in Yellowstone Lake virophage (YSLV)1 and 2, motif 1 in YSLV3, and motif 1 and 2 in YSLV4, respectively. AT-rich regions were distributed widely on the genomes of virophages. All of these motifs may be promoter motifs of virophages. Our results provide insights into further exploration of temporal expression of genes in virophages as well as associations between virophages and giant viruses.
NASA Astrophysics Data System (ADS)
Toufik, Mekkaoui; Atangana, Abdon
2017-10-01
Recently a new concept of fractional differentiation with non-local and non-singular kernel was introduced in order to extend the limitations of the conventional Riemann-Liouville and Caputo fractional derivatives. A new numerical scheme has been developed, in this paper, for the newly established fractional differentiation. We present in general the error analysis. The new numerical scheme was applied to solve linear and non-linear fractional differential equations. We do not need a predictor-corrector to have an efficient algorithm, in this method. The comparison of approximate and exact solutions leaves no doubt believing that, the new numerical scheme is very efficient and converges toward exact solution very rapidly.
Song, H Francis; Wang, Xiao-Jing
2014-12-01
Small-world networks-complex networks characterized by a combination of high clustering and short path lengths-are widely studied using the paradigmatic model of Watts and Strogatz (WS). Although the WS model is already quite minimal and intuitive, we describe an alternative formulation of the WS model in terms of a distance-dependent probability of connection that further simplifies, both practically and theoretically, the generation of directed and undirected WS-type small-world networks. In addition to highlighting an essential feature of the WS model that has previously been overlooked, namely the equivalence to a simple distance-dependent model, this alternative formulation makes it possible to derive exact expressions for quantities such as the degree and motif distributions and global clustering coefficient for both directed and undirected networks in terms of model parameters.
NASA Astrophysics Data System (ADS)
Song, H. Francis; Wang, Xiao-Jing
2014-12-01
Small-world networks—complex networks characterized by a combination of high clustering and short path lengths—are widely studied using the paradigmatic model of Watts and Strogatz (WS). Although the WS model is already quite minimal and intuitive, we describe an alternative formulation of the WS model in terms of a distance-dependent probability of connection that further simplifies, both practically and theoretically, the generation of directed and undirected WS-type small-world networks. In addition to highlighting an essential feature of the WS model that has previously been overlooked, namely the equivalence to a simple distance-dependent model, this alternative formulation makes it possible to derive exact expressions for quantities such as the degree and motif distributions and global clustering coefficient for both directed and undirected networks in terms of model parameters.
Structural and functional networks in complex systems with delay.
Eguíluz, Víctor M; Pérez, Toni; Borge-Holthoefer, Javier; Arenas, Alex
2011-05-01
Functional networks of complex systems are obtained from the analysis of the temporal activity of their components, and are often used to infer their unknown underlying connectivity. We obtain the equations relating topology and function in a system of diffusively delay-coupled elements in complex networks. We solve exactly the resulting equations in motifs (directed structures of three nodes) and in directed networks. The mean-field solution for directed uncorrelated networks shows that the clusterization of the activity is dominated by the in-degree of the nodes, and that the locking frequency decreases with increasing average degree. We find that the exponent of a power law degree distribution of the structural topology γ is related to the exponent of the associated functional network as α=(2-γ)(-1) for γ<2. © 2011 American Physical Society
Serang, Oliver
2014-01-01
Exact Bayesian inference can sometimes be performed efficiently for special cases where a function has commutative and associative symmetry of its inputs (called "causal independence"). For this reason, it is desirable to exploit such symmetry on big data sets. Here we present a method to exploit a general form of this symmetry on probabilistic adder nodes by transforming those probabilistic adder nodes into a probabilistic convolution tree with which dynamic programming computes exact probabilities. A substantial speedup is demonstrated using an illustration example that can arise when identifying splice forms with bottom-up mass spectrometry-based proteomics. On this example, even state-of-the-art exact inference algorithms require a runtime more than exponential in the number of splice forms considered. By using the probabilistic convolution tree, we reduce the runtime to O(k log(k)2) and the space to O(k log(k)) where k is the number of variables joined by an additive or cardinal operator. This approach, which can also be used with junction tree inference, is applicable to graphs with arbitrary dependency on counting variables or cardinalities and can be used on diverse problems and fields like forward error correcting codes, elemental decomposition, and spectral demixing. The approach also trivially generalizes to multiple dimensions.
Serang, Oliver
2014-01-01
Exact Bayesian inference can sometimes be performed efficiently for special cases where a function has commutative and associative symmetry of its inputs (called “causal independence”). For this reason, it is desirable to exploit such symmetry on big data sets. Here we present a method to exploit a general form of this symmetry on probabilistic adder nodes by transforming those probabilistic adder nodes into a probabilistic convolution tree with which dynamic programming computes exact probabilities. A substantial speedup is demonstrated using an illustration example that can arise when identifying splice forms with bottom-up mass spectrometry-based proteomics. On this example, even state-of-the-art exact inference algorithms require a runtime more than exponential in the number of splice forms considered. By using the probabilistic convolution tree, we reduce the runtime to and the space to where is the number of variables joined by an additive or cardinal operator. This approach, which can also be used with junction tree inference, is applicable to graphs with arbitrary dependency on counting variables or cardinalities and can be used on diverse problems and fields like forward error correcting codes, elemental decomposition, and spectral demixing. The approach also trivially generalizes to multiple dimensions. PMID:24626234
Exact solutions to the time-fractional differential equations via local fractional derivatives
NASA Astrophysics Data System (ADS)
Guner, Ozkan; Bekir, Ahmet
2018-01-01
This article utilizes the local fractional derivative and the exp-function method to construct the exact solutions of nonlinear time-fractional differential equations (FDEs). For illustrating the validity of the method, it is applied to the time-fractional Camassa-Holm equation and the time-fractional-generalized fifth-order KdV equation. Moreover, the exact solutions are obtained for the equations which are formed by different parameter values related to the time-fractional-generalized fifth-order KdV equation. This method is an reliable and efficient mathematical tool for solving FDEs and it can be applied to other non-linear FDEs.
Kümmel, Stephan; Perdew, John P
2003-01-31
For exchange-correlation functionals that depend explicitly on the Kohn-Sham orbitals, the potential V(xcsigma)(r) must be obtained as the solution of the optimized effective potential (OEP) integral equation. This is very demanding and has limited the use of orbital functionals. We demonstrate that instead the OEP can be obtained iteratively by solving the partial differential equations for the orbital shifts that exactify the Krieger-Li-Iafrate approximation. Unoccupied orbitals do not need to be calculated. Accuracy and efficiency of the method are shown for atoms and clusters using the exact-exchange energy. Counterintuitive asymptotic limits of the exact OEP are presented.
Alić, Nikola; Papen, George; Saperstein, Robert; Milstein, Laurence; Fainman, Yeshaiahu
2005-06-13
Exact signal statistics for fiber-optic links containing a single optical pre-amplifier are calculated and applied to sequence estimation for electronic dispersion compensation. The performance is evaluated and compared with results based on the approximate chi-square statistics. We show that detection in existing systems based on exact statistics can be improved relative to using a chi-square distribution for realistic filter shapes. In contrast, for high-spectral efficiency systems the difference between the two approaches diminishes, and performance tends to be less dependent on the exact shape of the filter used.
NASA Astrophysics Data System (ADS)
Sulc, Miroslav; Hernandez, Henar; Martinez, Todd J.; Vanicek, Jiri
2014-03-01
We recently showed that the Dephasing Representation (DR) provides an efficient tool for computing ultrafast electronic spectra and that cellularization yields further acceleration [M. Šulc and J. Vaníček, Mol. Phys. 110, 945 (2012)]. Here we focus on increasing its accuracy by first implementing an exact Gaussian basis method (GBM) combining the accuracy of quantum dynamics and efficiency of classical dynamics. The DR is then derived together with ten other methods for computing time-resolved spectra with intermediate accuracy and efficiency. These include the Gaussian DR (GDR), an exact generalization of the DR, in which trajectories are replaced by communicating frozen Gaussians evolving classically with an average Hamiltonian. The methods are tested numerically on time correlation functions and time-resolved stimulated emission spectra in the harmonic potential, pyrazine S0 /S1 model, and quartic oscillator. Both the GBM and the GDR are shown to increase the accuracy of the DR. Surprisingly, in chaotic systems the GDR can outperform the presumably more accurate GBM, in which the two bases evolve separately. This research was supported by the Swiss NSF Grant No. 200021_124936/1 and NCCR Molecular Ultrafast Science & Technology (MUST), and by the EPFL.
Substrate Specificity and Possible Heterologous Targets of Phytaspase, a Plant Cell Death Protease*
Galiullina, Raisa A.; Kasperkiewicz, Paulina; Chichkova, Nina V.; Szalek, Aleksandra; Serebryakova, Marina V.; Poreba, Marcin; Drag, Marcin; Vartapetian, Andrey B.
2015-01-01
Plants lack aspartate-specific cell death proteases homologous to animal caspases. Instead, a subtilisin-like serine-dependent plant protease named phytaspase shown to be involved in the accomplishment of programmed death of plant cells is able to hydrolyze a number of peptide-based caspase substrates. Here, we determined the substrate specificity of rice (Oryza sativa) phytaspase by using the positional scanning substrate combinatorial library approach. Phytaspase was shown to display an absolute specificity of hydrolysis after an aspartic acid residue. The preceding amino acid residues, however, significantly influence the efficiency of hydrolysis. Efficient phytaspase substrates demonstrated a remarkable preference for an aromatic amino acid residue in the P3 position. The deduced optimum phytaspase recognition motif has the sequence IWLD and is strikingly hydrophobic. The established pattern was confirmed through synthesis and kinetic analysis of cleavage of a set of optimized peptide substrates. An amino acid motif similar to the phytaspase cleavage site is shared by the human gastrointestinal peptide hormones gastrin and cholecystokinin. In agreement with the established enzyme specificity, phytaspase was shown to hydrolyze gastrin-1 and cholecystokinin at the predicted sites in vitro, thus destroying the active moieties of the hormones. PMID:26283788
Efficient Exact Inference With Loss Augmented Objective in Structured Learning.
Bauer, Alexander; Nakajima, Shinichi; Muller, Klaus-Robert
2016-08-19
Structural support vector machine (SVM) is an elegant approach for building complex and accurate models with structured outputs. However, its applicability relies on the availability of efficient inference algorithms--the state-of-the-art training algorithms repeatedly perform inference to compute a subgradient or to find the most violating configuration. In this paper, we propose an exact inference algorithm for maximizing nondecomposable objectives due to special type of a high-order potential having a decomposable internal structure. As an important application, our method covers the loss augmented inference, which enables the slack and margin scaling formulations of structural SVM with a variety of dissimilarity measures, e.g., Hamming loss, precision and recall, Fβ-loss, intersection over union, and many other functions that can be efficiently computed from the contingency table. We demonstrate the advantages of our approach in natural language parsing and sequence segmentation applications.
Romer, Katherine A.; Kayombya, Guy-Richard; Fraenkel, Ernest
2007-01-01
WebMOTIFS provides a web interface that facilitates the discovery and analysis of DNA-sequence motifs. Several studies have shown that the accuracy of motif discovery can be significantly improved by using multiple de novo motif discovery programs and using randomized control calculations to identify the most significant motifs or by using Bayesian approaches. WebMOTIFS makes it easy to apply these strategies. Using a single submission form, users can run several motif discovery programs and score, cluster and visualize the results. In addition, the Bayesian motif discovery program THEME can be used to determine the class of transcription factors that is most likely to regulate a set of sequences. Input can be provided as a list of gene or probe identifiers. Used with the default settings, WebMOTIFS accurately identifies biologically relevant motifs from diverse data in several species. WebMOTIFS is freely available at http://fraenkel.mit.edu/webmotifs. PMID:17584794
Kishine, Masahiro; Tsutsumi, Katsuji; Kitta, Kazumi
2017-12-01
Simple sequence repeat (SSR) is a popular tool for individual fingerprinting. The long-core motif (e.g. tetra-, penta-, and hexa-nucleotide) simple sequence repeats (SSRs) are preferred because they make it easier to separate and distinguish neighbor alleles. In the present study, a new set of 8 tetra-nucleotide SSRs in potato ( Solanum tuberosum ) is reported. By using these 8 markers, 72 out of 76 cultivars obtained from Japan and the United States were clearly discriminated, while two pairs, both of which arose from natural variation, showed identical profiles. The combined probability of identity between two random cultivars for the set of 8 SSR markers was estimated to be 1.10 × 10 -8 , confirming the usefulness of the proposed SSR markers for fingerprinting analyses of potato.
van Anken, Eelco; Pena, Florentina; Hafkemeijer, Nicole; Christis, Chantal; Romijn, Edwin P.; Grauschopf, Ulla; Oorschot, Viola M. J.; Pertel, Thomas; Engels, Sander; Ora, Ari; Lástun, Viorica; Glockshuber, Rudi; Klumperman, Judith; Heck, Albert J. R.; Luban, Jeremy; Braakman, Ineke
2009-01-01
Plasma cells daily secrete their own mass in antibodies, which fold and assemble in the endoplasmic reticulum (ER). To reach these levels, cells require pERp1, a novel lymphocyte-specific small ER-resident protein, which attains expression levels as high as BiP when B cells differentiate into plasma cells. Although pERp1 has no homology with known ER proteins, it does contain a CXXC motif typical for oxidoreductases. In steady state, the CXXC cysteines are locked by two parallel disulfide bonds with a downstream C(X)6C motif, and pERp1 displays only modest oxidoreductase activity. pERp1 emerged as a dedicated folding factor for IgM, associating with both heavy and light chains and promoting assembly and secretion of mature IgM. PMID:19805154
Morrison, Abigail; Straube, Sirko; Plesser, Hans Ekkehard; Diesmann, Markus
2007-01-01
Very large networks of spiking neurons can be simulated efficiently in parallel under the constraint that spike times are bound to an equidistant time grid. Within this scheme, the subthreshold dynamics of a wide class of integrate-and-fire-type neuron models can be integrated exactly from one grid point to the next. However, the loss in accuracy caused by restricting spike times to the grid can have undesirable consequences, which has led to interest in interpolating spike times between the grid points to retrieve an adequate representation of network dynamics. We demonstrate that the exact integration scheme can be combined naturally with off-grid spike events found by interpolation. We show that by exploiting the existence of a minimal synaptic propagation delay, the need for a central event queue is removed, so that the precision of event-driven simulation on the level of single neurons is combined with the efficiency of time-driven global scheduling. Further, for neuron models with linear subthreshold dynamics, even local event queuing can be avoided, resulting in much greater efficiency on the single-neuron level. These ideas are exemplified by two implementations of a widely used neuron model. We present a measure for the efficiency of network simulations in terms of their integration error and show that for a wide range of input spike rates, the novel techniques we present are both more accurate and faster than standard techniques.
Green's function enriched Poisson solver for electrostatics in many-particle systems
NASA Astrophysics Data System (ADS)
Sutmann, Godehard
2016-06-01
A highly accurate method is presented for the construction of the charge density for the solution of the Poisson equation in particle simulations. The method is based on an operator adjusted source term which can be shown to produce exact results up to numerical precision in the case of a large support of the charge distribution, therefore compensating the discretization error of finite difference schemes. This is achieved by balancing an exact representation of the known Green's function of regularized electrostatic problem with a discretized representation of the Laplace operator. It is shown that the exact calculation of the potential is possible independent of the order of the finite difference scheme but the computational efficiency for higher order methods is found to be superior due to a faster convergence to the exact result as a function of the charge support.
2012-01-01
Background Discovery of functionally significant short, statistically overrepresented subsequence patterns (motifs) in a set of sequences is a challenging problem in bioinformatics. Oftentimes, not all sequences in the set contain a motif. These non-motif-containing sequences complicate the algorithmic discovery of motifs. Filtering the non-motif-containing sequences from the larger set of sequences while simultaneously determining the identity of the motif is, therefore, desirable and a non-trivial problem in motif discovery research. Results We describe MotifCatcher, a framework that extends the sensitivity of existing motif-finding tools by employing random sampling to effectively remove non-motif-containing sequences from the motif search. We developed two implementations of our algorithm; each built around a commonly used motif-finding tool, and applied our algorithm to three diverse chromatin immunoprecipitation (ChIP) data sets. In each case, the motif finder with the MotifCatcher extension demonstrated improved sensitivity over the motif finder alone. Our approach organizes candidate functionally significant discovered motifs into a tree, which allowed us to make additional insights. In all cases, we were able to support our findings with experimental work from the literature. Conclusions Our framework demonstrates that additional processing at the sequence entry level can significantly improve the performance of existing motif-finding tools. For each biological data set tested, we were able to propose novel biological hypotheses supported by experimental work from the literature. Specifically, in Escherichia coli, we suggested binding site motifs for 6 non-traditional LexA protein binding sites; in Saccharomyces cerevisiae, we hypothesize 2 disparate mechanisms for novel binding sites of the Cse4p protein; and in Halobacterium sp. NRC-1, we discoverd subtle differences in a general transcription factor (GTF) binding site motif across several data sets. We suggest that small differences in our discovered motif could confer specificity for one or more homologous GTF proteins. We offer a free implementation of the MotifCatcher software package at http://www.bme.ucdavis.edu/facciotti/resources_data/software/. PMID:23181585
Trabalza, Antonio; Eleftheriadou, Ioanna; Sgourou, Argyro; Liao, Ting-Yi; Patsali, Petros; Lee, Heyne
2014-01-01
ABSTRACT To investigate the potential benefits which may arise from pseudotyping the HIV-1 lentiviral vector with its homologous gp41 envelope glycoprotein (GP) cytoplasmic tail (CT), we created chimeric RVG/HIV-1gp41 GPs composed of the extracellular and transmembrane sequences of RVG and either the full-length gp41 CT or C terminus gp41 truncations sequentially removing existing conserved motifs. Lentiviruses (LVs) pseudotyped with the chimeric GPs were evaluated in terms of particle release (physical titer), biological titers, infectivity, and in vivo central nervous system (CNS) transduction. We report here that LVs carrying shorter CTs expressed higher levels of envelope GP and showed a higher average infectivity than those bearing full-length GPs. Interestingly, complete removal of GP CT led to vectors with the highest transduction efficiency. Removal of all C-terminal gp41 CT conserved motifs, leaving just 17 amino acids (aa), appeared to preserve infectivity and resulted in a significantly increased physical titer. Furthermore, incorporation of these 17 aa in the RVG CT notably enhanced the physical titer. In vivo stereotaxic delivery of LV vectors exhibiting the best in vitro titers into rodent striatum facilitated efficient transduction of the CNS at the site of injection. A particular observation was the improved retrograde transduction of neurons in connected distal sites that resulted from the chimeric envelope R5 which included the “Kennedy” sequence (Ken) and lentivirus lytic peptide 2 (LLP2) conserved motifs in the CT, and although it did not exhibit a comparable high titer upon pseudotyping, it led to a significant increase in distal retrograde transduction of neurons. IMPORTANCE In this study, we have produced novel chimeric envelopes bearing the extracellular domain of rabies fused to the cytoplasmic tail (CT) of gp41 and pseudotyped lentiviral vectors with them. Here we report novel effects on the transduction efficiency and physical titer of these vectors, depending on CT length and context. We also managed to achieve increased neuronal transduction in vivo in the rodent CNS, thus demonstrating that the efficiency of these vectors can be enhanced following merely CT manipulation. We believe that this paper is a novel contribution to the field and opens the way for further attempts to surface engineer lentiviral vectors and make them more amenable for applications in human disease. PMID:24371049
Bergmann, Tobias; Moore, Carrie; Sidney, John; Miller, Donald; Tallmadge, Rebecca; Harman, Rebecca M; Oseroff, Carla; Wriston, Amanda; Shabanowitz, Jeffrey; Hunt, Donald F; Osterrieder, Nikolaus; Peters, Bjoern; Antczak, Douglas F; Sette, Alessandro
2015-11-01
Here we describe a detailed quantitative peptide-binding motif for the common equine leukocyte antigen (ELA) class I allele Eqca-1*00101, present in roughly 25 % of Thoroughbred horses. We determined a preliminary binding motif by sequencing endogenously bound ligands. Subsequently, a positional scanning combinatorial library (PSCL) was used to further characterize binding specificity and derive a quantitative motif involving aspartic acid in position 2 and hydrophobic residues at the C-terminus. Using this motif, we selected and tested 9- and 10-mer peptides derived from the equine herpesvirus type 1 (EHV-1) proteome for their capacity to bind Eqca-1*00101. PSCL predictions were very efficient, with an receiver operating characteristic (ROC) curve performance of 0.877, and 87 peptides derived from 40 different EHV-1 proteins were identified with affinities of 500 nM or higher. Quantitative analysis revealed that Eqca-1*00101 has a narrow peptide-binding repertoire, in comparison to those of most human, non-human primate, and mouse class I alleles. Peripheral blood mononuclear cells from six EHV-1-infected, or vaccinated but uninfected, Eqca-1*00101-positive horses were used in IFN-γ enzyme-linked immunospot (ELISPOT) assays. When we screened the 87 Eqca-1*00101-binding peptides for T cell reactivity, only one Eqca-1*00101 epitope, derived from the intermediate-early protein ICP4, was identified. Thus, despite its common occurrence in several horse breeds, Eqca-1*00101 is associated with a narrow binding repertoire and a similarly narrow T cell response to an important equine viral pathogen. Intriguingly, these features are shared with other human and macaque major histocompatibility complex (MHC) molecules with a similar specificity for D in position 2 or 3 in their main anchor motif.
Bergmann, Tobias; Moore, Carrie; Sidney, John; Miller, Donald; Tallmadge, Rebecca; Harman, Rebecca M.; Oseroff, Carla; Wriston, Amanda; Shabanowitz, Jeffrey; Hunt, Donald F.; Osterrieder, Nikolaus; Peters, Bjoern; Antczak, Douglas F.; Sette, Alessandro
2016-01-01
Here we describe a detailed quantitative peptide-binding motif for the common equine leukocyte antigen (ELA) class I allele Eqca-1*00101, present in roughly 25 % of Thoroughbred horses. We determined a preliminary binding motif by sequencing endogenously bound ligands. Subsequently, a positional scanning combinatorial library (PSCL) was used to further characterize binding specificity and derive a quantitative motif involving aspartic acid in position 2 and hydrophobic residues at the C-terminus. Using this motif, we selected and tested 9- and 10-mer peptides derived from the equine herpesvirus type 1 (EHV-1) proteome for their capacity to bind Eqca-1*00101. PSCL predictions were very efficient, with an receiver operating characteristic (ROC) curve performance of 0.877, and 87 peptides derived from 40 different EHV-1 proteins were identified with affinities of 500 nM or higher. Quantitative analysis revealed that Eqca-1*00101 has a narrow peptide-binding repertoire, in comparison to those of most human, non-human primate, and mouse class I alleles. Peripheral blood mononuclear cells from six EHV-1-infected, or vaccinated but uninfected, Eqca-1*00101-positive horses were used in IFN-γ enzyme-linked immunospot (ELISPOT) assays. When we screened the 87 Eqca-1*00101-binding peptides for T cell reactivity, only one Eqca-1*00101 epitope, derived from the intermediate-early protein ICP4, was identified. Thus, despite its common occurrence in several horse breeds, Eqca-1*00101 is associated with a narrow binding repertoire and a similarly narrow T cell response to an important equine viral pathogen. Intriguingly, these features are shared with other human and macaque major histocompatibility complex (MHC) molecules with a similar specificity for D in position 2 or 3 in their main anchor motif. PMID:26399241
SiteBinder: an improved approach for comparing multiple protein structural motifs.
Sehnal, David; Vařeková, Radka Svobodová; Huber, Heinrich J; Geidl, Stanislav; Ionescu, Crina-Maria; Wimmerová, Michaela; Koča, Jaroslav
2012-02-27
There is a paramount need to develop new techniques and tools that will extract as much information as possible from the ever growing repository of protein 3D structures. We report here on the development of a software tool for the multiple superimposition of large sets of protein structural motifs. Our superimposition methodology performs a systematic search for the atom pairing that provides the best fit. During this search, the RMSD values for all chemically relevant pairings are calculated by quaternion algebra. The number of evaluated pairings is markedly decreased by using PDB annotations for atoms. This approach guarantees that the best fit will be found and can be applied even when sequence similarity is low or does not exist at all. We have implemented this methodology in the Web application SiteBinder, which is able to process up to thousands of protein structural motifs in a very short time, and which provides an intuitive and user-friendly interface. Our benchmarking analysis has shown the robustness, efficiency, and versatility of our methodology and its implementation by the successful superimposition of 1000 experimentally determined structures for each of 32 eukaryotic linear motifs. We also demonstrate the applicability of SiteBinder using three case studies. We first compared the structures of 61 PA-IIL sugar binding sites containing nine different sugars, and we found that the sugar binding sites of PA-IIL and its mutants have a conserved structure despite their binding different sugars. We then superimposed over 300 zinc finger central motifs and revealed that the molecular structure in the vicinity of the Zn atom is highly conserved. Finally, we superimposed 12 BH3 domains from pro-apoptotic proteins. Our findings come to support the hypothesis that there is a structural basis for the functional segregation of BH3-only proteins into activators and enablers.
DMINDA: an integrated web server for DNA motif identification and analyses
Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying
2014-01-01
DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. PMID:24753419
Motivated Proteins: A web application for studying small three-dimensional protein motifs
Leader, David P; Milner-White, E James
2009-01-01
Background Small loop-shaped motifs are common constituents of the three-dimensional structure of proteins. Typically they comprise between three and seven amino acid residues, and are defined by a combination of dihedral angles and hydrogen bonding partners. The most abundant of these are αβ-motifs, asx-motifs, asx-turns, β-bulges, β-bulge loops, β-turns, nests, niches, Schellmann loops, ST-motifs, ST-staples and ST-turns. We have constructed a database of such motifs from a range of high-quality protein structures and built a web application as a visual interface to this. Description The web application, Motivated Proteins, provides access to these 12 motifs (with 48 sub-categories) in a database of over 400 representative proteins. Queries can be made for specific categories or sub-categories of motif, motifs in the vicinity of ligands, motifs which include part of an enzyme active site, overlapping motifs, or motifs which include a particular amino acid sequence. Individual proteins can be specified, or, where appropriate, motifs for all proteins listed. The results of queries are presented in textual form as an (X)HTML table, and may be saved as parsable plain text or XML. Motifs can be viewed and manipulated either individually or in the context of the protein in the Jmol applet structural viewer. Cartoons of the motifs imposed on a linear representation of protein secondary structure are also provided. Summary information for the motifs is available, as are histograms of amino acid distribution, and graphs of dihedral angles at individual positions in the motifs. Conclusion Motivated Proteins is a publicly and freely accessible web application that enables protein scientists to study small three-dimensional motifs without requiring knowledge of either Structured Query Language or the underlying database schema. PMID:19210785
Díez-Villaseñor, César; Guzmán, Noemí M.; Almendros, Cristóbal; García-Martínez, Jesús; Mojica, Francisco J.M.
2013-01-01
Prokaryotes immunize themselves against transmissible genetic elements by the integration (acquisition) in clustered regularly interspaced short palindromic repeats (CRISPR) loci of spacers homologous to invader nucleic acids, defined as protospacers. Following acquisition, mono-spacer CRISPR RNAs (termed crRNAs) guide CRISPR-associated (Cas) proteins to degrade (interference) protospacers flanked by an adjacent motif in extrachomosomal DNA. During acquisition, selection of spacer-precursors adjoining the protospacer motif and proper orientation of the integrated fragment with respect to the leader (sequence leading transcription of the flanking CRISPR array) grant efficient interference by at least some CRISPR-Cas systems. This adaptive stage of the CRISPR action is poorly characterized, mainly due to the lack of appropriate genetic strategies to address its study and, at least in Escherichia coli, the need of Cas overproduction for insertion detection. In this work, we describe the development and application in Escherichia coli strains of an interference-independent assay based on engineered selectable CRISPR-spacer integration reporter plasmids. By using this tool without the constraint of interference or cas overexpression, we confirmed fundamental aspects of this process such as the critical requirement of Cas1 and Cas2 and the identity of the CTT protospacer motif for the E. coli K12 system. In addition, we defined the CWT motif for a non-K12 CRISPR-Cas variant, and obtained data supporting the implication of the leader in spacer orientation, the preferred acquisition from plasmids harboring cas genes and the occurrence of a sequential cleavage at the insertion site by a ruler mechanism. PMID:23445770
Díez-Villaseñor, César; Guzmán, Noemí M; Almendros, Cristóbal; García-Martínez, Jesús; Mojica, Francisco J M
2013-05-01
Prokaryotes immunize themselves against transmissible genetic elements by the integration (acquisition) in clustered regularly interspaced short palindromic repeats (CRISPR) loci of spacers homologous to invader nucleic acids, defined as protospacers. Following acquisition, mono-spacer CRISPR RNAs (termed crRNAs) guide CRISPR-associated (Cas) proteins to degrade (interference) protospacers flanked by an adjacent motif in extrachomosomal DNA. During acquisition, selection of spacer-precursors adjoining the protospacer motif and proper orientation of the integrated fragment with respect to the leader (sequence leading transcription of the flanking CRISPR array) grant efficient interference by at least some CRISPR-Cas systems. This adaptive stage of the CRISPR action is poorly characterized, mainly due to the lack of appropriate genetic strategies to address its study and, at least in Escherichia coli, the need of Cas overproduction for insertion detection. In this work, we describe the development and application in Escherichia coli strains of an interference-independent assay based on engineered selectable CRISPR-spacer integration reporter plasmids. By using this tool without the constraint of interference or cas overexpression, we confirmed fundamental aspects of this process such as the critical requirement of Cas1 and Cas2 and the identity of the CTT protospacer motif for the E. coli K12 system. In addition, we defined the CWT motif for a non-K12 CRISPR-Cas variant, and obtained data supporting the implication of the leader in spacer orientation, the preferred acquisition from plasmids harboring cas genes and the occurrence of a sequential cleavage at the insertion site by a ruler mechanism.
Motif-based analysis of large nucleotide data sets using MEME-ChIP
Ma, Wenxiu; Noble, William S; Bailey, Timothy L
2014-01-01
MEME-ChIP is a web-based tool for analyzing motifs in large DNA or RNA data sets. It can analyze peak regions identified by ChIP-seq, cross-linking sites identified by cLIP-seq and related assays, as well as sets of genomic regions selected using other criteria. MEME-ChIP performs de novo motif discovery, motif enrichment analysis, motif location analysis and motif clustering, providing a comprehensive picture of the DNA or RNA motifs that are enriched in the input sequences. MEME-ChIP performs two complementary types of de novo motif discovery: weight matrix–based discovery for high accuracy; and word-based discovery for high sensitivity. Motif enrichment analysis using DNA or RNA motifs from human, mouse, worm, fly and other model organisms provides even greater sensitivity. MEME-ChIP’s interactive HTML output groups and aligns significant motifs to ease interpretation. this protocol takes less than 3 h, and it provides motif discovery approaches that are distinct and complementary to other online methods. PMID:24853928
NASA Astrophysics Data System (ADS)
Zeng, Lang; He, Yu; Povolotskyi, Michael; Liu, XiaoYan; Klimeck, Gerhard; Kubis, Tillmann
2013-06-01
In this work, the low rank approximation concept is extended to the non-equilibrium Green's function (NEGF) method to achieve a very efficient approximated algorithm for coherent and incoherent electron transport. This new method is applied to inelastic transport in various semiconductor nanodevices. Detailed benchmarks with exact NEGF solutions show (1) a very good agreement between approximated and exact NEGF results, (2) a significant reduction of the required memory, and (3) a large reduction of the computational time (a factor of speed up as high as 150 times is observed). A non-recursive solution of the inelastic NEGF transport equations of a 1000 nm long resistor on standard hardware illustrates nicely the capability of this new method.
Safari, Roghaiyeh; Salimi, Reza; Tunca, Zeliha; Ozerdem, Aysegul; Ceylan, Deniz; Sakizli, Meral
2016-06-01
Calcium signaling is important for synaptic plasticity, generation of brain rhythms, regulating neuronal excitability, data processing and cognition. Impairment in calcium homeostasis contributed to the development of psychiatric disorders such as bipolar disorder (BP). MCU is the most important calcium transporter in mitochondria inner membrane responsible for influx of Ca[Formula: see text]. MICU1 is linked with MCU and has two canonical EF hands that are vital for its activity and regulates MCU-mediated Ca[Formula: see text] influx. In the current study, we aimed to investigate the role of genetic alteration of EF hand calcium binding motifs of MICU1 on the development of BP. We examined patients with BP, first degree relatives of these patients and healthy volunteers for mutations and polymorphisms in EF hand calcium binding motifs of MICU1. The result showed no SNP/mutation in BP patients, in healthy subjects and in first degree relatives. Additionally, alignment of the EF hand calcium binding regions among species (Gallus-gallus, Canis-lupus-familiaris, Bos-taurus, Mus-musculus, Rattus-norvegicus, Pan-troglodytes, Homosapiens and Danio-rerio) showed exactly the same amino acids (DLNGDGEVDMEE and DCDGNGELSNKE) except in one of the calcium binding domain of Danio-rerio that there was only one difference; leucine instead of Methionine. Our results showed that the SNP on EF-hand Ca[Formula: see text] binding domains of MICU1 gene had no effect in phenotypic characters of BP patients.
van Lith, Marcel; Hartigan, Nichola; Hatch, Jennifer; Benham, Adam M
2005-01-14
Protein disulfide isomerase (PDI) is the archetypal enzyme involved in the formation and reshuffling of disulfide bonds in the endoplasmic reticulum (ER). PDI achieves its redox function through two highly conserved thioredoxin domains, and PDI can also operate as an ER chaperone. The substrate specificities and the exact functions of most other PDI family proteins remain important unsolved questions in biology. Here, we characterize a new and striking member of the PDI family, which we have named protein disulfide isomerase-like protein of the testis (PDILT). PDILT is the first eukaryotic SXXC protein to be characterized in the ER. Our experiments have unveiled a novel, glycosylated PDI-like protein whose tissue-specific expression and unusual motifs have implications for the evolution, catalytic function, and substrate selection of thioredoxin family proteins. We show that PDILT is an ER resident glycoprotein that liaises with partner proteins in disulfide-dependent complexes within the testis. PDILT interacts with the oxidoreductase Ero1alpha, demonstrating that the N-terminal cysteine of the CXXC sequence is not required for binding of PDI family proteins to ER oxidoreductases. The expression of PDILT, in addition to PDI in the testis, suggests that PDILT performs a specialized chaperone function in testicular cells. PDILT is an unusual PDI relative that highlights the adaptability of chaperone and redox function in enzymes of the endoplasmic reticulum.
Interactions of the SAP Domain of Human Ku70 with DNA Substrate: A Molecular Dynamics Study
NASA Technical Reports Server (NTRS)
Hu, Shaowen; Carra, Claudio; Huff, Janice; Pluth, Janice M.; Cucinotta, Francis A.
2007-01-01
NASA is developing a systems biology approach to improve the assessment of health risks associated with space radiation. The primary toxic and mutagenic lesion following radiation exposure is the DNA double strand break (DSB), thus a model incorporating proteins and pathways important in response and repair of this lesion is critical. One key protein heterodimer for systems models of radiation effects is the Ku70/80 complex. The Ku70/80 complex is important in the initial binding of DSB ends following DNA damage, and is a component of nonhomologous end joining repair, the primary pathway for DSB repair in mammalian cells. The SAP domain of Ku70 (residues 556-609), contains an a helix-extended strand-helix motif and similar motifs have been found in other nucleic acid-binding proteins critical for DNA repair. However, the exact mechanism of damage recognition and substrate specificity for the Ku heterodimer remains unclear in part due to the absence of a high-resolution structure of the SAP/DNA complex. We performed a series of molecular dynamics (MD) simulations on a system with the SAP domain of Ku70 and a 10 base pairs DNA duplex. Large-scale conformational changes were observed and some putative binding modes were suggested based on energetic analysis. These modes are consistent with previous experimental investigations. In addition, the results indicate that cooperation of SAP with other domains of Ku70/80 is necessary to explain the high affinity of binding as observed in experiments.
BayesMotif: de novo protein sorting motif discovery from impure datasets.
Hu, Jianjun; Zhang, Fan
2010-01-18
Protein sorting is the process that newly synthesized proteins are transported to their target locations within or outside of the cell. This process is precisely regulated by protein sorting signals in different forms. A major category of sorting signals are amino acid sub-sequences usually located at the N-terminals or C-terminals of protein sequences. Genome-wide experimental identification of protein sorting signals is extremely time-consuming and costly. Effective computational algorithms for de novo discovery of protein sorting signals is needed to improve the understanding of protein sorting mechanisms. We formulated the protein sorting motif discovery problem as a classification problem and proposed a Bayesian classifier based algorithm (BayesMotif) for de novo identification of a common type of protein sorting motifs in which a highly conserved anchor is present along with a less conserved motif regions. A false positive removal procedure is developed to iteratively remove sequences that are unlikely to contain true motifs so that the algorithm can identify motifs from impure input sequences. Experiments on both implanted motif datasets and real-world datasets showed that the enhanced BayesMotif algorithm can identify anchored sorting motifs from pure or impure protein sequence dataset. It also shows that the false positive removal procedure can help to identify true motifs even when there is only 20% of the input sequences containing true motif instances. We proposed BayesMotif, a novel Bayesian classification based algorithm for de novo discovery of a special category of anchored protein sorting motifs from impure datasets. Compared to conventional motif discovery algorithms such as MEME, our algorithm can find less-conserved motifs with short highly conserved anchors. Our algorithm also has the advantage of easy incorporation of additional meta-sequence features such as hydrophobicity or charge of the motifs which may help to overcome the limitations of PWM (position weight matrix) motif model.
Wang, Rui; He, Anyu; Ramu, Errabelli; Falck, John R
2015-02-14
An efficient and asymmetric synthetic approach towards one of the biologically interesting 4(S)-11-diHDHA derivatives was developed. This process mainly relied on two reactions, one is the copper-catalyzed mild cross-coupling that allows for the efficient construction of a chiral α-alkynyl α-hydroxy motif and another is the synthesis of chiral α-hydroxy α-stannanes that has previously been developed by our group featuring the asymmetric stannylation using the well-established tributyltin hydride/diethyl zinc system from an aldehyde.
DLocalMotif: a discriminative approach for discovering local motifs in protein sequences.
Mehdi, Ahmed M; Sehgal, Muhammad Shoaib B; Kobe, Bostjan; Bailey, Timothy L; Bodén, Mikael
2013-01-01
Local motifs are patterns of DNA or protein sequences that occur within a sequence interval relative to a biologically defined anchor or landmark. Current protein motif discovery methods do not adequately consider such constraints to identify biologically significant motifs that are only weakly over-represented but spatially confined. Using negatives, i.e. sequences known to not contain a local motif, can further increase the specificity of their discovery. This article introduces the method DLocalMotif that makes use of positional information and negative data for local motif discovery in protein sequences. DLocalMotif combines three scoring functions, measuring degrees of motif over-representation, entropy and spatial confinement, specifically designed to discriminatively exploit the availability of negative data. The method is shown to outperform current methods that use only a subset of these motif characteristics. We apply the method to several biological datasets. The analysis of peroxisomal targeting signals uncovers several novel motifs that occur immediately upstream of the dominant peroxisomal targeting signal-1 signal. The analysis of proline-tyrosine nuclear localization signals uncovers multiple novel motifs that overlap with C2H2 zinc finger domains. We also evaluate the method on classical nuclear localization signals and endoplasmic reticulum retention signals and find that DLocalMotif successfully recovers biologically relevant sequence properties. http://bioinf.scmb.uq.edu.au/dlocalmotif/
Pan, Xiaoyong; Shen, Hong-Bin
2017-02-28
RNAs play key roles in cells through the interactions with proteins known as the RNA-binding proteins (RBP) and their binding motifs enable crucial understanding of the post-transcriptional regulation of RNAs. How the RBPs correctly recognize the target RNAs and why they bind specific positions is still far from clear. Machine learning-based algorithms are widely acknowledged to be capable of speeding up this process. Although many automatic tools have been developed to predict the RNA-protein binding sites from the rapidly growing multi-resource data, e.g. sequence, structure, their domain specific features and formats have posed significant computational challenges. One of current difficulties is that the cross-source shared common knowledge is at a higher abstraction level beyond the observed data, resulting in a low efficiency of direct integration of observed data across domains. The other difficulty is how to interpret the prediction results. Existing approaches tend to terminate after outputting the potential discrete binding sites on the sequences, but how to assemble them into the meaningful binding motifs is a topic worth of further investigation. In viewing of these challenges, we propose a deep learning-based framework (iDeep) by using a novel hybrid convolutional neural network and deep belief network to predict the RBP interaction sites and motifs on RNAs. This new protocol is featured by transforming the original observed data into a high-level abstraction feature space using multiple layers of learning blocks, where the shared representations across different domains are integrated. To validate our iDeep method, we performed experiments on 31 large-scale CLIP-seq datasets, and our results show that by integrating multiple sources of data, the average AUC can be improved by 8% compared to the best single-source-based predictor; and through cross-domain knowledge integration at an abstraction level, it outperforms the state-of-the-art predictors by 6%. Besides the overall enhanced prediction performance, the convolutional neural network module embedded in iDeep is also able to automatically capture the interpretable binding motifs for RBPs. Large-scale experiments demonstrate that these mined binding motifs agree well with the experimentally verified results, suggesting iDeep is a promising approach in the real-world applications. The iDeep framework not only can achieve promising performance than the state-of-the-art predictors, but also easily capture interpretable binding motifs. iDeep is available at http://www.csbio.sjtu.edu.cn/bioinf/iDeep.
Unitary circular code motifs in genomes of eukaryotes.
El Soufi, Karim; Michel, Christian J
A set X of 20 trinucleotides was identified in genes of bacteria, eukaryotes, plasmids and viruses, which has in average the highest occurrence in reading frame compared to its two shifted frames (Michel, 2015; Arquès and Michel, 1996). This set X has an interesting mathematical property as X is a circular code (Arquès and Michel, 1996). Thus, the motifs from this circular code X, called X motifs, have the property to always retrieve, synchronize and maintain the reading frame in genes. The origin of this circular code X in genes is an open problem since its discovery in 1996. Here, we first show that the unitary circular codes (UCC), i.e. sets of one word, allow to generate unitary circular code motifs (UCC motifs), i.e. a concatenation of the same motif (simple repeats) leading to low complexity DNA. Three classes of UCC motifs are studied here: repeated dinucleotides (D + motifs), repeated trinucleotides (T + motifs) and repeated tetranucleotides (T + motifs). Thus, the D + , T + and T + motifs allow to retrieve, synchronize and maintain a frame modulo 2, modulo 3 and modulo 4, respectively, and their shifted frames (1 modulo 2; 1 and 2 modulo 3; 1, 2 and 3 modulo 4 according to the C 2 , C 3 and C 4 properties, respectively) in the DNA sequences. The statistical distribution of the D + , T + and T + motifs is analyzed in the genomes of eukaryotes. A UCC motif and its comp lementary UCC motif have the same distribution in the eukaryotic genomes. Furthermore, a UCC motif and its complementary UCC motif have increasing occurrences contrary to their number of hydrogen bonds, very significant with the T + motifs. The longest D + , T + and T + motifs in the studied eukaryotic genomes are also given. Surprisingly, a scarcity of repeated trinucleotides (T + motifs) in the large eukaryotic genomes is observed compared to the D + and T + motifs. This result has been investigated and may be explained by two outcomes. Repeated trinucleotides (T + motifs) are identified in the X motifs of low composition (cardinality less than 10) in the genomes of eukaryotes. Furthermore, identical trinucleotide pairs of the circular code X are preferentially used in the gene sequences of eukaryotes. These two results suggest that the unitary circular codes of trinucleotides may have been involved in the formation of the trinucleotide circular code X. Indeed, repeated trinucleotides in the X motifs in the genomes of eukaryotes may represent an intermediary evolution from repeated trinucleotides of cardinality 1 (T + motifs) in the genomes of eukaryotes up to the X motifs of cardinality 20 in the gene sequences of eukaryotes. Copyright © 2017 Elsevier B.V. All rights reserved.
Identifying the scale-dependent motifs in atmospheric surface layer by ordinal pattern analysis
NASA Astrophysics Data System (ADS)
Li, Qinglei; Fu, Zuntao
2018-07-01
Ramp-like structures in various atmospheric surface layer time series have been long studied, but the presence of motifs with the finer scale embedded within larger scale ramp-like structures has largely been overlooked in the reported literature. Here a novel, objective and well-adapted methodology, the ordinal pattern analysis, is adopted to study the finer-scaled motifs in atmospheric boundary-layer (ABL) time series. The studies show that the motifs represented by different ordinal patterns take clustering properties and 6 dominated motifs out of the whole 24 motifs account for about 45% of the time series under particular scales, which indicates the higher contribution of motifs with the finer scale to the series. Further studies indicate that motif statistics are similar for both stable conditions and unstable conditions at larger scales, but large discrepancies are found at smaller scales, and the frequencies of motifs "1234" and/or "4321" are a bit higher under stable conditions than unstable conditions. Under stable conditions, there are great changes for the occurrence frequencies of motifs "1234" and "4321", where the occurrence frequencies of motif "1234" decrease from nearly 24% to 4.5% with the scale factor increasing, and the occurrence frequencies of motif "4321" change nonlinearly with the scale increasing. These great differences of dominated motifs change with scale can be taken as an indicator to quantify the flow structure changes under different stability conditions, and motif entropy can be defined just by only 6 dominated motifs to quantify this time-scale independent property of the motifs. All these results suggest that the defined scale of motifs with the finer scale should be carefully taken into consideration in the interpretation of turbulence coherent structures.
DMINDA: an integrated web server for DNA motif identification and analyses.
Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying
2014-07-01
DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Identity and functions of CxxC-derived motifs.
Fomenko, Dmitri E; Gladyshev, Vadim N
2003-09-30
Two cysteines separated by two other residues (the CxxC motif) are employed by many redox proteins for formation, isomerization, and reduction of disulfide bonds and for other redox functions. The place of the C-terminal cysteine in this motif may be occupied by serine (the CxxS motif), modifying the functional repertoire of redox proteins. Here we found that the CxxC motif may also give rise to a motif, in which the C-terminal cysteine is replaced with threonine (the CxxT motif). Moreover, in contrast to a view that the N-terminal cysteine in the CxxC motif always serves as a nucleophilic attacking group, this residue could also be replaced with threonine (the TxxC motif), serine (the SxxC motif), or other residues. In each of these CxxC-derived motifs, the presence of a downstream alpha-helix was strongly favored. A search for conserved CxxC-derived motif/helix patterns in four complete genomes representing bacteria, archaea, and eukaryotes identified known redox proteins and suggested possible redox functions for several additional proteins. Catalytic sites in peroxiredoxins were major representatives of the TxxC motif, whereas those in glutathione peroxidases represented the CxxT motif. Structural assessments indicated that threonines in these enzymes could stabilize catalytic thiolates, suggesting revisions to previously proposed catalytic triads. Each of the CxxC-derived motifs was also observed in natural selenium-containing proteins, in which selenocysteine was present in place of a catalytic cysteine.
NASA Astrophysics Data System (ADS)
Zhai, Peng-Wang; Hu, Yongxiang; Josset, Damien B.; Trepte, Charles R.; Lucker, Patricia L.; Lin, Bing
2012-06-01
We have developed a Vector Radiative Transfer (VRT) code for coupled atmosphere and ocean systems based on the successive order of scattering (SOS) method. In order to achieve efficiency and maintain accuracy, the scattering matrix is expanded in terms of the Wigner d functions and the delta fit or delta-M technique is used to truncate the commonly-present large forward scattering peak. To further improve the accuracy of the SOS code, we have implemented the analytical first order scattering treatment using the exact scattering matrix of the medium in the SOS code. The expansion and truncation techniques are kept for higher order scattering. The exact first order scattering correction was originally published by Nakajima and Takana.1 A new contribution of this work is to account for the exact secondary light scattering caused by the light reflected by and transmitted through the rough air-sea interface.
Well balancing of the SWE schemes for moving-water steady flows
NASA Astrophysics Data System (ADS)
Caleffi, Valerio; Valiani, Alessandro
2017-08-01
In this work, the exact reproduction of a moving-water steady flow via the numerical solution of the one-dimensional shallow water equations is studied. A new scheme based on a modified version of the HLLEM approximate Riemann solver (Dumbser and Balsara (2016) [18]) that exactly preserves the total head and the discharge in the simulation of smooth steady flows and that correctly dissipates mechanical energy in the presence of hydraulic jumps is presented. This model is compared with a selected set of schemes from the literature, including models that exactly preserve quiescent flows and models that exactly preserve moving-water steady flows. The comparison highlights the strengths and weaknesses of the different approaches. In particular, the results show that the increase in accuracy in the steady state reproduction is counterbalanced by a reduced robustness and numerical efficiency of the models. Some solutions to reduce these drawbacks, at the cost of increased algorithm complexity, are presented.
Miner, Daniel C; Triesch, Jochen
2014-01-01
The neuroanatomical connectivity of cortical circuits is believed to follow certain rules, the exact origins of which are still poorly understood. In particular, numerous nonrandom features, such as common neighbor clustering, overrepresentation of reciprocal connectivity, and overrepresentation of certain triadic graph motifs have been experimentally observed in cortical slice data. Some of these data, particularly regarding bidirectional connectivity are seemingly contradictory, and the reasons for this are unclear. Here we present a simple static geometric network model with distance-dependent connectivity on a realistic scale that naturally gives rise to certain elements of these observed behaviors, and may provide plausible explanations for some of the conflicting findings. Specifically, investigation of the model shows that experimentally measured nonrandom effects, especially bidirectional connectivity, may depend sensitively on experimental parameters such as slice thickness and sampling area, suggesting potential explanations for the seemingly conflicting experimental results.
Miner, Daniel C.; Triesch, Jochen
2014-01-01
The neuroanatomical connectivity of cortical circuits is believed to follow certain rules, the exact origins of which are still poorly understood. In particular, numerous nonrandom features, such as common neighbor clustering, overrepresentation of reciprocal connectivity, and overrepresentation of certain triadic graph motifs have been experimentally observed in cortical slice data. Some of these data, particularly regarding bidirectional connectivity are seemingly contradictory, and the reasons for this are unclear. Here we present a simple static geometric network model with distance-dependent connectivity on a realistic scale that naturally gives rise to certain elements of these observed behaviors, and may provide plausible explanations for some of the conflicting findings. Specifically, investigation of the model shows that experimentally measured nonrandom effects, especially bidirectional connectivity, may depend sensitively on experimental parameters such as slice thickness and sampling area, suggesting potential explanations for the seemingly conflicting experimental results. PMID:25414647
Quantum delocalization of protons in the hydrogen-bond network of an enzyme active site.
Wang, Lu; Fried, Stephen D; Boxer, Steven G; Markland, Thomas E
2014-12-30
Enzymes use protein architectures to create highly specialized structural motifs that can greatly enhance the rates of complex chemical transformations. Here, we use experiments, combined with ab initio simulations that exactly include nuclear quantum effects, to show that a triad of strongly hydrogen-bonded tyrosine residues within the active site of the enzyme ketosteroid isomerase (KSI) facilitates quantum proton delocalization. This delocalization dramatically stabilizes the deprotonation of an active-site tyrosine residue, resulting in a very large isotope effect on its acidity. When an intermediate analog is docked, it is incorporated into the hydrogen-bond network, giving rise to extended quantum proton delocalization in the active site. These results shed light on the role of nuclear quantum effects in the hydrogen-bond network that stabilizes the reactive intermediate of KSI, and the behavior of protons in biological systems containing strong hydrogen bonds.
Quantum delocalization of protons in the hydrogen-bond network of an enzyme active site
Wang, Lu; Fried, Stephen D.; Boxer, Steven G.; Markland, Thomas E.
2014-01-01
Enzymes use protein architectures to create highly specialized structural motifs that can greatly enhance the rates of complex chemical transformations. Here, we use experiments, combined with ab initio simulations that exactly include nuclear quantum effects, to show that a triad of strongly hydrogen-bonded tyrosine residues within the active site of the enzyme ketosteroid isomerase (KSI) facilitates quantum proton delocalization. This delocalization dramatically stabilizes the deprotonation of an active-site tyrosine residue, resulting in a very large isotope effect on its acidity. When an intermediate analog is docked, it is incorporated into the hydrogen-bond network, giving rise to extended quantum proton delocalization in the active site. These results shed light on the role of nuclear quantum effects in the hydrogen-bond network that stabilizes the reactive intermediate of KSI, and the behavior of protons in biological systems containing strong hydrogen bonds. PMID:25503367
RacGAP50C is sufficient to signal cleavage furrow formation during cytokinesis.
D'Avino, Pier Paolo; Savoian, Matthew S; Capalbo, Luisa; Glover, David M
2006-11-01
Several studies indicate that spindle microtubules determine the position of the cleavage plane at the end of cell division, but their exact role in triggering the formation and ingression of the cleavage furrow is still unclear. Here we show that in Drosophila depletion of either the GAP (GTPase-activating protein) or the kinesin-like subunit of the evolutionary conserved centralspindlin complex prevents furrowing without affecting the association of astral microtubules with the cell cortex. Moreover, time-lapse imaging indicates that astral microtubules serve to deliver the centralspindlin complex to the equatorial cortex just before furrow formation. However, when the GAP-signaling component was mislocalized around the entire cortex using a membrane-tethering motif, this caused ectopic furrowing even in the absence of its motor partner. Thus, the GAP component of centralspindlin is both necessary and sufficient for furrow formation and ingression and astral microtubules provide a route for its delivery to the cleavage site.
Robledo, Marta; Peregrina, Alexandra; Millán, Vicenta; García-Tomsig, Natalia I; Torres-Quesada, Omar; Mateos, Pedro F; Becker, Anke; Jiménez-Zurdo, José I
2017-07-01
Small non-coding RNAs (sRNAs) are expected to have pivotal roles in the adaptive responses underlying symbiosis of nitrogen-fixing rhizobia with legumes. Here, we provide primary insights into the function and activity mechanism of the Sinorhizobium meliloti trans-sRNA NfeR1 (Nodule Formation Efficiency RNA). Northern blot probing and transcription tracking with fluorescent promoter-reporter fusions unveiled high nfeR1 expression in response to salt stress and throughout the symbiotic interaction. The strength and differential regulation of nfeR1 transcription are conferred by a motif, which is conserved in nfeR1 promoter regions in α-proteobacteria. NfeR1 loss-of-function compromised osmoadaptation of free-living bacteria, whilst causing misregulation of salt-responsive genes related to stress adaptation, osmolytes catabolism and membrane trafficking. Nodulation tests revealed that lack of NfeR1 affected competitiveness, infectivity, nodule development and symbiotic efficiency of S. meliloti on alfalfa roots. Comparative computer predictions and a genetic reporter assay evidenced a redundant role of three identical NfeR1 unpaired anti Shine-Dalgarno motifs for targeting and downregulation of translation of multiple mRNAs from transporter genes. Our data provide genetic evidence of the hyperosmotic conditions of the endosymbiotic compartments. NfeR1-mediated gene regulation in response to this cue could contribute to coordinate nutrient uptake with the metabolic reprogramming concomitant to symbiotic transitions. © 2017 Society for Applied Microbiology and John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Hoy, Robert S.; Harwayne-Gidansky, Jared; O'Hern, Corey S.
2012-05-01
We analyze the geometric structure and mechanical stability of a complete set of isostatic and hyperstatic sphere packings obtained via exact enumeration. The number of nonisomorphic isostatic packings grows exponentially with the number of spheres N, and their diversity of structure and symmetry increases with increasing N and decreases with increasing hyperstaticity H≡Nc-NISO, where Nc is the number of pair contacts and NISO=3N-6. Maximally contacting packings are in general neither the densest nor the most symmetric. Analyses of local structure show that the fraction f of nuclei with order compatible with the bulk (rhcp) crystal decreases sharply with increasing N due to a high propensity for stacking faults, five- and near-fivefold symmetric structures, and other motifs that preclude rhcp order. While f increases with increasing H, a significant fraction of hyperstatic nuclei for N as small as 11 retain non-rhcp structure. Classical theories of nucleation that consider only spherical nuclei, or only nuclei with the same ordering as the bulk crystal, cannot capture such effects. Our results provide an explanation for the failure of classical nucleation theory for hard-sphere systems of N≲10 particles; we argue that in this size regime, it is essential to consider nuclei of unconstrained geometry. Our results are also applicable to understanding kinetic arrest and jamming in systems that interact via hard-core-like repulsive and short-ranged attractive interactions.
Viral infection and human disease - insights from minimotifs
Kadaveru, Krishna; Vyas, Jay; Schiller, Martin R.
2008-01-01
Short functional peptide motifs cooperate in many molecular functions including protein interactions, protein trafficking, and posttranslational modifications. Viruses exploit these motifs as a principal mechanism for hijacking cells and many motifs are necessary for the viral life-cycle. A virus can accommodate many short motifs in its small genome size providing a plethora of ways for the virus to acquire host molecular machinery. Host enzymes that act on motifs such as kinases, proteases, and lipidation enzymes, as well as protein interaction domains, are commonly mutated in human disease, suggesting that the short peptide motif targets of these enzymes may also be mutated in disease; however, this is not observed. How can we explain why viruses have evolved to be so dependent on motifs, yet these motifs, in general do not seem to be as necessary for human viability? We propose that short motifs are used at the system level. This system architecture allows viruses to exploit a motif, whereas the viability of the host is not affected by mutation of a single motif. PMID:18508672
Modular and configurable optimal sequence alignment software: Cola.
Zamani, Neda; Sundström, Görel; Höppner, Marc P; Grabherr, Manfred G
2014-01-01
The fundamental challenge in optimally aligning homologous sequences is to define a scoring scheme that best reflects the underlying biological processes. Maximising the overall number of matches in the alignment does not always reflect the patterns by which nucleotides mutate. Efficiently implemented algorithms that can be parameterised to accommodate more complex non-linear scoring schemes are thus desirable. We present Cola, alignment software that implements different optimal alignment algorithms, also allowing for scoring contiguous matches of nucleotides in a nonlinear manner. The latter places more emphasis on short, highly conserved motifs, and less on the surrounding nucleotides, which can be more diverged. To illustrate the differences, we report results from aligning 14,100 sequences from 3' untranslated regions of human genes to 25 of their mammalian counterparts, where we found that a nonlinear scoring scheme is more consistent than a linear scheme in detecting short, conserved motifs. Cola is freely available under LPGL from https://github.com/nedaz/cola.
Mammalian Fe-S proteins: definition of a consensus motif recognized by the co-chaperone HSC20.
Maio, N; Rouault, T A
2016-10-01
Iron-sulfur (Fe-S) clusters are inorganic cofactors that are fundamental to several biological processes in all three kingdoms of life. In most organisms, Fe-S clusters are initially assembled on a scaffold protein, ISCU, and subsequently transferred to target proteins or to intermediate carriers by a dedicated chaperone/co-chaperone system. The delivery of assembled Fe-S clusters to recipient proteins is a crucial step in the biogenesis of Fe-S proteins, and, in mammals, it relies on the activity of a multiprotein transfer complex that contains the chaperone HSPA9, the co-chaperone HSC20 and the scaffold ISCU. How the transfer complex efficiently engages recipient Fe-S target proteins involves specific protein interactions that are not fully understood. This mini review focuses on recent insights into the molecular mechanism of amino acid motif recognition and discrimination by the co-chaperone HSC20, which guides Fe-S cluster delivery.
Kobayashi, Takehito; Yagi, Yusuke; Nakamura, Takahiro
2016-01-01
The pentatricopeptide repeat (PPR) motif is a sequence-specific RNA/DNA-binding module. Elucidation of the RNA/DNA recognition mechanism has enabled engineering of PPR motifs as new RNA/DNA manipulation tools in living cells, including for genome editing. However, the biochemical characteristics of PPR proteins remain unknown, mostly due to the instability and/or unfolding propensities of PPR proteins in heterologous expression systems such as bacteria and yeast. To overcome this issue, we constructed reporter systems using animal cultured cells. The cell-based system has highly attractive features for PPR engineering: robust eukaryotic gene expression; availability of various vectors, reagents, and antibodies; highly efficient DNA delivery ratio (>80 %); and rapid, high-throughput data production. In this chapter, we introduce an example of such reporter systems: a PPR-based sequence-specific translational activation system. The cell-based reporter system can be applied to characterize plant genes of interested and to PPR engineering.
Donor-σ-Acceptor Motifs: Thermally Activated Delayed Fluorescence Emitters with Dual Upconversion.
Geng, Yan; D'Aleo, Anthony; Inada, Ko; Cui, Lin-Song; Kim, Jong Uk; Nakanotani, Hajime; Adachi, Chihaya
2017-12-22
A family of organic emitters with a donor-σ-acceptor (D-σ-A) motif is presented. Owing to the weakly coupled D-σ-A intramolecular charge-transfer state, a transition from the localized excited triplet state ( 3 LE) and charge-transfer triplet state ( 3 CT) to the charge-transfer singlet state ( 1 CT) occurred with a small activation energy and high photoluminescence quantum efficiency. Two thermally activated delayed fluorescence (TADF) components were identified, one of which has a very short lifetime of 200-400 ns and the other a longer TADF lifetime of the order of microseconds. In particular, the two D-σ-A materials presented strong blue emission with TADF properties in toluene. These results will shed light on the molecular design of new TADF emitters with short delayed lifetimes. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Constitutional Dynamics of Metal-Organic Motifs on a Au(111) Surface.
Kong, Huihui; Zhang, Chi; Xie, Lei; Wang, Likun; Xu, Wei
2016-06-13
Constitutional dynamic chemistry (CDC), including both dynamic covalent chemistry and dynamic noncovalent chemistry, relies on reversible formation and breakage of bonds to achieve continuous changes in constitution by reorganization of components. In this regard, CDC is considered to be an efficient and appealing strategy for selective fabrication of surface nanostructures by virtue of dynamic diversity. Although constitutional dynamics of monolayered structures has been recently demonstrated at liquid/solid interfaces, most of molecular reorganization/reaction processes were thought to be irreversible under ultrahigh vacuum (UHV) conditions where CDC is therefore a challenge to be achieved. Here, we have successfully constructed a system that presents constitutional dynamics on a solid surface based on dynamic coordination chemistry, in which selective formation of metal-organic motifs is achieved under UHV conditions. The key to making this reversible switching successful is the molecule-substrate interaction as revealed by DFT calculations. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas
Petrov, Anton I.; Zirbel, Craig L.; Leontis, Neocles B.
2013-01-01
The analysis of atomic-resolution RNA three-dimensional (3D) structures reveals that many internal and hairpin loops are modular, recurrent, and structured by conserved non-Watson–Crick base pairs. Structurally similar loops define RNA 3D motifs that are conserved in homologous RNA molecules, but can also occur at nonhomologous sites in diverse RNAs, and which often vary in sequence. To further our understanding of RNA motif structure and sequence variability and to provide a useful resource for structure modeling and prediction, we present a new method for automated classification of internal and hairpin loop RNA 3D motifs and a new online database called the RNA 3D Motif Atlas. To classify the motif instances, a representative set of internal and hairpin loops is automatically extracted from a nonredundant list of RNA-containing PDB files. Their structures are compared geometrically, all-against-all, using the FR3D program suite. The loops are clustered into motif groups, taking into account geometric similarity and structural annotations and making allowance for a variable number of bulged bases. The automated procedure that we have implemented identifies all hairpin and internal loop motifs previously described in the literature. All motif instances and motif groups are assigned unique and stable identifiers and are made available in the RNA 3D Motif Atlas (http://rna.bgsu.edu/motifs), which is automatically updated every four weeks. The RNA 3D Motif Atlas provides an interactive user interface for exploring motif diversity and tools for programmatic data access. PMID:23970545
Efficient scheme for parametric fitting of data in arbitrary dimensions.
Pang, Ning-Ning; Tzeng, Wen-Jer; Kao, Hisen-Ching
2008-07-01
We propose an efficient scheme for parametric fitting expressed in terms of the Legendre polynomials. For continuous systems, our scheme is exact and the derived explicit expression is very helpful for further analytical studies. For discrete systems, our scheme is almost as accurate as the method of singular value decomposition. Through a few numerical examples, we show that our algorithm costs much less CPU time and memory space than the method of singular value decomposition. Thus, our algorithm is very suitable for a large amount of data fitting. In addition, the proposed scheme can also be used to extract the global structure of fluctuating systems. We then derive the exact relation between the correlation function and the detrended variance function of fluctuating systems in arbitrary dimensions and give a general scaling analysis.
NASA Astrophysics Data System (ADS)
Bentz, Jonathan L.; Kozak, John J.; Nicolis, Gregoire
2005-08-01
The influence of non-nearest-neighbor displacements on the efficiency of diffusion-reaction processes involving one and two mobile diffusing reactants is studied. An exact analytic result is given for dimension d=1 from which, for large lattices, one can recover the asymptotic estimate reported 30 years ago by Lakatos-Lindenberg and Shuler. For dimensions d=2,3 we present numerically exact values for the mean time to reaction, as gauged by the mean walklength before reactive encounter, obtained via the theory of finite Markov processes and supported by Monte Carlo simulations. Qualitatively different results are found between processes occurring on d=1 versus d>1 lattices, and between results obtained assuming nearest-neighbor (only) versus non-nearest-neighbor displacements.
Geometric Heat Engines Featuring Power that Grows with Efficiency.
Raz, O; Subaşı, Y; Pugatch, R
2016-04-22
Thermodynamics places a limit on the efficiency of heat engines, but not on their output power or on how the power and efficiency change with the engine's cycle time. In this Letter, we develop a geometrical description of the power and efficiency as a function of the cycle time, applicable to an important class of heat engine models. This geometrical description is used to design engine protocols that attain both the maximal power and maximal efficiency at the fast driving limit. Furthermore, using this method, we also prove that no protocol can exactly attain the Carnot efficiency at nonzero power.
NASA Astrophysics Data System (ADS)
Hosseini, Kamyar; Mayeli, Peyman; Ansari, Reza
2018-07-01
Finding the exact solutions of nonlinear fractional differential equations has gained considerable attention, during the past two decades. In this paper, the conformable time-fractional Klein-Gordon equations with quadratic and cubic nonlinearities are studied. Several exact soliton solutions, including the bright (non-topological) and singular soliton solutions are formally extracted by making use of the ansatz method. Results demonstrate that the method can efficiently handle the time-fractional Klein-Gordon equations with different nonlinearities.
MotifNet: a web-server for network motif analysis.
Smoly, Ilan Y; Lerman, Eugene; Ziv-Ukelson, Michal; Yeger-Lotem, Esti
2017-06-15
Network motifs are small topological patterns that recur in a network significantly more often than expected by chance. Their identification emerged as a powerful approach for uncovering the design principles underlying complex networks. However, available tools for network motif analysis typically require download and execution of computationally intensive software on a local computer. We present MotifNet, the first open-access web-server for network motif analysis. MotifNet allows researchers to analyze integrated networks, where nodes and edges may be labeled, and to search for motifs of up to eight nodes. The output motifs are presented graphically and the user can interactively filter them by their significance, number of instances, node and edge labels, and node identities, and view their instances. MotifNet also allows the user to distinguish between motifs that are centered on specific nodes and motifs that recur in distinct parts of the network. MotifNet is freely available at http://netbio.bgu.ac.il/motifnet . The website was implemented using ReactJs and supports all major browsers. The server interface was implemented in Python with data stored on a MySQL database. estiyl@bgu.ac.il or michaluz@cs.bgu.ac.il. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Biomaterials and cells for neural tissue engineering: Current choices.
Sensharma, Prerana; Madhumathi, G; Jayant, Rahul D; Jaiswal, Amit K
2017-08-01
The treatment of nerve injuries has taken a new dimension with the development of tissue engineering techniques. Prior to tissue engineering, suturing and surgery were the only options for effective treatment. With the advent of tissue engineering, it is now possible to design a scaffold that matches the exact biological and mechanical properties of the tissue. This has led to substantial reduction in the complications posed by surgeries and suturing to the patients. New synthetic and natural polymers are being applied to test their efficiency in generating an ideal scaffold. Along with these, cells and growth factors are also being incorporated to increase the efficiency of a scaffold. Efforts are being made to devise a scaffold that is biodegradable, biocompatible, conducting and immunologically inert. The ultimate goal is to exactly mimic the extracellular matrix in our body, and to elicit a combination of biochemical, topographical and electrical cues via various polymers, cells and growth factors, using which nerve regeneration can efficiently occur. Copyright © 2017 Elsevier B.V. All rights reserved.
Reward-based spatial crowdsourcing with differential privacy preservation
NASA Astrophysics Data System (ADS)
Xiong, Ping; Zhang, Lefeng; Zhu, Tianqing
2017-11-01
In recent years, the popularity of mobile devices has transformed spatial crowdsourcing (SC) into a novel mode for performing complicated projects. Workers can perform tasks at specified locations in return for rewards offered by employers. Existing methods ensure the efficiency of their systems by submitting the workers' exact locations to a centralised server for task assignment, which can lead to privacy violations. Thus, implementing crowsourcing applications while preserving the privacy of workers' location is a key issue that needs to be tackled. We propose a reward-based SC method that achieves acceptable utility as measured by task assignment success rates, while efficiently preserving privacy. A differential privacy model ensures rigorous privacy guarantee, and Laplace noise is introduced to protect workers' exact locations. We then present a reward allocation mechanism that adjusts each piece of the reward for a task using the distribution of the workers' locations. Through experimental results, we demonstrate that this optimised-reward method is efficient for SC applications.
Tasiopoulos, Christos Panagiotis; Widhe, Mona; Hedhammar, My
2018-05-02
In vitro endothelialization of synthetic grafts or engineered vascular constructs is considered a promising alternative to overcome shortcomings in the availability of autologous vessels and in-graft complications with synthetics. A number of cell-seeding techniques have been implemented to render vascular grafts accessible for cells to attach, proliferate, and spread over the surface area. Nonetheless, seeding efficiency and the time needed for cells to adhere varies dramatically. Herein, we investigated a novel cell-seeding approach (denoted co-seeding) that enables cells to bind to a motif from fibronectin included in a recombinant spider silk protein. Entrapment of cells occurs at the same time as the silk assembles into a nanofibrillar coating on various substrates. Cell adhesion analysis showed that the technique can markedly improve cell-seeding efficiency to nonfunctionalized polystyrene surfaces, as well as establish cell attachment and growth of human dermal microvascular endothelial cells on bare polyethylene terephthalate and polytetrafluoroethylene (PTFE) substrates. Scanning electron microscopy images revealed a uniform endothelial cell layer and cell-substratum compliance with the functionalized silk protein to PTFE surfaces. The co-seeding technique holds a great promise as a method to reliably and quickly cellularize engineered vascular constructs as well as to in vitro endothelialize commercially available cardiovascular grafts.
Substrate Specificity and Possible Heterologous Targets of Phytaspase, a Plant Cell Death Protease.
Galiullina, Raisa A; Kasperkiewicz, Paulina; Chichkova, Nina V; Szalek, Aleksandra; Serebryakova, Marina V; Poreba, Marcin; Drag, Marcin; Vartapetian, Andrey B
2015-10-09
Plants lack aspartate-specific cell death proteases homologous to animal caspases. Instead, a subtilisin-like serine-dependent plant protease named phytaspase shown to be involved in the accomplishment of programmed death of plant cells is able to hydrolyze a number of peptide-based caspase substrates. Here, we determined the substrate specificity of rice (Oryza sativa) phytaspase by using the positional scanning substrate combinatorial library approach. Phytaspase was shown to display an absolute specificity of hydrolysis after an aspartic acid residue. The preceding amino acid residues, however, significantly influence the efficiency of hydrolysis. Efficient phytaspase substrates demonstrated a remarkable preference for an aromatic amino acid residue in the P3 position. The deduced optimum phytaspase recognition motif has the sequence IWLD and is strikingly hydrophobic. The established pattern was confirmed through synthesis and kinetic analysis of cleavage of a set of optimized peptide substrates. An amino acid motif similar to the phytaspase cleavage site is shared by the human gastrointestinal peptide hormones gastrin and cholecystokinin. In agreement with the established enzyme specificity, phytaspase was shown to hydrolyze gastrin-1 and cholecystokinin at the predicted sites in vitro, thus destroying the active moieties of the hormones. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Grumbt, Barbara; Stroobant, Vincent; Terziyska, Nadia; Israel, Lars; Hell, Kai
2007-12-28
Mia40p and Erv1p are components of a translocation pathway for the import of cysteine-rich proteins into the intermembrane space of mitochondria. We have characterized the redox behavior of Mia40p and reconstituted the disulfide transfer system of Mia40p by using recombinant functional C-terminal fragment of Mia40p, Mia40C, and Erv1p. Oxidized Mia40p contains three intramolecular disulfide bonds. One disulfide bond connects the first two cysteine residues in the CPC motif. The second and the third bonds belong to the twin CX(9)C motif and bridge the cysteine residues of two CX(9)C segments. In contrast to the stabilizing disulfide bonds of the twin CX(9)C motif, the first disulfide bond was easily accessible to reducing agents. Partially reduced Mia40C generated by opening of this bond as well as fully reduced Mia40C were oxidized by Erv1p in vitro. In the course of this reaction, mixed disulfides of Mia40C and Erv1p were formed. Reoxidation of fully reduced Mia40C required the presence of the first two cysteine residues in Mia40C. However, efficient reoxidation of a Mia40C variant containing only the cysteine residues of the twin CX(9)C motif was observed when in addition to Erv1p low amounts of wild type Mia40C were present. In the reconstituted system the thiol oxidase Erv1p was sufficient to transfer disulfide bonds to Mia40C, which then could oxidize the variant of Mia40C. In summary, we reconstituted a disulfide relay system consisting of Mia40C and Erv1p.
Curto, M-Ángeles; Moro, Sandra; Yanguas, Francisco; Gutiérrez-González, Carmen; Valdivieso, M-Henar
2018-05-01
Dni1 and Dni2 facilitate cell fusion during mating. Here, we show that these proteins are interdependent for their localization in a plasma membrane subdomain, which we have termed the mating fusion domain. Dni1 compartmentation in the domain is required for cell fusion. The contribution of actin, sterol-dependent membrane organization, and Dni2 to this compartmentation was analysed, and the results showed that Dni2 plays the most relevant role in the process. In turn, the Dni2 exit from the endoplasmic reticulum depends on Dni1. These proteins share the presence of a cysteine motif in their first extracellular loop related to the claudin GLWxxC(8-10 aa)C signature motif. Structure-function analyses show that mutating each Dni1 conserved cysteine has mild effects, and that only simultaneous elimination of several cysteines leads to a mating defect. On the contrary, eliminating each single cysteine and the C-terminal tail in Dni2 abrogates Dni1 compartmentation and cell fusion. Sequence alignments show that claudin trans-membrane helixes bear small-XXX-small motifs at conserved positions. The fourth Dni2 trans-membrane helix tends to form homo-oligomers in Escherichia plasma membrane, and two concatenated small-XXX-small motifs are required for efficient oligomerization and for Dni2 export from the yeast endoplasmic reticulum. Together, our results strongly suggest that Dni2 is an ancient claudin that blocks Dni1 diffusion from the intercellular region where two plasma membranes are in close proximity, and that this function is required for Dni1 to facilitate cell fusion.
Assembly mechanism of FCT region type 1 pili in serotype M6 Streptococcus pyogenes.
Nakata, Masanobu; Kimura, Keiji Richard; Sumitomo, Tomoko; Wada, Satoshi; Sugauchi, Akinari; Oiki, Eiji; Higashino, Miharu; Kreikemeyer, Bernd; Podbielski, Andreas; Okahashi, Nobuo; Hamada, Shigeyuki; Isoda, Ryutaro; Terao, Yutaka; Kawabata, Shigetada
2011-10-28
The human pathogen Streptococcus pyogenes produces diverse pili depending on the serotype. We investigated the assembly mechanism of FCT type 1 pili in a serotype M6 strain. The pili were found to be assembled from two precursor proteins, the backbone protein T6 and ancillary protein FctX, and anchored to the cell wall in a manner that requires both a housekeeping sortase enzyme (SrtA) and pilus-associated sortase enzyme (SrtB). SrtB is primarily required for efficient formation of the T6 and FctX complex and subsequent polymerization of T6, whereas proper anchoring of the pili to the cell wall is mainly mediated by SrtA. Because motifs essential for polymerization of pilus backbone proteins in other Gram-positive bacteria are not present in T6, we sought to identify the functional residues involved in this process. Our results showed that T6 encompasses the novel VAKS pilin motif conserved in streptococcal T6 homologues and that the lysine residue (Lys-175) within the motif and cell wall sorting signal of T6 are prerequisites for isopeptide linkage of T6 molecules. Because Lys-175 and the cell wall sorting signal of FctX are indispensable for substantial incorporation of FctX into the T6 pilus shaft, FctX is suggested to be located at the pilus tip, which was also implied by immunogold electron microscopy findings. Thus, the elaborate assembly of FCT type 1 pili is potentially organized by sortase-mediated cross-linking between sorting signals and the amino group of Lys-175 positioned in the VAKS motif of T6, thereby displaying T6 and FctX in a temporospatial manner.
Zolotarov, Yevgen; Strömvik, Martina
2015-01-01
Plants accumulate dehydrins in response to osmotic stresses. Dehydrins are divided into five different classes, which are thought to be regulated in different manners. To better understand differences in transcriptional regulation of the five dehydrin classes, de novo motif discovery was performed on 350 dehydrin promoter sequences from a total of 51 plant genomes. Overrepresented motifs were identified in the promoters of five dehydrin classes. The Kn dehydrin promoters contain motifs linked with meristem specific expression, as well as motifs linked with cold/dehydration and abscisic acid response. KS dehydrin promoters contain a motif with a GATA core. SKn and YnSKn dehydrin promoters contain motifs that match elements connected with cold/dehydration, abscisic acid and light response. YnKn dehydrin promoters contain motifs that match abscisic acid and light response elements, but not cold/dehydration response elements. Conserved promoter motifs are present in the dehydrin classes and across different plant lineages, indicating that dehydrin gene regulation is likely also conserved.
The BaMM web server for de-novo motif discovery and regulatory sequence analysis.
Kiesel, Anja; Roth, Christian; Ge, Wanwan; Wess, Maximilian; Meier, Markus; Söding, Johannes
2018-05-28
The BaMM web server offers four tools: (i) de-novo discovery of enriched motifs in a set of nucleotide sequences, (ii) scanning a set of nucleotide sequences with motifs to find motif occurrences, (iii) searching with an input motif for similar motifs in our BaMM database with motifs for >1000 transcription factors, trained from the GTRD ChIP-seq database and (iv) browsing and keyword searching the motif database. In contrast to most other servers, we represent sequence motifs not by position weight matrices (PWMs) but by Bayesian Markov Models (BaMMs) of order 4, which we showed previously to perform substantially better in ROC analyses than PWMs or first order models. To address the inadequacy of P- and E-values as measures of motif quality, we introduce the AvRec score, the average recall over the TP-to-FP ratio between 1 and 100. The BaMM server is freely accessible without registration at https://bammmotif.mpibpc.mpg.de.
Understanding the Development of Mathematical Work in the Context of the Classroom
ERIC Educational Resources Information Center
Kuzniak, Alain; Nechache, Assia; Drouhard, J. P.
2016-01-01
According to our approach to mathematics education, the optimal aim of the teaching of mathematics is to assist students in achieving efficient mathematical work. But, what does efficient exactly mean in that case? And how can teachers reach this objective? The model of Mathematical Working Spaces with its three dimensions--semiotic, instrumental,…
Efficiently Sorting Zoo-Mesh Data Sets
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cook, R; Max, N; Silva, C
The authors describe the SXMPVO algorithm for performing a visibility ordering zoo-meshed polyhedra. The algorithm runs in practice in linear time and the visibility ordering which it produces is exact.
Discovering Sequence Motifs with Arbitrary Insertions and Deletions
Frith, Martin C.; Saunders, Neil F. W.; Kobe, Bostjan; Bailey, Timothy L.
2008-01-01
Biology is encoded in molecular sequences: deciphering this encoding remains a grand scientific challenge. Functional regions of DNA, RNA, and protein sequences often exhibit characteristic but subtle motifs; thus, computational discovery of motifs in sequences is a fundamental and much-studied problem. However, most current algorithms do not allow for insertions or deletions (indels) within motifs, and the few that do have other limitations. We present a method, GLAM2 (Gapped Local Alignment of Motifs), for discovering motifs allowing indels in a fully general manner, and a companion method GLAM2SCAN for searching sequence databases using such motifs. glam2 is a generalization of the gapless Gibbs sampling algorithm. It re-discovers variable-width protein motifs from the PROSITE database significantly more accurately than the alternative methods PRATT and SAM-T2K. Furthermore, it usefully refines protein motifs from the ELM database: in some cases, the refined motifs make orders of magnitude fewer overpredictions than the original ELM regular expressions. GLAM2 performs respectably on the BAliBASE multiple alignment benchmark, and may be superior to leading multiple alignment methods for “motif-like” alignments with N- and C-terminal extensions. Finally, we demonstrate the use of GLAM2 to discover protein kinase substrate motifs and a gapped DNA motif for the LIM-only transcriptional regulatory complex: using GLAM2SCAN, we identify promising targets for the latter. GLAM2 is especially promising for short protein motifs, and it should improve our ability to identify the protein cleavage sites, interaction sites, post-translational modification attachment sites, etc., that underlie much of biology. It may be equally useful for arbitrarily gapped motifs in DNA and RNA, although fewer examples of such motifs are known at present. GLAM2 is public domain software, available for download at http://bioinformatics.org.au/glam2. PMID:18437229
A Bioinformatics Approach for Detecting Repetitive Nested Motifs using Pattern Matching.
Romero, José R; Carballido, Jessica A; Garbus, Ingrid; Echenique, Viviana C; Ponzoni, Ignacio
2016-01-01
The identification of nested motifs in genomic sequences is a complex computational problem. The detection of these patterns is important to allow the discovery of transposable element (TE) insertions, incomplete reverse transcripts, deletions, and/or mutations. In this study, a de novo strategy for detecting patterns that represent nested motifs was designed based on exhaustive searches for pairs of motifs and combinatorial pattern analysis. These patterns can be grouped into three categories, motifs within other motifs, motifs flanked by other motifs, and motifs of large size. The methodology used in this study, applied to genomic sequences from the plant species Aegilops tauschii and Oryza sativa , revealed that it is possible to identify putative nested TEs by detecting these three types of patterns. The results were validated through BLAST alignments, which revealed the efficacy and usefulness of the new method, which is called Mamushka.
FPGA implementation of motifs-based neuronal network and synchronization analysis
NASA Astrophysics Data System (ADS)
Deng, Bin; Zhu, Zechen; Yang, Shuangming; Wei, Xile; Wang, Jiang; Yu, Haitao
2016-06-01
Motifs in complex networks play a crucial role in determining the brain functions. In this paper, 13 kinds of motifs are implemented with Field Programmable Gate Array (FPGA) to investigate the relationships between the networks properties and motifs properties. We use discretization method and pipelined architecture to construct various motifs with Hindmarsh-Rose (HR) neuron as the node model. We also build a small-world network based on these motifs and conduct the synchronization analysis of motifs as well as the constructed network. We find that the synchronization properties of motif determine that of motif-based small-world network, which demonstrates effectiveness of our proposed hardware simulation platform. By imitation of some vital nuclei in the brain to generate normal discharges, our proposed FPGA-based artificial neuronal networks have the potential to replace the injured nuclei to complete the brain function in the treatment of Parkinson's disease and epilepsy.
Karnik, Rahul; Beer, Michael A.
2015-01-01
The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix) rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs. PMID:26465884
Karnik, Rahul; Beer, Michael A
2015-01-01
The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix) rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs.
Regad, Leslie; Martin, Juliette; Camproux, Anne-Claude
2011-06-20
One of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function. Here, we present a systematic extraction of structural motifs of seven residues from protein loops and we explore their correspondence with functional sites. Our approach is based on the structural alphabet HMM-SA (Hidden Markov Model - Structural Alphabet), which allows simplification of protein structures into uni-dimensional sequences, and advanced pattern statistics adapted to short sequences. Structural motifs of interest are selected by looking for structural motifs significantly over-represented in SCOP superfamilies in protein loops. We discovered two types of structural motifs significantly over-represented in SCOP superfamilies: (i) ubiquitous motifs, shared by several superfamilies and (ii) superfamily-specific motifs, over-represented in few superfamilies. A comparison of ubiquitous words with known small structural motifs shows that they contain well-described motifs as turn, niche or nest motifs. A comparison between superfamily-specific motifs and biological annotations of Swiss-Prot reveals that some of them actually correspond to functional sites involved in the binding sites of small ligands, such as ATP/GTP, NAD(P) and SAH/SAM. Our findings show that statistical over-representation in SCOP superfamilies is linked to functional features. The detection of over-represented motifs within structures simplified by HMM-SA is therefore a promising approach for prediction of functional sites and annotation of uncharacterized proteins.
2011-01-01
Background One of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function. Results Here, we present a systematic extraction of structural motifs of seven residues from protein loops and we explore their correspondence with functional sites. Our approach is based on the structural alphabet HMM-SA (Hidden Markov Model - Structural Alphabet), which allows simplification of protein structures into uni-dimensional sequences, and advanced pattern statistics adapted to short sequences. Structural motifs of interest are selected by looking for structural motifs significantly over-represented in SCOP superfamilies in protein loops. We discovered two types of structural motifs significantly over-represented in SCOP superfamilies: (i) ubiquitous motifs, shared by several superfamilies and (ii) superfamily-specific motifs, over-represented in few superfamilies. A comparison of ubiquitous words with known small structural motifs shows that they contain well-described motifs as turn, niche or nest motifs. A comparison between superfamily-specific motifs and biological annotations of Swiss-Prot reveals that some of them actually correspond to functional sites involved in the binding sites of small ligands, such as ATP/GTP, NAD(P) and SAH/SAM. Conclusions Our findings show that statistical over-representation in SCOP superfamilies is linked to functional features. The detection of over-represented motifs within structures simplified by HMM-SA is therefore a promising approach for prediction of functional sites and annotation of uncharacterized proteins. PMID:21689388
Molecular helices as electron acceptors in high-performance bulk heterojunction solar cells
Yu M. Zhong; Nam, Chang -Yong; Trinh, M. Tuan; ...
2015-09-18
Despite numerous organic semiconducting materials synthesized for organic photovoltaics in the past decade, fullerenes are widely used as electron acceptors in highly efficient bulk-heterojunction solar cells. None of the non-fullerene bulk heterojunction solar cells have achieved efficiencies as high as fullerene-based solar cells. Design principles for fullerene-free acceptors remain unclear in the field. Here we report examples of helical molecular semiconductors as electron acceptors that are on par with fullerene derivatives in efficient solar cells. We achieved an 8.3% power conversion efficiency in a solar cell, which is a record high for non-fullerene bulk heterojunctions. Femtosecond transient absorption spectroscopy revealedmore » both electron and hole transfer processes at the donor–acceptor interfaces. Atomic force microscopy reveals a mesh-like network of acceptors with pores that are tens of nanometres in diameter for efficient exciton separation and charge transport. As a result, this study describes a new motif for designing highly efficient acceptors for organic solar cells.« less
Molecular helices as electron acceptors in high-performance bulk heterojunction solar cells.
Zhong, Yu; Trinh, M Tuan; Chen, Rongsheng; Purdum, Geoffrey E; Khlyabich, Petr P; Sezen, Melda; Oh, Seokjoon; Zhu, Haiming; Fowler, Brandon; Zhang, Boyuan; Wang, Wei; Nam, Chang-Yong; Sfeir, Matthew Y; Black, Charles T; Steigerwald, Michael L; Loo, Yueh-Lin; Ng, Fay; Zhu, X-Y; Nuckolls, Colin
2015-09-18
Despite numerous organic semiconducting materials synthesized for organic photovoltaics in the past decade, fullerenes are widely used as electron acceptors in highly efficient bulk-heterojunction solar cells. None of the non-fullerene bulk heterojunction solar cells have achieved efficiencies as high as fullerene-based solar cells. Design principles for fullerene-free acceptors remain unclear in the field. Here we report examples of helical molecular semiconductors as electron acceptors that are on par with fullerene derivatives in efficient solar cells. We achieved an 8.3% power conversion efficiency in a solar cell, which is a record high for non-fullerene bulk heterojunctions. Femtosecond transient absorption spectroscopy revealed both electron and hole transfer processes at the donor-acceptor interfaces. Atomic force microscopy reveals a mesh-like network of acceptors with pores that are tens of nanometres in diameter for efficient exciton separation and charge transport. This study describes a new motif for designing highly efficient acceptors for organic solar cells.
Eisinga, Rob; Heskes, Tom; Pelzer, Ben; Te Grotenhuis, Manfred
2017-01-25
The Friedman rank sum test is a widely-used nonparametric method in computational biology. In addition to examining the overall null hypothesis of no significant difference among any of the rank sums, it is typically of interest to conduct pairwise comparison tests. Current approaches to such tests rely on large-sample approximations, due to the numerical complexity of computing the exact distribution. These approximate methods lead to inaccurate estimates in the tail of the distribution, which is most relevant for p-value calculation. We propose an efficient, combinatorial exact approach for calculating the probability mass distribution of the rank sum difference statistic for pairwise comparison of Friedman rank sums, and compare exact results with recommended asymptotic approximations. Whereas the chi-squared approximation performs inferiorly to exact computation overall, others, particularly the normal, perform well, except for the extreme tail. Hence exact calculation offers an improvement when small p-values occur following multiple testing correction. Exact inference also enhances the identification of significant differences whenever the observed values are close to the approximate critical value. We illustrate the proposed method in the context of biological machine learning, were Friedman rank sum difference tests are commonly used for the comparison of classifiers over multiple datasets. We provide a computationally fast method to determine the exact p-value of the absolute rank sum difference of a pair of Friedman rank sums, making asymptotic tests obsolete. Calculation of exact p-values is easy to implement in statistical software and the implementation in R is provided in one of the Additional files and is also available at http://www.ru.nl/publish/pages/726696/friedmanrsd.zip .
A structural-alphabet-based strategy for finding structural motifs across protein families
Wu, Chih Yuan; Chen, Yao Chi; Lim, Carmay
2010-01-01
Proteins with insignificant sequence and overall structure similarity may still share locally conserved contiguous structural segments; i.e. structural/3D motifs. Most methods for finding 3D motifs require a known motif to search for other similar structures or functionally/structurally crucial residues. Here, without requiring a query motif or essential residues, a fully automated method for discovering 3D motifs of various sizes across protein families with different folds based on a 16-letter structural alphabet is presented. It was applied to structurally non-redundant proteins bound to DNA, RNA, obligate/non-obligate proteins as well as free DNA-binding proteins (DBPs) and proteins with known structures but unknown function. Its usefulness was illustrated by analyzing the 3D motifs found in DBPs. A non-specific motif was found with a ‘corner’ architecture that confers a stable scaffold and enables diverse interactions, making it suitable for binding not only DNA but also RNA and proteins. Furthermore, DNA-specific motifs present ‘only’ in DBPs were discovered. The motifs found can provide useful guidelines in detecting binding sites and computational protein redesign. PMID:20525797
An exact solution for orbit view-periods from a station on a tri-axial ellipsoidal planet
NASA Technical Reports Server (NTRS)
Tang, C. C. H.
1986-01-01
This paper presents the concise exact solution for predicting view-periods to be observed from a masked or unmasked tracking station on a tri-axial ellipsoidal surface. The new exact approach expresses the azimuth and elevation angles of a spacecraft in terms of the station-centered geodetic topocentric coordinates in an elegantly concise manner. A simple and efficient algorithm is developed to avoid costly repetitive computations in searching for neighborhoods near the rise and set times of each satellite orbit for each station. Only one search for each orbit is necessary for each station. Sample results indicate that the use of an assumed spherical earth instead of an 'actual' tri-axial ellipsoidal earth could introduce an error up to a few minutes in a view-period prediction for circular orbits of low or medium altitude. For an elliptical orbit of high eccentricity and long period, the maximum error could be even larger. The analytic treatment and the efficient algorithm are designed for geocentric orbits, but they should be applicable to interplanetary trajectories by an appropriate coordinates transformation at each view-period calculation. This analysis can be accomplished only by not using the classical orbital elements.
An exact solution for orbit view-periods from a station on a tri-axial ellipsoidal planet
NASA Astrophysics Data System (ADS)
Tang, C. C. H.
1986-08-01
This paper presents the concise exact solution for predicting view-periods to be observed from a masked or unmasked tracking station on a tri-axial ellipsoidal surface. The new exact approach expresses the azimuth and elevation angles of a spacecraft in terms of the station-centered geodetic topocentric coordinates in an elegantly concise manner. A simple and efficient algorithm is developed to avoid costly repetitive computations in searching for neighborhoods near the rise and set times of each satellite orbit for each station. Only one search for each orbit is necessary for each station. Sample results indicate that the use of an assumed spherical earth instead of an 'actual' tri-axial ellipsoidal earth could introduce an error up to a few minutes in a view-period prediction for circular orbits of low or medium altitude. For an elliptical orbit of high eccentricity and long period, the maximum error could be even larger. The analytic treatment and the efficient algorithm are designed for geocentric orbits, but they should be applicable to interplanetary trajectories by an appropriate coordinates transformation at each view-period calculation. This analysis can be accomplished only by not using the classical orbital elements.
An exact and efficient first passage time algorithm for reaction-diffusion processes on a 2D-lattice
NASA Astrophysics Data System (ADS)
Bezzola, Andri; Bales, Benjamin B.; Alkire, Richard C.; Petzold, Linda R.
2014-01-01
We present an exact and efficient algorithm for reaction-diffusion-nucleation processes on a 2D-lattice. The algorithm makes use of first passage time (FPT) to replace the computationally intensive simulation of diffusion hops in KMC by larger jumps when particles are far away from step-edges or other particles. Our approach computes exact probability distributions of jump times and target locations in a closed-form formula, based on the eigenvectors and eigenvalues of the corresponding 1D transition matrix, maintaining atomic-scale resolution of resulting shapes of deposit islands. We have applied our method to three different test cases of electrodeposition: pure diffusional aggregation for large ranges of diffusivity rates and for simulation domain sizes of up to 4096×4096 sites, the effect of diffusivity on island shapes and sizes in combination with a KMC edge diffusion, and the calculation of an exclusion zone in front of a step-edge, confirming statistical equivalence to standard KMC simulations. The algorithm achieves significant speedup compared to standard KMC for cases where particles diffuse over long distances before nucleating with other particles or being captured by larger islands.
An exact and efficient first passage time algorithm for reaction–diffusion processes on a 2D-lattice
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bezzola, Andri, E-mail: andri.bezzola@gmail.com; Bales, Benjamin B., E-mail: bbbales2@gmail.com; Alkire, Richard C., E-mail: r-alkire@uiuc.edu
2014-01-01
We present an exact and efficient algorithm for reaction–diffusion–nucleation processes on a 2D-lattice. The algorithm makes use of first passage time (FPT) to replace the computationally intensive simulation of diffusion hops in KMC by larger jumps when particles are far away from step-edges or other particles. Our approach computes exact probability distributions of jump times and target locations in a closed-form formula, based on the eigenvectors and eigenvalues of the corresponding 1D transition matrix, maintaining atomic-scale resolution of resulting shapes of deposit islands. We have applied our method to three different test cases of electrodeposition: pure diffusional aggregation for largemore » ranges of diffusivity rates and for simulation domain sizes of up to 4096×4096 sites, the effect of diffusivity on island shapes and sizes in combination with a KMC edge diffusion, and the calculation of an exclusion zone in front of a step-edge, confirming statistical equivalence to standard KMC simulations. The algorithm achieves significant speedup compared to standard KMC for cases where particles diffuse over long distances before nucleating with other particles or being captured by larger islands.« less
Boehm, Elizabeth M.; Powers, Kyle T.; Kondratick, Christine M.; Spies, Maria; Houtman, Jon C. D.; Washington, M. Todd
2016-01-01
Y-family DNA polymerases, such as polymerase η, polymerase ι, and polymerase κ, catalyze the bypass of DNA damage during translesion synthesis. These enzymes are recruited to sites of DNA damage by interacting with the essential replication accessory protein proliferating cell nuclear antigen (PCNA) and the scaffold protein Rev1. In most Y-family polymerases, these interactions are mediated by one or more conserved PCNA-interacting protein (PIP) motifs that bind in a hydrophobic pocket on the front side of PCNA as well as by conserved Rev1-interacting region (RIR) motifs that bind in a hydrophobic pocket on the C-terminal domain of Rev1. Yeast polymerase η, a prototypical translesion synthesis polymerase, binds both PCNA and Rev1. It possesses a single PIP motif but not an RIR motif. Here we show that the PIP motif of yeast polymerase η mediates its interactions both with PCNA and with Rev1. Moreover, the PIP motif of polymerase η binds in the hydrophobic pocket on the Rev1 C-terminal domain. We also show that the RIR motif of human polymerase κ and the PIP motif of yeast Msh6 bind both PCNA and Rev1. Overall, these findings demonstrate that PIP motifs and RIR motifs have overlapping specificities and can interact with both PCNA and Rev1 in structurally similar ways. These findings also suggest that PIP motifs are a more versatile protein interaction motif than previously believed. PMID:26903512
Han, Ziying; Madara, Jonathan J; Liu, Yuliang; Liu, Wenbo; Ruthel, Gordon; Freedman, Bruce D; Harty, Ronald N
2015-10-01
Ebola (EBOV) is an enveloped, negative-sense RNA virus belonging to the family Filoviridae that causes hemorrhagic fever syndromes with high-mortality rates. To date, there are no licensed vaccines or therapeutics to control EBOV infection and prevent transmission. Consequently, the need to better understand the mechanisms that regulate virus transmission is critical to developing countermeasures. The EBOV VP40 matrix protein plays a central role in late stages of virion assembly and egress, and independent expression of VP40 leads to the production of virus-like particles (VLPs) by a mechanism that accurately mimics budding of live virus. VP40 late (L) budding domains mediate efficient virus-cell separation by recruiting host ESCRT and ESCRT-associated proteins to complete the membrane fission process. L-domains consist of core consensus amino acid motifs including PPxY, P(T/S)AP, and YPx(n)L/I, and EBOV VP40 contains overlapping PPxY and PTAP motifs whose interactions with Nedd4 and Tsg101, respectively, have been characterized extensively. Here, we present data demonstrating for the first time that EBOV VP40 possesses a third L-domain YPx(n)L/I consensus motif that interacts with the ESCRT-III protein Alix. We show that the YPx(n)L/I motif mapping to amino acids 18-26 of EBOV VP40 interacts with the Alix Bro1-V fragment, and that siRNA knockdown of endogenous Alix expression inhibits EBOV VP40 VLP egress. Furthermore, overexpression of Alix Bro1-V rescues VLP production of the budding deficient EBOV VP40 double PTAP/PPEY L-domain deletion mutant to wild-type levels. Together, these findings demonstrate that EBOV VP40 recruits host Alix via a YPx(n)L/I motif that can function as an alternative L-domain to promote virus egress. © The Author 2015. Published by Oxford University Press on behalf of the Infectious Diseases Society of America. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Biological production models as elements of coupled, atmosphere-ocean models for climate research
NASA Technical Reports Server (NTRS)
Platt, Trevor; Sathyendranath, Shubha
1991-01-01
Process models of phytoplankton production are discussed with respect to their suitability for incorporation into global-scale numerical ocean circulation models. Exact solutions are given for integrals over the mixed layer and the day of analytic, wavelength-independent models of primary production. Within this class of model, the bias incurred by using a triangular approximation (rather than a sinusoidal one) to the variation of surface irradiance through the day is computed. Efficient computation algorithms are given for the nonspectral models. More exact calculations require a spectrally sensitive treatment. Such models exist but must be integrated numerically over depth and time. For these integrations, resolution in wavelength, depth, and time are considered and recommendations made for efficient computation. The extrapolation of the one-(spatial)-dimension treatment to large horizontal scale is discussed.
Simulation of biochemical reactions with time-dependent rates by the rejection-based algorithm
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thanh, Vo Hong, E-mail: vo@cosbi.eu; Priami, Corrado, E-mail: priami@cosbi.eu; Department of Mathematics, University of Trento, Trento
We address the problem of simulating biochemical reaction networks with time-dependent rates and propose a new algorithm based on our rejection-based stochastic simulation algorithm (RSSA) [Thanh et al., J. Chem. Phys. 141(13), 134116 (2014)]. The computation for selecting next reaction firings by our time-dependent RSSA (tRSSA) is computationally efficient. Furthermore, the generated trajectory is exact by exploiting the rejection-based mechanism. We benchmark tRSSA on different biological systems with varying forms of reaction rates to demonstrate its applicability and efficiency. We reveal that for nontrivial cases, the selection of reaction firings in existing algorithms introduces approximations because the integration of reactionmore » rates is very computationally demanding and simplifying assumptions are introduced. The selection of the next reaction firing by our approach is easier while preserving the exactness.« less
Making Optical-Fiber Chemical Detectors More Sensitive
NASA Technical Reports Server (NTRS)
Rogowski, Robert S.; Egalon, Claudio O.
1993-01-01
Calculations based on exact theory of optical fiber shown how to increase optical efficiency and sensitivity of active-cladding step-index-profile optical-fiber fluorosensor using evanescent wave coupling. Optical-fiber fluorosensor contains molecules fluorescing when illuminated by suitable light in presence of analyte. Fluorescence coupled into and launched along core by evanescent-wave interaction. Efficiency increases with difference in refractive indices.
Exact and efficient simulation of concordant computation
NASA Astrophysics Data System (ADS)
Cable, Hugo; Browne, Daniel E.
2015-11-01
Concordant computation is a circuit-based model of quantum computation for mixed states, that assumes that all correlations within the register are discord-free (i.e. the correlations are essentially classical) at every step of the computation. The question of whether concordant computation always admits efficient simulation by a classical computer was first considered by Eastin in arXiv:quant-ph/1006.4402v1, where an answer in the affirmative was given for circuits consisting only of one- and two-qubit gates. Building on this work, we develop the theory of classical simulation of concordant computation. We present a new framework for understanding such computations, argue that a larger class of concordant computations admit efficient simulation, and provide alternative proofs for the main results of arXiv:quant-ph/1006.4402v1 with an emphasis on the exactness of simulation which is crucial for this model. We include detailed analysis of the arithmetic complexity for solving equations in the simulation, as well as extensions to larger gates and qudits. We explore the limitations of our approach, and discuss the challenges faced in developing efficient classical simulation algorithms for all concordant computations.
Watanabe, Susan M; Simon, Viviana; Durham, Natasha D; Kemp, Brittney R; Machihara, Satoshi; Kemal, Kimdar Sherefa; Shi, Binshan; Foley, Brian; Li, Hongru; Chen, Benjamin K; Weiser, Barbara; Burger, Harold; Anastos, Kathryn; Chen, Chaoping; Carter, Carol A
2016-09-06
The p6 region of the HIV-1 structural precursor polyprotein, Gag, contains two motifs, P7TAP11 and L35YPLXSL41, designated as late (L) domain-1 and -2, respectively. These motifs bind the ESCRT-I factor Tsg101 and the ESCRT adaptor Alix, respectively, and are critical for efficient budding of virus particles from the plasma membrane. L domain-2 is thought to be functionally redundant to PTAP. To identify possible other functions of L domain-2, we examined this motif in dominant viruses that emerged in a group of 14 women who had detectable levels of HIV-1 in both plasma and genital tract despite a history of current or previous antiretroviral therapy. Remarkably, variants possessing mutations or rare polymorphisms in the highly conserved L domain-2 were identified in seven of these women. A mutation in a conserved residue (S40A) that does not reduce Gag interaction with Alix and therefore did not reduce budding efficiency was further investigated. This mutation causes a simultaneous change in the Pol reading frame but exhibits little deficiency in Gag processing and virion maturation. Whether introduced into the HIV-1 NL4-3 strain genome or a model protease (PR) precursor, S40A reduced production of mature PR. This same mutation also led to high level detection of two extended forms of PR that were fairly stable compared to the WT in the presence of IDV at various concentrations; one of the extended forms was effective in trans processing even at micromolar IDV. Our results indicate that L domain-2, considered redundant in vitro, can undergo mutations in vivo that significantly alter PR function. These may contribute fitness benefits in both the absence and presence of PR inhibitor.
Efficient Calculation of Exact Exchange Within the Quantum Espresso Software Package
NASA Astrophysics Data System (ADS)
Barnes, Taylor; Kurth, Thorsten; Carrier, Pierre; Wichmann, Nathan; Prendergast, David; Kent, Paul; Deslippe, Jack
Accurate simulation of condensed matter at the nanoscale requires careful treatment of the exchange interaction between electrons. In the context of plane-wave DFT, these interactions are typically represented through the use of approximate functionals. Greater accuracy can often be obtained through the use of functionals that incorporate some fraction of exact exchange; however, evaluation of the exact exchange potential is often prohibitively expensive. We present an improved algorithm for the parallel computation of exact exchange in Quantum Espresso, an open-source software package for plane-wave DFT simulation. Through the use of aggressive load balancing and on-the-fly transformation of internal data structures, our code exhibits speedups of approximately an order of magnitude for practical calculations. Additional optimizations are presented targeting the many-core Intel Xeon-Phi ``Knights Landing'' architecture, which largely powers NERSC's new Cori system. We demonstrate the successful application of the code to difficult problems, including simulation of water at a platinum interface and computation of the X-ray absorption spectra of transition metal oxides.
A hierarchical exact accelerated stochastic simulation algorithm
NASA Astrophysics Data System (ADS)
Orendorff, David; Mjolsness, Eric
2012-12-01
A new algorithm, "HiER-leap" (hierarchical exact reaction-leaping), is derived which improves on the computational properties of the ER-leap algorithm for exact accelerated simulation of stochastic chemical kinetics. Unlike ER-leap, HiER-leap utilizes a hierarchical or divide-and-conquer organization of reaction channels into tightly coupled "blocks" and is thereby able to speed up systems with many reaction channels. Like ER-leap, HiER-leap is based on the use of upper and lower bounds on the reaction propensities to define a rejection sampling algorithm with inexpensive early rejection and acceptance steps. But in HiER-leap, large portions of intra-block sampling may be done in parallel. An accept/reject step is used to synchronize across blocks. This method scales well when many reaction channels are present and has desirable asymptotic properties. The algorithm is exact, parallelizable and achieves a significant speedup over the stochastic simulation algorithm and ER-leap on certain problems. This algorithm offers a potentially important step towards efficient in silico modeling of entire organisms.
A generic motif discovery algorithm for sequential data.
Jensen, Kyle L; Styczynski, Mark P; Rigoutsos, Isidore; Stephanopoulos, Gregory N
2006-01-01
Motif discovery in sequential data is a problem of great interest and with many applications. However, previous methods have been unable to combine exhaustive search with complex motif representations and are each typically only applicable to a certain class of problems. Here we present a generic motif discovery algorithm (Gemoda) for sequential data. Gemoda can be applied to any dataset with a sequential character, including both categorical and real-valued data. As we show, Gemoda deterministically discovers motifs that are maximal in composition and length. As well, the algorithm allows any choice of similarity metric for finding motifs. Finally, Gemoda's output motifs are representation-agnostic: they can be represented using regular expressions, position weight matrices or any number of other models for any type of sequential data. We demonstrate a number of applications of the algorithm, including the discovery of motifs in amino acids sequences, a new solution to the (l,d)-motif problem in DNA sequences and the discovery of conserved protein substructures. Gemoda is freely available at http://web.mit.edu/bamel/gemoda
Convergent evolution and mimicry of protein linear motifs in host-pathogen interactions.
Chemes, Lucía Beatriz; de Prat-Gay, Gonzalo; Sánchez, Ignacio Enrique
2015-06-01
Pathogen linear motif mimics are highly evolvable elements that facilitate rewiring of host protein interaction networks. Host linear motifs and pathogen mimics differ in sequence, leading to thermodynamic and structural differences in the resulting protein-protein interactions. Moreover, the functional output of a mimic depends on the motif and domain repertoire of the pathogen protein. Regulatory evolution mediated by linear motifs can be understood by measuring evolutionary rates, quantifying positive and negative selection and performing phylogenetic reconstructions of linear motif natural history. Convergent evolution of linear motif mimics is widespread among unrelated proteins from viral, prokaryotic and eukaryotic pathogens and can also take place within individual protein phylogenies. Statistics, biochemistry and laboratory models of infection link pathogen linear motifs to phenotypic traits such as tropism, virulence and oncogenicity. In vitro evolution experiments and analysis of natural sequences suggest that changes in linear motif composition underlie pathogen adaptation to a changing environment. Copyright © 2015 Elsevier Ltd. All rights reserved.
Enrichment of Circular Code Motifs in the Genes of the Yeast Saccharomyces cerevisiae.
Michel, Christian J; Ngoune, Viviane Nguefack; Poch, Olivier; Ripp, Raymond; Thompson, Julie D
2017-12-03
A set X of 20 trinucleotides has been found to have the highest average occurrence in the reading frame, compared to the two shifted frames, of genes of bacteria, archaea, eukaryotes, plasmids and viruses. This set X has an interesting mathematical property, since X is a maximal C3 self-complementary trinucleotide circular code. Furthermore, any motif obtained from this circular code X has the capacity to retrieve, maintain and synchronize the original (reading) frame. Since 1996, the theory of circular codes in genes has mainly been developed by analysing the properties of the 20 trinucleotides of X, using combinatorics and statistical approaches. For the first time, we test this theory by analysing the X motifs, i.e., motifs from the circular code X, in the complete genome of the yeast Saccharomyces cerevisiae . Several properties of X motifs are identified by basic statistics (at the frequency level), and evaluated by comparison to R motifs, i.e., random motifs generated from 30 different random codes R. We first show that the frequency of X motifs is significantly greater than that of R motifs in the genome of S. cerevisiae . We then verify that no significant difference is observed between the frequencies of X and R motifs in the non-coding regions of S. cerevisiae , but that the occurrence number of X motifs is significantly higher than R motifs in the genes (protein-coding regions). This property is true for all cardinalities of X motifs (from 4 to 20) and for all 16 chromosomes. We further investigate the distribution of X motifs in the three frames of S. cerevisiae genes and show that they occur more frequently in the reading frame, regardless of their cardinality or their length. Finally, the ratio of X genes, i.e., genes with at least one X motif, to non-X genes, in the set of verified genes is significantly different to that observed in the set of putative or dubious genes with no experimental evidence. These results, taken together, represent the first evidence for a significant enrichment of X motifs in the genes of an extant organism. They raise two hypotheses: the X motifs may be evolutionary relics of the primitive codes used for translation, or they may continue to play a functional role in the complex processes of genome decoding and protein synthesis.
Andreatta, Massimo; Schafer-Nielsen, Claus; Lund, Ole; Buus, Søren; Nielsen, Morten
2011-01-01
Recent advances in high-throughput technologies have made it possible to generate both gene and protein sequence data at an unprecedented rate and scale thereby enabling entirely new “omics”-based approaches towards the analysis of complex biological processes. However, the amount and complexity of data that even a single experiment can produce seriously challenges researchers with limited bioinformatics expertise, who need to handle, analyze and interpret the data before it can be understood in a biological context. Thus, there is an unmet need for tools allowing non-bioinformatics users to interpret large data sets. We have recently developed a method, NNAlign, which is generally applicable to any biological problem where quantitative peptide data is available. This method efficiently identifies underlying sequence patterns by simultaneously aligning peptide sequences and identifying motifs associated with quantitative readouts. Here, we provide a web-based implementation of NNAlign allowing non-expert end-users to submit their data (optionally adjusting method parameters), and in return receive a trained method (including a visual representation of the identified motif) that subsequently can be used as prediction method and applied to unknown proteins/peptides. We have successfully applied this method to several different data sets including peptide microarray-derived sets containing more than 100,000 data points. NNAlign is available online at http://www.cbs.dtu.dk/services/NNAlign. PMID:22073191
Andreatta, Massimo; Schafer-Nielsen, Claus; Lund, Ole; Buus, Søren; Nielsen, Morten
2011-01-01
Recent advances in high-throughput technologies have made it possible to generate both gene and protein sequence data at an unprecedented rate and scale thereby enabling entirely new "omics"-based approaches towards the analysis of complex biological processes. However, the amount and complexity of data that even a single experiment can produce seriously challenges researchers with limited bioinformatics expertise, who need to handle, analyze and interpret the data before it can be understood in a biological context. Thus, there is an unmet need for tools allowing non-bioinformatics users to interpret large data sets. We have recently developed a method, NNAlign, which is generally applicable to any biological problem where quantitative peptide data is available. This method efficiently identifies underlying sequence patterns by simultaneously aligning peptide sequences and identifying motifs associated with quantitative readouts. Here, we provide a web-based implementation of NNAlign allowing non-expert end-users to submit their data (optionally adjusting method parameters), and in return receive a trained method (including a visual representation of the identified motif) that subsequently can be used as prediction method and applied to unknown proteins/peptides. We have successfully applied this method to several different data sets including peptide microarray-derived sets containing more than 100,000 data points. NNAlign is available online at http://www.cbs.dtu.dk/services/NNAlign.
Systematic and fully automated identification of protein sequence patterns.
Hart, R K; Royyuru, A K; Stolovitzky, G; Califano, A
2000-01-01
We present an efficient algorithm to systematically and automatically identify patterns in protein sequence families. The procedure is based on the Splash deterministic pattern discovery algorithm and on a framework to assess the statistical significance of patterns. We demonstrate its application to the fully automated discovery of patterns in 974 PROSITE families (the complete subset of PROSITE families which are defined by patterns and contain DR records). Splash generates patterns with better specificity and undiminished sensitivity, or vice versa, in 28% of the families; identical statistics were obtained in 48% of the families, worse statistics in 15%, and mixed behavior in the remaining 9%. In about 75% of the cases, Splash patterns identify sequence sites that overlap more than 50% with the corresponding PROSITE pattern. The procedure is sufficiently rapid to enable its use for daily curation of existing motif and profile databases. Third, our results show that the statistical significance of discovered patterns correlates well with their biological significance. The trypsin subfamily of serine proteases is used to illustrate this method's ability to exhaustively discover all motifs in a family that are statistically and biologically significant. Finally, we discuss applications of sequence patterns to multiple sequence alignment and the training of more sensitive score-based motif models, akin to the procedure used by PSI-BLAST. All results are available at httpl//www.research.ibm.com/spat/.
Blanden, Melanie J; Suazo, Kiall F; Hildebrandt, Emily R; Hardgrove, Daniel S; Patel, Meet; Saunders, William P; Distefano, Mark D; Schmidt, Walter K; Hougland, James L
2018-02-23
Protein prenylation is a post-translational modification that has been most commonly associated with enabling protein trafficking to and interaction with cellular membranes. In this process, an isoprenoid group is attached to a cysteine near the C terminus of a substrate protein by protein farnesyltransferase (FTase) or protein geranylgeranyltransferase type I or II (GGTase-I and GGTase-II). FTase and GGTase-I have long been proposed to specifically recognize a four-amino acid C AAX C-terminal sequence within their substrates. Surprisingly, genetic screening reveals that yeast FTase can modify sequences longer than the canonical C AAX sequence, specifically C( x ) 3 X sequences with four amino acids downstream of the cysteine. Biochemical and cell-based studies using both peptide and protein substrates reveal that mammalian FTase orthologs can also prenylate C( x ) 3 X sequences. As the search to identify physiologically relevant C( x ) 3 X proteins begins, this new prenylation motif nearly doubles the number of proteins within the yeast and human proteomes that can be explored as potential FTase substrates. This work expands our understanding of prenylation's impact within the proteome, establishes the biologically relevant reactivity possible with this new motif, and opens new frontiers in determining the impact of non-canonically prenylated proteins on cell function. © 2018 by The American Society for Biochemistry and Molecular Biology, Inc.
Geneva, Ivayla I.; Tan, Han Yen; Calvert, Peter D.
2017-01-01
Resolution limitations of optical systems are major obstacles for determining whether proteins are enriched within cell compartments. Here we use an approach to determine the degree of membrane protein ciliary enrichment that quantitatively accounts for the differences in sampling of the ciliary and apical membranes inherent to confocal microscopes. Theory shows that cilia will appear more than threefold brighter than the surrounding apical membrane when the densities of fluorescently labeled proteins are the same, thus providing a benchmark for ciliary enrichment. Using this benchmark, we examined the ciliary enrichment signals of two G protein–coupled receptors (GPCRs)—the somatostatin receptor 3 and rhodopsin. Remarkably, we found that the C-terminal VxPx motif, required for efficient enrichment of rhodopsin within rod photoreceptor sensory cilia, inhibited enrichment of the somatostatin receptor in primary cilia. Similarly, VxPx inhibited primary cilium enrichment of a chimera of rhodopsin and somatostatin receptor 3, where the dual Ax(S/A)xQ ciliary targeting motifs within the third intracellular loop of the somatostatin receptor replaced the third intracellular loop of rhodopsin. Rhodopsin was depleted from primary cilia but gained access, without being enriched, with the dual Ax(S/A)xQ motifs. Ciliary enrichment of these GPCRs thus operates via distinct mechanisms in different cells. PMID:27974638
Giancaspero, Teresa Anna; Dipalo, Emilia; Miccolis, Angelica; Boles, Eckhard; Caselle, Michele; Barile, Maria
2014-01-01
This paper deals with the control exerted by the mitochondrial translocator FLX1, which catalyzes the movement of the redox cofactor FAD across the mitochondrial membrane, on the efficiency of ATP production, ROS homeostasis, and lifespan of S. cerevisiae. The deletion of the FLX1 gene resulted in respiration-deficient and small-colony phenotype accompanied by a significant ATP shortage and ROS unbalance in glycerol-grown cells. Moreover, the flx1Δ strain showed H2O2 hypersensitivity and decreased lifespan. The impaired biochemical phenotype found in the flx1Δ strain might be justified by an altered expression of the flavoprotein subunit of succinate dehydrogenase, a key enzyme in bioenergetics and cell regulation. A search for possible cis-acting consensus motifs in the regulatory region upstream SDH1-ORF revealed a dozen of upstream motifs that might respond to induced metabolic changes by altering the expression of Flx1p. Among these motifs, two are present in the regulatory region of genes encoding proteins involved in flavin homeostasis. This is the first evidence that the mitochondrial flavin cofactor status is involved in controlling the lifespan of yeasts, maybe by changing the cellular succinate level. This is not the only case in which the homeostasis of redox cofactors underlies complex phenotypical behaviours, as lifespan in yeasts. PMID:24895546
2010-01-01
Background An important focus of genomic science is the discovery and characterization of all functional elements within genomes. In silico methods are used in genome studies to discover putative regulatory genomic elements (called words or motifs). Although a number of methods have been developed for motif discovery, most of them lack the scalability needed to analyze large genomic data sets. Methods This manuscript presents WordSeeker, an enumerative motif discovery toolkit that utilizes multi-core and distributed computational platforms to enable scalable analysis of genomic data. A controller task coordinates activities of worker nodes, each of which (1) enumerates a subset of the DNA word space and (2) scores words with a distributed Markov chain model. Results A comprehensive suite of performance tests was conducted to demonstrate the performance, speedup and efficiency of WordSeeker. The scalability of the toolkit enabled the analysis of the entire genome of Arabidopsis thaliana; the results of the analysis were integrated into The Arabidopsis Gene Regulatory Information Server (AGRIS). A public version of WordSeeker was deployed on the Glenn cluster at the Ohio Supercomputer Center. Conclusion WordSeeker effectively utilizes concurrent computing platforms to enable the identification of putative functional elements in genomic data sets. This capability facilitates the analysis of the large quantity of sequenced genomic data. PMID:21210985
NASA Astrophysics Data System (ADS)
Zhu, Yuanjun; Li, Ruyi; Lin, Yuan; Shui, Mengyang; Liu, Xiaoyan; Chen, Huan; Wang, Yinye
2016-07-01
Targeted delivery of antithrombotic drugs centralizes the effects in the thrombosis site and reduces the hemorrhage side effects in uninjured vessels. We have recently reported that the platelet-targeting factor Xa (FXa) inhibitors, constructed by engineering one Arg-Gly-Asp (RGD) motif into Ancylostoma caninum anticoagulant peptide 5 (AcAP5), can reduce the risk of systemic bleeding than non-targeted AcAP5 in mouse arterial injury model. Increasing the number of platelet-binding sites of FXa inhibitors may facilitate their adhesion to activated platelets, and further lower the bleeding risks. For this purpose, we introduced three RGD motifs into AcAP5 to generate a variant NR4 containing three platelet-binding sites. NR4 reserved its inherent anti-FXa activity. Protein-protein docking showed that all three RGD motifs were capable of binding to platelet receptor αIIbβ3. Molecular dynamics simulation demonstrated that NR4 has more opportunities to interact with αIIbβ3 than single-RGD-containing NR3. Flow cytometry analysis and rat arterial thrombosis model further confirmed that NR4 possesses enhanced platelet targeting activity. Moreover, NR4-treated mice showed a trend toward less tail bleeding time than NR3-treated mice in carotid artery endothelium injury model. Therefore, our data suggest that engineering multiple binding sites in one recombinant protein is a useful tool to improve its platelet-targeting efficiency.
Structural and Functional Investigations of the N-Terminal Ubiquitin Binding Region of Usp25.
Yang, Yuanyuan; Shi, Li; Ding, Yiluan; Shi, Yanhong; Hu, Hong-Yu; Wen, Yi; Zhang, Naixia
2017-05-23
Ubiquitin-specific protease 25 (Usp25) is a deubiquitinase that is involved in multiple biological processes. The N-terminal ubiquitin-binding region (UBR) of Usp25 contains one ubiquitin-associated domain, one small ubiquitin-like modifier (SUMO)-interacting motif and two ubiquitin-interacting motifs. Previous studies suggest that the covalent sumoylation in the UBR of Usp25 impairs its enzymatic activity. Here, we raise the hypothesis that non-covalent binding of SUMO, a prerequisite for efficient sumoylation, will impair Usp25's catalytic activity as well. To test our hypothesis and elucidate the underlying molecular mechanism, we investigated the structure and function of the Usp25 N-terminal UBR. The solution structure of Usp25 1-146 is obtained, and the key residues responsible for recognition of ubiquitin and SUMO2 are identified. Our data suggest inhibition of Usp25's catalytic activity upon the non-covalent binding of SUMO2 to the Usp25 SUMO-interacting motif. We also find that SUMO2 can competitively block the interaction between the Usp25 UBR and its ubiquitin substrates. Based on our findings, we have proposed a working model to depict the regulatory role of the Usp25 UBR in the functional display of the enzyme. Copyright © 2017 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Jayaraman, Dhileepkumar; Richards, Alicia L; Westphall, Michael S; Coon, Joshua J; Ané, Jean-Michel
2017-06-01
Detecting the phosphorylation substrates of multiple kinases in a single experiment is a challenge, and new techniques are being developed to overcome this challenge. Here, we used a multiplexed assay for kinase specificity (MAKS) to identify the substrates directly and to map the phosphorylation site(s) of plant symbiotic receptor-like kinases. The symbiotic receptor-like kinases nodulation receptor-like kinase (NORK) and lysin motif domain-containing receptor-like kinase 3 (LYK3) are indispensable for the establishment of root nodule symbiosis. Although some interacting proteins have been identified for these symbiotic receptor-like kinases, very little is known about their phosphorylation substrates. Using this high-throughput approach, we identified several other potential phosphorylation targets for both these symbiotic receptor-like kinases. In particular, we also discovered the phosphorylation of LYK3 by NORK itself, which was also confirmed by pairwise kinase assays. Motif analysis of potential targets for these kinases revealed that the acidic motif xxxsDxxx was common to both of them. In summary, this high-throughput technique catalogs the potential phosphorylation substrates of multiple kinases in a single efficient experiment, the biological characterization of which should provide a better understanding of phosphorylation signaling cascade in symbiosis. © 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.
Zhang, Yanju; Lameijer, Eric-Wubbo; 't Hoen, Peter A. C.; Ning, Zemin; Slagboom, P. Eline; Ye, Kai
2012-01-01
Motivation: RNA-seq is a powerful technology for the study of transcriptome profiles that uses deep-sequencing technologies. Moreover, it may be used for cellular phenotyping and help establishing the etiology of diseases characterized by abnormal splicing patterns. In RNA-Seq, the exact nature of splicing events is buried in the reads that span exon–exon boundaries. The accurate and efficient mapping of these reads to the reference genome is a major challenge. Results: We developed PASSion, a pattern growth algorithm-based pipeline for splice site detection in paired-end RNA-Seq reads. Comparing the performance of PASSion to three existing RNA-Seq analysis pipelines, TopHat, MapSplice and HMMSplicer, revealed that PASSion is competitive with these packages. Moreover, the performance of PASSion is not affected by read length and coverage. It performs better than the other three approaches when detecting junctions in highly abundant transcripts. PASSion has the ability to detect junctions that do not have known splicing motifs, which cannot be found by the other tools. Of the two public RNA-Seq datasets, PASSion predicted ∼ 137 000 and 173 000 splicing events, of which on average 82 are known junctions annotated in the Ensembl transcript database and 18% are novel. In addition, our package can discover differential and shared splicing patterns among multiple samples. Availability: The code and utilities can be freely downloaded from https://trac.nbic.nl/passion and ftp://ftp.sanger.ac.uk/pub/zn1/passion Contact: y.zhang@lumc.nl; k.ye@lumc.nl Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22219203
Correlation energy functional within the GW -RPA: Exact forms, approximate forms, and challenges
NASA Astrophysics Data System (ADS)
Ismail-Beigi, Sohrab
2010-05-01
In principle, the Luttinger-Ward Green’s-function formalism allows one to compute simultaneously the total energy and the quasiparticle band structure of a many-body electronic system from first principles. We present approximate and exact expressions for the correlation energy within the GW -random-phase approximation that are more amenable to computation and allow for developing efficient approximations to the self-energy operator and correlation energy. The exact form is a sum over differences between plasmon and interband energies. The approximate forms are based on summing over screened interband transitions. We also demonstrate that blind extremization of such functionals leads to unphysical results: imposing physical constraints on the allowed solutions (Green’s functions) is necessary. Finally, we present some relevant numerical results for atomic systems.
NASA Astrophysics Data System (ADS)
Nakwaski, W.
2008-03-01
Comprehensive computer simulations are currently the most efficient and cheap methods in designing and optimisation of semiconductor device structures. Seemingly they should be as exact as possible, but in practice it is well known that the most exact approaches are also the most involved and the most time-consuming ones and need powerful computers. In some cases, cheaper somewhat simplified modelling simulations are sufficiently accurate. Therefore, an appropriate modelling approach should be chosen taking into account a compromise between our needs and our possibilities. Modelling of operation and designing of structures of vertical-cavity surface-emitting diode lasers (VCSELs) requires appropriate mathematical description of physical processes crucial for devices operation, i.e., various optical, electrical, thermal, recombination and sometimes also mechanical phenomena taking place within their volumes. Equally important are mutual interactions between above individual processes, usually strongly non-linear and creating a real network of various inter-relations. Chain is as strong as its weakest link. Analogously, model is as exact as its less exact part. Therefore it is useless to improve exactness of its more accurate parts and not to care about less exact ones. All model parts should exhibit similar accuracy. In any individual case, a reasonable compromise should be reached between high modelling fidelity and its practical convenience depending on a main modelling goal, importance and urgency of expected results, available equipment and also financial possibilities. In the present paper, some simplifications used in VCSEL modelling are discussed and their impact on exactness of VCSEL designing is analysed.
Systematic comparison of the response properties of protein and RNA mediated gene regulatory motifs.
Iyengar, Bharat Ravi; Pillai, Beena; Venkatesh, K V; Gadgil, Chetan J
2017-05-30
We present a framework enabling the dissection of the effects of motif structure (feedback or feedforward), the nature of the controller (RNA or protein), and the regulation mode (transcriptional, post-transcriptional or translational) on the response to a step change in the input. We have used a common model framework for gene expression where both motif structures have an activating input and repressing regulator, with the same set of parameters, to enable a comparison of the responses. We studied the global sensitivity of the system properties, such as steady-state gain, overshoot, peak time, and peak duration, to parameters. We find that, in all motifs, overshoot correlated negatively whereas peak duration varied concavely with peak time. Differences in the other system properties were found to be mainly dependent on the nature of the controller rather than the motif structure. Protein mediated motifs showed a higher degree of adaptation i.e. a tendency to return to baseline levels; in particular, feedforward motifs exhibited perfect adaptation. RNA mediated motifs had a mild regulatory effect; they also exhibited a lower peaking tendency and mean overshoot. Protein mediated feedforward motifs showed higher overshoot and lower peak time compared to the corresponding feedback motifs.
Discriminative motif optimization based on perceptron training
Patel, Ronak Y.; Stormo, Gary D.
2014-01-01
Motivation: Generating accurate transcription factor (TF) binding site motifs from data generated using the next-generation sequencing, especially ChIP-seq, is challenging. The challenge arises because a typical experiment reports a large number of sequences bound by a TF, and the length of each sequence is relatively long. Most traditional motif finders are slow in handling such enormous amount of data. To overcome this limitation, tools have been developed that compromise accuracy with speed by using heuristic discrete search strategies or limited optimization of identified seed motifs. However, such strategies may not fully use the information in input sequences to generate motifs. Such motifs often form good seeds and can be further improved with appropriate scoring functions and rapid optimization. Results: We report a tool named discriminative motif optimizer (DiMO). DiMO takes a seed motif along with a positive and a negative database and improves the motif based on a discriminative strategy. We use area under receiver-operating characteristic curve (AUC) as a measure of discriminating power of motifs and a strategy based on perceptron training that maximizes AUC rapidly in a discriminative manner. Using DiMO, on a large test set of 87 TFs from human, drosophila and yeast, we show that it is possible to significantly improve motifs identified by nine motif finders. The motifs are generated/optimized using training sets and evaluated on test sets. The AUC is improved for almost 90% of the TFs on test sets and the magnitude of increase is up to 39%. Availability and implementation: DiMO is available at http://stormo.wustl.edu/DiMO Contact: rpatel@genetics.wustl.edu, ronakypatel@gmail.com PMID:24369152
2012-01-01
Background To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. Results We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. Conclusions SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery. PMID:23281852
Chiu, Yi-Yuan; Lin, Chun-Yu; Lin, Chih-Ta; Hsu, Kai-Cheng; Chang, Li-Zen; Yang, Jinn-Moon
2012-01-01
To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery.
An exact analysis of a rectangular plate piezoelectric generator.
Yang, Jiashi; Chen, Ziguang; Hu, Yuantai
2007-01-01
We study thickness-twist vibration of a finite, piezoelectric plate of polarized ceramics or 6-mm crystals driven by surface mechanical loads. An exact solution from the three-dimensional equations of piezoelectricity is obtained. The plate is properly electroded and connected to a circuit such that an electric output is generated. The structure analyzed represents a piezoelectric generator for converting mechanical energy to electrical energy. Analytical expressions for the output voltage, current, power, efficiency, and power density are given. The basic behaviors of the generator are shown by numerical results.
SARNAclust: Semi-automatic detection of RNA protein binding motifs from immunoprecipitation data
Dotu, Ivan; Adamson, Scott I.; Coleman, Benjamin; Fournier, Cyril; Ricart-Altimiras, Emma; Eyras, Eduardo
2018-01-01
RNA-protein binding is critical to gene regulation, controlling fundamental processes including splicing, translation, localization and stability, and aberrant RNA-protein interactions are known to play a role in a wide variety of diseases. However, molecular understanding of RNA-protein interactions remains limited; in particular, identification of RNA motifs that bind proteins has long been challenging, especially when such motifs depend on both sequence and structure. Moreover, although RNA binding proteins (RBPs) often contain more than one binding domain, algorithms capable of identifying more than one binding motif simultaneously have not been developed. In this paper we present a novel pipeline to determine binding peaks in crosslinking immunoprecipitation (CLIP) data, to discover multiple possible RNA sequence/structure motifs among them, and to experimentally validate such motifs. At the core is a new semi-automatic algorithm SARNAclust, the first unsupervised method to identify and deconvolve multiple sequence/structure motifs simultaneously. SARNAclust computes similarity between sequence/structure objects using a graph kernel, providing the ability to isolate the impact of specific features through the bulge graph formalism. Application of SARNAclust to synthetic data shows its capability of clustering 5 motifs at once with a V-measure value of over 0.95, while GraphClust achieves only a V-measure of 0.083 and RNAcontext cannot detect any of the motifs. When applied to existing eCLIP sets, SARNAclust finds known motifs for SLBP and HNRNPC and novel motifs for several other RBPs such as AGGF1, AKAP8L and ILF3. We demonstrate an experimental validation protocol, a targeted Bind-n-Seq-like high-throughput sequencing approach that relies on RNA inverse folding for oligo pool design, that can validate the components within the SLBP motif. Finally, we use this protocol to experimentally interrogate the SARNAclust motif predictions for protein ILF3. Our results support a newly identified partially double-stranded UUUUUGAGA motif similar to that known for the splicing factor HNRNPC. PMID:29596423
Jaeger, Sébastien; Thieffry, Denis
2017-01-01
Abstract Transcription factor (TF) databases contain multitudes of binding motifs (TFBMs) from various sources, from which non-redundant collections are derived by manual curation. The advent of high-throughput methods stimulated the production of novel collections with increasing numbers of motifs. Meta-databases, built by merging these collections, contain redundant versions, because available tools are not suited to automatically identify and explore biologically relevant clusters among thousands of motifs. Motif discovery from genome-scale data sets (e.g. ChIP-seq) also produces redundant motifs, hampering the interpretation of results. We present matrix-clustering, a versatile tool that clusters similar TFBMs into multiple trees, and automatically creates non-redundant TFBM collections. A feature unique to matrix-clustering is its dynamic visualisation of aligned TFBMs, and its capability to simultaneously treat multiple collections from various sources. We demonstrate that matrix-clustering considerably simplifies the interpretation of combined results from multiple motif discovery tools, and highlights biologically relevant variations of similar motifs. We also ran a large-scale application to cluster ∼11 000 motifs from 24 entire databases, showing that matrix-clustering correctly groups motifs belonging to the same TF families, and drastically reduced motif redundancy. matrix-clustering is integrated within the RSAT suite (http://rsat.eu/), accessible through a user-friendly web interface or command-line for its integration in pipelines. PMID:28591841
Discovery of phosphorylation motif mixtures in phosphoproteomics data
Ritz, Anna; Shakhnarovich, Gregory; Salomon, Arthur R.; Raphael, Benjamin J.
2009-01-01
Motivation: Modification of proteins via phosphorylation is a primary mechanism for signal transduction in cells. Phosphorylation sites on proteins are determined in part through particular patterns, or motifs, present in the amino acid sequence. Results: We describe an algorithm that simultaneously discovers multiple motifs in a set of peptides that were phosphorylated by several different kinases. Such sets of peptides are routinely produced in proteomics experiments.Our motif-finding algorithm uses the principle of minimum description length to determine a mixture of sequence motifs that distinguish a foreground set of phosphopeptides from a background set of unphosphorylated peptides. We show that our algorithm outperforms existing motif-finding algorithms on synthetic datasets consisting of mixtures of known phosphorylation sites. We also derive a motif specificity score that quantifies whether or not the phosphoproteins containing an instance of a motif have a significant number of known interactions. Application of our motif-finding algorithm to recently published human and mouse proteomic studies recovers several known phosphorylation motifs and reveals a number of novel motifs that are enriched for interactions with a particular kinase or phosphatase. Our tools provide a new approach for uncovering the sequence specificities of uncharacterized kinases or phosphatases. Availability: Software is available at http:/cs.brown.edu/people/braphael/software.html. Contact: aritz@cs.brown.edu; braphael@cs.brown.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:18996944
Motif discovery and motif finding from genome-mapped DNase footprint data.
Kulakovskiy, Ivan V; Favorov, Alexander V; Makeev, Vsevolod J
2009-09-15
Footprint data is an important source of information on transcription factor recognition motifs. However, a footprinting fragment can contain no sequences similar to known protein recognition sites. Inspection of genome fragments nearby can help to identify missing site positions. Genome fragments containing footprints were supplied to a pipeline that constructed a position weight matrix (PWM) for different motif lengths and selected the optimal PWM. Fragments were aligned with the SeSiMCMC sampler and a new heuristic algorithm, Bigfoot. Footprints with missing hits were found for approximately 50% of factors. Adding only 2 bp on both sides of a footprinting fragment recovered most hits. We automatically constructed motifs for 41 Drosophila factors. New motifs can recognize footprints with a greater sensitivity at the same false positive rate than existing models. Also we discuss possible overfitting of constructed motifs. Software and the collection of regulatory motifs are freely available at http://line.imb.ac.ru/DMMPMM.
Gorochowski, Thomas E; Grierson, Claire S; di Bernardo, Mario
2018-03-01
Network motifs are significantly overrepresented subgraphs that have been proposed as building blocks for natural and engineered networks. Detailed functional analysis has been performed for many types of motif in isolation, but less is known about how motifs work together to perform complex tasks. To address this issue, we measure the aggregation of network motifs via methods that extract precisely how these structures are connected. Applying this approach to a broad spectrum of networked systems and focusing on the widespread feed-forward loop motif, we uncover striking differences in motif organization. The types of connection are often highly constrained, differ between domains, and clearly capture architectural principles. We show how this information can be used to effectively predict functionally important nodes in the metabolic network of Escherichia coli . Our findings have implications for understanding how networked systems are constructed from motif parts and elucidate constraints that guide their evolution.
Grierson, Claire S.
2018-01-01
Network motifs are significantly overrepresented subgraphs that have been proposed as building blocks for natural and engineered networks. Detailed functional analysis has been performed for many types of motif in isolation, but less is known about how motifs work together to perform complex tasks. To address this issue, we measure the aggregation of network motifs via methods that extract precisely how these structures are connected. Applying this approach to a broad spectrum of networked systems and focusing on the widespread feed-forward loop motif, we uncover striking differences in motif organization. The types of connection are often highly constrained, differ between domains, and clearly capture architectural principles. We show how this information can be used to effectively predict functionally important nodes in the metabolic network of Escherichia coli. Our findings have implications for understanding how networked systems are constructed from motif parts and elucidate constraints that guide their evolution. PMID:29670941
Fauteux, François; Strömvik, Martina V
2009-01-01
Background Accurate computational identification of cis-regulatory motifs is difficult, particularly in eukaryotic promoters, which typically contain multiple short and degenerate DNA sequences bound by several interacting factors. Enrichment in combinations of rare motifs in the promoter sequence of functionally or evolutionarily related genes among several species is an indicator of conserved transcriptional regulatory mechanisms. This provides a basis for the computational identification of cis-regulatory motifs. Results We have used a discriminative seeding DNA motif discovery algorithm for an in-depth analysis of 54 seed storage protein (SSP) gene promoters from three plant families, namely Brassicaceae (mustards), Fabaceae (legumes) and Poaceae (grasses) using backgrounds based on complete sets of promoters from a representative species in each family, namely Arabidopsis (Arabidopsis thaliana (L.) Heynh.), soybean (Glycine max (L.) Merr.) and rice (Oryza sativa L.) respectively. We have identified three conserved motifs (two RY-like and one ACGT-like) in Brassicaceae and Fabaceae SSP gene promoters that are similar to experimentally characterized seed-specific cis-regulatory elements. Fabaceae SSP gene promoter sequences are also enriched in a novel, seed-specific E2Fb-like motif. Conserved motifs identified in Poaceae SSP gene promoters include a GCN4-like motif, two prolamin-box-like motifs and an Skn-1-like motif. Evidence of the presence of a variant of the TATA-box is found in the SSP gene promoters from the three plant families. Motifs discovered in SSP gene promoters were used to score whole-genome sets of promoters from Arabidopsis, soybean and rice. The highest-scoring promoters are associated with genes coding for different subunits or precursors of seed storage proteins. Conclusion Seed storage protein gene promoter motifs are conserved in diverse species, and different plant families are characterized by a distinct combination of conserved motifs. The majority of discovered motifs match experimentally characterized cis-regulatory elements. These results provide a good starting point for further experimental analysis of plant seed-specific promoters and our methodology can be used to unravel more transcriptional regulatory mechanisms in plants and other eukaryotes. PMID:19843335
Yang, Peng; Wu, Min; Guo, Jing; Kwoh, Chee Keong; Przytycka, Teresa M; Zheng, Jie
2014-02-17
As a fundamental genomic element, meiotic recombination hotspot plays important roles in life sciences. Thus uncovering its regulatory mechanisms has broad impact on biomedical research. Despite the recent identification of the zinc finger protein PRDM9 and its 13-mer binding motif as major regulators for meiotic recombination hotspots, other regulators remain to be discovered. Existing methods for finding DNA sequence motifs of recombination hotspots often rely on the enrichment of co-localizations between hotspots and short DNA patterns, which ignore the cross-individual variation of recombination rates and sequence polymorphisms in the population. Our objective in this paper is to capture signals encoded in genetic variations for the discovery of recombination-associated DNA motifs. Recently, an algorithm called "LDsplit" has been designed to detect the association between single nucleotide polymorphisms (SNPs) and proximal meiotic recombination hotspots. The association is measured by the difference of population recombination rates at a hotspot between two alleles of a candidate SNP. Here we present an open source software tool of LDsplit, with integrative data visualization for recombination hotspots and their proximal SNPs. Applying LDsplit on SNPs inside an established 7-mer motif bound by PRDM9 we observed that SNP alleles preserving the original motif tend to have higher recombination rates than the opposite alleles that disrupt the motif. Running on SNP windows around hotspots each containing an occurrence of the 7-mer motif, LDsplit is able to guide the established motif finding algorithm of MEME to recover the 7-mer motif. In contrast, without LDsplit the 7-mer motif could not be identified. LDsplit is a software tool for the discovery of cis-regulatory DNA sequence motifs stimulating meiotic recombination hotspots by screening and narrowing down to hotspot associated SNPs. It is the first computational method that utilizes the genetic variation of recombination hotspots among individuals, opening a new avenue for motif finding. Tested on an established motif and simulated datasets, LDsplit shows promise to discover novel DNA motifs for meiotic recombination hotspots.
2014-01-01
Background As a fundamental genomic element, meiotic recombination hotspot plays important roles in life sciences. Thus uncovering its regulatory mechanisms has broad impact on biomedical research. Despite the recent identification of the zinc finger protein PRDM9 and its 13-mer binding motif as major regulators for meiotic recombination hotspots, other regulators remain to be discovered. Existing methods for finding DNA sequence motifs of recombination hotspots often rely on the enrichment of co-localizations between hotspots and short DNA patterns, which ignore the cross-individual variation of recombination rates and sequence polymorphisms in the population. Our objective in this paper is to capture signals encoded in genetic variations for the discovery of recombination-associated DNA motifs. Results Recently, an algorithm called “LDsplit” has been designed to detect the association between single nucleotide polymorphisms (SNPs) and proximal meiotic recombination hotspots. The association is measured by the difference of population recombination rates at a hotspot between two alleles of a candidate SNP. Here we present an open source software tool of LDsplit, with integrative data visualization for recombination hotspots and their proximal SNPs. Applying LDsplit on SNPs inside an established 7-mer motif bound by PRDM9 we observed that SNP alleles preserving the original motif tend to have higher recombination rates than the opposite alleles that disrupt the motif. Running on SNP windows around hotspots each containing an occurrence of the 7-mer motif, LDsplit is able to guide the established motif finding algorithm of MEME to recover the 7-mer motif. In contrast, without LDsplit the 7-mer motif could not be identified. Conclusions LDsplit is a software tool for the discovery of cis-regulatory DNA sequence motifs stimulating meiotic recombination hotspots by screening and narrowing down to hotspot associated SNPs. It is the first computational method that utilizes the genetic variation of recombination hotspots among individuals, opening a new avenue for motif finding. Tested on an established motif and simulated datasets, LDsplit shows promise to discover novel DNA motifs for meiotic recombination hotspots. PMID:24533858
Punetha, Ankita; Shanmugam, Karthi; Sundar, Durai
2011-04-01
Aromatase is an important pharmacological target in the anti-cancer therapy as the intratumoral aromatase is the source of local estrogen production in breast cancer tissues. Suppression of estrogen biosynthesis by aromatase inhibition represents an effective approach for the treatment of hormone-sensitive breast cancer. Because of the membrane-bound character and heme-binding instability, no crystal structure of aromatase was reported for a long time, until recently when crystal structure of human placental aromatase cytochrome P450 in complex with androstenedione was deposited in PDB. The present study is towards understanding the structural and functional characteristics of aromatase to address unsolved mysteries about this enzyme and elucidate the exact mode of binding of aromatase inhibitors. We have performed molecular docking simulation with twelve different inhibitors (ligands), which includes four FDA approved drugs; two flavonoids; three herbal compounds and three compounds having biphenyl motif with known IC(50) values into the active site of the human aromatase enzyme. All ligands showed favorable interactions and most of them seemed to interact to hydrophobic amino acids Ile133, Phe134, Phe221, Trp224, Ala306, Val370, Val373, Met374 and Leu477 and hydrophilic Arg115 and neutral Thr310 residues. The elucidation of the actual structure-function relationship of aromatase and the exact binding mode described in this study will be of significant interest as its inhibitors have shown great promise in fighting breast cancer.
DNA motif alignment by evolving a population of Markov chains.
Bi, Chengpeng
2009-01-30
Deciphering cis-regulatory elements or de novo motif-finding in genomes still remains elusive although much algorithmic effort has been expended. The Markov chain Monte Carlo (MCMC) method such as Gibbs motif samplers has been widely employed to solve the de novo motif-finding problem through sequence local alignment. Nonetheless, the MCMC-based motif samplers still suffer from local maxima like EM. Therefore, as a prerequisite for finding good local alignments, these motif algorithms are often independently run a multitude of times, but without information exchange between different chains. Hence it would be worth a new algorithm design enabling such information exchange. This paper presents a novel motif-finding algorithm by evolving a population of Markov chains with information exchange (PMC), each of which is initialized as a random alignment and run by the Metropolis-Hastings sampler (MHS). It is progressively updated through a series of local alignments stochastically sampled. Explicitly, the PMC motif algorithm performs stochastic sampling as specified by a population-based proposal distribution rather than individual ones, and adaptively evolves the population as a whole towards a global maximum. The alignment information exchange is accomplished by taking advantage of the pooled motif site distributions. A distinct method for running multiple independent Markov chains (IMC) without information exchange, or dubbed as the IMC motif algorithm, is also devised to compare with its PMC counterpart. Experimental studies demonstrate that the performance could be improved if pooled information were used to run a population of motif samplers. The new PMC algorithm was able to improve the convergence and outperformed other popular algorithms tested using simulated and biological motif sequences.
Liu, Yanbin; Koh, Chong Mei John; Ngoh, Si Te; Ji, Lianghui
2015-10-26
Rhodosporidium and Rhodotorula are two genera of oleaginous red yeast with great potential for industrial biotechnology. To date, there is no effective method for inducible expression of proteins and RNAs in these hosts. We have developed a luciferase gene reporter assay based on a new codon-optimized LUC2 reporter gene (RtLUC2), which is flanked with CAR2 homology arms and can be integrated into the CAR2 locus in the nuclear genome at >90 % efficiency. We characterized the upstream DNA sequence of a D-amino acid oxidase gene (DAO1) from R. toruloides ATCC 10657 by nested deletions. By comparing the upstream DNA sequences of several putative DAO1 homologs of Basidiomycetous fungi, we identified a conserved DNA motif with a consensus sequence of AGGXXGXAGX11GAXGAXGG within a 0.2 kb region from the mRNA translation initiation site. Deletion of this motif led to strong mRNA transcription under non-inducing conditions. Interestingly, DAO1 promoter activity was enhanced about fivefold when the 108 bp intron 1 was included in the reporter construct. We identified a conserved CT-rich motif in the intron with a consensus sequence of TYTCCCYCTCCYCCCCACWYCCGA, deletion or point mutations of which drastically reduced promoter strength under both inducing and non-inducing conditions. Additionally, we created a selection marker-free DAO1-null mutant (∆dao1e) which displayed greatly improved inducible gene expression, particularly when both glucose and nitrogen were present in high levels. To avoid adding unwanted peptide to proteins to be expressed, we converted the original translation initiation codon to ATC and re-created a translation initiation codon at the start of exon 2. This promoter, named P DAO1-in1m1 , showed very similar luciferase activity to the wild-type promoter upon induction with D-alanine. The inducible system was tunable by adjusting the levels of inducers, carbon source and nitrogen source. The intron 1-containing DAO1 promoters coupled with a DAO1 null mutant makes an efficient and tight D-amino acid-inducible gene expression system in Rhodosporidium and Rhodotorula genera. The system will be a valuable tool for metabolic engineering and enzyme expression in these yeast hosts.
SA-Mot: a web server for the identification of motifs of interest extracted from protein loops
Regad, Leslie; Saladin, Adrien; Maupetit, Julien; Geneix, Colette; Camproux, Anne-Claude
2011-01-01
The detection of functional motifs is an important step for the determination of protein functions. We present here a new web server SA-Mot (Structural Alphabet Motif) for the extraction and location of structural motifs of interest from protein loops. Contrary to other methods, SA-Mot does not focus only on functional motifs, but it extracts recurrent and conserved structural motifs involved in structural redundancy of loops. SA-Mot uses the structural word notion to extract all structural motifs from uni-dimensional sequences corresponding to loop structures. Then, SA-Mot provides a description of these structural motifs using statistics computed in the loop data set and in SCOP superfamily, sequence and structural parameters. SA-Mot results correspond to an interactive table listing all structural motifs extracted from a target structure and their associated descriptors. Using this information, the users can easily locate loop regions that are important for the protein folding and function. The SA-Mot web server is available at http://sa-mot.mti.univ-paris-diderot.fr. PMID:21665924
SA-Mot: a web server for the identification of motifs of interest extracted from protein loops.
Regad, Leslie; Saladin, Adrien; Maupetit, Julien; Geneix, Colette; Camproux, Anne-Claude
2011-07-01
The detection of functional motifs is an important step for the determination of protein functions. We present here a new web server SA-Mot (Structural Alphabet Motif) for the extraction and location of structural motifs of interest from protein loops. Contrary to other methods, SA-Mot does not focus only on functional motifs, but it extracts recurrent and conserved structural motifs involved in structural redundancy of loops. SA-Mot uses the structural word notion to extract all structural motifs from uni-dimensional sequences corresponding to loop structures. Then, SA-Mot provides a description of these structural motifs using statistics computed in the loop data set and in SCOP superfamily, sequence and structural parameters. SA-Mot results correspond to an interactive table listing all structural motifs extracted from a target structure and their associated descriptors. Using this information, the users can easily locate loop regions that are important for the protein folding and function. The SA-Mot web server is available at http://sa-mot.mti.univ-paris-diderot.fr.
Liseron-Monfils, Christophe; Lewis, Tim; Ashlock, Daniel; McNicholas, Paul D; Fauteux, François; Strömvik, Martina; Raizada, Manish N
2013-03-15
The discovery of genetic networks and cis-acting DNA motifs underlying their regulation is a major objective of transcriptome studies. The recent release of the maize genome (Zea mays L.) has facilitated in silico searches for regulatory motifs. Several algorithms exist to predict cis-acting elements, but none have been adapted for maize. A benchmark data set was used to evaluate the accuracy of three motif discovery programs: BioProspector, Weeder and MEME. Analysis showed that each motif discovery tool had limited accuracy and appeared to retrieve a distinct set of motifs. Therefore, using the benchmark, statistical filters were optimized to reduce the false discovery ratio, and then remaining motifs from all programs were combined to improve motif prediction. These principles were integrated into a user-friendly pipeline for motif discovery in maize called Promzea, available at http://www.promzea.org and on the Discovery Environment of the iPlant Collaborative website. Promzea was subsequently expanded to include rice and Arabidopsis. Within Promzea, a user enters cDNA sequences or gene IDs; corresponding upstream sequences are retrieved from the maize genome. Predicted motifs are filtered, combined and ranked. Promzea searches the chosen plant genome for genes containing each candidate motif, providing the user with the gene list and corresponding gene annotations. Promzea was validated in silico using a benchmark data set: the Promzea pipeline showed a 22% increase in nucleotide sensitivity compared to the best standalone program tool, Weeder, with equivalent nucleotide specificity. Promzea was also validated by its ability to retrieve the experimentally defined binding sites of transcription factors that regulate the maize anthocyanin and phlobaphene biosynthetic pathways. Promzea predicted additional promoter motifs, and genome-wide motif searches by Promzea identified 127 non-anthocyanin/phlobaphene genes that each contained all five predicted promoter motifs in their promoters, perhaps uncovering a broader co-regulated gene network. Promzea was also tested against tissue-specific microarray data from maize. An online tool customized for promoter motif discovery in plants has been generated called Promzea. Promzea was validated in silico by its ability to retrieve benchmark motifs and experimentally defined motifs and was tested using tissue-specific microarray data. Promzea predicted broader networks of gene regulation associated with the historic anthocyanin and phlobaphene biosynthetic pathways. Promzea is a new bioinformatics tool for understanding transcriptional gene regulation in maize and has been expanded to include rice and Arabidopsis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cai, Zhao; Zhou, Daojin; Wang, Maoyu
Exploring materials with regulated local structures and understanding how the atomic motifs govern the reactivity and durability of catalysts are a critical challenge for designing advanced catalysts. Here we report the tuning of the local atomic structure of nickel–iron layered double hydroxides (NiFe–LDHs) by partially substituting Ni 2+ with Fe 2+ to introduce Fe–O–Fe moieties. These Fe 2+–containing NiFe–LDHs exhibit enhanced oxygen evolution reaction (OER) activity with an ultralow overpotential of 195 mV at the current density of 10 mA/cm 2, which is among the best OER catalytic performance reported to date. In–situ X–ray absorption, Raman, and electrochemical analysis jointlymore » reveal that the Fe–O–Fe motifs could stabilize high–valent metal sites at low overpotentials, thereby enhancing the OER activity. Lastly, these results reveal the importance of tuning the local atomic structure for designing high efficiency electrocatalysts.« less
Schmidtgall, Boris; Höbartner, Claudia; Ducho, Christian
2015-01-01
Modifications of the nucleic acid backbone are essential for the development of oligonucleotide-derived bioactive agents. The NAA-modification represents a novel artificial internucleotide linkage which enables the site-specific introduction of positive charges into the otherwise polyanionic backbone of DNA oligonucleotides. Following initial studies with the introduction of the NAA-linkage at T-T sites, it is now envisioned to prepare NAA-modified oligonucleotides bearing the modification at X-T motifs (X = A, C, G). We have therefore developed the efficient and stereoselective synthesis of NAA-linked 'dimeric' A-T phosphoramidite building blocks for automated DNA synthesis. Both the (S)- and the (R)-configured NAA-motifs were constructed with high diastereoselectivities to furnish two different phosphoramidite reagents, which were employed for the solid phase-supported automated synthesis of two NAA-modified DNA oligonucleotides. This represents a significant step to further establish the NAA-linkage as a useful addition to the existing 'toolbox' of backbone modifications for the design of bioactive oligonucleotide analogues.
The corepressor CtBP interacts with Evi-1 to repress transforming growth factor beta signaling.
Izutsu, K; Kurokawa, M; Imai, Y; Maki, K; Mitani, K; Hirai, H
2001-05-01
Evi-1 is a zinc finger nuclear protein whose inappropriate expression leads to leukemic transformation of hematopoietic cells in mice and humans. This was previously shown to block the antiproliferative effect of transforming growth factor beta (TGF-beta). Evi-1 represses TGF-beta signaling by direct interaction with Smad3 through its first zinc finger motif. Here, it is demonstrated that Evi-1 represses Smad-induced transcription by recruiting C-terminal binding protein (CtBP) as a corepressor. Evi-1 associates with CtBP1 through one of the consensus binding motifs, and this association is required for efficient inhibition of TGF-beta signaling. A specific inhibitor for histone deacetylase (HDAc) alleviates Evi-1-mediated repression of TGF-beta signaling, suggesting that HDAc is involved in the transcriptional repression by Evi-1. This identifies a novel function of Evi-1 as a member of corepressor complexes and suggests that aberrant recruitment of corepressors is one of the mechanisms for Evi-1-induced leukemogenesis.
Template-constrained macrocyclic peptides prepared from native, unprotected precursors
Lawson, Kenneth V.; Rose, Tristan E.; Harran, Patrick G.
2013-01-01
Peptide–protein interactions are important mediators of cellular-signaling events. Consensus binding motifs (also known as short linear motifs) within these contacts underpin molecular recognition, yet have poor pharmacological properties as discrete species. Here, we present methods to transform intact peptides into stable, templated macrocycles. Two simple steps install the template. The key reaction is a palladium-catalyzed macrocyclization. The catalysis has broad scope and efficiently forms large rings by engaging native peptide functionality including phenols, imidazoles, amines, and carboxylic acids without the necessity of protecting groups. The tunable reactivity of the template gives the process special utility. Defined changes in reaction conditions markedly alter chemoselectivity. In all cases examined, cyclization occurs rapidly and in high yield at room temperature, regardless of peptide composition or chain length. We show that conformational restraints imparted by the template stabilize secondary structure and enhance proteolytic stability in vitro. Palladium-catalyzed internal cinnamylation is a strong complement to existing methods for peptide modification. PMID:24043790
Willwand, Kurt; Moroianu, Adela; Hörlein, Rita; Stremmel, Wolfgang; Rommelaere, Jean
2002-07-01
The linear single-stranded DNA genome of minute virus of mice (MVM) is replicated via a double-stranded replicative form (RF) intermediate DNA. Amplification of viral RF DNA requires the structural transition of the right-end palindrome from a linear duplex into a double-hairpin structure, which serves for the repriming of unidirectional DNA synthesis. This conformational transition was found previously to be induced by the MVM nonstructural protein NS1. Elimination of the cognate NS1-binding sites, [ACCA](2), from the central region of the right-end palindrome next to the axis of symmetry was shown to markedly reduce the efficiency of hairpin-primed DNA replication, as measured in a reconstituted in vitro replication system. Thus, [ACCA](2) sequence motifs are essential as NS1-binding elements in the context of the structural transition of the right-end MVM palindrome.
Pilla, Kala Bharath; Otting, Gottfried; Huber, Thomas
2017-03-07
Computational and nuclear magnetic resonance hybrid approaches provide efficient tools for 3D structure determination of small proteins, but currently available algorithms struggle to perform with larger proteins. Here we demonstrate a new computational algorithm that assembles the 3D structure of a protein from its constituent super-secondary structural motifs (Smotifs) with the help of pseudocontact shift (PCS) restraints for backbone amide protons, where the PCSs are produced from different metal centers. The algorithm, DINGO-PCS (3D assembly of Individual Smotifs to Near-native Geometry as Orchestrated by PCSs), employs the PCSs to recognize, orient, and assemble the constituent Smotifs of the target protein without any other experimental data or computational force fields. Using a universal Smotif database, the DINGO-PCS algorithm exhaustively enumerates any given Smotif. We benchmarked the program against ten different protein targets ranging from 100 to 220 residues with different topologies. For nine of these targets, the method was able to identify near-native Smotifs. Copyright © 2017 Elsevier Ltd. All rights reserved.
Mechanism of foreign DNA selection in a bacterial adaptive immune system
Sashital, Dipali G.; Wiedenheft, Blake; Doudna, Jennifer A.
2012-01-01
Summary In bacterial and archaeal CRISPR immune pathways, DNA sequences from invading bacteriophage or plasmids are integrated into CRISPR loci within the host genome, conferring immunity against subsequent infections. The ribonucleoprotein complex Cascade utilizes RNAs generated from these loci to target complementary “non-self” DNA sequences for destruction, while avoiding binding to “self” sequences within the CRISPR locus. Here we show that CasA, the largest protein subunit of Cascade, is required for non-self target recognition and binding. Combining a 2.3 Å crystal structure of CasA with cryo-EM structures of Cascade, we have identified a loop that is required for viral defense. This loop contacts a conserved 3-base pair motif that is required for non-self target selection. Our data suggest a model in which the CasA loop scans DNA for this short motif prior to target destabilization and binding, maximizing the efficiency of DNA surveillance by Cascade. PMID:22521690
Freimuth, P; Anderson, C W
1993-03-01
The sequence of a 1158-base pair fragment of the human adenovirus serotype 12 (Ad12) genome was determined. This segment encodes the precursors for virion components Mu and VI. Both Ad12 precursors contain two sequences that conform to a consensus sequence motif for cleavage by the endoproteinase of adenovirus 2 (Ad2). Analysis of the amino terminus of VI and of the peptide fragments found in Ad12 virions demonstrated that these sites are cleaved during Ad12 maturation. This observation suggests that the recognition motif for adenovirus endoproteinases is highly conserved among human serotypes. The adenovirus 2 endoproteinase polypeptide requires additional co-factors for activity (C. W. Anderson, Protein Expression Purif., 1993, 4, 8-15). Synthetic Ad12 or Ad2 pVI carboxy-terminal peptides each permitted efficient cleavage of an artificial endoproteinase substrate by recombinant Ad2 endoproteinase polypeptide.
Mammalian Fe-S proteins: definition of a consensus motif recognized by the co-chaperone HSC20
Maio, N.; Rouault, T. A.
2017-01-01
Iron-sulfur (Fe-S) clusters are inorganic cofactors that are fundamental to several biological processes in all three kingdoms of life. In most organisms, Fe-S clusters are initially assembled on a scaffold protein, ISCU, and subsequently transferred to target proteins or to intermediate carriers by a dedicated chaperone/co-chaperone system. The delivery of assembled Fe-S clusters to recipient proteins is a crucial step in the biogenesis of Fe-S proteins, and, in mammals, it relies on the activity of a multiprotein transfer complex that contains the chaperone HSPA9, the co-chaperone HSC20 and the scaffold ISCU. How the transfer complex efficiently engages recipient Fe-S target proteins involves specific protein interactions that are not fully understood. This mini review focuses on recent insights into the molecular mechanism of amino acid motif recognition and discrimination by the co-chaperone HSC20, which guides Fe-S cluster delivery. PMID:27714045
Exaptive origins of regulated mRNA decay in eukaryotes.
Hamid, Fursham M; Makeyev, Eugene V
2016-09-01
Eukaryotic gene expression is extensively controlled at the level of mRNA stability and the mechanisms underlying this regulation are markedly different from their archaeal and bacterial counterparts. We propose that two such mechanisms, nonsense-mediated decay (NMD) and motif-specific transcript destabilization by CCCH-type zinc finger RNA-binding proteins, originated as a part of cellular defense against RNA pathogens. These branches of the mRNA turnover pathway might have been used by primeval eukaryotes alongside RNA interference to distinguish their own messages from those of RNA viruses and retrotransposable elements. We further hypothesize that the subsequent advent of "professional" innate and adaptive immunity systems allowed NMD and the motif-triggered mechanisms to be efficiently repurposed for regulation of endogenous cellular transcripts. This scenario explains the rapid emergence of archetypical mRNA destabilization pathways in eukaryotes and argues that other aspects of post-transcriptional gene regulation in this lineage might have been derived through a similar exaptation route. © 2016 The Authors BioEssays Published by WILEY Periodicals, Inc.
HIV-1 nucleocapsid protein localizes efficiently to the nucleus and nucleolus
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yu, Kyung Lee; Lee, Sun Hee; Lee, Eun Soo
The HIV-1 nucleocapsid (NC) is an essential viral protein containing two highly conserved retroviral-type zinc finger (ZF) motifs, which functions in multiple stages of the HIV-1 life cycle. Although a number of functions for NC either in its mature form or as a domain of Gag have been revealed, little is known about the intracellular localization of NC and, moreover, its role in Gag protein trafficking. Here, we have investigated various forms of HIV-1 NC protein for its cellular localization and found that the NC has a strong nuclear and nucleolar localization activity. The linker region, composed of a stretchmore » of basic amino acids between the two ZF motifs, was necessary and sufficient for the activity. - Highlights: • HIV-1 NC possess a NLS and leads to nuclear and nucleolus localization. • Mutations in basic residues between two ZFs in NC decrease the nucleus localization. • ZFs of NC affect cytoplasmic organelles localization rather than nucleus localization.« less
Cai, Zhao; Zhou, Daojin; Wang, Maoyu; Bak, Seongmin; Wu, Yueshen; Wu, Zishan; Tian, Yang; Xiong, Xuya; Li, Yaping; Liu, Wen; Siahrostami, Samira; Kuang, Yun; Yang, Xiao-Qing; Duan, Haohong; Feng, Zhenxing; Wang, Hailiang; Sun, Xiaoming
2018-06-11
Exploring materials with regulated local structures and understanding how the atomic motifs govern the reactivity and durability of catalysts are a critical challenge for designing advanced catalysts. Here we report the tuning of the local atomic structure of nickel-iron layered double hydroxides (NiFe-LDHs) by partially substituting Ni2+ with Fe2+ to introduce Fe-O-Fe moieties. These Fe2+-containing NiFe-LDHs exhibit enhanced oxygen evolution reaction (OER) activity with an ultralow overpotential of 195 mV at the current density of 10 mA/cm2, which is among the best OER catalytic performance reported to date. In-situ X-ray absorption, Raman, and electrochemical analysis jointly reveal that the Fe-O-Fe motifs could stabilize high-valent metal sites at low overpotentials, thereby enhancing the OER activity. These results reveal the importance of tuning the local atomic structure for designing high efficiency electrocatalysts. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Cai, Zhao; Zhou, Daojin; Wang, Maoyu; ...
2018-06-11
Exploring materials with regulated local structures and understanding how the atomic motifs govern the reactivity and durability of catalysts are a critical challenge for designing advanced catalysts. Here we report the tuning of the local atomic structure of nickel–iron layered double hydroxides (NiFe–LDHs) by partially substituting Ni 2+ with Fe 2+ to introduce Fe–O–Fe moieties. These Fe 2+–containing NiFe–LDHs exhibit enhanced oxygen evolution reaction (OER) activity with an ultralow overpotential of 195 mV at the current density of 10 mA/cm 2, which is among the best OER catalytic performance reported to date. In–situ X–ray absorption, Raman, and electrochemical analysis jointlymore » reveal that the Fe–O–Fe motifs could stabilize high–valent metal sites at low overpotentials, thereby enhancing the OER activity. Lastly, these results reveal the importance of tuning the local atomic structure for designing high efficiency electrocatalysts.« less
Synthetic incoherent feedforward circuits show adaptation to the amount of their genetic template
Bleris, Leonidas; Xie, Zhen; Glass, David; Adadey, Asa; Sontag, Eduardo; Benenson, Yaakov
2011-01-01
Natural and synthetic biological networks must function reliably in the face of fluctuating stoichiometry of their molecular components. These fluctuations are caused in part by changes in relative expression efficiency and the DNA template amount of the network-coding genes. Gene product levels could potentially be decoupled from these changes via built-in adaptation mechanisms, thereby boosting network reliability. Here, we show that a mechanism based on an incoherent feedforward motif enables adaptive gene expression in mammalian cells. We modeled, synthesized, and tested transcriptional and post-transcriptional incoherent loops and found that in all cases the gene product adapts to changes in DNA template abundance. We also observed that the post-transcriptional form results in superior adaptation behavior, higher absolute expression levels, and lower intrinsic fluctuations. Our results support a previously hypothesized endogenous role in gene dosage compensation for such motifs and suggest that their incorporation in synthetic networks will improve their robustness and reliability. PMID:21811230
McCune, Broc T; Tang, Wei; Lu, Jia; Eaglesham, James B; Thorne, Lucy; Mayer, Anne E; Condiff, Emily; Nice, Timothy J; Goodfellow, Ian; Krezel, Andrzej M; Virgin, Herbert W
2017-07-11
The Norovirus genus contains important human pathogens, but the role of host pathways in norovirus replication is largely unknown. Murine noroviruses provide the opportunity to study norovirus replication in cell culture and in small animals. The human norovirus nonstructural protein NS1/2 interacts with the host protein VAMP-associated protein A (VAPA), but the significance of the NS1/2-VAPA interaction is unexplored. Here we report decreased murine norovirus replication in VAPA- and VAPB-deficient cells. We characterized the role of VAPA in detail. VAPA was required for the efficiency of a step(s) in the viral replication cycle after entry of viral RNA into the cytoplasm but before the synthesis of viral minus-sense RNA. The interaction of VAPA with viral NS1/2 proteins is conserved between murine and human noroviruses. Murine norovirus NS1/2 directly bound the major sperm protein (MSP) domain of VAPA through its NS1 domain. Mutations within NS1 that disrupted interaction with VAPA inhibited viral replication. Structural analysis revealed that the viral NS1 domain contains a mimic of the phenylalanine-phenylalanine-acidic-tract (FFAT) motif that enables host proteins to bind to the VAPA MSP domain. The NS1/2-FFAT mimic region interacted with the VAPA-MSP domain in a manner similar to that seen with bona fide host FFAT motifs. Amino acids in the FFAT mimic region of the NS1 domain that are important for viral replication are highly conserved across murine norovirus strains. Thus, VAPA interaction with a norovirus protein that functionally mimics host FFAT motifs is important for murine norovirus replication. IMPORTANCE Human noroviruses are a leading cause of gastroenteritis worldwide, but host factors involved in norovirus replication are incompletely understood. Murine noroviruses have been studied to define mechanisms of norovirus replication. Here we defined the importance of the interaction between the hitherto poorly studied NS1/2 norovirus protein and the VAPA host protein. The NS1/2-VAPA interaction is conserved between murine and human noroviruses and was important for early steps in murine norovirus replication. Using structure-function analysis, we found that NS1/2 contains a short sequence that molecularly mimics the FFAT motif that is found in multiple host proteins that bind VAPA. This represents to our knowledge the first example of functionally important mimicry of a host FFAT motif by a microbial protein. Copyright © 2017 McCune et al.
D-MATRIX: A web tool for constructing weight matrix of conserved DNA motifs
Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok
2009-01-01
Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. DMATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the coregulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sosbox cisregulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. DMATRIX tool is accessible through the CIMAP domain network. Availability http://203.190.147.116/dmatrix/ PMID:19759861
D-MATRIX: a web tool for constructing weight matrix of conserved DNA motifs.
Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok
2009-07-27
Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. D-MATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the co-regulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sos-box cis-regulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. D-MATRIX tool is accessible through the CIMAP domain network. http://203.190.147.116/dmatrix/
A motif detection and classification method for peptide sequences using genetic programming.
Tomita, Yasuyuki; Kato, Ryuji; Okochi, Mina; Honda, Hiroyuki
2008-08-01
An exploration of common rules (property motifs) in amino acid sequences has been required for the design of novel sequences and elucidation of the interactions between molecules controlled by the structural or physical environment. In the present study, we developed a new method to search property motifs that are common in peptide sequence data. Our method comprises the following two characteristics: (i) the automatic determination of the position and length of common property motifs by calculating the physicochemical similarity of amino acids, and (ii) the quick and effective exploration of motif candidates that discriminates the positives and negatives by the introduction of genetic programming (GP). Our method was evaluated by two types of model data sets. First, the intentionally buried property motifs were searched in the artificially derived peptide data containing intentionally buried property motifs. As a result, the expected property motifs were correctly extracted by our algorithm. Second, the peptide data that interact with MHC class II molecules were analyzed as one of the models of biologically active peptides with buried motifs in various lengths. Twofold MHC class II binding peptides were identified with the rule using our method, compared to the existing scoring matrix method. In conclusion, our GP based motif searching approach enabled to obtain knowledge of functional aspects of the peptides without any prior knowledge.
NASA Astrophysics Data System (ADS)
Chen, Guohai; Meng, Zeng; Yang, Dixiong
2018-01-01
This paper develops an efficient method termed as PE-PIM to address the exact nonstationary responses of pavement structure, which is modeled as a rectangular thin plate resting on bi-parametric Pasternak elastic foundation subjected to stochastic moving loads with constant acceleration. Firstly, analytical power spectral density (PSD) functions of random responses for thin plate are derived by integrating pseudo excitation method (PEM) with Duhamel's integral. Based on PEM, the new equivalent von Mises stress (NEVMS) is proposed, whose PSD function contains all cross-PSD functions between stress components. Then, the PE-PIM that combines the PEM with precise integration method (PIM) is presented to achieve efficiently stochastic responses of the plate by replacing Duhamel's integral with the PIM. Moreover, the semi-analytical Monte Carlo simulation is employed to verify the computational results of the developed PE-PIM. Finally, numerical examples demonstrate the high accuracy and efficiency of PE-PIM for nonstationary random vibration analysis. The effects of velocity and acceleration of moving load, boundary conditions of the plate and foundation stiffness on the deflection and NEVMS responses are scrutinized.
Polynomial-time quantum algorithm for the simulation of chemical dynamics
Kassal, Ivan; Jordan, Stephen P.; Love, Peter J.; Mohseni, Masoud; Aspuru-Guzik, Alán
2008-01-01
The computational cost of exact methods for quantum simulation using classical computers grows exponentially with system size. As a consequence, these techniques can be applied only to small systems. By contrast, we demonstrate that quantum computers could exactly simulate chemical reactions in polynomial time. Our algorithm uses the split-operator approach and explicitly simulates all electron-nuclear and interelectronic interactions in quadratic time. Surprisingly, this treatment is not only more accurate than the Born–Oppenheimer approximation but faster and more efficient as well, for all reactions with more than about four atoms. This is the case even though the entire electronic wave function is propagated on a grid with appropriately short time steps. Although the preparation and measurement of arbitrary states on a quantum computer is inefficient, here we demonstrate how to prepare states of chemical interest efficiently. We also show how to efficiently obtain chemically relevant observables, such as state-to-state transition probabilities and thermal reaction rates. Quantum computers using these techniques could outperform current classical computers with 100 qubits. PMID:19033207
Quantum dynamical framework for Brownian heat engines
NASA Astrophysics Data System (ADS)
Agarwal, G. S.; Chaturvedi, S.
2013-07-01
We present a self-contained formalism modeled after the Brownian motion of a quantum harmonic oscillator for describing the performance of microscopic Brownian heat engines such as Carnot, Stirling, and Otto engines. Our theory, besides reproducing the standard thermodynamics results in the steady state, enables us to study the role dissipation plays in determining the efficiency of Brownian heat engines under actual laboratory conditions. In particular, we analyze in detail the dynamics associated with decoupling a system in equilibrium with one bath and recoupling it to another bath and obtain exact analytical results, which are shown to have significant ramifications on the efficiencies of engines involving such a step. We also develop a simple yet powerful technique for computing corrections to the steady state results arising from finite operation time and use it to arrive at the thermodynamic complementarity relations for various operating conditions and also to compute the efficiencies of the three engines cited above at maximum power. Some of the methods and exactly solvable models presented here are interesting in their own right and could find useful applications in other contexts as well.
NASA Astrophysics Data System (ADS)
Franzke, Yannick J.; Middendorf, Nils; Weigend, Florian
2018-03-01
We present an efficient algorithm for one- and two-component analytical energy gradients with respect to nuclear displacements in the exact two-component decoupling approach to the one-electron Dirac equation (X2C). Our approach is a generalization of the spin-free ansatz by Cheng and Gauss [J. Chem. Phys. 135, 084114 (2011)], where the perturbed one-electron Hamiltonian is calculated by solving a first-order response equation. Computational costs are drastically reduced by applying the diagonal local approximation to the unitary decoupling transformation (DLU) [D. Peng and M. Reiher, J. Chem. Phys. 136, 244108 (2012)] to the X2C Hamiltonian. The introduced error is found to be almost negligible as the mean absolute error of the optimized structures amounts to only 0.01 pm. Our implementation in TURBOMOLE is also available within the finite nucleus model based on a Gaussian charge distribution. For a X2C/DLU gradient calculation, computational effort scales cubically with the molecular size, while storage increases quadratically. The efficiency is demonstrated in calculations of large silver clusters and organometallic iridium complexes.
Visualization of Stereoselective Supramolecular Polymers by Chirality-Controlled Energy Transfer.
Sarkar, Aritra; Dhiman, Shikha; Chalishazar, Aditya; George, Subi J
2017-10-23
Chirality-driven self-sorting is envisaged to efficiently control functional properties in supramolecular materials. However, the challenge arises because of a lack of analytical methods to directly monitor the enantioselectivity of the resulting supramolecular assemblies. Presented herein are two fluorescent core-substituted naphthalene-diimide-based donor and acceptor molecules with minimal structural mismatch and they comprise strong self-recognizing chiral motifs to determine the self-sorting process. As a consequence, stereoselective supramolecular polymerization with an unprecedented chirality control over energy transfer has been achieved. This chirality-controlled energy transfer has been further exploited as an efficient probe to visualize microscopically the chirality driven self-sorting. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Castro-Mondragon, Jaime Abraham; Jaeger, Sébastien; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques
2017-07-27
Transcription factor (TF) databases contain multitudes of binding motifs (TFBMs) from various sources, from which non-redundant collections are derived by manual curation. The advent of high-throughput methods stimulated the production of novel collections with increasing numbers of motifs. Meta-databases, built by merging these collections, contain redundant versions, because available tools are not suited to automatically identify and explore biologically relevant clusters among thousands of motifs. Motif discovery from genome-scale data sets (e.g. ChIP-seq) also produces redundant motifs, hampering the interpretation of results. We present matrix-clustering, a versatile tool that clusters similar TFBMs into multiple trees, and automatically creates non-redundant TFBM collections. A feature unique to matrix-clustering is its dynamic visualisation of aligned TFBMs, and its capability to simultaneously treat multiple collections from various sources. We demonstrate that matrix-clustering considerably simplifies the interpretation of combined results from multiple motif discovery tools, and highlights biologically relevant variations of similar motifs. We also ran a large-scale application to cluster ∼11 000 motifs from 24 entire databases, showing that matrix-clustering correctly groups motifs belonging to the same TF families, and drastically reduced motif redundancy. matrix-clustering is integrated within the RSAT suite (http://rsat.eu/), accessible through a user-friendly web interface or command-line for its integration in pipelines. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Ahnert, S E; Fink, T M A
2016-07-01
Network motifs have been studied extensively over the past decade, and certain motifs, such as the feed-forward loop, play an important role in regulatory networks. Recent studies have used Boolean network motifs to explore the link between form and function in gene regulatory networks and have found that the structure of a motif does not strongly determine its function, if this is defined in terms of the gene expression patterns the motif can produce. Here, we offer a different, higher-level definition of the 'function' of a motif, in terms of two fundamental properties of its dynamical state space as a Boolean network. One is the basin entropy, which is a complexity measure of the dynamics of Boolean networks. The other is the diversity of cyclic attractor lengths that a given motif can produce. Using these two measures, we examine all 104 topologically distinct three-node motifs and show that the structural properties of a motif, such as the presence of feedback loops and feed-forward loops, predict fundamental characteristics of its dynamical state space, which in turn determine aspects of its functional versatility. We also show that these higher-level properties have a direct bearing on real regulatory networks, as both basin entropy and cycle length diversity show a close correspondence with the prevalence, in neural and genetic regulatory networks, of the 13 connected motifs without self-interactions that have been studied extensively in the literature. © 2016 The Authors.
An antifungal protein from Ginkgo biloba binds actin and can trigger cell death.
Gao, Ningning; Wadhwani, Parvesh; Mühlhäuser, Philipp; Liu, Qiong; Riemann, Michael; Ulrich, Anne S; Nick, Peter
2016-07-01
Ginkbilobin is a short antifungal protein that had been purified and cloned from the seeds of the living fossil Ginkgo biloba. Homologues of this protein can be detected in all seed plants and the heterosporic fern Selaginella and are conserved with respect to domain structures, peptide motifs, and specific cysteine signatures. To get insight into the cellular functions of these conserved motifs, we expressed green fluorescent protein fusions of full-length and truncated ginkbilobin in tobacco BY-2 cells. We show that the signal peptide confers efficient secretion of ginkbilobin. When this signal peptide is either cleaved or masked, ginkbilobin binds and visualizes the actin cytoskeleton. This actin-binding activity of ginkbilobin is mediated by a specific subdomain just downstream of the signal peptide, and this subdomain can also coassemble with actin in vitro. Upon stable overexpression of this domain, we observe a specific delay in premitotic nuclear positioning indicative of a reduced dynamicity of actin. To elucidate the cellular response to the binding of this subdomain to actin, we use chemical engineering based on synthetic peptides comprising different parts of the actin-binding subdomain conjugated with the cell-penetrating peptide BP100 and with rhodamine B as a fluorescent reporter. Binding of this synthetic construct to actin efficiently induces programmed cell death. We discuss these findings in terms of a working model, where ginkbilobin can activate actin-dependent cell death.
Lee, Choongho
2013-01-01
Chronic hepatitis C virus (HCV) infection is responsible for the development of liver cirrhosis and hepatocellular carcinoma. HCV core protein plays not only a structural role in the virion morphogenesis by encapsidating a virus RNA genome but also a non-structural role in HCV-induced pathogenesis by blocking innate immunity. Especially, it has been shown to regulate JAK-STAT signaling pathway through its direct interaction with Janus kinase (JAK) via its proline-rich JAK-binding motif (79PGYPWP84). However, little is known about the physiological significance of this HCV core-JAK association in the context of the virus life cycle. In order to gain an insight, a mutant HCV genome (J6/JFH1-79A82A) was constructed to express the mutant core with a defective JAK-binding motif (79AGYAWP84) using an HCV genotype 2a infectious clone (J6/JFH1). When this mutant HCV genome was introduced into hepatocarcinoma cells, it was found to be severely impaired in its ability to produce infectious viruses in spite of its robust RNA genome replication. Taken together, all these results suggest an essential requirement of HCV core-JAK protein interaction for efficient production of infectious viruses and the potential of using core-JAK blockers as a new anti-HCV therapy. PMID:24009866
Mouse Visual Neocortex Supports Multiple Stereotyped Patterns of Microcircuit Activity
Sadovsky, Alexander J.
2014-01-01
Spiking correlations between neocortical neurons provide insight into the underlying synaptic connectivity that defines cortical microcircuitry. Here, using two-photon calcium fluorescence imaging, we observed the simultaneous dynamics of hundreds of neurons in slices of mouse primary visual cortex (V1). Consistent with a balance of excitation and inhibition, V1 dynamics were characterized by a linear scaling between firing rate and circuit size. Using lagged firing correlations between neurons, we generated functional wiring diagrams to evaluate the topological features of V1 microcircuitry. We found that circuit connectivity exhibited both cyclic graph motifs, indicating recurrent wiring, and acyclic graph motifs, indicating feedforward wiring. After overlaying the functional wiring diagrams onto the imaged field of view, we found properties consistent with Rentian scaling: wiring diagrams were topologically efficient because they minimized wiring with a modular architecture. Within single imaged fields of view, V1 contained multiple discrete circuits that were overlapping and highly interdigitated but were still distinct from one another. The majority of neurons that were shared between circuits displayed peri-event spiking activity whose timing was specific to the active circuit, whereas spike times for a smaller percentage of neurons were invariant to circuit identity. These data provide evidence that V1 microcircuitry exhibits balanced dynamics, is efficiently arranged in anatomical space, and is capable of supporting a diversity of multineuron spike firing patterns from overlapping sets of neurons. PMID:24899701
NASA Astrophysics Data System (ADS)
Chen, Guangye; Chacón, Luis; CoCoMans Team
2014-10-01
For decades, the Vlasov-Darwin model has been recognized to be attractive for PIC simulations (to avoid radiative noise issues) in non-radiative electromagnetic regimes. However, the Darwin model results in elliptic field equations that renders explicit time integration unconditionally unstable. Improving on linearly implicit schemes, fully implicit PIC algorithms for both electrostatic and electromagnetic regimes, with exact discrete energy and charge conservation properties, have been recently developed in 1D. This study builds on these recent algorithms to develop an implicit, orbit-averaged, time-space-centered finite difference scheme for the particle-field equations in multiple dimensions. The algorithm conserves energy, charge, and canonical-momentum exactly, even with grid packing. A simple fluid preconditioner allows efficient use of large timesteps, O (√{mi/me}c/veT) larger than the explicit CFL. We demonstrate the accuracy and efficiency properties of the of the algorithm with various numerical experiments in 2D3V.
Sugisaki, Kenji; Yamamoto, Satoru; Nakazawa, Shigeaki; Toyota, Kazuo; Sato, Kazunobu; Shiomi, Daisuke; Takui, Takeji
2016-08-18
Quantum computers are capable to efficiently perform full configuration interaction (FCI) calculations of atoms and molecules by using the quantum phase estimation (QPE) algorithm. Because the success probability of the QPE depends on the overlap between approximate and exact wave functions, efficient methods to prepare accurate initial guess wave functions enough to have sufficiently large overlap with the exact ones are highly desired. Here, we propose a quantum algorithm to construct the wave function consisting of one configuration state function, which is suitable for the initial guess wave function in QPE-based FCI calculations of open-shell molecules, based on the addition theorem of angular momentum. The proposed quantum algorithm enables us to prepare the wave function consisting of an exponential number of Slater determinants only by a polynomial number of quantum operations.
Induced drag ideal efficiency factor of arbitrary lateral-vertical wing forms
NASA Technical Reports Server (NTRS)
Deyoung, J.
1980-01-01
A relatively simple equation is presented for estimating the induced drag ideal efficiency factor e for arbitrary cross sectional wing forms. This equation is based on eight basic but varied wing configurations which have exact solutions. The e function which relates the basic wings is developed statistically and is a continuous function of configuration geometry. The basic wing configurations include boxwings shaped as a rectangle, ellipse, and diamond; the V-wing; end-plate wing; 90 degree cruciform; circle dumbbell; and biplane. Example applications of the e equations are made to many wing forms such as wings with struts which form partial span rectangle dumbbell wings; bowtie, cruciform, winglet, and fan wings; and multiwings. Derivations are presented in the appendices of exact closed form solutions found of e for the V-wing and 90 degree cruciform wing and for an asymptotic solution for multiwings.
Efficient steady-state solver for hierarchical quantum master equations
NASA Astrophysics Data System (ADS)
Zhang, Hou-Dao; Qiao, Qin; Xu, Rui-Xue; Zheng, Xiao; Yan, YiJing
2017-07-01
Steady states play pivotal roles in many equilibrium and non-equilibrium open system studies. Their accurate evaluations call for exact theories with rigorous treatment of system-bath interactions. Therein, the hierarchical equations-of-motion (HEOM) formalism is a nonperturbative and non-Markovian quantum dissipation theory, which can faithfully describe the dissipative dynamics and nonlinear response of open systems. Nevertheless, solving the steady states of open quantum systems via HEOM is often a challenging task, due to the vast number of dynamical quantities involved. In this work, we propose a self-consistent iteration approach that quickly solves the HEOM steady states. We demonstrate its high efficiency with accurate and fast evaluations of low-temperature thermal equilibrium of a model Fenna-Matthews-Olson pigment-protein complex. Numerically exact evaluation of thermal equilibrium Rényi entropies and stationary emission line shapes is presented with detailed discussion.
Assessment of composite motif discovery methods.
Klepper, Kjetil; Sandve, Geir K; Abul, Osman; Johansen, Jostein; Drablos, Finn
2008-02-26
Computational discovery of regulatory elements is an important area of bioinformatics research and more than a hundred motif discovery methods have been published. Traditionally, most of these methods have addressed the problem of single motif discovery - discovering binding motifs for individual transcription factors. In higher organisms, however, transcription factors usually act in combination with nearby bound factors to induce specific regulatory behaviours. Hence, recent focus has shifted from single motifs to the discovery of sets of motifs bound by multiple cooperating transcription factors, so called composite motifs or cis-regulatory modules. Given the large number and diversity of methods available, independent assessment of methods becomes important. Although there have been several benchmark studies of single motif discovery, no similar studies have previously been conducted concerning composite motif discovery. We have developed a benchmarking framework for composite motif discovery and used it to evaluate the performance of eight published module discovery tools. Benchmark datasets were constructed based on real genomic sequences containing experimentally verified regulatory modules, and the module discovery programs were asked to predict both the locations of these modules and to specify the single motifs involved. To aid the programs in their search, we provided position weight matrices corresponding to the binding motifs of the transcription factors involved. In addition, selections of decoy matrices were mixed with the genuine matrices on one dataset to test the response of programs to varying levels of noise. Although some of the methods tested tended to score somewhat better than others overall, there were still large variations between individual datasets and no single method performed consistently better than the rest in all situations. The variation in performance on individual datasets also shows that the new benchmark datasets represents a suitable variety of challenges to most methods for module discovery.
Automatic annotation of protein motif function with Gene Ontology terms.
Lu, Xinghua; Zhai, Chengxiang; Gopalakrishnan, Vanathi; Buchanan, Bruce G
2004-09-02
Conserved protein sequence motifs are short stretches of amino acid sequence patterns that potentially encode the function of proteins. Several sequence pattern searching algorithms and programs exist foridentifying candidate protein motifs at the whole genome level. However, a much needed and important task is to determine the functions of the newly identified protein motifs. The Gene Ontology (GO) project is an endeavor to annotate the function of genes or protein sequences with terms from a dynamic, controlled vocabulary and these annotations serve well as a knowledge base. This paper presents methods to mine the GO knowledge base and use the association between the GO terms assigned to a sequence and the motifs matched by the same sequence as evidence for predicting the functions of novel protein motifs automatically. The task of assigning GO terms to protein motifs is viewed as both a binary classification and information retrieval problem, where PROSITE motifs are used as samples for mode training and functional prediction. The mutual information of a motif and aGO term association is found to be a very useful feature. We take advantage of the known motifs to train a logistic regression classifier, which allows us to combine mutual information with other frequency-based features and obtain a probability of correct association. The trained logistic regression model has intuitively meaningful and logically plausible parameter values, and performs very well empirically according to our evaluation criteria. In this research, different methods for automatic annotation of protein motifs have been investigated. Empirical result demonstrated that the methods have a great potential for detecting and augmenting information about the functions of newly discovered candidate protein motifs.
Boyen, Peter; Van Dyck, Dries; Neven, Frank; van Ham, Roeland C H J; van Dijk, Aalt D J
2011-01-01
Correlated motif mining (cmm) is the problem of finding overrepresented pairs of patterns, called motifs, in sequences of interacting proteins. Algorithmic solutions for cmm thereby provide a computational method for predicting binding sites for protein interaction. In this paper, we adopt a motif-driven approach where the support of candidate motif pairs is evaluated in the network. We experimentally establish the superiority of the Chi-square-based support measure over other support measures. Furthermore, we obtain that cmm is an np-hard problem for a large class of support measures (including Chi-square) and reformulate the search for correlated motifs as a combinatorial optimization problem. We then present the generic metaheuristic slider which uses steepest ascent with a neighborhood function based on sliding motifs and employs the Chi-square-based support measure. We show that slider outperforms existing motif-driven cmm methods and scales to large protein-protein interaction networks. The slider-implementation and the data used in the experiments are available on http://bioinformatics.uhasselt.be.
RNA 3D Structural Motifs: Definition, Identification, Annotation, and Database Searching
NASA Astrophysics Data System (ADS)
Nasalean, Lorena; Stombaugh, Jesse; Zirbel, Craig L.; Leontis, Neocles B.
Structured RNA molecules resemble proteins in the hierarchical organization of their global structures, folding and broad range of functions. Structured RNAs are composed of recurrent modular motifs that play specific functional roles. Some motifs direct the folding of the RNA or stabilize the folded structure through tertiary interactions. Others bind ligands or proteins or catalyze chemical reactions. Therefore, it is desirable, starting from the RNA sequence, to be able to predict the locations of recurrent motifs in RNA molecules. Conversely, the potential occurrence of one or more known 3D RNA motifs may indicate that a genomic sequence codes for a structured RNA molecule. To identify known RNA structural motifs in new RNA sequences, precise structure-based definitions are needed that specify the core nucleotides of each motif and their conserved interactions. By comparing instances of each recurrent motif and applying base pair isosteriCity relations, one can identify neutral mutations that preserve its structure and function in the contexts in which it occurs.
Effector prediction in host-pathogen interaction based on a Markov model of a ubiquitous EPIYA motif
2010-01-01
Background Effector secretion is a common strategy of pathogen in mediating host-pathogen interaction. Eight EPIYA-motif containing effectors have recently been discovered in six pathogens. Once these effectors enter host cells through type III/IV secretion systems (T3SS/T4SS), tyrosine in the EPIYA motif is phosphorylated, which triggers effectors binding other proteins to manipulate host-cell functions. The objectives of this study are to evaluate the distribution pattern of EPIYA motif in broad biological species, to predict potential effectors with EPIYA motif, and to suggest roles and biological functions of potential effectors in host-pathogen interactions. Results A hidden Markov model (HMM) of five amino acids was built for the EPIYA-motif based on the eight known effectors. Using this HMM to search the non-redundant protein database containing 9,216,047 sequences, we obtained 107,231 sequences with at least one EPIYA motif occurrence and 3115 sequences with multiple repeats of the EPIYA motif. Although the EPIYA motif exists among broad species, it is significantly over-represented in some particular groups of species. For those proteins containing at least four copies of EPIYA motif, most of them are from intracellular bacteria, extracellular bacteria with T3SS or T4SS or intracellular protozoan parasites. By combining the EPIYA motif and the adjacent SH2 binding motifs (KK, R4, Tarp and Tir), we built HMMs of nine amino acids and predicted many potential effectors in bacteria and protista by the HMMs. Some potential effectors for pathogens (such as Lawsonia intracellularis, Plasmodium falciparum and Leishmania major) are suggested. Conclusions Our study indicates that the EPIYA motif may be a ubiquitous functional site for effectors that play an important pathogenicity role in mediating host-pathogen interactions. We suggest that some intracellular protozoan parasites could secrete EPIYA-motif containing effectors through secretion systems similar to the T3SS/T4SS in bacteria. Our predicted effectors provide useful hypotheses for further studies. PMID:21143776
Computing exact bundle compliance control charts via probability generating functions.
Chen, Binchao; Matis, Timothy; Benneyan, James
2016-06-01
Compliance to evidenced-base practices, individually and in 'bundles', remains an important focus of healthcare quality improvement for many clinical conditions. The exact probability distribution of composite bundle compliance measures used to develop corresponding control charts and other statistical tests is based on a fairly large convolution whose direct calculation can be computationally prohibitive. Various series expansions and other approximation approaches have been proposed, each with computational and accuracy tradeoffs, especially in the tails. This same probability distribution also arises in other important healthcare applications, such as for risk-adjusted outcomes and bed demand prediction, with the same computational difficulties. As an alternative, we use probability generating functions to rapidly obtain exact results and illustrate the improved accuracy and detection over other methods. Numerical testing across a wide range of applications demonstrates the computational efficiency and accuracy of this approach.
Fernández, J J; Tablero, C; Wahnón, P
2004-06-08
In this paper we present an analysis of the convergence of the band structure properties, particularly the influence on the modification of the bandgap and bandwidth values in half metallic compounds by the use of the exact exchange formalism. This formalism for general solids has been implemented using a localized basis set of numerical functions to represent the exchange density. The implementation has been carried out using a code which uses a linear combination of confined numerical pseudoatomic functions to represent the Kohn-Sham orbitals. The application of this exact exchange scheme to a half-metallic semiconductor compound, in particular to Ga(4)P(3)Ti, a promising material in the field of high efficiency solar cells, confirms the existence of the isolated intermediate band in this compound. (c) 2004 American Institute of Physics.
Occurrence probability of structured motifs in random sequences.
Robin, S; Daudin, J-J; Richard, H; Sagot, M-F; Schbath, S
2002-01-01
The problem of extracting from a set of nucleic acid sequences motifs which may have biological function is more and more important. In this paper, we are interested in particular motifs that may be implicated in the transcription process. These motifs, called structured motifs, are composed of two ordered parts separated by a variable distance and allowing for substitutions. In order to assess their statistical significance, we propose approximations of the probability of occurrences of such a structured motif in a given sequence. An application of our method to evaluate candidate promoters in E. coli and B. subtilis is presented. Simulations show the goodness of the approximations.
Efficacy of function specific 3D-motifs in enzyme classification according to their EC-numbers.
Rahimi, Amir; Madadkar-Sobhani, Armin; Touserkani, Rouzbeh; Goliaei, Bahram
2013-11-07
Due to the increasing number of protein structures with unknown function originated from structural genomics projects, protein function prediction has become an important subject in bioinformatics. Among diverse function prediction methods, exploring known 3D-motifs, which are associated with functional elements in unknown protein structures is one of the most biologically meaningful methods. Homologous enzymes inherit such motifs in their active sites from common ancestors. However, slight differences in the properties of these motifs, results in variation in the reactions and substrates of the enzymes. In this study, we examined the possibility of discriminating highly related active site patterns according to their EC-numbers by 3D-motifs. For each EC-number, the spatial arrangement of an active site, which has minimum average distance to other active sites with the same function, was selected as a representative 3D-motif. In order to characterize the motifs, various points in active site elements were tested. The results demonstrated the possibility of predicting full EC-number of enzymes by 3D-motifs. However, the discriminating power of 3D-motifs varies among different enzyme families and depends on selecting the appropriate points and features. © 2013 Elsevier Ltd. All rights reserved.
ELM: the status of the 2010 eukaryotic linear motif resource
Gould, Cathryn M.; Diella, Francesca; Via, Allegra; Puntervoll, Pål; Gemünd, Christine; Chabanis-Davidson, Sophie; Michael, Sushama; Sayadi, Ahmed; Bryne, Jan Christian; Chica, Claudia; Seiler, Markus; Davey, Norman E.; Haslam, Niall; Weatheritt, Robert J.; Budd, Aidan; Hughes, Tim; Paś, Jakub; Rychlewski, Leszek; Travé, Gilles; Aasland, Rein; Helmer-Citterich, Manuela; Linding, Rune; Gibson, Toby J.
2010-01-01
Linear motifs are short segments of multidomain proteins that provide regulatory functions independently of protein tertiary structure. Much of intracellular signalling passes through protein modifications at linear motifs. Many thousands of linear motif instances, most notably phosphorylation sites, have now been reported. Although clearly very abundant, linear motifs are difficult to predict de novo in protein sequences due to the difficulty of obtaining robust statistical assessments. The ELM resource at http://elm.eu.org/ provides an expanding knowledge base, currently covering 146 known motifs, with annotation that includes >1300 experimentally reported instances. ELM is also an exploratory tool for suggesting new candidates of known linear motifs in proteins of interest. Information about protein domains, protein structure and native disorder, cellular and taxonomic contexts is used to reduce or deprecate false positive matches. Results are graphically displayed in a ‘Bar Code’ format, which also displays known instances from homologous proteins through a novel ‘Instance Mapper’ protocol based on PHI-BLAST. ELM server output provides links to the ELM annotation as well as to a number of remote resources. Using the links, researchers can explore the motifs, proteins, complex structures and associated literature to evaluate whether candidate motifs might be worth experimental investigation. PMID:19920119
Composite Structural Motifs of Binding Sites for Delineating Biological Functions of Proteins
Kinjo, Akira R.; Nakamura, Haruki
2012-01-01
Most biological processes are described as a series of interactions between proteins and other molecules, and interactions are in turn described in terms of atomic structures. To annotate protein functions as sets of interaction states at atomic resolution, and thereby to better understand the relation between protein interactions and biological functions, we conducted exhaustive all-against-all atomic structure comparisons of all known binding sites for ligands including small molecules, proteins and nucleic acids, and identified recurring elementary motifs. By integrating the elementary motifs associated with each subunit, we defined composite motifs that represent context-dependent combinations of elementary motifs. It is demonstrated that function similarity can be better inferred from composite motif similarity compared to the similarity of protein sequences or of individual binding sites. By integrating the composite motifs associated with each protein function, we define meta-composite motifs each of which is regarded as a time-independent diagrammatic representation of a biological process. It is shown that meta-composite motifs provide richer annotations of biological processes than sequence clusters. The present results serve as a basis for bridging atomic structures to higher-order biological phenomena by classification and integration of binding site structures. PMID:22347478
World Color Survey color naming reveals universal motifs and their within-language diversity
Lindsey, Delwin T.; Brown, Angela M.
2009-01-01
We analyzed the color terms in the World Color Survey (WCS) (www.icsi.berkeley.edu/wcs/), a large color-naming database obtained from informants of mostly unwritten languages spoken in preindustrialized cultures that have had limited contact with modern, industrialized society. The color naming idiolects of 2,367 WCS informants fall into three to six “motifs,” where each motif is a different color-naming system based on a subset of a universal glossary of 11 color terms. These motifs are universal in that they occur worldwide, with some individual variation, in completely unrelated languages. Strikingly, these few motifs are distributed across the WCS informants in such a way that multiple motifs occur in most languages. Thus, the culture a speaker comes from does not completely determine how he or she will use color terms. An analysis of the modern patterns of motif usage in the WCS languages, based on the assumption that they reflect historical patterns of color term evolution, suggests that color lexicons have changed over time in a complex but orderly way. The worldwide distribution of the motifs and the cooccurrence of multiple motifs within languages suggest that universal processes control the naming of colors. PMID:19901327
Multidimensional, fully implicit, exactly conserving electromagnetic particle-in-cell simulations
NASA Astrophysics Data System (ADS)
Chacon, Luis
2015-09-01
We discuss a new, conservative, fully implicit 2D-3V particle-in-cell algorithm for non-radiative, electromagnetic kinetic plasma simulations, based on the Vlasov-Darwin model. Unlike earlier linearly implicit PIC schemes and standard explicit PIC schemes, fully implicit PIC algorithms are unconditionally stable and allow exact discrete energy and charge conservation. This has been demonstrated in 1D electrostatic and electromagnetic contexts. In this study, we build on these recent algorithms to develop an implicit, orbit-averaged, time-space-centered finite difference scheme for the Darwin field and particle orbit equations for multiple species in multiple dimensions. The Vlasov-Darwin model is very attractive for PIC simulations because it avoids radiative noise issues in non-radiative electromagnetic regimes. The algorithm conserves global energy, local charge, and particle canonical-momentum exactly, even with grid packing. The nonlinear iteration is effectively accelerated with a fluid preconditioner, which allows efficient use of large timesteps, O(√{mi/me}c/veT) larger than the explicit CFL. In this presentation, we will introduce the main algorithmic components of the approach, and demonstrate the accuracy and efficiency properties of the algorithm with various numerical experiments in 1D and 2D. Support from the LANL LDRD program and the DOE-SC ASCR office.
Vexler, Albert; Tanajian, Hovig; Hutson, Alan D
In practice, parametric likelihood-ratio techniques are powerful statistical tools. In this article, we propose and examine novel and simple distribution-free test statistics that efficiently approximate parametric likelihood ratios to analyze and compare distributions of K groups of observations. Using the density-based empirical likelihood methodology, we develop a Stata package that applies to a test for symmetry of data distributions and compares K -sample distributions. Recognizing that recent statistical software packages do not sufficiently address K -sample nonparametric comparisons of data distributions, we propose a new Stata command, vxdbel, to execute exact density-based empirical likelihood-ratio tests using K samples. To calculate p -values of the proposed tests, we use the following methods: 1) a classical technique based on Monte Carlo p -value evaluations; 2) an interpolation technique based on tabulated critical values; and 3) a new hybrid technique that combines methods 1 and 2. The third, cutting-edge method is shown to be very efficient in the context of exact-test p -value computations. This Bayesian-type method considers tabulated critical values as prior information and Monte Carlo generations of test statistic values as data used to depict the likelihood function. In this case, a nonparametric Bayesian method is proposed to compute critical values of exact tests.
Generalized Buneman Pruning for Inferring the Most Parsimonious Multi-state Phylogeny
NASA Astrophysics Data System (ADS)
Misra, Navodit; Blelloch, Guy; Ravi, R.; Schwartz, Russell
Accurate reconstruction of phylogenies remains a key challenge in evolutionary biology. Most biologically plausible formulations of the problem are formally NP-hard, with no known efficient solution. The standard in practice are fast heuristic methods that are empirically known to work very well in general, but can yield results arbitrarily far from optimal. Practical exact methods, which yield exponential worst-case running times but generally much better times in practice, provide an important alternative. We report progress in this direction by introducing a provably optimal method for the weighted multi-state maximum parsimony phylogeny problem. The method is based on generalizing the notion of the Buneman graph, a construction key to efficient exact methods for binary sequences, so as to apply to sequences with arbitrary finite numbers of states with arbitrary state transition weights. We implement an integer linear programming (ILP) method for the multi-state problem using this generalized Buneman graph and demonstrate that the resulting method is able to solve data sets that are intractable by prior exact methods in run times comparable with popular heuristics. Our work provides the first method for provably optimal maximum parsimony phylogeny inference that is practical for multi-state data sets of more than a few characters.
Maurer-Stroh, Sebastian; Gao, He; Han, Hao; Baeten, Lies; Schymkowitz, Joost; Rousseau, Frederic; Zhang, Louxin; Eisenhaber, Frank
2013-02-01
Data mining in protein databases, derivatives from more fundamental protein 3D structure and sequence databases, has considerable unearthed potential for the discovery of sequence motif--structural motif--function relationships as the finding of the U-shape (Huf-Zinc) motif, originally a small student's project, exemplifies. The metal ion zinc is critically involved in universal biological processes, ranging from protein-DNA complexes and transcription regulation to enzymatic catalysis and metabolic pathways. Proteins have evolved a series of motifs to specifically recognize and bind zinc ions. Many of these, so called zinc fingers, are structurally independent globular domains with discontinuous binding motifs made up of residues mostly far apart in sequence. Through a systematic approach starting from the BRIX structure fragment database, we discovered that there exists another predictable subset of zinc-binding motifs that not only have a conserved continuous sequence pattern but also share a characteristic local conformation, despite being included in totally different overall folds. While this does not allow general prediction of all Zn binding motifs, a HMM-based web server, Huf-Zinc, is available for prediction of these novel, as well as conventional, zinc finger motifs in protein sequences. The Huf-Zinc webserver can be freely accessed through this URL (http://mendel.bii.a-star.edu.sg/METHODS/hufzinc/).
Krystkowiak, Izabella; Manguy, Jean; Davey, Norman E
2018-06-05
There is a pressing need for in silico tools that can aid in the identification of the complete repertoire of protein binding (SLiMs, MoRFs, miniMotifs) and modification (moiety attachment/removal, isomerization, cleavage) motifs. We have created PSSMSearch, an interactive web-based tool for rapid statistical modeling, visualization, discovery and annotation of protein motif specificity determinants to discover novel motifs in a proteome-wide manner. PSSMSearch analyses proteomes for regions with significant similarity to a motif specificity determinant model built from a set of aligned motif-containing peptides. Multiple scoring methods are available to build a position-specific scoring matrix (PSSM) describing the motif specificity determinant model. This model can then be modified by a user to add prior knowledge of specificity determinants through an interactive PSSM heatmap. PSSMSearch includes a statistical framework to calculate the significance of specificity determinant model matches against a proteome of interest. PSSMSearch also includes the SLiMSearch framework's annotation, motif functional analysis and filtering tools to highlight relevant discriminatory information. Additional tools to annotate statistically significant shared keywords and GO terms, or experimental evidence of interaction with a motif-recognizing protein have been added. Finally, PSSM-based conservation metrics have been created for taxonomic range analyses. The PSSMSearch web server is available at http://slim.ucd.ie/pssmsearch/.
ssHMM: extracting intuitive sequence-structure motifs from high-throughput RNA-binding protein data
Krestel, Ralf; Ohler, Uwe; Vingron, Martin; Marsico, Annalisa
2017-01-01
Abstract RNA-binding proteins (RBPs) play an important role in RNA post-transcriptional regulation and recognize target RNAs via sequence-structure motifs. The extent to which RNA structure influences protein binding in the presence or absence of a sequence motif is still poorly understood. Existing RNA motif finders either take the structure of the RNA only partially into account, or employ models which are not directly interpretable as sequence-structure motifs. We developed ssHMM, an RNA motif finder based on a hidden Markov model (HMM) and Gibbs sampling which fully captures the relationship between RNA sequence and secondary structure preference of a given RBP. Compared to previous methods which output separate logos for sequence and structure, it directly produces a combined sequence-structure motif when trained on a large set of sequences. ssHMM’s model is visualized intuitively as a graph and facilitates biological interpretation. ssHMM can be used to find novel bona fide sequence-structure motifs of uncharacterized RBPs, such as the one presented here for the YY1 protein. ssHMM reaches a high motif recovery rate on synthetic data, it recovers known RBP motifs from CLIP-Seq data, and scales linearly on the input size, being considerably faster than MEMERIS and RNAcontext on large datasets while being on par with GraphProt. It is freely available on Github and as a Docker image. PMID:28977546
Counting motifs in dynamic networks.
Mukherjee, Kingshuk; Hasan, Md Mahmudul; Boucher, Christina; Kahveci, Tamer
2018-04-11
A network motif is a sub-network that occurs frequently in a given network. Detection of such motifs is important since they uncover functions and local properties of the given biological network. Finding motifs is however a computationally challenging task as it requires solving the costly subgraph isomorphism problem. Moreover, the topology of biological networks change over time. These changing networks are called dynamic biological networks. As the network evolves, frequency of each motif in the network also changes. Computing the frequency of a given motif from scratch in a dynamic network as the network topology evolves is infeasible, particularly for large and fast evolving networks. In this article, we design and develop a scalable method for counting the number of motifs in a dynamic biological network. Our method incrementally updates the frequency of each motif as the underlying network's topology evolves. Our experiments demonstrate that our method can update the frequency of each motif in orders of magnitude faster than counting the motif embeddings every time the network changes. If the network evolves more frequently, the margin with which our method outperforms the existing static methods, increases. We evaluated our method extensively using synthetic and real datasets, and show that our method is highly accurate(≥ 96%) and that it can be scaled to large dense networks. The results on real data demonstrate the utility of our method in revealing interesting insights on the evolution of biological processes.
QuadBase2: web server for multiplexed guanine quadruplex mining and visualization
Dhapola, Parashar; Chowdhury, Shantanu
2016-01-01
DNA guanine quadruplexes or G4s are non-canonical DNA secondary structures which affect genomic processes like replication, transcription and recombination. G4s are computationally identified by specific nucleotide motifs which are also called putative G4 (PG4) motifs. Despite the general relevance of these structures, there is currently no tool available that can allow batch queries and genome-wide analysis of these motifs in a user-friendly interface. QuadBase2 (quadbase.igib.res.in) presents a completely reinvented web server version of previously published QuadBase database. QuadBase2 enables users to mine PG4 motifs in up to 178 eukaryotes through the EuQuad module. This module interfaces with Ensembl Compara database, to allow users mine PG4 motifs in the orthologues of genes of interest across eukaryotes. PG4 motifs can be mined across genes and their promoter sequences in 1719 prokaryotes through ProQuad module. This module includes a feature that allows genome-wide mining of PG4 motifs and their visualization as circular histograms. TetraplexFinder, the module for mining PG4 motifs in user-provided sequences is now capable of handling up to 20 MB of data. QuadBase2 is a comprehensive PG4 motif mining tool that further expands the configurations and algorithms for mining PG4 motifs in a user-friendly way. PMID:27185890
Li, Wan; Chen, Lina; Li, Xia; Jia, Xu; Feng, Chenchen; Zhang, Liangcai; He, Weiming; Lv, Junjie; He, Yuehan; Li, Weiguo; Qu, Xiaoli; Zhou, Yanyan; Shi, Yuchen
2013-12-01
Network motifs in central positions are considered to not only have more in-coming and out-going connections but are also localized in an area where more paths reach the networks. These central motifs have been extensively investigated to determine their consistent functions or associations with specific function categories. However, their functional potentials in the maintenance of cross-talk between different functional communities are unclear. In this paper, we constructed an integrated human signaling network from the Pathway Interaction Database. We identified 39 essential cancer-related motifs in central roles, which we called cancer-related marketing centrality motifs, using combined centrality indices on the system level. Our results demonstrated that these cancer-related marketing centrality motifs were pivotal units in the signaling network, and could mediate cross-talk between 61 biological pathways (25 could be mediated by one motif on average), most of which were cancer-related pathways. Further analysis showed that molecules of most marketing centrality motifs were in the same or adjacent subcellular localizations, such as the motif containing PI3K, PDK1 and AKT1 in the plasma membrane, to mediate signal transduction between 32 cancer-related pathways. Finally, we analyzed the pivotal roles of cancer genes in these marketing centrality motifs in the pathogenesis of cancers, and found that non-cancer genes were potential cancer-related genes.
Gibbs motif sampling: detection of bacterial outer membrane protein repeats.
Neuwald, A. F.; Liu, J. S.; Lawrence, C. E.
1995-01-01
The detection and alignment of locally conserved regions (motifs) in multiple sequences can provide insight into protein structure, function, and evolution. A new Gibbs sampling algorithm is described that detects motif-encoding regions in sequences and optimally partitions them into distinct motif models; this is illustrated using a set of immunoglobulin fold proteins. When applied to sequences sharing a single motif, the sampler can be used to classify motif regions into related submodels, as is illustrated using helix-turn-helix DNA-binding proteins. Other statistically based procedures are described for searching a database for sequences matching motifs found by the sampler. When applied to a set of 32 very distantly related bacterial integral outer membrane proteins, the sampler revealed that they share a subtle, repetitive motif. Although BLAST (Altschul SF et al., 1990, J Mol Biol 215:403-410) fails to detect significant pairwise similarity between any of the sequences, the repeats present in these outer membrane proteins, taken as a whole, are highly significant (based on a generally applicable statistical test for motifs described here). Analysis of bacterial porins with known trimeric beta-barrel structure and related proteins reveals a similar repetitive motif corresponding to alternating membrane-spanning beta-strands. These beta-strands occur on the membrane interface (as opposed to the trimeric interface) of the beta-barrel. The broad conservation and structural location of these repeats suggests that they play important functional roles. PMID:8520488
Combinatorics of feedback in cellular uptake and metabolism of small molecules.
Krishna, Sandeep; Semsey, Szabolcs; Sneppen, Kim
2007-12-26
We analyze the connection between structure and function for regulatory motifs associated with cellular uptake and usage of small molecules. Based on the boolean logic of the feedback we suggest four classes: the socialist, consumer, fashion, and collector motifs. We find that the socialist motif is good for homeostasis of a useful but potentially poisonous molecule, whereas the consumer motif is optimal for nutrition molecules. Accordingly, examples of these motifs are found in, respectively, the iron homeostasis system in various organisms and in the uptake of sugar molecules in bacteria. The remaining two motifs have no obvious analogs in small molecule regulation, but we illustrate their behavior using analogies to fashion and obesity. These extreme motifs could inspire construction of synthetic systems that exhibit bistable, history-dependent states, and homeostasis of flux (rather than concentration).
MOTIFSIM 2.1: An Enhanced Software Platform for Detecting Similarity in Multiple DNA Motif Data Sets
Huang, Chun-Hsi
2017-01-01
Abstract Finding binding site motifs plays an important role in bioinformatics as it reveals the transcription factors that control the gene expression. The development for motif finders has flourished in the past years with many tools have been introduced to the research community. Although these tools possess exceptional features for detecting motifs, they report different results for an identical data set. Hence, using multiple tools is recommended because motifs reported by several tools are likely biologically significant. However, the results from multiple tools need to be compared for obtaining common significant motifs. MOTIFSIM web tool and command-line tool were developed for this purpose. In this work, we present several technical improvements as well as additional features to further support the motif analysis in our new release MOTIFSIM 2.1. PMID:28632401
Targeted inhibition of oncogenic miR-21 maturation with designed RNA-binding proteins
Chen, Yu; Yang, Fan; Zubovic, Lorena; Pavelitz, Tom; Yang, Wen; Godin, Katherine; Walker, Matthew; Zheng, Suxin; Macchi, Paolo; Varani, Gabriele
2016-01-01
The RNA Recognition Motif (RRM) is the largest family of eukaryotic RNA-binding proteins. Engineered RRMs with new specificity would provide valuable tools and an exacting test of our understanding of specificity. We have achieved the first successful re-design of the specificity of an RRM using rational methods and demonstrated re-targeting of activity in cells. We engineered the conserved RRM of human Rbfox proteins to specifically bind to the terminal loop of miR-21 precursor with high affinity and inhibit its processing by Drosha and Dicer. We further engineered Giardia Dicer by replacing its PAZ domain with the designed RRM. The reprogrammed enzyme degrades pre-miR-21 specifically in vitro and suppresses mature miR-21 levels in cells, which results in increased expression of PDCD4 and significantly decreased viability for cancer cells. The results demonstrate the feasibility of engineering the sequence-specificity of RRMs and of using this ubiquitous platform for diverse biological applications. PMID:27428511
Classification and assessment tools for structural motif discovery algorithms.
Badr, Ghada; Al-Turaiki, Isra; Mathkour, Hassan
2013-01-01
Motif discovery is the problem of finding recurring patterns in biological data. Patterns can be sequential, mainly when discovered in DNA sequences. They can also be structural (e.g. when discovering RNA motifs). Finding common structural patterns helps to gain a better understanding of the mechanism of action (e.g. post-transcriptional regulation). Unlike DNA motifs, which are sequentially conserved, RNA motifs exhibit conservation in structure, which may be common even if the sequences are different. Over the past few years, hundreds of algorithms have been developed to solve the sequential motif discovery problem, while less work has been done for the structural case. In this paper, we survey, classify, and compare different algorithms that solve the structural motif discovery problem, where the underlying sequences may be different. We highlight their strengths and weaknesses. We start by proposing a benchmark dataset and a measurement tool that can be used to evaluate different motif discovery approaches. Then, we proceed by proposing our experimental setup. Finally, results are obtained using the proposed benchmark to compare available tools. To the best of our knowledge, this is the first attempt to compare tools solely designed for structural motif discovery. Results show that the accuracy of discovered motifs is relatively low. The results also suggest a complementary behavior among tools where some tools perform well on simple structures, while other tools are better for complex structures. We have classified and evaluated the performance of available structural motif discovery tools. In addition, we have proposed a benchmark dataset with tools that can be used to evaluate newly developed tools.
Park, In Seob; Komiyama, Hideaki; Yasuda, Takuma
2017-02-01
Deep-blue emitters that can harvest both singlet and triplet excited states to give high electron-to-photon conversion efficiencies are highly desired for applications in full-color displays and white lighting devices based on organic light-emitting diodes (OLEDs). Thermally activated delayed fluorescence (TADF) molecules based on highly twisted donor-acceptor (D-A) configurations are promising emitting dopants for the construction of efficient deep-blue OLEDs. In this study, a simple and versatile D-A system combining acridan-based donors and pyrimidine-based acceptors has been developed as a new platform for high-efficiency deep-blue TADF emitters. The designed pre-twisted acridan-pyrimidine D-A molecules exhibit small singlet-triplet energy splitting and high photoluminescence quantum yields, functioning as efficient deep-blue TADF emitters. The OLEDs utilizing these TADF emitters display bright blue electroluminescence with external quantum efficiencies of up to 20.4%, maximum current efficiencies of 41.7 cd A -1 , maximum power efficiencies of 37.2 lm W -1 , and color coordinates of (0.16, 0.23). The design strategy featuring such acridan-pyrimidine D-A motifs can offer great prospects for further developing high-performance deep-blue TADF emitters and TADF-OLEDs.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Jin-Hua; Zhang, E.; Tang, Gui-Mei, E-mail: meiguit@163.com
2016-09-15
Three new metal coordination complexes, namely, [Co(BPO){sub 2}(H{sub 2}O){sub 4}](BS){sub 2}(H{sub 2}O){sub 2} (1), [Co(BPO){sub 2}(H{sub 2}O){sub 4}](ABS){sub 2}(H{sub 2}O){sub 2} (2), [Co(BPO){sub 2}(H{sub 2}O){sub 4}](MBS){sub 2}(H{sub 2}O){sub 2} (3) [BPO=2,5-di(pyridin-4-yl)-1,3,4-oxadiazole, BS=benzenesulphonate, ABS=4-aminobenzenesulphonate, MBS=4-methylbenzenesulphonate] were obtained under hydrothermal conditions. Complexes 1–3 were structurally characterized by single-crystal X-ray diffraction, powder X-ray diffraction, IR and thermogravimetric analyses (TGA). All of them display a zero-dimensional motif, in which strong intermolecular hydrogen bonding interactions (O–H···O/N) and packing interactions (C–H···π and π···π) make them achieve a three-dimensional supramolecular architecture. The primary catalytic results of these three complexes show that high efficiency for the green synthesismore » of a variety of 3,4-dihydropyrimidin-2(1H)-ones was observed under solvent free conditions through Biginelli reactions. The present catalytic protocols exhibit advantages such as excellent yield, easy isolation, eco-friendly conditions, and short reaction time. - Graphical abstract: Three new metal coordination complexes with bipyridinyl-oxadiazole were obtained under hydrothermal conditions, which display a zero-dimensional motif, and show high efficiency for the green synthesis of a variety of 3,4-dihydropyrimidin-2(1H)-ones under solvent free conditions through Biginelli reactions. The present catalytic protocols exhibit advantages such as excellent yield, easy isolation, eco-friendly conditions, and short reaction time. Display Omitted.« less
Soroka, Daria; Li de la Sierra-Gallay, Inès; Dubée, Vincent; Triboulet, Sébastien; van Tilbeurgh, Herman; Compain, Fabrice; Ballell, Lluis; Barros, David; Mainardi, Jean-Luc; Hugonnet, Jean-Emmanuel; Arthur, Michel
2015-09-01
Combinations of β-lactams with clavulanate are currently being investigated for tuberculosis treatment. Since Mycobacterium tuberculosis produces a broad spectrum β-lactamase, BlaC, the success of this approach could be compromised by the emergence of clavulanate-resistant variants, as observed for inhibitor-resistant TEM variants in enterobacteria. Previous analyses based on site-directed mutagenesis of BlaC have led to the conclusion that this risk was limited. Here, we used a different approach based on determination of the crystal structure of β-lactamase BlaMAb of Mycobacterium abscessus, which efficiently hydrolyzes clavulanate. Comparison of BlaMAb and BlaC allowed for structure-assisted site-directed mutagenesis of BlaC and identification of the G(132)N substitution that was sufficient to switch the interaction of BlaC with clavulanate from irreversible inactivation to efficient hydrolysis. The substitution, which restored the canonical SDN motif (SDG→SDN), allowed for efficient hydrolysis of clavulanate, with a more than 10(4)-fold increase in k cat (0.41 s(-1)), without affecting the hydrolysis of other β-lactams. Mass spectrometry revealed that acylation of BlaC and of its G(132)N variant by clavulanate follows similar paths, involving sequential formation of two acylenzymes. Decarboxylation of the first acylenzyme results in a stable secondary acylenzyme in BlaC, whereas hydrolysis occurs in the G(132)N variant. The SDN/SDG polymorphism defines two mycobacterial lineages comprising rapidly and slowly growing species, respectively. Together, these results suggest that the efficacy of β-lactam-clavulanate combinations may be limited by the emergence of resistance. β-Lactams active without clavulanate, such as faropenem, should be prioritized for the development of new therapies. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Soroka, Daria; Li de la Sierra-Gallay, Inès; Dubée, Vincent; Triboulet, Sébastien; van Tilbeurgh, Herman; Compain, Fabrice; Ballell, Lluis; Barros, David; Mainardi, Jean-Luc
2015-01-01
Combinations of β-lactams with clavulanate are currently being investigated for tuberculosis treatment. Since Mycobacterium tuberculosis produces a broad spectrum β-lactamase, BlaC, the success of this approach could be compromised by the emergence of clavulanate-resistant variants, as observed for inhibitor-resistant TEM variants in enterobacteria. Previous analyses based on site-directed mutagenesis of BlaC have led to the conclusion that this risk was limited. Here, we used a different approach based on determination of the crystal structure of β-lactamase BlaMAb of Mycobacterium abscessus, which efficiently hydrolyzes clavulanate. Comparison of BlaMAb and BlaC allowed for structure-assisted site-directed mutagenesis of BlaC and identification of the G132N substitution that was sufficient to switch the interaction of BlaC with clavulanate from irreversible inactivation to efficient hydrolysis. The substitution, which restored the canonical SDN motif (SDG→SDN), allowed for efficient hydrolysis of clavulanate, with a more than 104-fold increase in kcat (0.41 s−1), without affecting the hydrolysis of other β-lactams. Mass spectrometry revealed that acylation of BlaC and of its G132N variant by clavulanate follows similar paths, involving sequential formation of two acylenzymes. Decarboxylation of the first acylenzyme results in a stable secondary acylenzyme in BlaC, whereas hydrolysis occurs in the G132N variant. The SDN/SDG polymorphism defines two mycobacterial lineages comprising rapidly and slowly growing species, respectively. Together, these results suggest that the efficacy of β-lactam–clavulanate combinations may be limited by the emergence of resistance. β-Lactams active without clavulanate, such as faropenem, should be prioritized for the development of new therapies. PMID:26149997
An efficient technique for higher order fractional differential equation.
Ali, Ayyaz; Iqbal, Muhammad Asad; Ul-Hassan, Qazi Mahmood; Ahmad, Jamshad; Mohyud-Din, Syed Tauseef
2016-01-01
In this study, we establish exact solutions of fractional Kawahara equation by using the idea of [Formula: see text]-expansion method. The results of different studies show that the method is very effective and can be used as an alternative for finding exact solutions of nonlinear evolution equations (NLEEs) in mathematical physics. The solitary wave solutions are expressed by the hyperbolic, trigonometric, exponential and rational functions. Graphical representations along with the numerical data reinforce the efficacy of the used procedure. The specified idea is very effective, expedient for fractional PDEs, and could be extended to other physical problems.
Pharmer: efficient and exact pharmacophore search.
Koes, David Ryan; Camacho, Carlos J
2011-06-27
Pharmacophore search is a key component of many drug discovery efforts. Pharmer is a new computational approach to pharmacophore search that scales with the breadth and complexity of the query, not the size of the compound library being screened. Two novel methods for organizing pharmacophore data, the Pharmer KDB-tree and Bloom fingerprints, enable Pharmer to perform an exact pharmacophore search of almost two million structures in less than a minute. In general, Pharmer is more than an order of magnitude faster than existing technologies. The complete source code is available under an open-source license at http://pharmer.sourceforge.net .
Identifying DNA-binding proteins using structural motifs and the electrostatic potential
Shanahan, Hugh P.; Garcia, Mario A.; Jones, Susan; Thornton, Janet M.
2004-01-01
Robust methods to detect DNA-binding proteins from structures of unknown function are important for structural biology. This paper describes a method for identifying such proteins that (i) have a solvent accessible structural motif necessary for DNA-binding and (ii) a positive electrostatic potential in the region of the binding region. We focus on three structural motifs: helix–turn-helix (HTH), helix–hairpin–helix (HhH) and helix–loop–helix (HLH). We find that the combination of these variables detect 78% of proteins with an HTH motif, which is a substantial improvement over previous work based purely on structural templates and is comparable to more complex methods of identifying DNA-binding proteins. Similar true positive fractions are achieved for the HhH and HLH motifs. We see evidence of wide evolutionary diversity for DNA-binding proteins with an HTH motif, and much smaller diversity for those with an HhH or HLH motif. PMID:15356290
A Gibbs sampler for motif detection in phylogenetically close sequences
NASA Astrophysics Data System (ADS)
Siddharthan, Rahul; van Nimwegen, Erik; Siggia, Eric
2004-03-01
Genes are regulated by transcription factors that bind to DNA upstream of genes and recognize short conserved ``motifs'' in a random intergenic ``background''. Motif-finders such as the Gibbs sampler compare the probability of these short sequences being represented by ``weight matrices'' to the probability of their arising from the background ``null model'', and explore this space (analogous to a free-energy landscape). But closely related species may show conservation not because of functional sites but simply because they have not had sufficient time to diverge, so conventional methods will fail. We introduce a new Gibbs sampler algorithm that accounts for common ancestry when searching for motifs, while requiring minimal ``prior'' assumptions on the number and types of motifs, assessing the significance of detected motifs by ``tracking'' clusters that stay together. We apply this scheme to motif detection in sporulation-cycle genes in the yeast S. cerevisiae, using recent sequences of other closely-related Saccharomyces species.
Daithankar, Vidyadhar N; Farrell, Scott R; Thorpe, Colin
2009-06-09
Augmenter of liver regeneration (ALR) is both a growth factor and a sulfhydryl oxidase that binds FAD in an unusual helix-rich domain containing a redox-active CxxC disulfide proximal to the flavin ring. In addition to the cytokine form of ALR (sfALR) that circulates in serum, a longer form, lfALR, is believed to participate in oxidative trapping of reduced proteins entering the mitochondrial intermembrane space (IMS). This longer form has an 80-residue N-terminal extension containing an additional, distal, CxxC motif. This work presents the first enzymological characterization of human lfALR. The N-terminal region conveys no catalytic advantage toward the oxidation of the model substrate dithiothreitol (DTT). In addition, a C71A or C74A mutation of the distal disulfide does not increase the turnover number toward DTT. Unlike Erv1p, the yeast homologue of lfALR, static spectrophotometric experiments with the human oxidase provide no evidence of communication between distal and proximal disulfides. An N-terminal His-tagged version of human Mia40, a resident oxidoreductase of the IMS and a putative physiological reductant of lfALR, was subcloned and expressed in Escherichia coli BL21 DE3 cells. Mia40, as isolated, shows a visible spectrum characteristic of an Fe-S center and contains 0.56 +/- 0.02 atom of iron per subunit. Treatment of Mia40 with guanidine hydrochloride and triscarboxyethylphosphine hydrochloride during purification removed this chromophore. The resulting protein, with a reduced CxC motif, was a good substrate of lfALR. However, neither sfALR nor lfALR mutants lacking the distal disulfide could oxidize reduced Mia40 efficiently. Thus, catalysis involves a flow of reducing equivalents from the reduced CxC motif of Mia40 to distal and then proximal CxxC motifs of lfALR to the flavin ring and, finally, to cytochrome c or molecular oxygen.
Daithankar, Vidyadhar N.; Farrell, Scott R.; Thorpe, Colin
2009-01-01
Augmenter of liver regeneration (ALR) is both a growth factor and a sulfhydryl oxidase that binds FAD in an unusual helix-rich domain containing a redox-active CxxC disulfide proximal to the flavin ring. In addition to the cytokine form of ALR (sfALR) that circulates in serum, a longer form, lfALR, is believed to participate in oxidative trapping of reduced proteins entering the mitochondrial intermembrane space (IMS). This longer form has an 80-residue N-terminal extension containing an additional, distal, CxxC motif. This work presents the first enzymological characterization of human lfALR. The N-terminal region conveys no catalytic advantage towards the oxidation of the model substrate dithiothreitol (DTT). In addition, C71A or C74A mutations of the distal disulfide do not increase the turnover number towards DTT. Unlike Erv1p, the yeast homolog of lfALR, static spectrophotometric experiments of the human oxidase provide no evidence for communication between distal and proximal disulfides. An N-terminal his-tagged version of human Mia40, a resident oxidoreductase of the IMS and a putative physiological reductant of lfALR, was subcloned and expressed in Escherichia coli BL21 DE3 cells. Mia40, as isolated, shows a visible spectrum characteristic of an Fe/S center and contains 0.56 ± 0.02 atoms of iron per subunit. Treatment of Mia40 with guanidine hydrochloride and triscarboxyethylphosphine hydrochloride during purification removed this chromophore. The resulting protein, with a reduced CxC motif, was a good substrate of lfALR. However, neither sfALR, nor lfALR mutants lacking the distal disulfide, could oxidize reduced Mia40 efficiently. Thus, catalysis involves a flow of reducing equivalents from the reduced CxC motif of Mia40, to distal- and then proximal CxxC motifs of lfALR, to the flavin ring, and, finally, to cytochrome c or molecular oxygen. PMID:19397338
Huang, Xiaoqiang; Xue, Jing; Lin, Min; Zhu, Yushan
2016-01-01
Active site preorganization helps native enzymes electrostatically stabilize the transition state better than the ground state for their primary substrates and achieve significant rate enhancement. In this report, we hypothesize that a complex active site model for active site preorganization modeling should help to create preorganized active site design and afford higher starting activities towards target reactions. Our matching algorithm ProdaMatch was improved by invoking effective pruning strategies and the native active sites for ten scaffolds in a benchmark test set were reproduced. The root-mean squared deviations between the matched transition states and those in the crystal structures were < 1.0 Å for the ten scaffolds, and the repacking calculation results showed that 91% of the hydrogen bonds within the active sites are recovered, indicating that the active sites can be preorganized based on the predicted positions of transition states. The application of the complex active site model for de novo enzyme design was evaluated by scaffold selection using a classic catalytic triad motif for the hydrolysis of p-nitrophenyl acetate. Eighty scaffolds were identified from a scaffold library with 1,491 proteins and four scaffolds were native esterase. Furthermore, enzyme design for complicated substrates was investigated for the hydrolysis of cephalexin using scaffold selection based on two different catalytic motifs. Only three scaffolds were identified from the scaffold library by virtue of the classic catalytic triad-based motif. In contrast, 40 scaffolds were identified using a more flexible, but still preorganized catalytic motif, where one scaffold corresponded to the α-amino acid ester hydrolase that catalyzes the hydrolysis and synthesis of cephalexin. Thus, the complex active site modeling approach for de novo enzyme design with the aid of the improved ProdaMatch program is a promising approach for the creation of active sites with high catalytic efficiencies towards target reactions.
Huang, Xiaoqiang; Xue, Jing; Lin, Min; Zhu, Yushan
2016-01-01
Active site preorganization helps native enzymes electrostatically stabilize the transition state better than the ground state for their primary substrates and achieve significant rate enhancement. In this report, we hypothesize that a complex active site model for active site preorganization modeling should help to create preorganized active site design and afford higher starting activities towards target reactions. Our matching algorithm ProdaMatch was improved by invoking effective pruning strategies and the native active sites for ten scaffolds in a benchmark test set were reproduced. The root-mean squared deviations between the matched transition states and those in the crystal structures were < 1.0 Å for the ten scaffolds, and the repacking calculation results showed that 91% of the hydrogen bonds within the active sites are recovered, indicating that the active sites can be preorganized based on the predicted positions of transition states. The application of the complex active site model for de novo enzyme design was evaluated by scaffold selection using a classic catalytic triad motif for the hydrolysis of p-nitrophenyl acetate. Eighty scaffolds were identified from a scaffold library with 1,491 proteins and four scaffolds were native esterase. Furthermore, enzyme design for complicated substrates was investigated for the hydrolysis of cephalexin using scaffold selection based on two different catalytic motifs. Only three scaffolds were identified from the scaffold library by virtue of the classic catalytic triad-based motif. In contrast, 40 scaffolds were identified using a more flexible, but still preorganized catalytic motif, where one scaffold corresponded to the α-amino acid ester hydrolase that catalyzes the hydrolysis and synthesis of cephalexin. Thus, the complex active site modeling approach for de novo enzyme design with the aid of the improved ProdaMatch program is a promising approach for the creation of active sites with high catalytic efficiencies towards target reactions. PMID:27243223
Butler, David C.; Messer, Anne
2011-01-01
Huntington's disease (HD) is a fatal autosomal dominant neurodegenerative disorder caused by a trinucleotide (CAG)n repeat expansion in the coding sequence of the huntingtin gene, and an expanded polyglutamine (>37Q) tract in the protein. This results in misfolding and accumulation of huntingtin protein (htt), formation of neuronal intranuclear and cytoplasmic inclusions, and neuronal dysfunction/degeneration. Single-chain Fv antibodies (scFvs), expressed as intrabodies that bind htt and prevent aggregation, show promise as immunotherapeutics for HD. Intrastriatal delivery of anti-N-terminal htt scFv-C4 using an adeno-associated virus vector (AAV2/1) significantly reduces the size and number of aggregates in HDR6/1 transgenic mice; however, this protective effect diminishes with age and time after injection. We therefore explored enhancing intrabody efficacy via fusions to heterologous functional domains. Proteins containing a PEST motif are often targeted for proteasomal degradation and generally have a short half life. In ST14A cells, fusion of the C-terminal PEST region of mouse ornithine decarboxylase (mODC) to scFv-C4 reduces htt exon 1 protein fragments with 72 glutamine repeats (httex1-72Q) by ∼80–90% when compared to scFv-C4 alone. Proteasomal targeting was verified by either scrambling the mODC-PEST motif, or via proteasomal inhibition with epoxomicin. For these constructs, the proteasomal degradation of the scFv intrabody proteins themselves was reduced<25% by the addition of the mODC-PEST motif, with or without antigens. The remaining intrabody levels were amply sufficient to target N-terminal httex1-72Q protein fragment turnover. Critically, scFv-C4-PEST prevents aggregation and toxicity of httex1-72Q fragments at significantly lower doses than scFv-C4. Fusion of the mODC-PEST motif to intrabodies is a valuable general approach to specifically target toxic antigens to the proteasome for degradation. PMID:22216210
Ulfig, Agnes; Fröbel, Julia; Lausberg, Frank; Blümmel, Anne-Sophie; Heide, Anna Katharina; Müller, Matthias; Freudl, Roland
2017-06-30
The twin-arginine translocation (Tat) pathway transports folded proteins across bacterial membranes. Tat precursor proteins possess a conserved twin-arginine (RR) motif in their signal peptides that is involved in their binding to the Tat translocase, but some facets of this interaction remain unclear. Here, we investigated the role of the hydrophobic (h-) region of the Escherichia coli trimethylamine N -oxide reductase (TorA) signal peptide in TatBC receptor binding in vivo and in vitro We show that besides the RR motif, a minimal, functional h-region in the signal peptide is required for Tat-dependent export in Escherichia coli Furthermore, we identified mutations in the h-region that synergistically suppressed the export defect of a TorA[KQ]-30aa-MalE Tat reporter protein in which the RR motif was replaced with a lysine-glutamine pair. Strikingly, all suppressor mutations increased the hydrophobicity of the h-region. By systematically replacing a neutral residue in the h-region with various amino acids, we detected a positive correlation between the hydrophobicity of the h-region and the translocation efficiency of the resulting reporter variants. In vitro cross-linking of residues located in the periplasmically-oriented part of the TatBC receptor to TorA[KQ]-30aa-MalE reporter variants harboring a more hydrophobic h-region in their signal peptides confirmed that unlike in TorA[KQ]-30aa-MalE with an unaltered h-region, the mutated reporters moved deep into the TatBC-binding cavity. Our results clearly indicate that, besides the Tat motif, the h-region of the Tat signal peptides is another important binding determinant that significantly contributes to the productive interaction of Tat precursor proteins with the TatBC receptor complex. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Rutsdottir, Gudrun; Härmark, Johan; Weide, Yoran; Hebert, Hans; Rasmussen, Morten I; Wernersson, Sven; Respondek, Michal; Akke, Mikael; Højrup, Peter; Koeck, Philip J B; Söderberg, Christopher A G; Emanuelsson, Cecilia
2017-05-12
Small heat-shock proteins (sHsps) prevent aggregation of thermosensitive client proteins in a first line of defense against cellular stress. The mechanisms by which they perform this function have been hard to define due to limited structural information; currently, there is only one high-resolution structure of a plant sHsp published, that of the cytosolic Hsp16.9. We took interest in Hsp21, a chloroplast-localized sHsp crucial for plant stress resistance, which has even longer N-terminal arms than Hsp16.9, with a functionally important and conserved methionine-rich motif. To provide a framework for investigating structure-function relationships of Hsp21 and understanding these sequence variations, we developed a structural model of Hsp21 based on homology modeling, cryo-EM, cross-linking mass spectrometry, NMR, and small-angle X-ray scattering. Our data suggest a dodecameric arrangement of two trimer-of-dimer discs stabilized by the C-terminal tails, possibly through tail-to-tail interactions between the discs, mediated through extended I X V X I motifs. Our model further suggests that six N-terminal arms are located on the outside of the dodecamer, accessible for interaction with client proteins, and distinct from previous undefined or inwardly facing arms. To test the importance of the I X V X I motif, we created the point mutant V181A, which, as expected, disrupts the Hsp21 dodecamer and decreases chaperone activity. Finally, our data emphasize that sHsp chaperone efficiency depends on oligomerization and that client interactions can occur both with and without oligomer dissociation. These results provide a generalizable workflow to explore sHsps, expand our understanding of sHsp structural motifs, and provide a testable Hsp21 structure model to inform future investigations. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Farfán, Pamela; Lee, Jiyeon; Larios, Jorge; Sotelo, Pablo; Bu, Guojun; Marzolo, María-Paz
2013-01-01
Sorting nexin 17 (SNX17) is an adaptor protein present in EEA1-positive sorting endosomes that promotes the efficient recycling of low-density lipoprotein receptor-related protein 1 (LRP1) to the plasma membrane through recognition of the first NPxY motif in the cytoplasmic tail of this receptor. The interaction of LRP1 with SNX17 also regulates the basolateral recycling of the receptor from the basolateral sorting endosome (BSE). In contrast, megalin, which is apically distributed in polarized epithelial cells and localizes poorly to EEA1-positive sorting endosomes, does not interact with SNX17, despite containing three NPxY motifs, indicating that this motif is not sufficient for receptor recognition by SNX17. Here, we identified a cluster of 32 amino acids within the cytoplasmic domain of LRP1 that is both necessary and sufficient for SNX17 binding. To delineate the function of this SNX17-binding domain, we generated chimeric proteins in which the SNX17-binding domain was inserted into the cytoplasmic tail of megalin. This insertion mediated the binding of megalin to SNX17 and modified the cell surface expression and recycling of megalin in non-polarized cells. However, the polarized localization of chimeric megalin was not modified in polarized MDCK cells. These results provide evidence regarding the molecular and cellular mechanisms underlying the specificity of SNX17-binding receptors and the restricted function of SNX17 in the BSE. PMID:23593972
Krystkowiak, Izabella; Lenart, Jakub; Debski, Konrad; Kuterba, Piotr; Petas, Michal; Kaminska, Bozena; Dabrowski, Michal
2013-01-01
We present the Nencki Genomics Database, which extends the functionality of Ensembl Regulatory Build (funcgen) for the three species: human, mouse and rat. The key enhancements over Ensembl funcgen include the following: (i) a user can add private data, analyze them alongside the public data and manage access rights; (ii) inside the database, we provide efficient procedures for computing intersections between regulatory features and for mapping them to the genes. To Ensembl funcgen-derived data, which include data from ENCODE, we add information on conserved non-coding (putative regulatory) sequences, and on genome-wide occurrence of transcription factor binding site motifs from the current versions of two major motif libraries, namely, Jaspar and Transfac. The intersections and mapping to the genes are pre-computed for the public data, and the result of any procedure run on the data added by the users is stored back into the database, thus incrementally increasing the body of pre-computed data. As the Ensembl funcgen schema for the rat is currently not populated, our database is the first database of regulatory features for this frequently used laboratory animal. The database is accessible without registration using the mysql client: mysql -h database.nencki-genomics.org -u public. Registration is required only to add or access private data. A WSDL webservice provides access to the database from any SOAP client, including the Taverna Workbench with a graphical user interface.
MicroRNA categorization using sequence motifs and k-mers.
Yousef, Malik; Khalifa, Waleed; Acar, İlhan Erkin; Allmer, Jens
2017-03-14
Post-transcriptional gene dysregulation can be a hallmark of diseases like cancer and microRNAs (miRNAs) play a key role in the modulation of translation efficiency. Known pre-miRNAs are listed in miRBase, and they have been discovered in a variety of organisms ranging from viruses and microbes to eukaryotic organisms. The computational detection of pre-miRNAs is of great interest, and such approaches usually employ machine learning to discriminate between miRNAs and other sequences. Many features have been proposed describing pre-miRNAs, and we have previously introduced the use of sequence motifs and k-mers as useful ones. There have been reports of xeno-miRNAs detected via next generation sequencing. However, they may be contaminations and to aid that important decision-making process, we aimed to establish a means to differentiate pre-miRNAs from different species. To achieve distinction into species, we used one species' pre-miRNAs as the positive and another species' pre-miRNAs as the negative training and test data for the establishment of machine learned models based on sequence motifs and k-mers as features. This approach resulted in higher accuracy values between distantly related species while species with closer relation produced lower accuracy values. We were able to differentiate among species with increasing success when the evolutionary distance increases. This conclusion is supported by previous reports of fast evolutionary changes in miRNAs since even in relatively closely related species a fairly good discrimination was possible.
Krystkowiak, Izabella; Lenart, Jakub; Debski, Konrad; Kuterba, Piotr; Petas, Michal; Kaminska, Bozena; Dabrowski, Michal
2013-01-01
We present the Nencki Genomics Database, which extends the functionality of Ensembl Regulatory Build (funcgen) for the three species: human, mouse and rat. The key enhancements over Ensembl funcgen include the following: (i) a user can add private data, analyze them alongside the public data and manage access rights; (ii) inside the database, we provide efficient procedures for computing intersections between regulatory features and for mapping them to the genes. To Ensembl funcgen-derived data, which include data from ENCODE, we add information on conserved non-coding (putative regulatory) sequences, and on genome-wide occurrence of transcription factor binding site motifs from the current versions of two major motif libraries, namely, Jaspar and Transfac. The intersections and mapping to the genes are pre-computed for the public data, and the result of any procedure run on the data added by the users is stored back into the database, thus incrementally increasing the body of pre-computed data. As the Ensembl funcgen schema for the rat is currently not populated, our database is the first database of regulatory features for this frequently used laboratory animal. The database is accessible without registration using the mysql client: mysql –h database.nencki-genomics.org –u public. Registration is required only to add or access private data. A WSDL webservice provides access to the database from any SOAP client, including the Taverna Workbench with a graphical user interface. Database URL: http://www.nencki-genomics.org. PMID:24089456
RNA motif search with data-driven element ordering.
Rampášek, Ladislav; Jimenez, Randi M; Lupták, Andrej; Vinař, Tomáš; Brejová, Broňa
2016-05-18
In this paper, we study the problem of RNA motif search in long genomic sequences. This approach uses a combination of sequence and structure constraints to uncover new distant homologs of known functional RNAs. The problem is NP-hard and is traditionally solved by backtracking algorithms. We have designed a new algorithm for RNA motif search and implemented a new motif search tool RNArobo. The tool enhances the RNAbob descriptor language, allowing insertions in helices, which enables better characterization of ribozymes and aptamers. A typical RNA motif consists of multiple elements and the running time of the algorithm is highly dependent on their ordering. By approaching the element ordering problem in a principled way, we demonstrate more than 100-fold speedup of the search for complex motifs compared to previously published tools. We have developed a new method for RNA motif search that allows for a significant speedup of the search of complex motifs that include pseudoknots. Such speed improvements are crucial at a time when the rate of DNA sequencing outpaces growth in computing. RNArobo is available at http://compbio.fmph.uniba.sk/rnarobo .
CircularLogo: A lightweight web application to visualize intra-motif dependencies.
Ye, Zhenqing; Ma, Tao; Kalmbach, Michael T; Dasari, Surendra; Kocher, Jean-Pierre A; Wang, Liguo
2017-05-22
The sequence logo has been widely used to represent DNA or RNA motifs for more than three decades. Despite its intelligibility and intuitiveness, the traditional sequence logo is unable to display the intra-motif dependencies and therefore is insufficient to fully characterize nucleotide motifs. Many methods have been developed to quantify the intra-motif dependencies, but fewer tools are available for visualization. We developed CircularLogo, a web-based interactive application, which is able to not only visualize the position-specific nucleotide consensus and diversity but also display the intra-motif dependencies. Applying CircularLogo to HNF6 binding sites and tRNA sequences demonstrated its ability to show intra-motif dependencies and intuitively reveal biomolecular structure. CircularLogo is implemented in JavaScript and Python based on the Django web framework. The program's source code and user's manual are freely available at http://circularlogo.sourceforge.net . CircularLogo web server can be accessed from http://bioinformaticstools.mayo.edu/circularlogo/index.html . CircularLogo is an innovative web application that is specifically designed to visualize and interactively explore intra-motif dependencies.
Triadic motifs in the dependence networks of virtual societies.
Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing
2014-06-10
In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs.
Triadic motifs in the dependence networks of virtual societies
NASA Astrophysics Data System (ADS)
Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing
2014-06-01
In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs.
Triadic motifs in the dependence networks of virtual societies
Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing
2014-01-01
In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs. PMID:24912755
SLiMSearch 2.0: biological context for short linear motifs in proteins
Davey, Norman E.; Haslam, Niall J.; Shields, Denis C.
2011-01-01
Short, linear motifs (SLiMs) play a critical role in many biological processes. The SLiMSearch 2.0 (Short, Linear Motif Search) web server allows researchers to identify occurrences of a user-defined SLiM in a proteome, using conservation and protein disorder context statistics to rank occurrences. User-friendly output and visualizations of motif context allow the user to quickly gain insight into the validity of a putatively functional motif occurrence. For each motif occurrence, overlapping UniProt features and annotated SLiMs are displayed. Visualization also includes annotated multiple sequence alignments surrounding each occurrence, showing conservation and protein disorder statistics in addition to known and predicted SLiMs, protein domains and known post-translational modifications. In addition, enrichment of Gene Ontology terms and protein interaction partners are provided as indicators of possible motif function. All web server results are available for download. Users can search motifs against the human proteome or a subset thereof defined by Uniprot accession numbers or GO term. The SLiMSearch server is available at: http://bioware.ucd.ie/slimsearch2.html. PMID:21622654
PAM multiplicity marks genomic target sites as inhibitory to CRISPR-Cas9 editing.
Malina, Abba; Cameron, Christopher J F; Robert, Francis; Blanchette, Mathieu; Dostie, Josée; Pelletier, Jerry
2015-12-08
In CRISPR-Cas9 genome editing, the underlying principles for selecting guide RNA (gRNA) sequences that would ensure for efficient target site modification remain poorly understood. Here we show that target sites harbouring multiple protospacer adjacent motifs (PAMs) are refractory to Cas9-mediated repair in situ. Thus we refine which substrates should be avoided in gRNA design, implicating PAM density as a novel sequence-specific feature that inhibits in vivo Cas9-driven DNA modification.
PAM multiplicity marks genomic target sites as inhibitory to CRISPR-Cas9 editing
Malina, Abba; Cameron, Christopher J. F.; Robert, Francis; Blanchette, Mathieu; Dostie, Josée; Pelletier, Jerry
2015-01-01
In CRISPR-Cas9 genome editing, the underlying principles for selecting guide RNA (gRNA) sequences that would ensure for efficient target site modification remain poorly understood. Here we show that target sites harbouring multiple protospacer adjacent motifs (PAMs) are refractory to Cas9-mediated repair in situ. Thus we refine which substrates should be avoided in gRNA design, implicating PAM density as a novel sequence-specific feature that inhibits in vivo Cas9-driven DNA modification. PMID:26644285
Identifying novel sequence variants of RNA 3D motifs
Zirbel, Craig L.; Roll, James; Sweeney, Blake A.; Petrov, Anton I.; Pirrung, Meg; Leontis, Neocles B.
2015-01-01
Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameterize each model, we use all instances of each motif found in the RNA 3D Motif Atlas and annotations of pairwise nucleotide interactions generated by the FR3D software. Isostericity relations between non-Watson–Crick basepairs are used in scoring sequence variants. SCFG techniques model nested pairs and insertions, while MRF ideas handle crossing interactions and base triples. We use test sets of randomly-generated sequences to set acceptance and rejection thresholds for each motif group and thus control the false positive rate. Validation was carried out by comparing results for four motif groups to RMDetect. The software developed for sequence scoring (JAR3D) is structured to automatically incorporate new motifs as they accumulate in the RNA 3D Motif Atlas when new structures are solved and is available free for download. PMID:26130723
Catania, Francesco; Lynch, Michael
2010-05-04
In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa) remains a virtually unexplored issue. By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Our observations 1) shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2) are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3) reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes.
Multiple Dileucine-like Motifs Direct VGLUT1 Trafficking
Foss, Sarah M.; Li, Haiyan; Santos, Magda S.; Edwards, Robert H.
2013-01-01
The vesicular glutamate transporters (VGLUTs) package glutamate into synaptic vesicles, and the two principal isoforms VGLUT1 and VGLUT2 have been suggested to influence the properties of release. To understand how a VGLUT isoform might influence transmitter release, we have studied their trafficking and previously identified a dileucine-like endocytic motif in the C terminus of VGLUT1. Disruption of this motif impairs the activity-dependent recycling of VGLUT1, but does not eliminate its endocytosis. We now report the identification of two additional dileucine-like motifs in the N terminus of VGLUT1 that are not well conserved in the other isoforms. In the absence of all three motifs, rat VGLUT1 shows limited accumulation at synaptic sites and no longer responds to stimulation. In addition, shRNA-mediated knockdown of clathrin adaptor proteins AP-1 and AP-2 shows that the C-terminal motif acts largely via AP-2, whereas the N-terminal motifs use AP-1. Without the C-terminal motif, knockdown of AP-1 reduces the proportion of VGLUT1 that responds to stimulation. VGLUT1 thus contains multiple sorting signals that engage distinct trafficking mechanisms. In contrast to VGLUT1, the trafficking of VGLUT2 depends almost entirely on the conserved C-terminal dileucine-like motif: without this motif, a substantial fraction of VGLUT2 redistributes to the plasma membrane and the transporter's synaptic localization is disrupted. Consistent with these differences in trafficking signals, wild-type VGLUT1 and VGLUT2 differ in their response to stimulation. PMID:23804088
Multiple dileucine-like motifs direct VGLUT1 trafficking.
Foss, Sarah M; Li, Haiyan; Santos, Magda S; Edwards, Robert H; Voglmaier, Susan M
2013-06-26
The vesicular glutamate transporters (VGLUTs) package glutamate into synaptic vesicles, and the two principal isoforms VGLUT1 and VGLUT2 have been suggested to influence the properties of release. To understand how a VGLUT isoform might influence transmitter release, we have studied their trafficking and previously identified a dileucine-like endocytic motif in the C terminus of VGLUT1. Disruption of this motif impairs the activity-dependent recycling of VGLUT1, but does not eliminate its endocytosis. We now report the identification of two additional dileucine-like motifs in the N terminus of VGLUT1 that are not well conserved in the other isoforms. In the absence of all three motifs, rat VGLUT1 shows limited accumulation at synaptic sites and no longer responds to stimulation. In addition, shRNA-mediated knockdown of clathrin adaptor proteins AP-1 and AP-2 shows that the C-terminal motif acts largely via AP-2, whereas the N-terminal motifs use AP-1. Without the C-terminal motif, knockdown of AP-1 reduces the proportion of VGLUT1 that responds to stimulation. VGLUT1 thus contains multiple sorting signals that engage distinct trafficking mechanisms. In contrast to VGLUT1, the trafficking of VGLUT2 depends almost entirely on the conserved C-terminal dileucine-like motif: without this motif, a substantial fraction of VGLUT2 redistributes to the plasma membrane and the transporter's synaptic localization is disrupted. Consistent with these differences in trafficking signals, wild-type VGLUT1 and VGLUT2 differ in their response to stimulation.
Source polarization effects in an optical fiber fluorosensor
NASA Technical Reports Server (NTRS)
Egalon, Claudio O.; Rogowski, Robert S.
1992-01-01
The exact field solution of a step-index profile fiber was used to determine the injection efficiency of a thin-film distribution of polarized sources located in the cladding of an optical fiber. Previous results for random source orientation were confirmed. The behavior of the power efficiency, P(eff), of a polarized distribution of sources was found to be similar to the behavior of a fiber with sources with random orientation. However, for sources polarized in either the x or y direction, P(eff) was found to be more efficient.
Overexpression of TRIM25 in Lung Cancer Regulates Tumor Cell Progression.
Qin, Ying; Cui, He; Zhang, Hua
2016-10-01
Lung cancer is one of the most common causes of cancer-related deaths worldwide. Although great efforts and progressions have been made in the study of the lung cancer in the recent decades, the mechanism of lung cancer formation remains elusive. To establish effective therapeutic methods, new targets implied in lung cancer processes have to be identified. Tripartite motif-containing 25 has been associated with ovarian and breast cancer and is thought to positively promote cell growth by targeting the cell cycle. However, whether tripartite motif-containing 25 has a function in lung cancer development remains unknown. In this study, we found that tripartite motif-containing 25 was overexpressed in human lung cancer tissues. Expression of tripartite motif-containing 25 in lung cancer cells is important for cell proliferation and migration. Knockdown of tripartite motif-containing 25 markedly reduced proliferation of lung cancer cells both in vitro and in vivo and reduced migration of lung cancer cells in vitro Meanwhile, tripartite motif-containing 25 silencing also increased the sensitivity of doxorubicin and significantly increased death and apoptosis of lung cancer cells by doxorubicin were achieved with knockdown of tripartite motif-containing 25. We also observed that tripartite motif-containing 25 formed a complex with p53 and mouse double minute 2 homolog (MDM2) in both human lung cancer tissues and in lung cancer cells and tripartite motif-containing 25 silencing increased the expression of p53. These results provide evidence that tripartite motif-containing 25 contributes to the pathogenesis of lung cancer probably by promoting proliferation and migration of lung cancer cells. Therefore, targeting tripartite motif-containing 25 may provide a potential therapeutic intervention for lung cancer. © The Author(s) 2015.
MOCCS: Clarifying DNA-binding motif ambiguity using ChIP-Seq data.
Ozaki, Haruka; Iwasaki, Wataru
2016-08-01
As a key mechanism of gene regulation, transcription factors (TFs) bind to DNA by recognizing specific short sequence patterns that are called DNA-binding motifs. A single TF can accept ambiguity within its DNA-binding motifs, which comprise both canonical (typical) and non-canonical motifs. Clarification of such DNA-binding motif ambiguity is crucial for revealing gene regulatory networks and evaluating mutations in cis-regulatory elements. Although chromatin immunoprecipitation sequencing (ChIP-seq) now provides abundant data on the genomic sequences to which a given TF binds, existing motif discovery methods are unable to directly answer whether a given TF can bind to a specific DNA-binding motif. Here, we report a method for clarifying the DNA-binding motif ambiguity, MOCCS. Given ChIP-Seq data of any TF, MOCCS comprehensively analyzes and describes every k-mer to which that TF binds. Analysis of simulated datasets revealed that MOCCS is applicable to various ChIP-Seq datasets, requiring only a few minutes per dataset. Application to the ENCODE ChIP-Seq datasets proved that MOCCS directly evaluates whether a given TF binds to each DNA-binding motif, even if known position weight matrix models do not provide sufficient information on DNA-binding motif ambiguity. Furthermore, users are not required to provide numerous parameters or background genomic sequence models that are typically unavailable. MOCCS is implemented in Perl and R and is freely available via https://github.com/yuifu/moccs. By complementing existing motif-discovery software, MOCCS will contribute to the basic understanding of how the genome controls diverse cellular processes via DNA-protein interactions. Copyright © 2016 Elsevier Ltd. All rights reserved.
MOHANTY, BIJAYALAXMI; KRISHNAN, S. P. T.; SWARUP, SANJAY; BAJIC, VLADIMIR B.
2005-01-01
• Background and Aims Plants can suffer from oxygen limitation during flooding or more complete submergence and may therefore switch from Kreb's cycle respiration to fermentation in association with the expression of anaerobically inducible genes coding for enzymes involved in glycolysis and fermentation. The aim of this study was to clarify mechanisms of transcriptional regulation of these anaerobic genes by identifying motifs shared by their promoter regions. • Methods Statistically significant motifs were detected by an in silico method from 13 promoters of anaerobic genes. The selected motifs were common for the majority of analysed promoters. Their significance was evaluated by searching for their presence in transcription factor-binding site databases (TRANSFAC, PlantCARE and PLACE). Using several negative control data sets, it was tested whether the motifs found were specific to the anaerobic group. • Key Results Previously, anaerobic response elements have been identified in maize (Zea mays) and arabidopsis (Arabidopsis thaliana) genes. Known functional motifs were detected, such as GT and GC motifs, but also other motifs shared by most of the genes examined. Five motifs detected have not been found in plants hitherto but are present in the promoters of animal genes with various functions. The consensus sequences of these novel motifs are 5′-AAACAAA-3′, 5′-AGCAGC-3′, 5′-TCATCAC-3′, 5′-GTTT(A/C/T)GCAA-3′ and 5′-TTCCCTGTT-3′. • Conclusions It is believed that the promoter motifs identified could be functional by conferring anaerobic sensitivity to the genes that possess them. This proposal now requires experimental verification. PMID:16027132
Small Deletion Variants Have Stable Breakpoints Commonly Associated with Alu Elements
Coin, Lachlan J. M.; Steinfeld, Israel; Yakhini, Zohar; Sladek, Rob; Froguel, Philippe; Blakemore, Alexandra I. F.
2008-01-01
Copy number variants (CNVs) contribute significantly to human genomic variation, with over 5000 loci reported, covering more than 18% of the euchromatic human genome. Little is known, however, about the origin and stability of variants of different size and complexity. We investigated the breakpoints of 20 small, common deletions, representing a subset of those originally identified by array CGH, using Agilent microarrays, in 50 healthy French Caucasian subjects. By sequencing PCR products amplified using primers designed to span the deleted regions, we determined the exact size and genomic position of the deletions in all affected samples. For each deletion studied, all individuals carrying the deletion share identical upstream and downstream breakpoints at the sequence level, suggesting that the deletion event occurred just once and later became common in the population. This is supported by linkage disequilibrium (LD) analysis, which has revealed that most of the deletions studied are in moderate to strong LD with surrounding SNPs, and have conserved long-range haplotypes. Analysis of the sequences flanking the deletion breakpoints revealed an enrichment of microhomology at the breakpoint junctions. More significantly, we found an enrichment of Alu repeat elements, the overwhelming majority of which intersected deletion breakpoints at their poly-A tails. We found no enrichment of LINE elements or segmental duplications, in contrast to other reports. Sequence analysis revealed enrichment of a conserved motif in the sequences surrounding the deletion breakpoints, although whether this motif has any mechanistic role in the formation of some deletions has yet to be determined. Considered together with existing information on more complex inherited variant regions, and reports of de novo variants associated with autism, these data support the presence of different subgroups of CNV in the genome which may have originated through different mechanisms. PMID:18769679
Interactions of the C-terminal Domain of Human Ku70 with DNA Substrate: A Molecular Dynamics Study
NASA Technical Reports Server (NTRS)
Hu, Shaowen; Huff, Janice; Pluth, Janice M.; Cucinotta, Francis A.
2007-01-01
NASA is developing a systems biology approach to improve the assessment of health risks associated with space radiation. The primary toxic and mutagenic lesion following radiation exposure is the DNA double strand break (DSB), thus a model incorporating proteins and pathways important in response and repair of this lesion is critical. One key protein heterodimer for systems models of radiation effects is the Ku(sub 70/80) complex. The Ku70/80 complex is important in the initial binding of DSB ends following DNA damage, and is a component of nonhomologous end joining repair, the primary pathway for DSB repair in mammalian cells. The C-terminal domain of Ku70 (Ku70c, residues 559-609), contains an helix-extended strand-helix motif and similar motifs have been found in other nucleic acid-binding proteins critical for DNA repair. However, the exact mechanism of damage recognition and substrate specificity for the Ku heterodimer remains unclear in part due to the absence of a high-resolution structure of the Ku70c/DNA complex. We performed a series of molecular dynamics (MD) simulations on a system with the subunit Ku70c and a 14 base pairs DNA duplex, whose starting structures are designed to be variable so as to mimic their different binding modes. By analyzing conformational changes and energetic properties of the complex during MD simulations, we found that interactions are preferred at DNA ends, and within the major groove, which is consistent with previous experimental investigations. In addition, the results indicate that cooperation of Ku70c with other subunits of Ku(sub 70/80) is necessary to explain the high affinity of binding as observed in experiments.
Laver, John D; Li, Xiao; Ray, Debashish; Cook, Kate B; Hahn, Noah A; Nabeel-Shah, Syed; Kekis, Mariana; Luo, Hua; Marsolais, Alexander J; Fung, Karen Yy; Hughes, Timothy R; Westwood, J Timothy; Sidhu, Sachdev S; Morris, Quaid; Lipshitz, Howard D; Smibert, Craig A
2015-05-12
Brain tumor (BRAT) is a Drosophila member of the TRIM-NHL protein family. This family is conserved among metazoans and its members function as post-transcriptional regulators. BRAT was thought to be recruited to mRNAs indirectly through interaction with the RNA-binding protein Pumilio (PUM). However, it has recently been demonstrated that BRAT directly binds to RNA. The precise sequence recognized by BRAT, the extent of BRAT-mediated regulation, and the exact roles of PUM and BRAT in post-transcriptional regulation are unknown. Genome-wide identification of transcripts associated with BRAT or with PUM in Drosophila embryos shows that they bind largely non-overlapping sets of mRNAs. BRAT binds mRNAs that encode proteins associated with a variety of functions, many of which are distinct from those implemented by PUM-associated transcripts. Computational analysis of in vitro and in vivo data identified a novel RNA motif recognized by BRAT that confers BRAT-mediated regulation in tissue culture cells. The regulatory status of BRAT-associated mRNAs suggests a prominent role for BRAT in post-transcriptional regulation, including a previously unidentified role in transcript degradation. Transcriptomic analysis of embryos lacking functional BRAT reveals an important role in mediating the decay of hundreds of maternal mRNAs during the maternal-to-zygotic transition. Our results represent the first genome-wide analysis of the mRNAs associated with a TRIM-NHL protein and the first identification of an RNA motif bound by this protein family. BRAT is a prominent post-transcriptional regulator in the early embryo through mechanisms that are largely independent of PUM.
The EGFR family of receptors sensitizes cancer cells towards UV light
NASA Astrophysics Data System (ADS)
Petersen, Steffen; Neves-Petersen, Maria Teresa; Olsen, Birgitte
2008-02-01
A combination of bioinformatics, biophysical, advanced laser studies and cell biology lead to the realization that laser-pulsed UV light stops cancer growth and induces apoptosis. We have previously shown that laser-pulsed UV (LP-UV) illumination of two different skin-derived cancer cell lines both over expressing the EGF receptor, lead to arrest of the EGFR signaling pathway. We have investigated the available sequence and experimental 3D structures available in the Protein Data Bank. The EGF receptor contains a Furin like cystein rich extracellular domain. The cystein content is highly unusual, 25 disulphide bridges supports the 621 amino acid extracellular protein domain scaffold (1mb6.pdb). In two cases a tryptophan is neighboring a cystein in the primary sequence, which in itself is a rare observation. Aromatic residues is observed to be spatially close to all observed 25 disulphide bridges. The EGF receptor is often overexpressed in cancers and other proliferative skin disorders, it might be possible to significantly reduce the proliferative potential of these cells making them good targets for laser-pulsed UV-light treatment. The discovery that UV light can be used to open disulphide bridges in proteins upon illumination of nearby aromatic amino acids was the first step that lead to the hypothesis that UV light could modulate the structure and therefore the function of these key receptor proteins. The observation that membrane receptors (EGFR) contained exactly the motifs that are sensitive to UV light lead to the prediction that UV light could modify these receptors permanently and stop cancer proliferation. We hereby show that the EGFR family of receptors has the necessary structural motifs that make this family of proteins highly sensitive to UV light.
Manna, Moutusi; Mukhopadhyay, Chaitali
2013-01-01
Interactions of amyloid-β (Aβ) with neuronal membrane are associated with the progression of Alzheimer’s disease (AD). Ganglioside GM1 has been shown to promote the structural conversion of Aβ and increase the rate of peptide aggregation; but the exact nature of interaction driving theses processes remains to be explored. In this work, we have carried out atomistic-scale computer simulations (totaling 2.65 µs) to investigate the behavior of Aβ monomer and dimers in GM1-containing raft-like membrane. The oligosaccharide head-group of GM1 was observed to act as scaffold for Aβ-binding through sugar-specific interactions. Starting from the initial helical peptide conformation, a β-hairpin motif was formed at the C-terminus of the GM1-bound Aβ-monomer; that didn’t appear in absence of GM1 (both in fluid POPC and liquid-ordered cholesterol/POPC bilayers and also in aqueous medium) within the simulation time span. For Aβ-dimers, the β-structure was further enhanced by peptide-peptide interactions, which might influence the propensity of Aβ to aggregate into higher-ordered structures. The salt-bridges and inter-peptide hydrogen bonds were found to account for dimer stability. We observed spontaneous formation of intra-peptide D23-K28 salt-bridge and a turn at V24GSN27 region - long been accepted as characteristic structural-motifs for amyloid self-assembly. Altogether, our results provide atomistic details of Aβ-GM1 and Aβ-Aβ interactions and demonstrate their importance in the early-stages of GM1-mediated Aβ-oligomerisation on membrane surface. PMID:23951128
Exploring protein interiors: the role of a buried histidine in the KH module fold.
Fraternali, F; Amodeo, P; Musco, G; Nilges, M; Pastore, A
1999-03-01
The K-homology (KH) module is a novel RNA-binding motif. The structures of a representative KH motif from vigilin (vig-KH6) and of the first KH domain of fmr1 have been recently solved by nuclear magnetic resonance (NMR) and automated assignment-refinement techniques (ARIA). While a hydrophobic residue is found at position 21 in most of the KH modules, a buried His is conserved in all the 15 KH repeats of vigilin. This position must therefore have a key structural role in stabilizing the hydrophobic core. In the present work, we have addressed the following questions in order to obtain a detailed description of the role of His 21: i) what is the exact role of the histidine in the hydrophobic core of vig-KH6? ii) can we define the interactions that allow a conserved buried position to be occupied by a histidine both in vig-KH6 and in the whole vigilin KH sub-family? iii) how is the structure and stability of vig-KH6 influenced by the state of protonation of this histidine? To answer these questions, we have carried out an extensive refinement of the vig-KH6 structure using both an improved ARIA protocol starting from different initial structures and successively running restrained and unrestrained trajectories in water. An analysis of the stability of secondary structural elements, solvent accessibility, and hydrogen bonding patterns allows hypothesis on the structural role of residue His 21 and on the interactions that this residue forms with the environment. The importance of the protonation state of His 21 on the stability of the KH fold was addressed and validated by experimental results.
Bremel, Robert D.; Homan, E. Jane
2015-01-01
T-cell receptor binding to MHC-bound peptides plays a key role in discrimination between self and non-self. Only a subset, typically a pentamer, of amino acids in a MHC-bound peptide form the motif exposed to the T-cell receptor. We categorize and compare the T-cell exposed amino acid motif repertoire of the total proteomes of two groups of bacteria, comprising pathogens and gastrointestinal microbiome organisms, with the human proteome and immunoglobulins. Given the maximum 205, or 3.2 million of such motifs that bind T-cell receptors, there is considerable overlap in motif usage. We show that the human proteome, exclusive of immunoglobulins, only comprises three quarters of the possible motifs, of which 65.3% are also present in both composite bacterial proteomes. Very few motifs are unique to the human proteome. Immunoglobulin variable regions carry a broad diversity of T-cell exposed motifs (TCEMs) that provides a stratified random sample of the motifs found in pathogens, microbiome, and the human proteome. Individual bacterial genera and species vary in the content of immunoglobulin and human proteome matched motifs that they carry. Mycobacteria and Burkholderia spp carry a particularly high content of such matched motifs. Some bacteria retain a unique motif signature and motif sharing pattern with the human proteome. The implication is that distinguishing self from non-self does not depend on individual TCEMs, but on a complex and dynamic overlay of signals wherein the same TCEM may play different roles in different organisms, and the frequency with which a particular TCEM appears influences its effect. The patterns observed provide clues to bacterial immune evasion and to strategies for intervention, including vaccine design. The breadth and distinct frequency patterns of the immunoglobulin-derived peptides suggest a role of immunoglobulins in maintaining a broadly responsive T-cell repertoire. PMID:26557118
Prediction of virus-host protein-protein interactions mediated by short linear motifs.
Becerra, Andrés; Bucheli, Victor A; Moreno, Pedro A
2017-03-09
Short linear motifs in host organisms proteins can be mimicked by viruses to create protein-protein interactions that disable or control metabolic pathways. Given that viral linear motif instances of host motif regular expressions can be found by chance, it is necessary to develop filtering methods of functional linear motifs. We conduct a systematic comparison of linear motifs filtering methods to develop a computational approach for predicting motif-mediated protein-protein interactions between human and the human immunodeficiency virus 1 (HIV-1). We implemented three filtering methods to obtain linear motif sets: 1) conserved in viral proteins (C), 2) located in disordered regions (D) and 3) rare or scarce in a set of randomized viral sequences (R). The sets C,D,R are united and intersected. The resulting sets are compared by the number of protein-protein interactions correctly inferred with them - with experimental validation. The comparison is done with HIV-1 sequences and interactions from the National Institute of Allergy and Infectious Diseases (NIAID). The number of correctly inferred interactions allows to rank the interactions by the sets used to deduce them: D∪R and C. The ordering of the sets is descending on the probability of capturing functional interactions. With respect to HIV-1, the sets C∪R, D∪R, C∪D∪R infer all known interactions between HIV1 and human proteins mediated by linear motifs. We found that the majority of conserved linear motifs in the virus are located in disordered regions. We have developed a method for predicting protein-protein interactions mediated by linear motifs between HIV-1 and human proteins. The method only use protein sequences as inputs. We can extend the software developed to any other eukaryotic virus and host in order to find and rank candidate interactions. In future works we will use it to explore possible viral attack mechanisms based on linear motif mimicry.
Sebestyén, Endre; Nagy, Tibor; Suhai, Sándor; Barta, Endre
2009-01-01
Background The comparative genomic analysis of a large number of orthologous promoter regions of the chordate and plant genes from the DoOP databases shows thousands of conserved motifs. Most of these motifs differ from any known transcription factor binding site (TFBS). To identify common conserved motifs, we need a specific tool to be able to search amongst them. Since conserved motifs from the DoOP databases are linked to genes, the result of such a search can give a list of genes that are potentially regulated by the same transcription factor(s). Results We have developed a new tool called DoOPSearch for the analysis of the conserved motifs in the promoter regions of chordate or plant genes. We used the orthologous promoters of the DoOP database to extract thousands of conserved motifs from different taxonomic groups. The advantage of this approach is that different sets of conserved motifs might be found depending on how broad the taxonomic coverage of the underlying orthologous promoter sequence collection is (consider e.g. primates vs. mammals or Brassicaceae vs. Viridiplantae). The DoOPSearch tool allows the users to search these motif collections or the promoter regions of DoOP with user supplied query sequences or any of the conserved motifs from the DoOP database. To find overrepresented gene ontologies, the gene lists obtained can be analysed further using a modified version of the GeneMerge program. Conclusion We present here a comparative genomics based promoter analysis tool. Our system is based on a unique collection of conserved promoter motifs characteristic of different taxonomic groups. We offer both a command line and a web-based tool for searching in these motif collections using user specified queries. These can be either short promoter sequences or consensus sequences of known transcription factor binding sites. The GeneMerge analysis of the search results allows the user to identify statistically overrepresented Gene Ontology terms that might provide a clue on the function of the motifs and genes. PMID:19534755
Ellenberger, David; Friede, Tim
2016-08-05
Methods for change point (also sometimes referred to as threshold or breakpoint) detection in binary sequences are not new and were introduced as early as 1955. Much of the research in this area has focussed on asymptotic and exact conditional methods. Here we develop an exact unconditional test. An unconditional exact test is developed which assumes the total number of events as random instead of conditioning on the number of observed events. The new test is shown to be uniformly more powerful than Worsley's exact conditional test and means for its efficient numerical calculations are given. Adaptions of methods by Berger and Boos are made to deal with the issue that the unknown event probability imposes a nuisance parameter. The methods are compared in a Monte Carlo simulation study and applied to a cohort of patients undergoing traumatic orthopaedic surgery involving external fixators where a change in pin site infections is investigated. The unconditional test controls the type I error rate at the nominal level and is uniformly more powerful than (or to be more precise uniformly at least as powerful as) Worsley's exact conditional test which is very conservative for small sample sizes. In the application a beneficial effect associated with the introduction of a new treatment procedure for pin site care could be revealed. We consider the new test an effective and easy to use exact test which is recommended in small sample size change point problems in binary sequences.
Scheltema, Richard Alexander; Hauschild, Jan-Peter; Lange, Oliver; Hornburg, Daniel; Denisov, Eduard; Damoc, Eugen; Kuehn, Andreas; Makarov, Alexander; Mann, Matthias
2014-01-01
The quadrupole Orbitrap mass spectrometer (Q Exactive) made a powerful proteomics instrument available in a benchtop format. It significantly boosted the number of proteins analyzable per hour and has now evolved into a proteomics analysis workhorse for many laboratories. Here we describe the Q Exactive Plus and Q Exactive HF mass spectrometers, which feature several innovations in comparison to the original Q Exactive instrument. A low-resolution pre-filter has been implemented within the injection flatapole, preventing unwanted ions from entering deep into the system, and thereby increasing its robustness. A new segmented quadrupole, with higher fidelity of isolation efficiency over a wide range of isolation windows, provides an almost 2-fold improvement of transmission at narrow isolation widths. Additionally, the Q Exactive HF has a compact Orbitrap analyzer, leading to higher field strength and almost doubling the resolution at the same transient times. With its very fast isolation and fragmentation capabilities, the instrument achieves overall cycle times of 1 s for a top 15 to 20 higher energy collisional dissociation method. We demonstrate the identification of 5000 proteins in standard 90-min gradients of tryptic digests of mammalian cell lysate, an increase of over 40% for detected peptides and over 20% for detected proteins. Additionally, we tested the instrument on peptide phosphorylation enriched samples, for which an improvement of up to 60% class I sites was observed. PMID:25360005
Identification and preliminary characterization of a protein motif related to the zinc finger.
Lovering, R; Hanson, I M; Borden, K L; Martin, S; O'Reilly, N J; Evan, G I; Rahman, D; Pappin, D J; Trowsdale, J; Freemont, P S
1993-01-01
We have identified a protein motif, related to the zinc finger, which defines a newly discovered family of proteins. The motif was found in the sequence of the human RING1 gene, which is proximal to the major histocompatibility complex region on chromosome six. We propose naming this motif the "RING finger" and it is found in 27 proteins, all of which have putative DNA binding functions. We have synthesized a peptide corresponding to the RING1 motif and examined a number of properties, including metal and DNA binding. We provide evidence to support the suggestion that the RING finger motif is the DNA binding domain of this newly defined family of proteins. Images Fig. 1 Fig. 4 PMID:7681583
Duque, Hernando; Baxt, Barry
2003-01-01
Three members of the αV integrin family of cellular receptors, αVβ1, αVβ3, and αVβ6, have been identified as receptors for foot-and-mouth disease virus (FMDV) in vitro. The virus interacts with these receptors via a highly conserved arginine-glycine-aspartic acid (RGD) amino acid sequence motif located within the βG-βH (G-H) loop of VP1. Other αV integrins, as well as several other integrins, recognize and bind to RGD motifs on their natural ligands and also may be candidate receptors for FMDV. To analyze the roles of the αV integrins from a susceptible species as viral receptors, we molecularly cloned the bovine β1, β5, and β6 integrin subunits. Using these subunits, along with previously cloned bovine αV and β3 subunits, in a transient expression assay system, we compared the efficiencies of infection mediated by αVβ1, αVβ3, αVβ5, and αVβ6 among three strains of FMDV serotype A and two strains of serotype O. While all the viruses could infect cells expressing these integrins, they exhibited different efficiencies of integrin utilization. All the type A viruses used αVβ3 and αVβ6 with relatively high efficiency, while only one virus utilized αVβ1 with moderate efficiency. In contrast, both type O viruses utilized αVβ6 and αVβ1 with higher efficiency than αVβ3. Only low levels of viral replication were detected in αVβ5-expressing cells infected with either serotype. Experiments in which the ligand-binding domains among the β subunits were exchanged indicated that this region of the integrin subunit appears to contribute to the differences in integrin utilizations among strains. In contrast, the G-H loops of the different viruses do not appear to be involved in this phenomenon. Thus, the ability of the virus to utilize multiple integrins in vitro may be a reflection of the use of multiple receptors during the course of infection within the susceptible host. PMID:12551988
Two-lane traffic-flow model with an exact steady-state solution.
Kanai, Masahiro
2010-12-01
We propose a stochastic cellular-automaton model for two-lane traffic flow based on the misanthrope process in one dimension. The misanthrope process is a stochastic process allowing for an exact steady-state solution; hence, we have an exact flow-density diagram for two-lane traffic. In addition, we introduce two parameters that indicate, respectively, driver's driving-lane preference and passing-lane priority. Due to the additional parameters, the model shows a deviation of the density ratio for driving-lane use and a biased lane efficiency in flow. Then, a mean-field approach explicitly describes the asymmetric flow by the hop rates, the driving-lane preference, and the passing-lane priority. Meanwhile, the simulation results are in good agreement with an observational data, and we thus estimate these parameters. We conclude that the proposed model successfully produces two-lane traffic flow particularly with the driving-lane preference and the passing-lane priority.
Exact geodesic distances in FLRW spacetimes
NASA Astrophysics Data System (ADS)
Cunningham, William J.; Rideout, David; Halverson, James; Krioukov, Dmitri
2017-11-01
Geodesics are used in a wide array of applications in cosmology and astrophysics. However, it is not a trivial task to efficiently calculate exact geodesic distances in an arbitrary spacetime. We show that in spatially flat (3 +1 )-dimensional Friedmann-Lemaître-Robertson-Walker (FLRW) spacetimes, it is possible to integrate the second-order geodesic differential equations, and derive a general method for finding both timelike and spacelike distances given initial-value or boundary-value constraints. In flat spacetimes with either dark energy or matter, whether dust, radiation, or a stiff fluid, we find an exact closed-form solution for geodesic distances. In spacetimes with a mixture of dark energy and matter, including spacetimes used to model our physical universe, there exists no closed-form solution, but we provide a fast numerical method to compute geodesics. A general method is also described for determining the geodesic connectedness of an FLRW manifold, provided only its scale factor.
Colombo, Miriam; Fiandra, Luisa; Alessio, Giulia; Mazzucchelli, Serena; Nebuloni, Manuela; De Palma, Clara; Kantner, Karsten; Pelaz, Beatriz; Rotem, Rany; Corsi, Fabio; Parak, Wolfgang J.; Prosperi, Davide
2016-01-01
Active targeting of nanoparticles to tumours can be achieved by conjugation with specific antibodies. Specific active targeting of the HER2 receptor is demonstrated in vitro and in vivo with a subcutaneous MCF-7 breast cancer mouse model with trastuzumab-functionalized gold nanoparticles. The number of attached antibodies per nanoparticle was precisely controlled in a way that each nanoparticle was conjugated with either exactly one or exactly two antibodies. As expected, in vitro we found a moderate increase in targeting efficiency of nanoparticles with two instead of just one antibody attached per nanoparticle. However, the in vivo data demonstrate that best effect is obtained for nanoparticles with only exactly one antibody. There is indication that this is based on a size-related effect. These results highlight the importance of precisely controlling the ligand density on the nanoparticle surface for optimizing active targeting, and that less antibodies can exhibit more effect. PMID:27991503
NASA Astrophysics Data System (ADS)
Mu, Wanlu; Li, Xiaowei; Wang, Longfei; Chen, Yong; Wu, Yanchao
2017-08-01
An efficient aerobic oxidative annulation of cyclohexanones and 2-aminophenyl ketones approach to substituted acridines, a structural motif for a large number of pharmaceuticals and functional materials is described. The key feature of this method is the use of oxygen as the sole oxidant and Pd catalyst, which resulting in the high regioselectivity with unsymmetrical meta-substituted cyclohexanones. The electron gap of the global redox condensation process is filled and the reaction efficiency is significantly promoted by O2 as a redox moderator. This protocol possesses many advantages such as using O2 as a cheap and nonhazardous oxidant, high regioselectivity and water as the only by-product, which meet the principle of green chemistry.
Excitation efficiency of an optical fiber core source
NASA Technical Reports Server (NTRS)
Egalon, Claudio O.; Rogowski, Robert S.; Tai, Alan C.
1992-01-01
The exact field solution of a step-index profile fiber is used to determine the excitation efficiency of a distribution of sources in the core of an optical fiber. Previous results of a thin-film cladding source distribution to its core source counterpart are used for comparison. The behavior of power efficiency with the fiber parameters is examined and found to be similar to the behavior exhibited by cladding sources. It is also found that a core-source fiber is two orders of magnitude more efficient than a fiber with a bulk distribution of cladding sources. This result agrees qualitatively with previous ones obtained experimentally.
Inhibition of human papillomavirus expression using DNAzymes.
Benítez-Hess, María Luisa; Reyes-Gutiérrez, Pablo; Alvarez-Salas, Luis Marat
2011-01-01
Deoxyribozymes (DXZs) are catalytic oligodeoxynucleotides capable of performing diverse functions including the specific cleavage of a target RNA. These molecules represent a new type of therapeutic oligonucleotides combining the efficiency of ribozymes and the intracellular endurance and simplicity of modified antisense oligonucleotides. Commonly used DXZs include the 8-17 and 10-23 motifs, which have been engineered to destroy disease-associated genes with remarkable efficiency. Targeting DXZs to disease-associated transcripts requires extensive biochemical testing to establish target RNA accessibility, catalytic efficiency, and nuclease sensibility. The usage of modified nucleotides to render nuclease-resistance DXZs must be counterweighted against deleterious consequences on catalytic activity. Further intracellular testing is required to establish the effect of microenvironmental conditions on DXZ activity and off-target issues. Application of modified DXZs to cervical cancer results in specific growth inhibition, cell death, and apoptosis. Thus, DXZs represent a highly effective antisense moiety with minimal secondary effects.
Brendolise, Cyril; Espley, Richard V; Lin-Wang, Kui; Laing, William; Peng, Yongyan; McGhie, Tony; Dejnoprat, Supinya; Tomes, Sumathi; Hellens, Roger P; Allan, Andrew C
2017-01-01
In apple, the MYB transcription factor MYB10 controls the accumulation of anthocyanins. MYB10 is able to auto-activate its expression by binding its own promoter at a specific motif, the R1 motif. In some apple accessions a natural mutation, termed R6, has more copies of this motif within the MYB10 promoter resulting in stronger auto-activation and elevated anthocyanins. Here we show that other anthocyanin-related MYBs selected from apple, pear, strawberry, petunia, kiwifruit and Arabidopsis are able to activate promoters containing the R6 motif. To examine the specificity of this motif, members of the R2R3 MYB family were screened against a promoter harboring the R6 mutation. Only MYBs from subgroups 5 and 6 activate expression by binding the R6 motif, with these MYBs sharing conserved residues in their R2R3 DNA binding domains. Insertion of the apple R6 motif into orthologous promoters of MYB10 in pear ( PcMYB10 ) and Arabidopsis ( AtMY75 ) elevated anthocyanin levels. Introduction of the R6 motif into the promoter region of an anthocyanin biosynthetic enzyme F3'5'H of kiwifruit imparts regulation by MYB10. This results in elevated levels of delphinidin in both tobacco and kiwifruit. Finally, an R6 motif inserted into the promoter the vitamin C biosynthesis gene GDP-L-Gal phosphorylase increases vitamin C content in a MYB10-dependent manner. This motif therefore provides a tool to re-engineer novel MYB-regulated responses in plants.
Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium
2010-01-01
Background In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa) remains a virtually unexplored issue. Results By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Conclusions Our observations 1) shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2) are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3) reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes. PMID:20441586
Dienogest inhibits C-C motif chemokine ligand 20 expression in human endometriotic epithelial cells.
Mita, Shizuka; Nakakuki, Masanori; Ichioka, Masayuki; Shimizu, Yutaka; Hashiba, Masamichi; Miyazaki, Hiroyasu; Kyo, Satoru
2017-07-01
C-C motif chemokine ligand 20 is thought to contribute to the development of endometriosis by recruiting Th17 lymphocytes into endometriotic foci. The present study investigated the effects of dienogest, a progesterone receptor agonist used to treat endometriosis, on C-C motif chemokine ligand 20 expression by endometriotic cells. Effects of dienogest on mRNA expression and protein secretion of C-C motif chemokine ligand 20 induced by interleukin 1β were assessed in three immortalized endometriotic epithelial cell lines, parental cells (EMosis-CC/TERT1), and stably expressing human progesterone receptor isoform A (EMosis-CC/TERT1/PRA+) or isoform B (EMosis-CC/TERT1/PRA-/PRB+). Dienogest markedly inhibited interleukin 1β-stimulated C-C motif chemokine ligand 20 mRNA expression and protein secretion in EMosis-CC/TERT1/PRA-/PRB+, which was abrogated by the progesterone receptor antagonist RU486. In EMosis-CC/TERT1/PRA+, dienogest slightly inhibited C-C motif chemokine ligand 20 mRNA and protein. In EMosis-CC/TERT1, dienogest slightly inhibited C-C motif chemokine ligand 20 mRNA, but had no effect on C-C motif chemokine ligand 20 protein. Dienogest inhibited interleukin 1β-induced up-regulation of C-C motif chemokine ligand 20 in endometriotic epithelial cells, mainly mediated by progesterone receptor B. Copyright © 2017 Elsevier B.V. All rights reserved.
Transcriptional regulation of Saccharomyces cerevisiaeCYS3 encoding cystathionine γ-lyase
Hiraishi, Hiroyuki; Miyake, Tsuyoshi
2008-01-01
In studying the regulation of GSH11, the structural gene of the high-affinity glutathione transporter (GSH-P1) in Saccharomyces cerevisiae, a cis-acting cysteine responsive element, CCGCCACAC (CCG motif), was detected. Like GSH-P1, the cystathionine γ-lyase encoded by CYS3 is induced by sulfur starvation and repressed by addition of cysteine to the growth medium. We detected a CCG motif (−311 to −303) and a CGC motif (CGCCACAC; −193 to −186), which is one base shorter than the CCG motif, in the 5′-upstream region of CYS3. One copy of the centromere determining element 1, CDE1 (TCACGTGA; −217 to −210), being responsible for regulation of the sulfate assimilation pathway genes, was also detected. We tested the roles of these three elements in the regulation of CYS3. Using a lacZ-reporter assay system, we found that the CCG/CGC motif is required for activation of CYS3, as well as for its repression by cysteine. In contrast, the CDE1 motif was responsible for only activation of CYS3. We also found that two transcription factors, Met4 and VDE, are responsible for activation of CYS3 through the CCG/CGC and CDE1 motifs. These observations suggest a dual regulation of CYS3 by factors that interact with the CDE1 motif and the CCG/CGC motifs. PMID:18317767
Wang, Lilin; Smith, Dan; Bot, Simona; Dellamary, Luis; Bloom, Amy; Bot, Adrian
2002-01-01
The adaptive immune response is triggered by recognition of T and B cell epitopes and is influenced by “danger” motifs that act via innate immune receptors. This study shows that motifs associated with noncoding RNA are essential features in the immune response reminiscent of viral infection, mediating rapid induction of proinflammatory chemokine expression, recruitment and activation of antigen-presenting cells, modulation of regulatory cytokines, subsequent differentiation of Th1 cells, isotype switching, and stimulation of cross-priming. The heterogeneity of RNA-associated motifs results in differential binding to cellular receptors, and specifically impacts the immune profile. Naturally occurring double-stranded RNA (dsRNA) triggered activation of dendritic cells and enhancement of specific immunity, similar to selected synthetic dsRNA motifs. Based on the ability of specific RNA motifs to block tolerance induction and effectively organize the immune defense during viral infection, we conclude that such RNA species are potent danger motifs. We also demonstrate the feasibility of using selected RNA motifs as adjuvants in the context of novel aerosol carriers for optimizing the immune response to subunit vaccines. In conclusion, RNA-associated motifs produced during viral infection bridge the early response with the late adaptive phase, regulating the activation and differentiation of antigen-specific B and T cells, in addition to a short-term impact on innate immunity. PMID:12393853
The Methionine-aromatic Motif Plays a Unique Role in Stabilizing Protein Structure*
Valley, Christopher C.; Cembran, Alessandro; Perlmutter, Jason D.; Lewis, Andrew K.; Labello, Nicholas P.; Gao, Jiali; Sachs, Jonathan N.
2012-01-01
Of the 20 amino acids, the precise function of methionine (Met) remains among the least well understood. To establish a determining characteristic of methionine that fundamentally differentiates it from purely hydrophobic residues, we have used in vitro cellular experiments, molecular simulations, quantum calculations, and a bioinformatics screen of the Protein Data Bank. We show that approximately one-third of all known protein structures contain an energetically stabilizing Met-aromatic motif and, remarkably, that greater than 10,000 structures contain this motif more than 10 times. Critically, we show that as compared with a purely hydrophobic interaction, the Met-aromatic motif yields an additional stabilization of 1–1.5 kcal/mol. To highlight its importance and to dissect the energetic underpinnings of this motif, we have studied two clinically relevant TNF ligand-receptor complexes, namely TRAIL-DR5 and LTα-TNFR1. In both cases, we show that the motif is necessary for high affinity ligand binding as well as function. Additionally, we highlight previously overlooked instances of the motif in several disease-related Met mutations. Our results strongly suggest that the Met-aromatic motif should be exploited in the rational design of therapeutics targeting a range of proteins. PMID:22859300
MotifMark: Finding regulatory motifs in DNA sequences.
Hassanzadeh, Hamid Reza; Kolhe, Pushkar; Isbell, Charles L; Wang, May D
2017-07-01
The interaction between proteins and DNA is a key driving force in a significant number of biological processes such as transcriptional regulation, repair, recombination, splicing, and DNA modification. The identification of DNA-binding sites and the specificity of target proteins in binding to these regions are two important steps in understanding the mechanisms of these biological activities. A number of high-throughput technologies have recently emerged that try to quantify the affinity between proteins and DNA motifs. Despite their success, these technologies have their own limitations and fall short in precise characterization of motifs, and as a result, require further downstream analysis to extract useful and interpretable information from a haystack of noisy and inaccurate data. Here we propose MotifMark, a new algorithm based on graph theory and machine learning, that can find binding sites on candidate probes and rank their specificity in regard to the underlying transcription factor. We developed a pipeline to analyze experimental data derived from compact universal protein binding microarrays and benchmarked it against two of the most accurate motif search methods. Our results indicate that MotifMark can be a viable alternative technique for prediction of motif from protein binding microarrays and possibly other related high-throughput techniques.
Deletion of transcription factor binding motifs using the CRISPR/spCas9 system in the β-globin LCR.
Kim, Yea Woon; Kim, AeRi
2017-07-20
Transcription factors play roles in gene transcription through direct binding to their motifs in genome, and inhibiting this binding provides an effective strategy for studying their roles. Here we applied the CRISPR/spCas9 system to mutate the binding motifs of transcription factors. Binding motifs for erythroid specific transcription factors were mutated in the locus control region hypersensitive sites of the human β-globin locus. Guide RNAs targeting binding motifs were cloned into lentiviral CRISPR vector containing the spCas9 gene, and transduced into MEL/ch11 cells carrying a human chromosome 11. DNA mutations in clonal cells were initially screened by quantitative PCR in genomic DNA and then clarified by sequencing. Mutations in binding motifs reduced occupancy by transcription factors in a chromatin environment. Characterization of mutations revealed that the CRISPR/spCas9 system mainly induced deletions in short regions of <20 bp and preferentially deleted nucleotides around the fifth nucleotide upstream of Protospacer adjacent motifs. These results indicate that the CRISPR/Cas9 system is suitable for mutating the binding motifs of transcription factors, and, consequently, would contribute to elucidate the direct roles of transcription factors. ©2017 The Author(s).
Wang, Jichao; Zhang, Tongchuan; Liu, Ruicun; Song, Meilin; Wang, Juncheng; Hong, Jiong; Chen, Quan; Liu, Haiyan
2017-02-01
An interesting way of generating novel artificial proteins is to combine sequence motifs from natural proteins, mimicking the evolutionary path suggested by natural proteins comprising recurring motifs. We analyzed the βα and αβ modules of TIM barrel proteins by structure alignment-based sequence clustering. A number of preferred motifs were identified. A chimeric TIM was designed by using recurring elements as mutually compatible interfaces. The foldability of the designed TIM protein was then significantly improved by six rounds of directed evolution. The melting temperature has been improved by more than 20°C. A variety of characteristics suggested that the resulting protein is well-folded. Our analysis provided a library of peptide motifs that is potentially useful for different protein engineering studies. The protein engineering strategy of using recurring motifs as interfaces to connect partial natural proteins may be applied to other protein folds. Copyright © 2016 Elsevier B.V. All rights reserved.
Discriminative motif discovery via simulated evolution and random under-sampling.
Song, Tao; Gu, Hong
2014-01-01
Conserved motifs in biological sequences are closely related to their structure and functions. Recently, discriminative motif discovery methods have attracted more and more attention. However, little attention has been devoted to the data imbalance problem, which is one of the main reasons affecting the performance of the discriminative models. In this article, a simulated evolution method is applied to solve the multi-class imbalance problem at the stage of data preprocessing, and at the stage of Hidden Markov Models (HMMs) training, a random under-sampling method is introduced for the imbalance between the positive and negative datasets. It is shown that, in the task of discovering targeting motifs of nine subcellular compartments, the motifs found by our method are more conserved than the methods without considering data imbalance problem and recover the most known targeting motifs from Minimotif Miner and InterPro. Meanwhile, we use the found motifs to predict protein subcellular localization and achieve higher prediction precision and recall for the minority classes.
Tran, Tuan; Disney, Matthew D
2012-01-01
RNA is an important therapeutic target but information about RNA-ligand interactions is limited. Here, we report a screening method that probes over 3,000,000 combinations of RNA motif-small molecule interactions to identify the privileged RNA structures and chemical spaces that interact. Specifically, a small molecule library biased for binding RNA was probed for binding to over 70,000 unique RNA motifs in a high throughput solution-based screen. The RNA motifs that specifically bind each small molecule were identified by microarray-based selection. In this library-versus-library or multidimensional combinatorial screening approach, hairpin loops (among a variety of RNA motifs) were the preferred RNA motif space that binds small molecules. Furthermore, it was shown that indole, 2-phenyl indole, 2-phenyl benzimidazole and pyridinium chemotypes allow for specific recognition of RNA motifs. As targeting RNA with small molecules is an extremely challenging area, these studies provide new information on RNA-ligand interactions that has many potential uses.
Tran, Tuan; Disney, Matthew D.
2012-01-01
RNA is an important therapeutic target but information about RNA-ligand interactions is limited. Here we report a screening method that probes over 3,000,000 combinations of RNA motif-small molecule interactions to identify the privileged RNA structures and chemical spaces that interact. Specifically, a small molecule library biased for binding RNA was probed for binding to over 70,000 unique RNA motifs in a high throughput solution-based screen. The RNA motifs that specifically bind each small molecule were identified by microarray-based selection. In this library-versus-library or multidimensional combinatorial screening approach, hairpin loops (amongst a variety of RNA motifs) were the preferred RNA motif space that binds small molecules. Furthermore, it was shown that indole, 2-phenyl indole, 2-phenyl benzimidazole, and pyridinium chemotypes allow for specific recognition of RNA motifs. Since targeting RNA with small molecules is an extremely challenging area, these studies provide new information on RNA-ligand interactions that has many potential uses. PMID:23047683
Effect of C(60) fullerene on the duplex formation of i-motif DNA with complementary DNA in solution.
Jin, Kyeong Sik; Shin, Su Ryon; Ahn, Byungcheol; Jin, Sangwoo; Rho, Yecheol; Kim, Heesoo; Kim, Seon Jeong; Ree, Moonhor
2010-04-15
The structural effects of fullerene on i-motif DNA were investigated by characterizing the structures of fullerene-free and fullerene-bound i-motif DNA, in the presence of cDNA and in solutions of varying pH, using circular dichroism and synchrotron small-angle X-ray scattering. To facilitate a direct structural comparison between the i-motif and duplex structures in response to pH stimulus, we developed atomic scale structural models for the duplex and i-motif DNA structures, and for the C(60)/i-motif DNA hybrid associated with the cDNA strand, assuming that the DNA strands are present in an ideal right-handed helical conformation. We found that fullerene shifted the pH-induced conformational transition between the i-motif and the duplex structure, possibly due to the hydrophobic interactions between the terminal fullerenes and between the terminal fullerenes and an internal TAA loop in the DNA strand. The hybrid structure showed a dramatic reduction in cyclic hysteresis.
Anion induced conformational preference of Cα NN motif residues in functional proteins.
Patra, Piya; Ghosh, Mahua; Banerjee, Raja; Chakrabarti, Jaydeb
2017-12-01
Among different ligand binding motifs, anion binding C α NN motif consisting of peptide backbone atoms of three consecutive residues are observed to be important for recognition of free anions, like sulphate or biphosphate and participate in different key functions. Here we study the interaction of sulphate and biphosphate with C α NN motif present in different proteins. Instead of total protein, a peptide fragment has been studied keeping C α NN motif flanked in between other residues. We use classical force field based molecular dynamics simulations to understand the stability of this motif. Our data indicate fluctuations in conformational preferences of the motif residues in absence of the anion. The anion gives stability to one of these conformations. However, the anion induced conformational preferences are highly sequence dependent and specific to the type of anion. In particular, the polar residues are more favourable compared to the other residues for recognising the anion. © 2017 Wiley Periodicals, Inc.
Lathrop, R H; Casale, M; Tobias, D J; Marsh, J L; Thompson, L M
1998-01-01
We describe a prototype system (Poly-X) for assisting an expert user in modeling protein repeats. Poly-X reduces the large number of degrees of freedom required to specify a protein motif in complete atomic detail. The result is a small number of parameters that are easily understood by, and under the direct control of, a domain expert. The system was applied to the polyglutamine (poly-Q) repeat in the first exon of huntingtin, the gene implicated in Huntington's disease. We present four poly-Q structural motifs: two poly-Q beta-sheet motifs (parallel and antiparallel) that constitute plausible alternatives to a similar previously published poly-Q beta-sheet motif, and two novel poly-Q helix motifs (alpha-helix and pi-helix). To our knowledge, helical forms of polyglutamine have not been proposed before. The motifs suggest that there may be several plausible aggregation structures for the intranuclear inclusion bodies which have been found in diseased neurons, and may help in the effort to understand the structural basis for Huntington's disease.
NASA Astrophysics Data System (ADS)
Kim, Hunmo
In the brake systems, it is important to reduce the rear brake pressure in order to secure the safety of the vehicle in braking. So, there was some research that reduced and controlled the rear brake pressure exactly like a L. S. P. V and a E. L. S. P. V. However, the previous research has some weaknesses: the L. S. P. V is a mechanical system and its brake efficiency is lower than the efficiency of E. L. S. P. V. But, the cost of E. L. S. P. V is very higher so its application to the vehicle is very difficult. Additionally, when a fail appears in the circuit which controls the valves, the fail results in some wrong operation of the valves. But, the previous researchers didn't take the effect of fail into account. Hence, the efficiency of them is low and the safety of the vehicle is not confirmed. So, in this paper we develop a new economical pressure modulator that exactly controls brake pressure and confirms the safety of the vehicle in any case using a direct adaptive fuzzy controller.
Column generation algorithms for virtual network embedding in flexi-grid optical networks.
Lin, Rongping; Luo, Shan; Zhou, Jingwei; Wang, Sheng; Chen, Bin; Zhang, Xiaoning; Cai, Anliang; Zhong, Wen-De; Zukerman, Moshe
2018-04-16
Network virtualization provides means for efficient management of network resources by embedding multiple virtual networks (VNs) to share efficiently the same substrate network. Such virtual network embedding (VNE) gives rise to a challenging problem of how to optimize resource allocation to VNs and to guarantee their performance requirements. In this paper, we provide VNE algorithms for efficient management of flexi-grid optical networks. We provide an exact algorithm aiming to minimize the total embedding cost in terms of spectrum cost and computation cost for a single VN request. Then, to achieve scalability, we also develop a heuristic algorithm for the same problem. We apply these two algorithms for a dynamic traffic scenario where many VN requests arrive one-by-one. We first demonstrate by simulations for the case of a six-node network that the heuristic algorithm obtains very close blocking probabilities to exact algorithm (about 0.2% higher). Then, for a network of realistic size (namely, USnet) we demonstrate that the blocking probability of our new heuristic algorithm is about one magnitude lower than a simpler heuristic algorithm, which was a component of an earlier published algorithm.
Li, Chuan; Li, Lin; Zhang, Jie; Alexov, Emil
2012-01-01
The Gauss-Seidel method is a standard iterative numerical method widely used to solve a system of equations and, in general, is more efficient comparing to other iterative methods, such as the Jacobi method. However, standard implementation of the Gauss-Seidel method restricts its utilization in parallel computing due to its requirement of using updated neighboring values (i.e., in current iteration) as soon as they are available. Here we report an efficient and exact (not requiring assumptions) method to parallelize iterations and to reduce the computational time as a linear/nearly linear function of the number of CPUs. In contrast to other existing solutions, our method does not require any assumptions and is equally applicable for solving linear and nonlinear equations. This approach is implemented in the DelPhi program, which is a finite difference Poisson-Boltzmann equation solver to model electrostatics in molecular biology. This development makes the iterative procedure on obtaining the electrostatic potential distribution in the parallelized DelPhi several folds faster than that in the serial code. Further we demonstrate the advantages of the new parallelized DelPhi by computing the electrostatic potential and the corresponding energies of large supramolecular structures. PMID:22674480
Identification of the sequence motif of glycoside hydrolase 13 family members
Kumar, Vikash
2011-01-01
A bioinformatics analysis of sequences of enzymes of the glycoside hydrolase (GH) 13 family members such as α-amylase, cyclodextrin glycosyltransferase (CGTase), branching enzyme and cyclomaltodextrinase has been carried out in order to find out the sequence motifs that govern the reactions specificities of these enzymes by using hidden Markov model (HMM) profile. This analysis suggests the existence of such sequence motifs and residues of these motifs constituting the −1 to +3 catalytic subsites of the enzyme. Hence, by introducing mutations in the residues of these four subsites, one can change the reaction specificities of the enzymes. In general it has been observed that α -amylase sequence motif have low sequence conservation than rest of the motifs of the GH13 family members. PMID:21544166
Kieken, Fabien; Jović, Marko; Tonelli, Marco; Naslavsky, Naava; Caplan, Steve; Sorgen, Paul L
2009-01-01
Eps15 homology (EH)-domain containing proteins are regulators of endocytic membrane trafficking. EH-domain binding to proteins containing the tripeptide NPF has been well characterized, but recent studies have shown that EH-domains are also able to interact with ligands containing DPF or GPF motifs. We demonstrate that the three motifs interact in a similar way with the EH-domain of EHD1, with the NPF motif having the highest affinity due to the presence of an intermolecular hydrogen bond. The weaker affinity for the DPF and GPF motifs suggests that if complex formation occurs in vivo, they may require high ligand concentrations, the presence of successive motifs and/or specific flanking residues. PMID:19798736
Barouch-Bentov, Rina; Neveu, Gregory; Xiao, Fei; Beer, Melanie; Bekerman, Elena; Schor, Stanford; Campbell, Joseph; Boonyaratanakornkit, Jim; Lindenbach, Brett; Lu, Albert; Jacob, Yves; Einav, Shirit
2016-11-01
Enveloped viruses commonly utilize late-domain motifs, sometimes cooperatively with ubiquitin, to hijack the endosomal sorting complex required for transport (ESCRT) machinery for budding at the plasma membrane. However, the mechanisms underlying budding of viruses lacking defined late-domain motifs and budding into intracellular compartments are poorly characterized. Here, we map a network of hepatitis C virus (HCV) protein interactions with the ESCRT machinery using a mammalian-cell-based protein interaction screen and reveal nine novel interactions. We identify HRS (hepatocyte growth factor-regulated tyrosine kinase substrate), an ESCRT-0 complex component, as an important entry point for HCV into the ESCRT pathway and validate its interactions with the HCV nonstructural (NS) proteins NS2 and NS5A in HCV-infected cells. Infectivity assays indicate that HRS is an important factor for efficient HCV assembly. Specifically, by integrating capsid oligomerization assays, biophysical analysis of intracellular viral particles by continuous gradient centrifugations, proteolytic digestion protection, and RNase digestion protection assays, we show that HCV co-opts HRS to mediate a late assembly step, namely, envelopment. In the absence of defined late-domain motifs, K63-linked polyubiquitinated lysine residues in the HCV NS2 protein bind the HRS ubiquitin-interacting motif to facilitate assembly. Finally, ESCRT-III and VPS/VTA1 components are also recruited by HCV proteins to mediate assembly. These data uncover involvement of ESCRT proteins in intracellular budding of a virus lacking defined late-domain motifs and a novel mechanism by which HCV gains entry into the ESCRT network, with potential implications for other viruses. Viruses commonly bud at the plasma membrane by recruiting the host ESCRT machinery via conserved motifs termed late domains. The mechanism by which some viruses, such as HCV, bud intracellularly is, however, poorly characterized. Moreover, whether envelopment of HCV and other viruses lacking defined late domains is ESCRT mediated and, if so, what the entry points into the ESCRT pathway are remain unknown. Here, we report the interaction network of HCV with the ESCRT machinery and a critical role for HRS, an ESCRT-0 complex component, in HCV envelopment. Viral protein ubiquitination was discovered to be a signal for HRS binding and HCV assembly, thereby functionally compensating for the absence of late domains. These findings characterize how a virus lacking defined late domains co-opts ESCRT to bud intracellularly. Since the ESCRT machinery is essential for the life cycle of multiple viruses, better understanding of this virus-host interplay may yield targets for broad-spectrum antiviral therapies. Copyright © 2016 Barouch-Bentov et al.
Hellander, Andreas; Lawson, Michael J; Drawert, Brian; Petzold, Linda
2015-01-01
The efficiency of exact simulation methods for the reaction-diffusion master equation (RDME) is severely limited by the large number of diffusion events if the mesh is fine or if diffusion constants are large. Furthermore, inherent properties of exact kinetic-Monte Carlo simulation methods limit the efficiency of parallel implementations. Several approximate and hybrid methods have appeared that enable more efficient simulation of the RDME. A common feature to most of them is that they rely on splitting the system into its reaction and diffusion parts and updating them sequentially over a discrete timestep. This use of operator splitting enables more efficient simulation but it comes at the price of a temporal discretization error that depends on the size of the timestep. So far, existing methods have not attempted to estimate or control this error in a systematic manner. This makes the solvers hard to use for practitioners since they must guess an appropriate timestep. It also makes the solvers potentially less efficient than if the timesteps are adapted to control the error. Here, we derive estimates of the local error and propose a strategy to adaptively select the timestep when the RDME is simulated via a first order operator splitting. While the strategy is general and applicable to a wide range of approximate and hybrid methods, we exemplify it here by extending a previously published approximate method, the Diffusive Finite-State Projection (DFSP) method, to incorporate temporal adaptivity. PMID:26865735
Hellander, Andreas; Lawson, Michael J; Drawert, Brian; Petzold, Linda
2014-06-01
The efficiency of exact simulation methods for the reaction-diffusion master equation (RDME) is severely limited by the large number of diffusion events if the mesh is fine or if diffusion constants are large. Furthermore, inherent properties of exact kinetic-Monte Carlo simulation methods limit the efficiency of parallel implementations. Several approximate and hybrid methods have appeared that enable more efficient simulation of the RDME. A common feature to most of them is that they rely on splitting the system into its reaction and diffusion parts and updating them sequentially over a discrete timestep. This use of operator splitting enables more efficient simulation but it comes at the price of a temporal discretization error that depends on the size of the timestep. So far, existing methods have not attempted to estimate or control this error in a systematic manner. This makes the solvers hard to use for practitioners since they must guess an appropriate timestep. It also makes the solvers potentially less efficient than if the timesteps are adapted to control the error. Here, we derive estimates of the local error and propose a strategy to adaptively select the timestep when the RDME is simulated via a first order operator splitting. While the strategy is general and applicable to a wide range of approximate and hybrid methods, we exemplify it here by extending a previously published approximate method, the Diffusive Finite-State Projection (DFSP) method, to incorporate temporal adaptivity.
kpLogo: positional k-mer analysis reveals hidden specificity in biological sequences
2017-01-01
Abstract Motifs of only 1–4 letters can play important roles when present at key locations within macromolecules. Because existing motif-discovery tools typically miss these position-specific short motifs, we developed kpLogo, a probability-based logo tool for integrated detection and visualization of position-specific ultra-short motifs from a set of aligned sequences. kpLogo also overcomes the limitations of conventional motif-visualization tools in handling positional interdependencies and utilizing ranked or weighted sequences increasingly available from high-throughput assays. kpLogo can be found at http://kplogo.wi.mit.edu/. PMID:28460012
Deletion and site-specific mutagenesis of nucleolin's carboxy GAR domain.
Pellar, Gregory J; DiMario, Patrick J
2003-04-01
Vertebrate nucleolin is an abundant RNA-binding protein in the dense fibrillar component of active nucleoli. Nucleolin is modular in composition. Its amino-terminal third contains alternating acidic and basic domains, its middle section contains four consensus RNA-binding domains (cRBDs), and its carboxy-terminus contains a distinctive glycine/arginine-rich (GAR) domain with several RGG motifs. The arginines within these motifs are asymmetrically dimethylated. Several laboratories have shown that the GAR domain is necessary but not sufficient for the efficient localization of nucleolin to nucleoli. We examined the distribution of endogenous fibrillarin, Nopp140, and B23 when full-length and DeltaGAR nucleolin were expressed exogenously as enhanced green fluorescent protein (EGFP)-tagged fusions. Only B23 redistributed when DeltaGAR-EGFP was expressed at moderate to high levels, suggesting an in vivo interaction between nucleolin and B23. Next we substituted all ten arginines within the GAR domain of Chinese hamster ovary (CHO) nucleolin with lysines to test the hypothesis that methylation of the carboxy GAR domain is necessary for the nucleolar association of nucleolin. The lysine-substituted mutant was not an in vitro substrate for the yeast protein methyltransferase, Hmt1p/Rmt1. It was, however, able to associate properly with interphase nucleoli and with interphase pre-nucleolar bodies upon recovery from hypotonic shock. We conclude, therefore, that although the GAR domain is necessary for the efficient localization of nucleolin to nucleoli, methylation of this domain is not required for proper nucleolar localization.
Verhoeven, Esther E. A.; van Kesteren, Marian; Turner, John J.; van der Marel, Gijs A.; van Boom, Jacques H.; Moolenaar, Geri F.; Goosen, Nora
2002-01-01
Nucleotide excision repair in Escherichia coli involves formation of the UvrB–DNA complex and subsequent DNA incisions on either site of the damage by UvrC. In this paper, we studied the incision of substrates with different damages in varying sequence contexts. We show that there is not always a correlation between the incision efficiency and the stability of the UvrB–DNA complex. Both stable and unstable UvrB–DNA complexes can be efficiently incised. However some lesions that give rise to stable UvrB–DNA complexes do result in a very low incision. We present evidence that this poor incision is due to sterical hindrance of the damage itself. In its C-terminal region UvrC contains two helix–hairpin–helix (HhH) motifs. Mutational analysis shows that these motifs constitute one functional unit, probably folded as one structural unit; the (HhH)2 domain. This (HhH)2 domain was previously shown to be important for the 5′ incision on a substrate containing a (cis-Pt)·GG adduct, but not for 3′ incision. Here we show that, mainly depending on the sequence context of the lesion, the (HhH)2 domain can be important for 3′ and/or 5′ incision. We propose that the (HhH)2 domain stabilises specific DNA structures required for the two incisions, thereby contributing to the flexibility of the UvrABC repair system. PMID:12034838
Gene regulatory and signaling networks exhibit distinct topological distributions of motifs
NASA Astrophysics Data System (ADS)
Ferreira, Gustavo Rodrigues; Nakaya, Helder Imoto; Costa, Luciano da Fontoura
2018-04-01
The biological processes of cellular decision making and differentiation involve a plethora of signaling pathways and gene regulatory circuits. These networks in turn exhibit a multitude of motifs playing crucial parts in regulating network activity. Here we compare the topological placement of motifs in gene regulatory and signaling networks and observe that it suggests different evolutionary strategies in motif distribution for distinct cellular subnetworks.
Chen, Connie; Gribble, Matthew O; Bartroff, Jay; Bay, Steven M; Goldstein, Larry
2017-05-01
The United States's Clean Water Act stipulates in section 303(d) that states must identify impaired water bodies for which total maximum daily loads (TMDLs) of pollution inputs into water bodies are developed. Decision-making procedures about how to list, or delist, water bodies as impaired, or not, per Clean Water Act 303(d) differ across states. In states such as California, whether or not a particular monitoring sample suggests that water quality is impaired can be regarded as a binary outcome variable, and California's current regulatory framework invokes a version of the exact binomial test to consolidate evidence across samples and assess whether the overall water body complies with the Clean Water Act. Here, we contrast the performance of California's exact binomial test with one potential alternative, the Sequential Probability Ratio Test (SPRT). The SPRT uses a sequential testing framework, testing samples as they become available and evaluating evidence as it emerges, rather than measuring all the samples and calculating a test statistic at the end of the data collection process. Through simulations and theoretical derivations, we demonstrate that the SPRT on average requires fewer samples to be measured to have comparable Type I and Type II error rates as the current fixed-sample binomial test. Policymakers might consider efficient alternatives such as SPRT to current procedure. Copyright © 2017 Elsevier Ltd. All rights reserved.
An Epoch of Reionization simulation pipeline based on BEARS
NASA Astrophysics Data System (ADS)
Krause, Fabian; Thomas, Rajat M.; Zaroubi, Saleem; Abdalla, Filipe B.
2018-10-01
The quest to unlock the mysteries of the Epoch of Reionization (EoR) is well poised with many experiments at diverse wavelengths beginning to gather data. Albeit these efforts, we are yet uncertain about the various factors that influence the EoR which include, the nature of the sources, their spectral characteristics (blackbody temperatures, power-law indices), clustering property, efficiency, duty cycle etc. Given these physical uncertainties that define the EoR, we need fast and efficient computational methods to model and analyze the data in order to provide confidence bounds on the parameters that influence the brightness temperature at 21-cm. Towards this goal we developed a pipeline that combines dark matter-only N-body simulations with exact 1-dimensional radiative transfer computations to approximate exact 3-dimensional radiative transfer. Because these simulations are about two to three orders of magnitude faster than the exact 3-dimensional methods, they can be used to explore the parameter space of the EoR systematically. A fast scheme like this pipeline could be incorporated into a Bayesian framework for parameter estimation. In this paper we detail the construction of the pipeline and describe how to use the software which is being made publicly available. We show the results of running the pipeline for four test cases of sources with various spectral energy distributions and compare their outputs using various statistics.
NASA Astrophysics Data System (ADS)
Mandrà, Salvatore; Giacomo Guerreschi, Gian; Aspuru-Guzik, Alán
2016-07-01
We present an exact quantum algorithm for solving the Exact Satisfiability problem, which belongs to the important NP-complete complexity class. The algorithm is based on an intuitive approach that can be divided into two parts: the first step consists in the identification and efficient characterization of a restricted subspace that contains all the valid assignments of the Exact Satisfiability; while the second part performs a quantum search in such restricted subspace. The quantum algorithm can be used either to find a valid assignment (or to certify that no solution exists) or to count the total number of valid assignments. The query complexities for the worst-case are respectively bounded by O(\\sqrt{{2}n-{M\\prime }}) and O({2}n-{M\\prime }), where n is the number of variables and {M}\\prime the number of linearly independent clauses. Remarkably, the proposed quantum algorithm results to be faster than any known exact classical algorithm to solve dense formulas of Exact Satisfiability. As a concrete application, we provide the worst-case complexity for the Hamiltonian cycle problem obtained after mapping it to a suitable Occupation problem. Specifically, we show that the time complexity for the proposed quantum algorithm is bounded by O({2}n/4) for 3-regular undirected graphs, where n is the number of nodes. The same worst-case complexity holds for (3,3)-regular bipartite graphs. As a reference, the current best classical algorithm has a (worst-case) running time bounded by O({2}31n/96). Finally, when compared to heuristic techniques for Exact Satisfiability problems, the proposed quantum algorithm is faster than the classical WalkSAT and Adiabatic Quantum Optimization for random instances with a density of constraints close to the satisfiability threshold, the regime in which instances are typically the hardest to solve. The proposed quantum algorithm can be straightforwardly extended to the generalized version of the Exact Satisfiability known as Occupation problem. The general version of the algorithm is presented and analyzed.
Faham, Malek; Carlton, Victoria; Moorhead, Martin; Zheng, Jianbiao; Klinger, Mark; Pepin, Francois; Asbury, Thomas; Vignali, Marissa; Emerson, Ryan O; Robins, Harlan S; Ireland, James; Baechler-Gillespie, Emily; Inman, Robert D
2017-04-01
Ankylosing spondylitis (AS), a chronic inflammatory disorder, has a notable association with HLA-B27. One hypothesis suggests that a common antigen that binds to HLA-B27 is important for AS disease pathogenesis. This study was undertaken to determine sequences and motifs that are shared among HLA-B27-positive AS patients, using T cell repertoire next-generation sequencing. To identify motifs enriched among B27-positive AS patients, we performed T cell receptor β (TCRβ) repertoire sequencing on samples from 191 B27-positive AS patients, 43 B27-negative AS patients, and 227 controls, and we obtained >77 million TCRβ clonotype sequences. First, we assessed whether any of 50 previously published sequences were enriched in B27-positive AS patients. We then used training and test cohorts to identify discovered motifs that were enriched in B27-positive AS patients versus controls. Six previously published and 11 discovered motifs were enriched in the B27-positive AS samples as compared to controls. After combining motifs related by sequence, we identified a total of 15 independent motifs. Both the full set of 15 motifs and a set of 6 published motifs were enriched in the B27-positive AS patients as compared to B27-positive healthy individuals (P = 0.049 and P = 0.001, respectively). Using an independent cohort, we validated that at least some of these motifs were associated with AS, and not simply with B27-positive status. We identified TCRβ motifs that are enriched in B27-positive AS patients as compared to B27-positive healthy controls. This suggests that a common antigen, presented by HLA-B27 and detected by CD8+ T cells, may be associated with AS disease pathogenesis. © 2016, American College of Rheumatology.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shimomura, Tadanori; Miyamura, Norio; Hata, Shoji
2014-01-17
Highlights: •Loss of the PDZ-binding motif inhibits constitutively active YAP (5SA)-induced oncogenic cell transformation. •The PDZ-binding motif of YAP promotes its nuclear localization in cultured cells and mouse liver. •Loss of the PDZ-binding motif inhibits YAP (5SA)-induced CTGF transcription in cultured cells and mouse liver. -- Abstract: YAP is a transcriptional co-activator that acts downstream of the Hippo signaling pathway and regulates multiple cellular processes, including proliferation. Hippo pathway-dependent phosphorylation of YAP negatively regulates its function. Conversely, attenuation of Hippo-mediated phosphorylation of YAP increases its ability to stimulate proliferation and eventually induces oncogenic transformation. The C-terminus of YAP contains amore » highly conserved PDZ-binding motif that regulates YAP’s functions in multiple ways. However, to date, the importance of the PDZ-binding motif to the oncogenic cell transforming activity of YAP has not been determined. In this study, we disrupted the PDZ-binding motif in the YAP (5SA) protein, in which the sites normally targeted by Hippo pathway-dependent phosphorylation are mutated. We found that loss of the PDZ-binding motif significantly inhibited the oncogenic transformation of cultured cells induced by YAP (5SA). In addition, the increased nuclear localization of YAP (5SA) and its enhanced activation of TEAD-dependent transcription of the cell proliferation gene CTGF were strongly reduced when the PDZ-binding motif was deleted. Similarly, in mouse liver, deletion of the PDZ-binding motif suppressed nuclear localization of YAP (5SA) and YAP (5SA)-induced CTGF expression. Taken together, our results indicate that the PDZ-binding motif of YAP is critical for YAP-mediated oncogenesis, and that this effect is mediated by YAP’s co-activation of TEAD-mediated CTGF transcription.« less