Sample records for predicted secondary structures

  1. RNA-SSPT: RNA Secondary Structure Prediction Tools.

    PubMed

    Ahmad, Freed; Mahboob, Shahid; Gulzar, Tahsin; Din, Salah U; Hanif, Tanzeela; Ahmad, Hifza; Afzal, Muhammad

    2013-01-01

    The prediction of RNA structure is useful for understanding evolution for both in silico and in vitro studies. Physical methods like NMR studies to predict RNA secondary structure are expensive and difficult. Computational RNA secondary structure prediction is easier. Comparative sequence analysis provides the best solution. But secondary structure prediction of a single RNA sequence is challenging. RNA-SSPT is a tool that computationally predicts secondary structure of a single RNA sequence. Most of the RNA secondary structure prediction tools do not allow pseudoknots in the structure or are unable to locate them. Nussinov dynamic programming algorithm has been implemented in RNA-SSPT. The current studies shows only energetically most favorable secondary structure is required and the algorithm modification is also available that produces base pairs to lower the total free energy of the secondary structure. For visualization of RNA secondary structure, NAVIEW in C language is used and modified in C# for tool requirement. RNA-SSPT is built in C# using Dot Net 2.0 in Microsoft Visual Studio 2005 Professional edition. The accuracy of RNA-SSPT is tested in terms of Sensitivity and Positive Predicted Value. It is a tool which serves both secondary structure prediction and secondary structure visualization purposes.

  2. RNA-SSPT: RNA Secondary Structure Prediction Tools

    PubMed Central

    Ahmad, Freed; Mahboob, Shahid; Gulzar, Tahsin; din, Salah U; Hanif, Tanzeela; Ahmad, Hifza; Afzal, Muhammad

    2013-01-01

    The prediction of RNA structure is useful for understanding evolution for both in silico and in vitro studies. Physical methods like NMR studies to predict RNA secondary structure are expensive and difficult. Computational RNA secondary structure prediction is easier. Comparative sequence analysis provides the best solution. But secondary structure prediction of a single RNA sequence is challenging. RNA-SSPT is a tool that computationally predicts secondary structure of a single RNA sequence. Most of the RNA secondary structure prediction tools do not allow pseudoknots in the structure or are unable to locate them. Nussinov dynamic programming algorithm has been implemented in RNA-SSPT. The current studies shows only energetically most favorable secondary structure is required and the algorithm modification is also available that produces base pairs to lower the total free energy of the secondary structure. For visualization of RNA secondary structure, NAVIEW in C language is used and modified in C# for tool requirement. RNA-SSPT is built in C# using Dot Net 2.0 in Microsoft Visual Studio 2005 Professional edition. The accuracy of RNA-SSPT is tested in terms of Sensitivity and Positive Predicted Value. It is a tool which serves both secondary structure prediction and secondary structure visualization purposes. PMID:24250115

  3. A semi-supervised learning approach for RNA secondary structure prediction.

    PubMed

    Yonemoto, Haruka; Asai, Kiyoshi; Hamada, Michiaki

    2015-08-01

    RNA secondary structure prediction is a key technology in RNA bioinformatics. Most algorithms for RNA secondary structure prediction use probabilistic models, in which the model parameters are trained with reliable RNA secondary structures. Because of the difficulty of determining RNA secondary structures by experimental procedures, such as NMR or X-ray crystal structural analyses, there are still many RNA sequences that could be useful for training whose secondary structures have not been experimentally determined. In this paper, we introduce a novel semi-supervised learning approach for training parameters in a probabilistic model of RNA secondary structures in which we employ not only RNA sequences with annotated secondary structures but also ones with unknown secondary structures. Our model is based on a hybrid of generative (stochastic context-free grammars) and discriminative models (conditional random fields) that has been successfully applied to natural language processing. Computational experiments indicate that the accuracy of secondary structure prediction is improved by incorporating RNA sequences with unknown secondary structures into training. To our knowledge, this is the first study of a semi-supervised learning approach for RNA secondary structure prediction. This technique will be useful when the number of reliable structures is limited. Copyright © 2015 Elsevier Ltd. All rights reserved.

  4. RNA Secondary Structure Prediction by Using Discrete Mathematics: An Interdisciplinary Research Experience for Undergraduate Students

    ERIC Educational Resources Information Center

    Ellington, Roni; Wachira, James; Nkwanta, Asamoah

    2010-01-01

    The focus of this Research Experience for Undergraduates (REU) project was on RNA secondary structure prediction by using a lattice walk approach. The lattice walk approach is a combinatorial and computational biology method used to enumerate possible secondary structures and predict RNA secondary structure from RNA sequences. The method uses…

  5. RNAstructure: software for RNA secondary structure prediction and analysis.

    PubMed

    Reuter, Jessica S; Mathews, David H

    2010-03-15

    To understand an RNA sequence's mechanism of action, the structure must be known. Furthermore, target RNA structure is an important consideration in the design of small interfering RNAs and antisense DNA oligonucleotides. RNA secondary structure prediction, using thermodynamics, can be used to develop hypotheses about the structure of an RNA sequence. RNAstructure is a software package for RNA secondary structure prediction and analysis. It uses thermodynamics and utilizes the most recent set of nearest neighbor parameters from the Turner group. It includes methods for secondary structure prediction (using several algorithms), prediction of base pair probabilities, bimolecular structure prediction, and prediction of a structure common to two sequences. This contribution describes new extensions to the package, including a library of C++ classes for incorporation into other programs, a user-friendly graphical user interface written in JAVA, and new Unix-style text interfaces. The original graphical user interface for Microsoft Windows is still maintained. The extensions to RNAstructure serve to make RNA secondary structure prediction user-friendly. The package is available for download from the Mathews lab homepage at http://rna.urmc.rochester.edu/RNAstructure.html.

  6. RNA secondary structure prediction with pseudoknots: Contribution of algorithm versus energy model.

    PubMed

    Jabbari, Hosna; Wark, Ian; Montemagno, Carlo

    2018-01-01

    RNA is a biopolymer with various applications inside the cell and in biotechnology. Structure of an RNA molecule mainly determines its function and is essential to guide nanostructure design. Since experimental structure determination is time-consuming and expensive, accurate computational prediction of RNA structure is of great importance. Prediction of RNA secondary structure is relatively simpler than its tertiary structure and provides information about its tertiary structure, therefore, RNA secondary structure prediction has received attention in the past decades. Numerous methods with different folding approaches have been developed for RNA secondary structure prediction. While methods for prediction of RNA pseudoknot-free structure (structures with no crossing base pairs) have greatly improved in terms of their accuracy, methods for prediction of RNA pseudoknotted secondary structure (structures with crossing base pairs) still have room for improvement. A long-standing question for improving the prediction accuracy of RNA pseudoknotted secondary structure is whether to focus on the prediction algorithm or the underlying energy model, as there is a trade-off on computational cost of the prediction algorithm versus the generality of the method. The aim of this work is to argue when comparing different methods for RNA pseudoknotted structure prediction, the combination of algorithm and energy model should be considered and a method should not be considered superior or inferior to others if they do not use the same scoring model. We demonstrate that while the folding approach is important in structure prediction, it is not the only important factor in prediction accuracy of a given method as the underlying energy model is also as of great value. Therefore we encourage researchers to pay particular attention in comparing methods with different energy models.

  7. Building a knowledge-based statistical potential by capturing high-order inter-residue interactions and its applications in protein secondary structure assessment.

    PubMed

    Li, Yaohang; Liu, Hui; Rata, Ionel; Jakobsson, Eric

    2013-02-25

    The rapidly increasing number of protein crystal structures available in the Protein Data Bank (PDB) has naturally made statistical analyses feasible in studying complex high-order inter-residue correlations. In this paper, we report a context-based secondary structure potential (CSSP) for assessing the quality of predicted protein secondary structures generated by various prediction servers. CSSP is a sequence-position-specific knowledge-based potential generated based on the potentials of mean force approach, where high-order inter-residue interactions are taken into consideration. The CSSP potential is effective in identifying secondary structure predictions with good quality. In 56% of the targets in the CB513 benchmark, the optimal CSSP potential is able to recognize the native secondary structure or a prediction with Q3 accuracy higher than 90% as best scored in the predicted secondary structures generated by 10 popularly used secondary structure prediction servers. In more than 80% of the CB513 targets, the predicted secondary structures with the lowest CSSP potential values yield higher than 80% Q3 accuracy. Similar performance of CSSP is found on the CASP9 targets as well. Moreover, our computational results also show that the CSSP potential using triplets outperforms the CSSP potential using doublets and is currently better than the CSSP potential using quartets.

  8. Secondary Structure Predictions for Long RNA Sequences Based on Inversion Excursions and MapReduce.

    PubMed

    Yehdego, Daniel T; Zhang, Boyu; Kodimala, Vikram K R; Johnson, Kyle L; Taufer, Michela; Leung, Ming-Ying

    2013-05-01

    Secondary structures of ribonucleic acid (RNA) molecules play important roles in many biological processes including gene expression and regulation. Experimental observations and computing limitations suggest that we can approach the secondary structure prediction problem for long RNA sequences by segmenting them into shorter chunks, predicting the secondary structures of each chunk individually using existing prediction programs, and then assembling the results to give the structure of the original sequence. The selection of cutting points is a crucial component of the segmenting step. Noting that stem-loops and pseudoknots always contain an inversion, i.e., a stretch of nucleotides followed closely by its inverse complementary sequence, we developed two cutting methods for segmenting long RNA sequences based on inversion excursions: the centered and optimized method. Each step of searching for inversions, chunking, and predictions can be performed in parallel. In this paper we use a MapReduce framework, i.e., Hadoop, to extensively explore meaningful inversion stem lengths and gap sizes for the segmentation and identify correlations between chunking methods and prediction accuracy. We show that for a set of long RNA sequences in the RFAM database, whose secondary structures are known to contain pseudoknots, our approach predicts secondary structures more accurately than methods that do not segment the sequence, when the latter predictions are possible computationally. We also show that, as sequences exceed certain lengths, some programs cannot computationally predict pseudoknots while our chunking methods can. Overall, our predicted structures still retain the accuracy level of the original prediction programs when compared with known experimental secondary structure.

  9. Sixty-five years of the long march in protein secondary structure prediction: the final stretch?

    PubMed Central

    Yang, Yuedong; Gao, Jianzhao; Wang, Jihua; Heffernan, Rhys; Hanson, Jack; Paliwal, Kuldip; Zhou, Yaoqi

    2018-01-01

    Abstract Protein secondary structure prediction began in 1951 when Pauling and Corey predicted helical and sheet conformations for protein polypeptide backbone even before the first protein structure was determined. Sixty-five years later, powerful new methods breathe new life into this field. The highest three-state accuracy without relying on structure templates is now at 82–84%, a number unthinkable just a few years ago. These improvements came from increasingly larger databases of protein sequences and structures for training, the use of template secondary structure information and more powerful deep learning techniques. As we are approaching to the theoretical limit of three-state prediction (88–90%), alternative to secondary structure prediction (prediction of backbone torsion angles and Cα-atom-based angles and torsion angles) not only has more room for further improvement but also allows direct prediction of three-dimensional fragment structures with constantly improved accuracy. About 20% of all 40-residue fragments in a database of 1199 non-redundant proteins have <6 Å root-mean-squared distance from the native conformations by SPIDER2. More powerful deep learning methods with improved capability of capturing long-range interactions begin to emerge as the next generation of techniques for secondary structure prediction. The time has come to finish off the final stretch of the long march towards protein secondary structure prediction. PMID:28040746

  10. PreSSAPro: a software for the prediction of secondary structure by amino acid properties.

    PubMed

    Costantini, Susan; Colonna, Giovanni; Facchiano, Angelo M

    2007-10-01

    PreSSAPro is a software, available to the scientific community as a free web service designed to provide predictions of secondary structures starting from the amino acid sequence of a given protein. Predictions are based on our recently published work on the amino acid propensities for secondary structures in either large but not homogeneous protein data sets, as well as in smaller but homogeneous data sets corresponding to protein structural classes, i.e. all-alpha, all-beta, or alpha-beta proteins. Predictions result improved by the use of propensities evaluated for the right protein class. PreSSAPro predicts the secondary structure according to the right protein class, if known, or gives a multiple prediction with reference to the different structural classes. The comparison of these predictions represents a novel tool to evaluate what sequence regions can assume different secondary structures depending on the structural class assignment, in the perspective of identifying proteins able to fold in different conformations. The service is available at the URL http://bioinformatica.isa.cnr.it/PRESSAPRO/.

  11. Free energy minimization to predict RNA secondary structures and computational RNA design.

    PubMed

    Churkin, Alexander; Weinbrand, Lina; Barash, Danny

    2015-01-01

    Determining the RNA secondary structure from sequence data by computational predictions is a long-standing problem. Its solution has been approached in two distinctive ways. If a multiple sequence alignment of a collection of homologous sequences is available, the comparative method uses phylogeny to determine conserved base pairs that are more likely to form as a result of billions of years of evolution than by chance. In the case of single sequences, recursive algorithms that compute free energy structures by using empirically derived energy parameters have been developed. This latter approach of RNA folding prediction by energy minimization is widely used to predict RNA secondary structure from sequence. For a significant number of RNA molecules, the secondary structure of the RNA molecule is indicative of its function and its computational prediction by minimizing its free energy is important for its functional analysis. A general method for free energy minimization to predict RNA secondary structures is dynamic programming, although other optimization methods have been developed as well along with empirically derived energy parameters. In this chapter, we introduce and illustrate by examples the approach of free energy minimization to predict RNA secondary structures.

  12. Ensemble-based prediction of RNA secondary structures.

    PubMed

    Aghaeepour, Nima; Hoos, Holger H

    2013-04-24

    Accurate structure prediction methods play an important role for the understanding of RNA function. Energy-based, pseudoknot-free secondary structure prediction is one of the most widely used and versatile approaches, and improved methods for this task have received much attention over the past five years. Despite the impressive progress that as been achieved in this area, existing evaluations of the prediction accuracy achieved by various algorithms do not provide a comprehensive, statistically sound assessment. Furthermore, while there is increasing evidence that no prediction algorithm consistently outperforms all others, no work has been done to exploit the complementary strengths of multiple approaches. In this work, we present two contributions to the area of RNA secondary structure prediction. Firstly, we use state-of-the-art, resampling-based statistical methods together with a previously published and increasingly widely used dataset of high-quality RNA structures to conduct a comprehensive evaluation of existing RNA secondary structure prediction procedures. The results from this evaluation clarify the performance relationship between ten well-known existing energy-based pseudoknot-free RNA secondary structure prediction methods and clearly demonstrate the progress that has been achieved in recent years. Secondly, we introduce AveRNA, a generic and powerful method for combining a set of existing secondary structure prediction procedures into an ensemble-based method that achieves significantly higher prediction accuracies than obtained from any of its component procedures. Our new, ensemble-based method, AveRNA, improves the state of the art for energy-based, pseudoknot-free RNA secondary structure prediction by exploiting the complementary strengths of multiple existing prediction procedures, as demonstrated using a state-of-the-art statistical resampling approach. In addition, AveRNA allows an intuitive and effective control of the trade-off between false negative and false positive base pair predictions. Finally, AveRNA can make use of arbitrary sets of secondary structure prediction procedures and can therefore be used to leverage improvements in prediction accuracy offered by algorithms and energy models developed in the future. Our data, MATLAB software and a web-based version of AveRNA are publicly available at http://www.cs.ubc.ca/labs/beta/Software/AveRNA.

  13. JNSViewer—A JavaScript-based Nucleotide Sequence Viewer for DNA/RNA secondary structures

    PubMed Central

    Dong, Min; Graham, Mitchell; Yadav, Nehul

    2017-01-01

    Many tools are available for visualizing RNA or DNA secondary structures, but there is scarce implementation in JavaScript that provides seamless integration with the increasingly popular web computational platforms. We have developed JNSViewer, a highly interactive web service, which is bundled with several popular tools for DNA/RNA secondary structure prediction and can provide precise and interactive correspondence among nucleotides, dot-bracket data, secondary structure graphs, and genic annotations. In JNSViewer, users can perform RNA secondary structure predictions with different programs and settings, add customized genic annotations in GFF format to structure graphs, search for specific linear motifs, and extract relevant structure graphs of sub-sequences. JNSViewer also allows users to choose a transcript or specific segment of Arabidopsis thaliana genome sequences and predict the corresponding secondary structure. Popular genome browsers (i.e., JBrowse and BrowserGenome) were integrated into JNSViewer to provide powerful visualizations of chromosomal locations, genic annotations, and secondary structures. In addition, we used StructureFold with default settings to predict some RNA structures for Arabidopsis by incorporating in vivo high-throughput RNA structure profiling data and stored the results in our web server, which might be a useful resource for RNA secondary structure studies in plants. JNSViewer is available at http://bioinfolab.miamioh.edu/jnsviewer/index.html. PMID:28582416

  14. New insights from cluster analysis methods for RNA secondary structure prediction

    PubMed Central

    Rogers, Emily; Heitsch, Christine

    2016-01-01

    A widening gap exists between the best practices for RNA secondary structure prediction developed by computational researchers and the methods used in practice by experimentalists. Minimum free energy (MFE) predictions, although broadly used, are outperformed by methods which sample from the Boltzmann distribution and data mine the results. In particular, moving beyond the single structure prediction paradigm yields substantial gains in accuracy. Furthermore, the largest improvements in accuracy and precision come from viewing secondary structures not at the base pair level but at lower granularity/higher abstraction. This suggests that random errors affecting precision and systematic ones affecting accuracy are both reduced by this “fuzzier” view of secondary structures. Thus experimentalists who are willing to adopt a more rigorous, multilayered approach to secondary structure prediction by iterating through these levels of granularity will be much better able to capture fundamental aspects of RNA base pairing. PMID:26971529

  15. Rtools: a web server for various secondary structural analyses on single RNA sequences.

    PubMed

    Hamada, Michiaki; Ono, Yukiteru; Kiryu, Hisanori; Sato, Kengo; Kato, Yuki; Fukunaga, Tsukasa; Mori, Ryota; Asai, Kiyoshi

    2016-07-08

    The secondary structures, as well as the nucleotide sequences, are the important features of RNA molecules to characterize their functions. According to the thermodynamic model, however, the probability of any secondary structure is very small. As a consequence, any tool to predict the secondary structures of RNAs has limited accuracy. On the other hand, there are a few tools to compensate the imperfect predictions by calculating and visualizing the secondary structural information from RNA sequences. It is desirable to obtain the rich information from those tools through a friendly interface. We implemented a web server of the tools to predict secondary structures and to calculate various structural features based on the energy models of secondary structures. By just giving an RNA sequence to the web server, the user can get the different types of solutions of the secondary structures, the marginal probabilities such as base-paring probabilities, loop probabilities and accessibilities of the local bases, the energy changes by arbitrary base mutations as well as the measures for validations of the predicted secondary structures. The web server is available at http://rtools.cbrc.jp, which integrates software tools, CentroidFold, CentroidHomfold, IPKnot, CapR, Raccess, Rchange and RintD. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  16. Characterising RNA secondary structure space using information entropy

    PubMed Central

    2013-01-01

    Comparative methods for RNA secondary structure prediction use evolutionary information from RNA alignments to increase prediction accuracy. The model is often described in terms of stochastic context-free grammars (SCFGs), which generate a probability distribution over secondary structures. It is, however, unclear how this probability distribution changes as a function of the input alignment. As prediction programs typically only return a single secondary structure, better characterisation of the underlying probability space of RNA secondary structures is of great interest. In this work, we show how to efficiently compute the information entropy of the probability distribution over RNA secondary structures produced for RNA alignments by a phylo-SCFG, and implement it for the PPfold model. We also discuss interpretations and applications of this quantity, including how it can clarify reasons for low prediction reliability scores. PPfold and its source code are available from http://birc.au.dk/software/ppfold/. PMID:23368905

  17. The predicted secondary structures of class I fructose-bisphosphate aldolases.

    PubMed Central

    Sawyer, L; Fothergill-Gilmore, L A; Freemont, P S

    1988-01-01

    The results of several secondary-structure prediction programs were combined to produce an estimate of the regions of alpha-helix, beta-sheet and reverse turns for fructose-bisphosphate aldolases from human and rat muscle and liver, from Trypanosoma brucei and from Drosophila melanogaster. All the aldolase sequences gave essentially the same pattern of secondary-structure predictions despite having sequences up to 50% different. One exception to this pattern was an additional strongly predicted helix in the rat liver and Drosophila enzymes. Regions of relatively high sequence variation generally were predicted as reverse turns, and probably occur as surface loops. Most of the positions corresponding to exon boundaries are located between regions predicted to have secondary-structural elements consistent with a compact structure. The predominantly alternating alpha/beta structure predicted is consistent with the alpha/beta-barrel structure indicated by preliminary high-resolution X-ray diffraction studies on rabbit muscle aldolase [Sygusch, Beaudry & Allaire (1986) Biophys. J. 49, 287a]. Images Fig. 1. (cont.) Fig. 1. PMID:3128269

  18. Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments.

    PubMed

    Zheng, Ce; Kurgan, Lukasz

    2008-10-10

    beta-turn is a secondary protein structure type that plays significant role in protein folding, stability, and molecular recognition. To date, several methods for prediction of beta-turns from protein sequences were developed, but they are characterized by relatively poor prediction quality. The novelty of the proposed sequence-based beta-turn predictor stems from the usage of a window based information extracted from four predicted three-state secondary structures, which together with a selected set of position specific scoring matrix (PSSM) values serve as an input to the support vector machine (SVM) predictor. We show that (1) all four predicted secondary structures are useful; (2) the most useful information extracted from the predicted secondary structure includes the structure of the predicted residue, secondary structure content in a window around the predicted residue, and features that indicate whether the predicted residue is inside a secondary structure segment; (3) the PSSM values of Asn, Asp, Gly, Ile, Leu, Met, Pro, and Val were among the top ranked features, which corroborates with recent studies. The Asn, Asp, Gly, and Pro indicate potential beta-turns, while the remaining four amino acids are useful to predict non-beta-turns. Empirical evaluation using three nonredundant datasets shows favorable Q total, Q predicted and MCC values when compared with over a dozen of modern competing methods. Our method is the first to break the 80% Q total barrier and achieves Q total = 80.9%, MCC = 0.47, and Q predicted higher by over 6% when compared with the second best method. We use feature selection to reduce the dimensionality of the feature vector used as the input for the proposed prediction method. The applied feature set is smaller by 86, 62 and 37% when compared with the second and two third-best (with respect to MCC) competing methods, respectively. Experiments show that the proposed method constitutes an improvement over the competing prediction methods. The proposed prediction model can better discriminate between beta-turns and non-beta-turns due to obtaining lower numbers of false positive predictions. The prediction model and datasets are freely available at http://biomine.ece.ualberta.ca/BTNpred/BTNpred.html.

  19. A statistical learning approach to the modeling of chromatographic retention of oligonucleotides incorporating sequence and secondary structure data

    PubMed Central

    Sturm, Marc; Quinten, Sascha; Huber, Christian G.; Kohlbacher, Oliver

    2007-01-01

    We propose a new model for predicting the retention time of oligonucleotides. The model is based on ν support vector regression using features derived from base sequence and predicted secondary structure of oligonucleotides. Because of the secondary structure information, the model is applicable even at relatively low temperatures where the secondary structure is not suppressed by thermal denaturing. This makes the prediction of oligonucleotide retention time for arbitrary temperatures possible, provided that the target temperature lies within the temperature range of the training data. We describe different possibilities of feature calculation from base sequence and secondary structure, present the results and compare our model to existing models. PMID:17567619

  20. PARTS: Probabilistic Alignment for RNA joinT Secondary structure prediction

    PubMed Central

    Harmanci, Arif Ozgun; Sharma, Gaurav; Mathews, David H.

    2008-01-01

    A novel method is presented for joint prediction of alignment and common secondary structures of two RNA sequences. The joint consideration of common secondary structures and alignment is accomplished by structural alignment over a search space defined by the newly introduced motif called matched helical regions. The matched helical region formulation generalizes previously employed constraints for structural alignment and thereby better accommodates the structural variability within RNA families. A probabilistic model based on pseudo free energies obtained from precomputed base pairing and alignment probabilities is utilized for scoring structural alignments. Maximum a posteriori (MAP) common secondary structures, sequence alignment and joint posterior probabilities of base pairing are obtained from the model via a dynamic programming algorithm called PARTS. The advantage of the more general structural alignment of PARTS is seen in secondary structure predictions for the RNase P family. For this family, the PARTS MAP predictions of secondary structures and alignment perform significantly better than prior methods that utilize a more restrictive structural alignment model. For the tRNA and 5S rRNA families, the richer structural alignment model of PARTS does not offer a benefit and the method therefore performs comparably with existing alternatives. For all RNA families studied, the posterior probability estimates obtained from PARTS offer an improvement over posterior probability estimates from a single sequence prediction. When considering the base pairings predicted over a threshold value of confidence, the combination of sensitivity and positive predictive value is superior for PARTS than for the single sequence prediction. PARTS source code is available for download under the GNU public license at http://rna.urmc.rochester.edu. PMID:18304945

  1. Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments

    PubMed Central

    Zheng, Ce; Kurgan, Lukasz

    2008-01-01

    Background β-turn is a secondary protein structure type that plays significant role in protein folding, stability, and molecular recognition. To date, several methods for prediction of β-turns from protein sequences were developed, but they are characterized by relatively poor prediction quality. The novelty of the proposed sequence-based β-turn predictor stems from the usage of a window based information extracted from four predicted three-state secondary structures, which together with a selected set of position specific scoring matrix (PSSM) values serve as an input to the support vector machine (SVM) predictor. Results We show that (1) all four predicted secondary structures are useful; (2) the most useful information extracted from the predicted secondary structure includes the structure of the predicted residue, secondary structure content in a window around the predicted residue, and features that indicate whether the predicted residue is inside a secondary structure segment; (3) the PSSM values of Asn, Asp, Gly, Ile, Leu, Met, Pro, and Val were among the top ranked features, which corroborates with recent studies. The Asn, Asp, Gly, and Pro indicate potential β-turns, while the remaining four amino acids are useful to predict non-β-turns. Empirical evaluation using three nonredundant datasets shows favorable Qtotal, Qpredicted and MCC values when compared with over a dozen of modern competing methods. Our method is the first to break the 80% Qtotal barrier and achieves Qtotal = 80.9%, MCC = 0.47, and Qpredicted higher by over 6% when compared with the second best method. We use feature selection to reduce the dimensionality of the feature vector used as the input for the proposed prediction method. The applied feature set is smaller by 86, 62 and 37% when compared with the second and two third-best (with respect to MCC) competing methods, respectively. Conclusion Experiments show that the proposed method constitutes an improvement over the competing prediction methods. The proposed prediction model can better discriminate between β-turns and non-β-turns due to obtaining lower numbers of false positive predictions. The prediction model and datasets are freely available at . PMID:18847492

  2. K-Partite RNA Secondary Structures

    NASA Astrophysics Data System (ADS)

    Jiang, Minghui; Tejada, Pedro J.; Lasisi, Ramoni O.; Cheng, Shanhong; Fechser, D. Scott

    RNA secondary structure prediction is a fundamental problem in structural bioinformatics. The prediction problem is difficult because RNA secondary structures may contain pseudoknots formed by crossing base pairs. We introduce k-partite secondary structures as a simple classification of RNA secondary structures with pseudoknots. An RNA secondary structure is k-partite if it is the union of k pseudoknot-free sub-structures. Most known RNA secondary structures are either bipartite or tripartite. We show that there exists a constant number k such that any secondary structure can be modified into a k-partite secondary structure with approximately the same free energy. This offers a partial explanation of the prevalence of k-partite secondary structures with small k. We give a complete characterization of the computational complexities of recognizing k-partite secondary structures for all k ≥ 2, and show that this recognition problem is essentially the same as the k-colorability problem on circle graphs. We present two simple heuristics, iterated peeling and first-fit packing, for finding k-partite RNA secondary structures. For maximizing the number of base pair stackings, our iterated peeling heuristic achieves a constant approximation ratio of at most k for 2 ≤ k ≤ 5, and at most frac6{1-(1-6/k)^k} le frac6{1-e^{-6}} < 6.01491 for k ≥ 6. Experiment on sequences from PseudoBase shows that our first-fit packing heuristic outperforms the leading method HotKnots in predicting RNA secondary structures with pseudoknots. Source code, data set, and experimental results are available at http://www.cs.usu.edu/ mjiang/rna/kpartite/.

  3. Prediction of protein secondary structure content for the twilight zone sequences.

    PubMed

    Homaeian, Leila; Kurgan, Lukasz A; Ruan, Jishou; Cios, Krzysztof J; Chen, Ke

    2007-11-15

    Secondary protein structure carries information about local structural arrangements, which include three major conformations: alpha-helices, beta-strands, and coils. Significant majority of successful methods for prediction of the secondary structure is based on multiple sequence alignment. However, multiple alignment fails to provide accurate results when a sequence comes from the twilight zone, that is, it is characterized by low (<30%) homology. To this end, we propose a novel method for prediction of secondary structure content through comprehensive sequence representation, called PSSC-core. The method uses a multiple linear regression model and introduces a comprehensive feature-based sequence representation to predict amount of helices and strands for sequences from the twilight zone. The PSSC-core method was tested and compared with two other state-of-the-art prediction methods on a set of 2187 twilight zone sequences. The results indicate that our method provides better predictions for both helix and strand content. The PSSC-core is shown to provide statistically significantly better results when compared with the competing methods, reducing the prediction error by 5-7% for helix and 7-9% for strand content predictions. The proposed feature-based sequence representation uses a comprehensive set of physicochemical properties that are custom-designed for each of the helix and strand content predictions. It includes composition and composition moment vectors, frequency of tetra-peptides associated with helical and strand conformations, various property-based groups like exchange groups, chemical groups of the side chains and hydrophobic group, auto-correlations based on hydrophobicity, side-chain masses, hydropathy, and conformational patterns for beta-sheets. The PSSC-core method provides an alternative for predicting the secondary structure content that can be used to validate and constrain results of other structure prediction methods. At the same time, it also provides useful insight into design of successful protein sequence representations that can be used in developing new methods related to prediction of different aspects of the secondary protein structure. (c) 2007 Wiley-Liss, Inc.

  4. Fourier-based classification of protein secondary structures.

    PubMed

    Shu, Jian-Jun; Yong, Kian Yan

    2017-04-15

    The correct prediction of protein secondary structures is one of the key issues in predicting the correct protein folded shape, which is used for determining gene function. Existing methods make use of amino acids properties as indices to classify protein secondary structures, but are faced with a significant number of misclassifications. The paper presents a technique for the classification of protein secondary structures based on protein "signal-plotting" and the use of the Fourier technique for digital signal processing. New indices are proposed to classify protein secondary structures by analyzing hydrophobicity profiles. The approach is simple and straightforward. Results show that the more types of protein secondary structures can be classified by means of these newly-proposed indices. Copyright © 2017 Elsevier Inc. All rights reserved.

  5. Protein secondary structure prediction using modular reciprocal bidirectional recurrent neural networks.

    PubMed

    Babaei, Sepideh; Geranmayeh, Amir; Seyyedsalehi, Seyyed Ali

    2010-12-01

    The supervised learning of recurrent neural networks well-suited for prediction of protein secondary structures from the underlying amino acids sequence is studied. Modular reciprocal recurrent neural networks (MRR-NN) are proposed to model the strong correlations between adjacent secondary structure elements. Besides, a multilayer bidirectional recurrent neural network (MBR-NN) is introduced to capture the long-range intramolecular interactions between amino acids in formation of the secondary structure. The final modular prediction system is devised based on the interactive integration of the MRR-NN and the MBR-NN structures to arbitrarily engage the neighboring effects of the secondary structure types concurrent with memorizing the sequential dependencies of amino acids along the protein chain. The advanced combined network augments the percentage accuracy (Q₃) to 79.36% and boosts the segment overlap (SOV) up to 70.09% when tested on the PSIPRED dataset in three-fold cross-validation. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  6. A generalized analysis of hydrophobic and loop clusters within globular protein sequences

    PubMed Central

    Eudes, Richard; Le Tuan, Khanh; Delettré, Jean; Mornon, Jean-Paul; Callebaut, Isabelle

    2007-01-01

    Background Hydrophobic Cluster Analysis (HCA) is an efficient way to compare highly divergent sequences through the implicit secondary structure information directly derived from hydrophobic clusters. However, its efficiency and application are currently limited by the need of user expertise. In order to help the analysis of HCA plots, we report here the structural preferences of hydrophobic cluster species, which are frequently encountered in globular domains of proteins. These species are characterized only by their hydrophobic/non-hydrophobic dichotomy. This analysis has been extended to loop-forming clusters, using an appropriate loop alphabet. Results The structural behavior of hydrophobic cluster species, which are typical of protein globular domains, was investigated within banks of experimental structures, considered at different levels of sequence redundancy. The 294 more frequent hydrophobic cluster species were analyzed with regard to their association with the different secondary structures (frequencies of association with secondary structures and secondary structure propensities). Hydrophobic cluster species are predominantly associated with regular secondary structures, and a large part (60 %) reveals preferences for α-helices or β-strands. Moreover, the analysis of the hydrophobic cluster amino acid composition generally allows for finer prediction of the regular secondary structure associated with the considered cluster within a cluster species. We also investigated the behavior of loop forming clusters, using a "PGDNS" alphabet. These loop clusters do not overlap with hydrophobic clusters and are highly associated with coils. Finally, the structural information contained in the hydrophobic structural words, as deduced from experimental structures, was compared to the PSI-PRED predictions, revealing that β-strands and especially α-helices are generally over-predicted within the limits of typical β and α hydrophobic clusters. Conclusion The dictionary of hydrophobic clusters described here can help the HCA user to interpret and compare the HCA plots of globular protein sequences, as well as provides an original fundamental insight into the structural bricks of protein folds. Moreover, the novel loop cluster analysis brings additional information for secondary structure prediction on the whole sequence through a generalized cluster analysis (GCA), and not only on regular secondary structures. Such information lays the foundations for developing a new and original tool for secondary structure prediction. PMID:17210072

  7. Robust prediction of consensus secondary structures using averaged base pairing probability matrices.

    PubMed

    Kiryu, Hisanori; Kin, Taishin; Asai, Kiyoshi

    2007-02-15

    Recent transcriptomic studies have revealed the existence of a considerable number of non-protein-coding RNA transcripts in higher eukaryotic cells. To investigate the functional roles of these transcripts, it is of great interest to find conserved secondary structures from multiple alignments on a genomic scale. Since multiple alignments are often created using alignment programs that neglect the special conservation patterns of RNA secondary structures for computational efficiency, alignment failures can cause potential risks of overlooking conserved stem structures. We investigated the dependence of the accuracy of secondary structure prediction on the quality of alignments. We compared three algorithms that maximize the expected accuracy of secondary structures as well as other frequently used algorithms. We found that one of our algorithms, called McCaskill-MEA, was more robust against alignment failures than others. The McCaskill-MEA method first computes the base pairing probability matrices for all the sequences in the alignment and then obtains the base pairing probability matrix of the alignment by averaging over these matrices. The consensus secondary structure is predicted from this matrix such that the expected accuracy of the prediction is maximized. We show that the McCaskill-MEA method performs better than other methods, particularly when the alignment quality is low and when the alignment consists of many sequences. Our model has a parameter that controls the sensitivity and specificity of predictions. We discussed the uses of that parameter for multi-step screening procedures to search for conserved secondary structures and for assigning confidence values to the predicted base pairs. The C++ source code that implements the McCaskill-MEA algorithm and the test dataset used in this paper are available at http://www.ncrna.org/papers/McCaskillMEA/. Supplementary data are available at Bioinformatics online.

  8. A parallel strategy for predicting the secondary structure of polycistronic microRNAs.

    PubMed

    Han, Dianwei; Tang, Guiliang; Zhang, Jun

    2013-01-01

    The biogenesis of a functional microRNA is largely dependent on the secondary structure of the microRNA precursor (pre-miRNA). Recently, it has been shown that microRNAs are present in the genome as the form of polycistronic transcriptional units in plants and animals. It will be important to design efficient computational methods to predict such structures for microRNA discovery and its applications in gene silencing. In this paper, we propose a parallel algorithm based on the master-slave architecture to predict the secondary structure from an input sequence. We conducted some experiments to verify the effectiveness of our parallel algorithm. The experimental results show that our algorithm is able to produce the optimal secondary structure of polycistronic microRNAs.

  9. FlexStem: improving predictions of RNA secondary structures with pseudoknots by reducing the search space.

    PubMed

    Chen, Xiang; He, Si-Min; Bu, Dongbo; Zhang, Fa; Wang, Zhiyong; Chen, Runsheng; Gao, Wen

    2008-09-15

    RNA secondary structures with pseudoknots are often predicted by minimizing free energy, which is proved to be NP-hard. Due to kinetic reasons the real RNA secondary structure often has local instead of global minimum free energy. This implies that we may improve the performance of RNA secondary structure prediction by taking kinetics into account and minimize free energy in a local area. we propose a novel algorithm named FlexStem to predict RNA secondary structures with pseudoknots. Still based on MFE criterion, FlexStem adopts comprehensive energy models that allow complex pseudoknots. Unlike classical thermodynamic methods, our approach aims to simulate the RNA folding process by successive addition of maximal stems, reducing the search space while maintaining or even improving the prediction accuracy. This reduced space is constructed by our maximal stem strategy and stem-adding rule induced from elaborate statistical experiments on real RNA secondary structures. The strategy and the rule also reflect the folding characteristic of RNA from a new angle and help compensate for the deficiency of merely relying on MFE in RNA structure prediction. We validate FlexStem by applying it to tRNAs, 5SrRNAs and a large number of pseudoknotted structures and compare it with the well-known algorithms such as RNAfold, PKNOTS, PknotsRG, HotKnots and ILM according to their overall sensitivities and specificities, as well as positive and negative controls on pseudoknots. The results show that FlexStem significantly increases the prediction accuracy through its local search strategy. Software is available at http://pfind.ict.ac.cn/FlexStem/. Supplementary data are available at Bioinformatics online.

  10. A Method for WD40 Repeat Detection and Secondary Structure Prediction

    PubMed Central

    Wang, Yang; Jiang, Fan; Zhuo, Zhu; Wu, Xian-Hui; Wu, Yun-Dong

    2013-01-01

    WD40-repeat proteins (WD40s), as one of the largest protein families in eukaryotes, play vital roles in assembling protein-protein/DNA/RNA complexes. WD40s fold into similar β-propeller structures despite diversified sequences. A program WDSP (WD40 repeat protein Structure Predictor) has been developed to accurately identify WD40 repeats and predict their secondary structures. The method is designed specifically for WD40 proteins by incorporating both local residue information and non-local family-specific structural features. It overcomes the problem of highly diversified protein sequences and variable loops. In addition, WDSP achieves a better prediction in identifying multiple WD40-domain proteins by taking the global combination of repeats into consideration. In secondary structure prediction, the average Q3 accuracy of WDSP in jack-knife test reaches 93.7%. A disease related protein LRRK2 was used as a representive example to demonstrate the structure prediction. PMID:23776530

  11. Protein Secondary Structure Prediction Using AutoEncoder Network and Bayes Classifier

    NASA Astrophysics Data System (ADS)

    Wang, Leilei; Cheng, Jinyong

    2018-03-01

    Protein secondary structure prediction is belong to bioinformatics,and it's important in research area. In this paper, we propose a new prediction way of protein using bayes classifier and autoEncoder network. Our experiments show some algorithms including the construction of the model, the classification of parameters and so on. The data set is a typical CB513 data set for protein. In terms of accuracy, the method is the cross validation based on the 3-fold. Then we can get the Q3 accuracy. Paper results illustrate that the autoencoder network improved the prediction accuracy of protein secondary structure.

  12. Knowledge-based computational intelligence development for predicting protein secondary structures from sequences.

    PubMed

    Shen, Hong-Bin; Yi, Dong-Liang; Yao, Li-Xiu; Yang, Jie; Chou, Kuo-Chen

    2008-10-01

    In the postgenomic age, with the avalanche of protein sequences generated and relatively slow progress in determining their structures by experiments, it is important to develop automated methods to predict the structure of a protein from its sequence. The membrane proteins are a special group in the protein family that accounts for approximately 30% of all proteins; however, solved membrane protein structures only represent less than 1% of known protein structures to date. Although a great success has been achieved for developing computational intelligence techniques to predict secondary structures in both globular and membrane proteins, there is still much challenging work in this regard. In this review article, we firstly summarize the recent progress of automation methodology development in predicting protein secondary structures, especially in membrane proteins; we will then give some future directions in this research field.

  13. Computational analysis of conserved RNA secondary structure in transcriptomes and genomes.

    PubMed

    Eddy, Sean R

    2014-01-01

    Transcriptomics experiments and computational predictions both enable systematic discovery of new functional RNAs. However, many putative noncoding transcripts arise instead from artifacts and biological noise, and current computational prediction methods have high false positive rates. I discuss prospects for improving computational methods for analyzing and identifying functional RNAs, with a focus on detecting signatures of conserved RNA secondary structure. An interesting new front is the application of chemical and enzymatic experiments that probe RNA structure on a transcriptome-wide scale. I review several proposed approaches for incorporating structure probing data into the computational prediction of RNA secondary structure. Using probabilistic inference formalisms, I show how all these approaches can be unified in a well-principled framework, which in turn allows RNA probing data to be easily integrated into a wide range of analyses that depend on RNA secondary structure inference. Such analyses include homology search and genome-wide detection of new structural RNAs.

  14. Artificial Intelligence in Prediction of Secondary Protein Structure Using CB513 Database

    PubMed Central

    Avdagic, Zikrija; Purisevic, Elvir; Omanovic, Samir; Coralic, Zlatan

    2009-01-01

    In this paper we describe CB513 a non-redundant dataset, suitable for development of algorithms for prediction of secondary protein structure. A program was made in Borland Delphi for transforming data from our dataset to make it suitable for learning of neural network for prediction of secondary protein structure implemented in MATLAB Neural-Network Toolbox. Learning (training and testing) of neural network is researched with different sizes of windows, different number of neurons in the hidden layer and different number of training epochs, while using dataset CB513. PMID:21347158

  15. MicroRNAfold: pre-microRNA secondary structure prediction based on modified NCM model with thermodynamics-based scoring strategy.

    PubMed

    Han, Dianwei; Zhang, Jun; Tang, Guiliang

    2012-01-01

    An accurate prediction of the pre-microRNA secondary structure is important in miRNA informatics. Based on a recently proposed model, nucleotide cyclic motifs (NCM), to predict RNA secondary structure, we propose and implement a Modified NCM (MNCM) model with a physics-based scoring strategy to tackle the problem of pre-microRNA folding. Our microRNAfold is implemented using a global optimal algorithm based on the bottom-up local optimal solutions. Our experimental results show that microRNAfold outperforms the current leading prediction tools in terms of True Negative rate, False Negative rate, Specificity, and Matthews coefficient ratio.

  16. Correlation of MFOLD-predicted DNA secondary structures with separation patterns obtained by capillary electrophoresis single-strand conformation polymorphism (CE-SSCP) analysis.

    PubMed

    Glavac, Damjan; Potocnik, Uros; Podpecnik, Darja; Zizek, Teofil; Smerkolj, Sava; Ravnik-Glavac, Metka

    2002-04-01

    We have studied 57 different mutations within three beta-globin gene promoter fragments with sizes 52 bp, 77 bp, and 193 bp by fluorescent capillary electrophoresis CE-SSCP analysis. For each mutation and wild type, energetically most-favorable predicted secondary structures were calculated for sense and antisense strands using the MFOLD DNA-folding algorithm in order to investigate if any correlation exists between predicted DNA structures and actual CE migration time shifts. The overall CE-SSCP detection rate was 100% for all mutations in three studied DNA fragments. For shorter 52 bp and 77 bp DNA fragments we obtained a positive correlation between the migration time shifts and difference in free energy values of predicted secondary structures at all temperatures. For longer 193 bp beta-globin gene fragments with 46 mutations MFOLD predicted different secondary structures for 89% of mutated strands at 25 degrees C and 40 degrees C. However, the magnitude of the mobility shifts did not necessarily correlate with their secondary structures and free energy values except for the sense strand at 40 degrees C where this correlation was statistically significant (r = 0.312, p = 0.033). Results of this study provided more direct insight into the mechanism of CE-SSCP and showed that MFOLD prediction could be helpful in making decisions about the running temperatures and in prediction of CE-SSCP data patterns, especially for shorter (50-100 bp) DNA fragments. Copyright 2002 Wiley-Liss, Inc.

  17. Role of DNA secondary structures in fragile site breakage along human chromosome 10

    PubMed Central

    Dillon, Laura W.; Pierce, Levi C. T.; Ng, Maggie C. Y.; Wang, Yuh-Hwa

    2013-01-01

    The formation of alternative DNA secondary structures can result in DNA breakage leading to cancer and other diseases. Chromosomal fragile sites, which are regions of the genome that exhibit chromosomal breakage under conditions of mild replication stress, are predicted to form stable DNA secondary structures. DNA breakage at fragile sites is associated with regions that are deleted, amplified or rearranged in cancer. Despite the correlation, unbiased examination of the ability to form secondary structures has not been evaluated in fragile sites. Here, using the Mfold program, we predict potential DNA secondary structure formation on the human chromosome 10 sequence, and utilize this analysis to compare fragile and non-fragile DNA. We found that aphidicolin (APH)-induced common fragile sites contain more sequence segments with potential high secondary structure-forming ability, and these segments clustered more densely than those in non-fragile DNA. Additionally, using a threshold of secondary structure-forming ability, we refined legitimate fragile sites within the cytogenetically defined boundaries, and identified potential fragile regions within non-fragile DNA. In vitro detection of alternative DNA structure formation and a DNA breakage cell assay were used to validate the computational predictions. Many of the regions identified by our analysis coincide with genes mutated in various diseases and regions of copy number alteration in cancer. This study supports the role of DNA secondary structures in common fragile site instability, provides a systematic method for their identification and suggests a mechanism by which DNA secondary structures can lead to human disease. PMID:23297364

  18. Thermodynamic heuristics with case-based reasoning: combined insights for RNA pseudoknot secondary structure.

    PubMed

    Al-Khatib, Ra'ed M; Rashid, Nur'Aini Abdul; Abdullah, Rosni

    2011-08-01

    The secondary structure of RNA pseudoknots has been extensively inferred and scrutinized by computational approaches. Experimental methods for determining RNA structure are time consuming and tedious; therefore, predictive computational approaches are required. Predicting the most accurate and energy-stable pseudoknot RNA secondary structure has been proven to be an NP-hard problem. In this paper, a new RNA folding approach, termed MSeeker, is presented; it includes KnotSeeker (a heuristic method) and Mfold (a thermodynamic algorithm). The global optimization of this thermodynamic heuristic approach was further enhanced by using a case-based reasoning technique as a local optimization method. MSeeker is a proposed algorithm for predicting RNA pseudoknot structure from individual sequences, especially long ones. This research demonstrates that MSeeker improves the sensitivity and specificity of existing RNA pseudoknot structure predictions. The performance and structural results from this proposed method were evaluated against seven other state-of-the-art pseudoknot prediction methods. The MSeeker method had better sensitivity than the DotKnot, FlexStem, HotKnots, pknotsRG, ILM, NUPACK and pknotsRE methods, with 79% of the predicted pseudoknot base-pairs being correct.

  19. An O(n(5)) algorithm for MFE prediction of kissing hairpins and 4-chains in nucleic acids.

    PubMed

    Chen, Ho-Lin; Condon, Anne; Jabbari, Hosna

    2009-06-01

    Efficient methods for prediction of minimum free energy (MFE) nucleic secondary structures are widely used, both to better understand structure and function of biological RNAs and to design novel nano-structures. Here, we present a new algorithm for MFE secondary structure prediction, which significantly expands the class of structures that can be handled in O(n(5)) time. Our algorithm can handle H-type pseudoknotted structures, kissing hairpins, and chains of four overlapping stems, as well as nested substructures of these types.

  20. Personal goals as predictors of intended classroom goals: comparing elementary and secondary school pre-service teachers.

    PubMed

    Daniels, Lia M; Frenzel, Anne C; Stupnisky, Robert H; Stewart, Tara L; Perry, Raymond P

    2013-09-01

    The literature documents fewer classroom mastery goal structures in secondary school compared to elementary. However, little is known about how personal achievement goals may influence classroom goal structures. This is especially true at the level of pre-service teachers. Our objective was to investigate if pre-service teachers' personal goals predicted their intended classroom goal structures. Participants were 125 elementary and 175 secondary school pre-service teachers from two Western Canadian universities. Structural equation modelling was used to examine if the structural relationships and latent means of personal and intended classroom goal structures differed for elementary and secondary school pre-service teachers. The results revealed that personal goals predicted the goal structures that pre-service teachers intended to establish; however, the relationships and means differed between elementary and secondary school pre-service teachers. Specifically, personal mastery-approach goals positively predicted classroom mastery goals much more strongly at the elementary than the secondary level. Furthermore, elementary pre-service teachers had significantly higher latent mean scores on personal mastery-approach goals than their secondary counterparts. It seems possible that the currently documented differences between classroom goal structures noted for elementary compared to secondary school may be based on the personal goals endorsed as pre-service teachers. The results are further discussed in terms of alignment with research on practising teachers' personal and classroom goals and implications for teacher education. © 2012 The British Psychological Society.

  1. RNA Secondary Structure Prediction by Using Discrete Mathematics: An Interdisciplinary Research Experience for Undergraduate Students

    PubMed Central

    Ellington, Roni; Wachira, James

    2010-01-01

    The focus of this Research Experience for Undergraduates (REU) project was on RNA secondary structure prediction by using a lattice walk approach. The lattice walk approach is a combinatorial and computational biology method used to enumerate possible secondary structures and predict RNA secondary structure from RNA sequences. The method uses discrete mathematical techniques and identifies specified base pairs as parameters. The goal of the REU was to introduce upper-level undergraduate students to the principles and challenges of interdisciplinary research in molecular biology and discrete mathematics. At the beginning of the project, students from the biology and mathematics departments of a mid-sized university received instruction on the role of secondary structure in the function of eukaryotic RNAs and RNA viruses, RNA related to combinatorics, and the National Center for Biotechnology Information resources. The student research projects focused on RNA secondary structure prediction on a regulatory region of the yellow fever virus RNA genome and on an untranslated region of an mRNA of a gene associated with the neurological disorder epilepsy. At the end of the project, the REU students gave poster and oral presentations, and they submitted written final project reports to the program director. The outcome of the REU was that the students gained transferable knowledge and skills in bioinformatics and an awareness of the applications of discrete mathematics to biological research problems. PMID:20810968

  2. RNA secondary structure prediction by using discrete mathematics: an interdisciplinary research experience for undergraduate students.

    PubMed

    Ellington, Roni; Wachira, James; Nkwanta, Asamoah

    2010-01-01

    The focus of this Research Experience for Undergraduates (REU) project was on RNA secondary structure prediction by using a lattice walk approach. The lattice walk approach is a combinatorial and computational biology method used to enumerate possible secondary structures and predict RNA secondary structure from RNA sequences. The method uses discrete mathematical techniques and identifies specified base pairs as parameters. The goal of the REU was to introduce upper-level undergraduate students to the principles and challenges of interdisciplinary research in molecular biology and discrete mathematics. At the beginning of the project, students from the biology and mathematics departments of a mid-sized university received instruction on the role of secondary structure in the function of eukaryotic RNAs and RNA viruses, RNA related to combinatorics, and the National Center for Biotechnology Information resources. The student research projects focused on RNA secondary structure prediction on a regulatory region of the yellow fever virus RNA genome and on an untranslated region of an mRNA of a gene associated with the neurological disorder epilepsy. At the end of the project, the REU students gave poster and oral presentations, and they submitted written final project reports to the program director. The outcome of the REU was that the students gained transferable knowledge and skills in bioinformatics and an awareness of the applications of discrete mathematics to biological research problems.

  3. Critical Features of Fragment Libraries for Protein Structure Prediction

    PubMed Central

    dos Santos, Karina Baptista

    2017-01-01

    The use of fragment libraries is a popular approach among protein structure prediction methods and has proven to substantially improve the quality of predicted structures. However, some vital aspects of a fragment library that influence the accuracy of modeling a native structure remain to be determined. This study investigates some of these features. Particularly, we analyze the effect of using secondary structure prediction guiding fragments selection, different fragments sizes and the effect of structural clustering of fragments within libraries. To have a clearer view of how these factors affect protein structure prediction, we isolated the process of model building by fragment assembly from some common limitations associated with prediction methods, e.g., imprecise energy functions and optimization algorithms, by employing an exact structure-based objective function under a greedy algorithm. Our results indicate that shorter fragments reproduce the native structure more accurately than the longer. Libraries composed of multiple fragment lengths generate even better structures, where longer fragments show to be more useful at the beginning of the simulations. The use of many different fragment sizes shows little improvement when compared to predictions carried out with libraries that comprise only three different fragment sizes. Models obtained from libraries built using only sequence similarity are, on average, better than those built with a secondary structure prediction bias. However, we found that the use of secondary structure prediction allows greater reduction of the search space, which is invaluable for prediction methods. The results of this study can be critical guidelines for the use of fragment libraries in protein structure prediction. PMID:28085928

  4. Critical Features of Fragment Libraries for Protein Structure Prediction.

    PubMed

    Trevizani, Raphael; Custódio, Fábio Lima; Dos Santos, Karina Baptista; Dardenne, Laurent Emmanuel

    2017-01-01

    The use of fragment libraries is a popular approach among protein structure prediction methods and has proven to substantially improve the quality of predicted structures. However, some vital aspects of a fragment library that influence the accuracy of modeling a native structure remain to be determined. This study investigates some of these features. Particularly, we analyze the effect of using secondary structure prediction guiding fragments selection, different fragments sizes and the effect of structural clustering of fragments within libraries. To have a clearer view of how these factors affect protein structure prediction, we isolated the process of model building by fragment assembly from some common limitations associated with prediction methods, e.g., imprecise energy functions and optimization algorithms, by employing an exact structure-based objective function under a greedy algorithm. Our results indicate that shorter fragments reproduce the native structure more accurately than the longer. Libraries composed of multiple fragment lengths generate even better structures, where longer fragments show to be more useful at the beginning of the simulations. The use of many different fragment sizes shows little improvement when compared to predictions carried out with libraries that comprise only three different fragment sizes. Models obtained from libraries built using only sequence similarity are, on average, better than those built with a secondary structure prediction bias. However, we found that the use of secondary structure prediction allows greater reduction of the search space, which is invaluable for prediction methods. The results of this study can be critical guidelines for the use of fragment libraries in protein structure prediction.

  5. Correlation of RNA secondary structure and attenuation of Sabin vaccine strains of poliovirus in tissue culture.

    PubMed

    Macadam, A J; Ferguson, G; Burlison, J; Stone, D; Skuce, R; Almond, J W; Minor, P D

    1992-08-01

    Part of the 5' noncoding regions of all three Sabin vaccine strains of poliovirus contains determinants of attenuation that are shown here to influence the ability of these strains to grow at elevated temperatures in BGM cells. The predicted RNA secondary structure of this region (nt 464-542 in P3/Sabin) suggests that both phenotypes are due to perturbation of base-paired stems. Ts phenotypes of site-directed mutants with defined changes in this region correlated well with predicted secondary structure stabilities. Reversal of base-pair orientation had little effect whereas stem disruption led to marked increases in temperature sensitivity. Phenotypic revertants of such viruses displayed mutations on either side of the stem. Mutations destabilizing stems led to intermediate phenotypes. These results provided evidence for the biological significance of the predicted RNA secondary structure.

  6. bpRNA: large-scale automated annotation and analysis of RNA secondary structure.

    PubMed

    Danaee, Padideh; Rouches, Mason; Wiley, Michelle; Deng, Dezhong; Huang, Liang; Hendrix, David

    2018-05-09

    While RNA secondary structure prediction from sequence data has made remarkable progress, there is a need for improved strategies for annotating the features of RNA secondary structures. Here, we present bpRNA, a novel annotation tool capable of parsing RNA structures, including complex pseudoknot-containing RNAs, to yield an objective, precise, compact, unambiguous, easily-interpretable description of all loops, stems, and pseudoknots, along with the positions, sequence, and flanking base pairs of each such structural feature. We also introduce several new informative representations of RNA structure types to improve structure visualization and interpretation. We have further used bpRNA to generate a web-accessible meta-database, 'bpRNA-1m', of over 100 000 single-molecule, known secondary structures; this is both more fully and accurately annotated and over 20-times larger than existing databases. We use a subset of the database with highly similar (≥90% identical) sequences filtered out to report on statistical trends in sequence, flanking base pairs, and length. Both the bpRNA method and the bpRNA-1m database will be valuable resources both for specific analysis of individual RNA molecules and large-scale analyses such as are useful for updating RNA energy parameters for computational thermodynamic predictions, improving machine learning models for structure prediction, and for benchmarking structure-prediction algorithms.

  7. MUFOLD-SS: New deep inception-inside-inception networks for protein secondary structure prediction.

    PubMed

    Fang, Chao; Shang, Yi; Xu, Dong

    2018-05-01

    Protein secondary structure prediction can provide important information for protein 3D structure prediction and protein functions. Deep learning offers a new opportunity to significantly improve prediction accuracy. In this article, a new deep neural network architecture, named the Deep inception-inside-inception (Deep3I) network, is proposed for protein secondary structure prediction and implemented as a software tool MUFOLD-SS. The input to MUFOLD-SS is a carefully designed feature matrix corresponding to the primary amino acid sequence of a protein, which consists of a rich set of information derived from individual amino acid, as well as the context of the protein sequence. Specifically, the feature matrix is a composition of physio-chemical properties of amino acids, PSI-BLAST profile, and HHBlits profile. MUFOLD-SS is composed of a sequence of nested inception modules and maps the input matrix to either eight states or three states of secondary structures. The architecture of MUFOLD-SS enables effective processing of local and global interactions between amino acids in making accurate prediction. In extensive experiments on multiple datasets, MUFOLD-SS outperformed the best existing methods and other deep neural networks significantly. MUFold-SS can be downloaded from http://dslsrv8.cs.missouri.edu/~cf797/MUFoldSS/download.html. © 2018 Wiley Periodicals, Inc.

  8. Polarization-dependent two-photon absorption for the determination of protein secondary structure: A theoretical study

    NASA Astrophysics Data System (ADS)

    Wanapun, Duangporn; Wampler, Ronald D.; Begue, Nathan J.; Simpson, Garth J.

    2008-03-01

    A new method for sensitive determination of protein secondary structure via multi-photon absorption is considered theoretically. Perturbation theory is developed to describe the polarization-dependent two-photon absorption (TPA) of α-helix and β-sheet protein secondary structures. The exciton coupling interactions responsible for relatively weak electronic circular dichroism in one-photon absorption are predicted to give rise to large changes in the TPA cross-section (>200%) for circular versus linear incident polarizations, defined as CLD. The CLD effect in TPA is electric dipole-allowed, which explains the much greater sensitivity. These predictions suggest TPA should be a viable means of sensitively probing protein secondary structure.

  9. Compilation of mRNA Polyadenylation Signals in Arabidopsis Revealed a New Signal Element and Potential Secondary Structures1[w

    PubMed Central

    Loke, Johnny C.; Stahlberg, Eric A.; Strenski, David G.; Haas, Brian J.; Wood, Paul Chris; Li, Qingshun Quinn

    2005-01-01

    Using a novel program, SignalSleuth, and a database containing authenticated polyadenylation [poly(A)] sites, we analyzed the composition of mRNA poly(A) signals in Arabidopsis (Arabidopsis thaliana), and reevaluated previously described cis-elements within the 3′-untranslated (UTR) regions, including near upstream elements and far upstream elements. As predicted, there are absences of high-consensus signal patterns. The AAUAAA signal topped the near upstream elements patterns and was found within the predicted location to only approximately 10% of 3′-UTRs. More importantly, we identified a new set, named cleavage elements, of poly(A) signals flanking both sides of the cleavage site. These cis-elements were not previously revealed by conventional mutagenesis and are contemplated as a cluster of signals for cleavage site recognition. Moreover, a single-nucleotide profile scan on the 3′-UTR regions unveiled a distinct arrangement of alternate stretches of U and A nucleotides, which led to a prediction of the formation of secondary structures. Using an RNA secondary structure prediction program, mFold, we identified three main types of secondary structures on the sequences analyzed. Surprisingly, these observed secondary structures were all interrupted in previously constructed mutations in these regions. These results will enable us to revise the current model of plant poly(A) signals and to develop tools to predict 3′-ends for gene annotation. PMID:15965016

  10. Predicting residue-wise contact orders in proteins by support vector regression.

    PubMed

    Song, Jiangning; Burrage, Kevin

    2006-10-03

    The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.

  11. Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences.

    PubMed

    Mizianty, Marcin J; Kurgan, Lukasz

    2009-12-13

    Knowledge of structural class is used by numerous methods for identification of structural/functional characteristics of proteins and could be used for the detection of remote homologues, particularly for chains that share twilight-zone similarity. In contrast to existing sequence-based structural class predictors, which target four major classes and which are designed for high identity sequences, we predict seven classes from sequences that share twilight-zone identity with the training sequences. The proposed MODular Approach to Structural class prediction (MODAS) method is unique as it allows for selection of any subset of the classes. MODAS is also the first to utilize a novel, custom-built feature-based sequence representation that combines evolutionary profiles and predicted secondary structure. The features quantify information relevant to the definition of the classes including conservation of residues and arrangement and number of helix/strand segments. Our comprehensive design considers 8 feature selection methods and 4 classifiers to develop Support Vector Machine-based classifiers that are tailored for each of the seven classes. Tests on 5 twilight-zone and 1 high-similarity benchmark datasets and comparison with over two dozens of modern competing predictors show that MODAS provides the best overall accuracy that ranges between 80% and 96.7% (83.5% for the twilight-zone datasets), depending on the dataset. This translates into 19% and 8% error rate reduction when compared against the best performing competing method on two largest datasets. The proposed predictor provides accurate predictions at 58% accuracy for membrane proteins class, which is not considered by majority of existing methods, in spite that this class accounts for only 2% of the data. Our predictive model is analyzed to demonstrate how and why the input features are associated with the corresponding classes. The improved predictions stem from the novel features that express collocation of the secondary structure segments in the protein sequence and that combine evolutionary and secondary structure information. Our work demonstrates that conservation and arrangement of the secondary structure segments predicted along the protein chain can successfully predict structural classes which are defined based on the spatial arrangement of the secondary structures. A web server is available at http://biomine.ece.ualberta.ca/MODAS/.

  12. Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences

    PubMed Central

    2009-01-01

    Background Knowledge of structural class is used by numerous methods for identification of structural/functional characteristics of proteins and could be used for the detection of remote homologues, particularly for chains that share twilight-zone similarity. In contrast to existing sequence-based structural class predictors, which target four major classes and which are designed for high identity sequences, we predict seven classes from sequences that share twilight-zone identity with the training sequences. Results The proposed MODular Approach to Structural class prediction (MODAS) method is unique as it allows for selection of any subset of the classes. MODAS is also the first to utilize a novel, custom-built feature-based sequence representation that combines evolutionary profiles and predicted secondary structure. The features quantify information relevant to the definition of the classes including conservation of residues and arrangement and number of helix/strand segments. Our comprehensive design considers 8 feature selection methods and 4 classifiers to develop Support Vector Machine-based classifiers that are tailored for each of the seven classes. Tests on 5 twilight-zone and 1 high-similarity benchmark datasets and comparison with over two dozens of modern competing predictors show that MODAS provides the best overall accuracy that ranges between 80% and 96.7% (83.5% for the twilight-zone datasets), depending on the dataset. This translates into 19% and 8% error rate reduction when compared against the best performing competing method on two largest datasets. The proposed predictor provides accurate predictions at 58% accuracy for membrane proteins class, which is not considered by majority of existing methods, in spite that this class accounts for only 2% of the data. Our predictive model is analyzed to demonstrate how and why the input features are associated with the corresponding classes. Conclusions The improved predictions stem from the novel features that express collocation of the secondary structure segments in the protein sequence and that combine evolutionary and secondary structure information. Our work demonstrates that conservation and arrangement of the secondary structure segments predicted along the protein chain can successfully predict structural classes which are defined based on the spatial arrangement of the secondary structures. A web server is available at http://biomine.ece.ualberta.ca/MODAS/. PMID:20003388

  13. RNA secondary structure prediction using soft computing.

    PubMed

    Ray, Shubhra Sankar; Pal, Sankar K

    2013-01-01

    Prediction of RNA structure is invaluable in creating new drugs and understanding genetic diseases. Several deterministic algorithms and soft computing-based techniques have been developed for more than a decade to determine the structure from a known RNA sequence. Soft computing gained importance with the need to get approximate solutions for RNA sequences by considering the issues related with kinetic effects, cotranscriptional folding, and estimation of certain energy parameters. A brief description of some of the soft computing-based techniques, developed for RNA secondary structure prediction, is presented along with their relevance. The basic concepts of RNA and its different structural elements like helix, bulge, hairpin loop, internal loop, and multiloop are described. These are followed by different methodologies, employing genetic algorithms, artificial neural networks, and fuzzy logic. The role of various metaheuristics, like simulated annealing, particle swarm optimization, ant colony optimization, and tabu search is also discussed. A relative comparison among different techniques, in predicting 12 known RNA secondary structures, is presented, as an example. Future challenging issues are then mentioned.

  14. A fast and robust iterative algorithm for prediction of RNA pseudoknotted secondary structures

    PubMed Central

    2014-01-01

    Background Improving accuracy and efficiency of computational methods that predict pseudoknotted RNA secondary structures is an ongoing challenge. Existing methods based on free energy minimization tend to be very slow and are limited in the types of pseudoknots that they can predict. Incorporating known structural information can improve prediction accuracy; however, there are not many methods for prediction of pseudoknotted structures that can incorporate structural information as input. There is even less understanding of the relative robustness of these methods with respect to partial information. Results We present a new method, Iterative HFold, for pseudoknotted RNA secondary structure prediction. Iterative HFold takes as input a pseudoknot-free structure, and produces a possibly pseudoknotted structure whose energy is at least as low as that of any (density-2) pseudoknotted structure containing the input structure. Iterative HFold leverages strengths of earlier methods, namely the fast running time of HFold, a method that is based on the hierarchical folding hypothesis, and the energy parameters of HotKnots V2.0. Our experimental evaluation on a large data set shows that Iterative HFold is robust with respect to partial information, with average accuracy on pseudoknotted structures steadily increasing from roughly 54% to 79% as the user provides up to 40% of the input structure. Iterative HFold is much faster than HotKnots V2.0, while having comparable accuracy. Iterative HFold also has significantly better accuracy than IPknot on our HK-PK and IP-pk168 data sets. Conclusions Iterative HFold is a robust method for prediction of pseudoknotted RNA secondary structures, whose accuracy with more than 5% information about true pseudoknot-free structures is better than that of IPknot, and with about 35% information about true pseudoknot-free structures compares well with that of HotKnots V2.0 while being significantly faster. Iterative HFold and all data used in this work are freely available at http://www.cs.ubc.ca/~hjabbari/software.php. PMID:24884954

  15. R-chie: a web server and R package for visualizing RNA secondary structures

    PubMed Central

    Lai, Daniel; Proctor, Jeff R.; Zhu, Jing Yun A.; Meyer, Irmtraud M.

    2012-01-01

    Visually examining RNA structures can greatly aid in understanding their potential functional roles and in evaluating the performance of structure prediction algorithms. As many functional roles of RNA structures can already be studied given the secondary structure of the RNA, various methods have been devised for visualizing RNA secondary structures. Most of these methods depict a given RNA secondary structure as a planar graph consisting of base-paired stems interconnected by roundish loops. In this article, we present an alternative method of depicting RNA secondary structure as arc diagrams. This is well suited for structures that are difficult or impossible to represent as planar stem-loop diagrams. Arc diagrams can intuitively display pseudo-knotted structures, as well as transient and alternative structural features. In addition, they facilitate the comparison of known and predicted RNA secondary structures. An added benefit is that structure information can be displayed in conjunction with a corresponding multiple sequence alignments, thereby highlighting structure and primary sequence conservation and variation. We have implemented the visualization algorithm as a web server R-chie as well as a corresponding R package called R4RNA, which allows users to run the software locally and across a range of common operating systems. PMID:22434875

  16. Bi-objective integer programming for RNA secondary structure prediction with pseudoknots.

    PubMed

    Legendre, Audrey; Angel, Eric; Tahi, Fariza

    2018-01-15

    RNA structure prediction is an important field in bioinformatics, and numerous methods and tools have been proposed. Pseudoknots are specific motifs of RNA secondary structures that are difficult to predict. Almost all existing methods are based on a single model and return one solution, often missing the real structure. An alternative approach would be to combine different models and return a (small) set of solutions, maximizing its quality and diversity in order to increase the probability that it contains the real structure. We propose here an original method for predicting RNA secondary structures with pseudoknots, based on integer programming. We developed a generic bi-objective integer programming algorithm allowing to return optimal and sub-optimal solutions optimizing simultaneously two models. This algorithm was then applied to the combination of two known models of RNA secondary structure prediction, namely MEA and MFE. The resulting tool, called BiokoP, is compared with the other methods in the literature. The results show that the best solution (structure with the highest F 1 -score) is, in most cases, given by BiokoP. Moreover, the results of BiokoP are homogeneous, regardless of the pseudoknot type or the presence or not of pseudoknots. Indeed, the F 1 -scores are always higher than 70% for any number of solutions returned. The results obtained by BiokoP show that combining the MEA and the MFE models, as well as returning several optimal and several sub-optimal solutions, allow to improve the prediction of secondary structures. One perspective of our work is to combine better mono-criterion models, in particular to combine a model based on the comparative approach with the MEA and the MFE models. This leads to develop in the future a new multi-objective algorithm to combine more than two models. BiokoP is available on the EvryRNA platform: https://EvryRNA.ibisc.univ-evry.fr .

  17. Cascaded bidirectional recurrent neural networks for protein secondary structure prediction.

    PubMed

    Chen, Jinmiao; Chaudhari, Narendra

    2007-01-01

    Protein secondary structure (PSS) prediction is an important topic in bioinformatics. Our study on a large set of non-homologous proteins shows that long-range interactions commonly exist and negatively affect PSS prediction. Besides, we also reveal strong correlations between secondary structure (SS) elements. In order to take into account the long-range interactions and SS-SS correlations, we propose a novel prediction system based on cascaded bidirectional recurrent neural network (BRNN). We compare the cascaded BRNN against another two BRNN architectures, namely the original BRNN architecture used for speech recognition as well as Pollastri's BRNN that was proposed for PSS prediction. Our cascaded BRNN achieves an overall three state accuracy Q3 of 74.38\\%, and reaches a high Segment OVerlap (SOV) of 66.0455. It outperforms the original BRNN and Pollastri's BRNN in both Q3 and SOV. Specifically, it improves the SOV score by 4-6%.

  18. BeStSel: a web server for accurate protein secondary structure prediction and fold recognition from the circular dichroism spectra.

    PubMed

    Micsonai, András; Wien, Frank; Bulyáki, Éva; Kun, Judit; Moussong, Éva; Lee, Young-Ho; Goto, Yuji; Réfrégiers, Matthieu; Kardos, József

    2018-06-11

    Circular dichroism (CD) spectroscopy is a widely used method to study the protein secondary structure. However, for decades, the general opinion was that the correct estimation of β-sheet content is challenging because of the large spectral and structural diversity of β-sheets. Recently, we showed that the orientation and twisting of β-sheets account for the observed spectral diversity, and developed a new method to estimate accurately the secondary structure (PNAS, 112, E3095). BeStSel web server provides the Beta Structure Selection method to analyze the CD spectra recorded by conventional or synchrotron radiation CD equipment. Both normalized and measured data can be uploaded to the server either as a single spectrum or series of spectra. The originality of BeStSel is that it carries out a detailed secondary structure analysis providing information on eight secondary structure components including parallel-β structure and antiparallel β-sheets with three different groups of twist. Based on these, it predicts the protein fold down to the topology/homology level of the CATH protein fold classification. The server also provides a module to analyze the structures deposited in the PDB for BeStSel secondary structure contents in relation to Dictionary of Secondary Structure of Proteins data. The BeStSel server is freely accessible at http://bestsel.elte.hu.

  19. Extracting physicochemical features to predict protein secondary structure.

    PubMed

    Huang, Yin-Fu; Chen, Shu-Ying

    2013-01-01

    We propose a protein secondary structure prediction method based on position-specific scoring matrix (PSSM) profiles and four physicochemical features including conformation parameters, net charges, hydrophobic, and side chain mass. First, the SVM with the optimal window size and the optimal parameters of the kernel function is found. Then, we train the SVM using the PSSM profiles generated from PSI-BLAST and the physicochemical features extracted from the CB513 data set. Finally, we use the filter to refine the predicted results from the trained SVM. For all the performance measures of our method, Q 3 reaches 79.52, SOV94 reaches 86.10, and SOV99 reaches 74.60; all the measures are higher than those of the SVMpsi method and the SVMfreq method. This validates that considering these physicochemical features in predicting protein secondary structure would exhibit better performances.

  20. Extracting Physicochemical Features to Predict Protein Secondary Structure

    PubMed Central

    Chen, Shu-Ying

    2013-01-01

    We propose a protein secondary structure prediction method based on position-specific scoring matrix (PSSM) profiles and four physicochemical features including conformation parameters, net charges, hydrophobic, and side chain mass. First, the SVM with the optimal window size and the optimal parameters of the kernel function is found. Then, we train the SVM using the PSSM profiles generated from PSI-BLAST and the physicochemical features extracted from the CB513 data set. Finally, we use the filter to refine the predicted results from the trained SVM. For all the performance measures of our method, Q 3 reaches 79.52, SOV94 reaches 86.10, and SOV99 reaches 74.60; all the measures are higher than those of the SVMpsi method and the SVMfreq method. This validates that considering these physicochemical features in predicting protein secondary structure would exhibit better performances. PMID:23766688

  1. Correlation of RNA secondary structure statistics with thermodynamic stability and applications to folding.

    PubMed

    Wu, Johnny C; Gardner, David P; Ozer, Stuart; Gutell, Robin R; Ren, Pengyu

    2009-08-28

    The accurate prediction of the secondary and tertiary structure of an RNA with different folding algorithms is dependent on several factors, including the energy functions. However, an RNA higher-order structure cannot be predicted accurately from its sequence based on a limited set of energy parameters. The inter- and intramolecular forces between this RNA and other small molecules and macromolecules, in addition to other factors in the cell such as pH, ionic strength, and temperature, influence the complex dynamics associated with transition of a single stranded RNA to its secondary and tertiary structure. Since all of the factors that affect the formation of an RNAs 3D structure cannot be determined experimentally, statistically derived potential energy has been used in the prediction of protein structure. In the current work, we evaluate the statistical free energy of various secondary structure motifs, including base-pair stacks, hairpin loops, and internal loops, using their statistical frequency obtained from the comparative analysis of more than 50,000 RNA sequences stored in the RNA Comparative Analysis Database (rCAD) at the Comparative RNA Web (CRW) Site. Statistical energy was computed from the structural statistics for several datasets. While the statistical energy for a base-pair stack correlates with experimentally derived free energy values, suggesting a Boltzmann-like distribution, variation is observed between different molecules and their location on the phylogenetic tree of life. Our statistical energy values calculated for several structural elements were utilized in the Mfold RNA-folding algorithm. The combined statistical energy values for base-pair stacks, hairpins and internal loop flanks result in a significant improvement in the accuracy of secondary structure prediction; the hairpin flanks contribute the most.

  2. Predicting β-turns and their types using predicted backbone dihedral angles and secondary structures

    PubMed Central

    2010-01-01

    Background β-turns are secondary structure elements usually classified as coil. Their prediction is important, because of their role in protein folding and their frequent occurrence in protein chains. Results We have developed a novel method that predicts β-turns and their types using information from multiple sequence alignments, predicted secondary structures and, for the first time, predicted dihedral angles. Our method uses support vector machines, a supervised classification technique, and is trained and tested on three established datasets of 426, 547 and 823 protein chains. We achieve a Matthews correlation coefficient of up to 0.49, when predicting the location of β-turns, the highest reported value to date. Moreover, the additional dihedral information improves the prediction of β-turn types I, II, IV, VIII and "non-specific", achieving correlation coefficients up to 0.39, 0.33, 0.27, 0.14 and 0.38, respectively. Our results are more accurate than other methods. Conclusions We have created an accurate predictor of β-turns and their types. Our method, called DEBT, is available online at http://comp.chem.nottingham.ac.uk/debt/. PMID:20673368

  3. Predicting beta-turns and their types using predicted backbone dihedral angles and secondary structures.

    PubMed

    Kountouris, Petros; Hirst, Jonathan D

    2010-07-31

    Beta-turns are secondary structure elements usually classified as coil. Their prediction is important, because of their role in protein folding and their frequent occurrence in protein chains. We have developed a novel method that predicts beta-turns and their types using information from multiple sequence alignments, predicted secondary structures and, for the first time, predicted dihedral angles. Our method uses support vector machines, a supervised classification technique, and is trained and tested on three established datasets of 426, 547 and 823 protein chains. We achieve a Matthews correlation coefficient of up to 0.49, when predicting the location of beta-turns, the highest reported value to date. Moreover, the additional dihedral information improves the prediction of beta-turn types I, II, IV, VIII and "non-specific", achieving correlation coefficients up to 0.39, 0.33, 0.27, 0.14 and 0.38, respectively. Our results are more accurate than other methods. We have created an accurate predictor of beta-turns and their types. Our method, called DEBT, is available online at http://comp.chem.nottingham.ac.uk/debt/.

  4. Prediction of Spontaneous Protein Deamidation from Sequence-Derived Secondary Structure and Intrinsic Disorder.

    PubMed

    Lorenzo, J Ramiro; Alonso, Leonardo G; Sánchez, Ignacio E

    2015-01-01

    Asparagine residues in proteins undergo spontaneous deamidation, a post-translational modification that may act as a molecular clock for the regulation of protein function and turnover. Asparagine deamidation is modulated by protein local sequence, secondary structure and hydrogen bonding. We present NGOME, an algorithm able to predict non-enzymatic deamidation of internal asparagine residues in proteins in the absence of structural data, using sequence-based predictions of secondary structure and intrinsic disorder. Compared to previous algorithms, NGOME does not require three-dimensional structures yet yields better predictions than available sequence-only methods. Four case studies of specific proteins show how NGOME may help the user identify deamidation-prone asparagine residues, often related to protein gain of function, protein degradation or protein misfolding in pathological processes. A fifth case study applies NGOME at a proteomic scale and unveils a correlation between asparagine deamidation and protein degradation in yeast. NGOME is freely available as a webserver at the National EMBnet node Argentina, URL: http://www.embnet.qb.fcen.uba.ar/ in the subpage "Protein and nucleic acid structure and sequence analysis".

  5. CompaRNA: a server for continuous benchmarking of automated methods for RNA secondary structure prediction

    PubMed Central

    Puton, Tomasz; Kozlowski, Lukasz P.; Rother, Kristian M.; Bujnicki, Janusz M.

    2013-01-01

    We present a continuous benchmarking approach for the assessment of RNA secondary structure prediction methods implemented in the CompaRNA web server. As of 3 October 2012, the performance of 28 single-sequence and 13 comparative methods has been evaluated on RNA sequences/structures released weekly by the Protein Data Bank. We also provide a static benchmark generated on RNA 2D structures derived from the RNAstrand database. Benchmarks on both data sets offer insight into the relative performance of RNA secondary structure prediction methods on RNAs of different size and with respect to different types of structure. According to our tests, on the average, the most accurate predictions obtained by a comparative approach are generated by CentroidAlifold, MXScarna, RNAalifold and TurboFold. On the average, the most accurate predictions obtained by single-sequence analyses are generated by CentroidFold, ContextFold and IPknot. The best comparative methods typically outperform the best single-sequence methods if an alignment of homologous RNA sequences is available. This article presents the results of our benchmarks as of 3 October 2012, whereas the rankings presented online are continuously updated. We will gladly include new prediction methods and new measures of accuracy in the new editions of CompaRNA benchmarks. PMID:23435231

  6. Bioinformatics approaches for structural and functional analysis of proteins in secondary metabolism in Withania somnifera.

    PubMed

    Sanchita; Singh, Swati; Sharma, Ashok

    2014-11-01

    Withania somnifera (Ashwagandha) is an affluent storehouse of large number of pharmacologically active secondary metabolites known as withanolides. These secondary metabolites are produced by withanolide biosynthetic pathway. Very less information is available on structural and functional aspects of enzymes involved in withanolides biosynthetic pathways of Withiana somnifera. We therefore performed a bioinformatics analysis to look at functional and structural properties of these important enzymes. The pathway enzymes taken for this study were 3-Hydroxy-3-methylglutaryl coenzyme A reductase, 1-Deoxy-D-xylulose-5-phosphate synthase, 1-Deoxy-D-xylulose-5-phosphate reductase, farnesyl pyrophosphate synthase, squalene synthase, squalene epoxidase, and cycloartenol synthase. The prediction of secondary structure was performed for basic structural information. Three-dimensional structures for these enzymes were predicted. The physico-chemical properties such as pI, AI, GRAVY and instability index were also studied. The current information will provide a platform to know the structural attributes responsible for the function of these protein until experimental structures become available.

  7. Accurate SHAPE-directed RNA secondary structure modeling, including pseudoknots.

    PubMed

    Hajdin, Christine E; Bellaousov, Stanislav; Huggins, Wayne; Leonard, Christopher W; Mathews, David H; Weeks, Kevin M

    2013-04-02

    A pseudoknot forms in an RNA when nucleotides in a loop pair with a region outside the helices that close the loop. Pseudoknots occur relatively rarely in RNA but are highly overrepresented in functionally critical motifs in large catalytic RNAs, in riboswitches, and in regulatory elements of viruses. Pseudoknots are usually excluded from RNA structure prediction algorithms. When included, these pairings are difficult to model accurately, especially in large RNAs, because allowing this structure dramatically increases the number of possible incorrect folds and because it is difficult to search the fold space for an optimal structure. We have developed a concise secondary structure modeling approach that combines SHAPE (selective 2'-hydroxyl acylation analyzed by primer extension) experimental chemical probing information and a simple, but robust, energy model for the entropic cost of single pseudoknot formation. Structures are predicted with iterative refinement, using a dynamic programming algorithm. This melded experimental and thermodynamic energy function predicted the secondary structures and the pseudoknots for a set of 21 challenging RNAs of known structure ranging in size from 34 to 530 nt. On average, 93% of known base pairs were predicted, and all pseudoknots in well-folded RNAs were identified.

  8. Predicting turns in proteins with a unified model.

    PubMed

    Song, Qi; Li, Tonghua; Cong, Peisheng; Sun, Jiangming; Li, Dapeng; Tang, Shengnan

    2012-01-01

    Turns are a critical element of the structure of a protein; turns play a crucial role in loops, folds, and interactions. Current prediction methods are well developed for the prediction of individual turn types, including α-turn, β-turn, and γ-turn, etc. However, for further protein structure and function prediction it is necessary to develop a uniform model that can accurately predict all types of turns simultaneously. In this study, we present a novel approach, TurnP, which offers the ability to investigate all the turns in a protein based on a unified model. The main characteristics of TurnP are: (i) using newly exploited features of structural evolution information (secondary structure and shape string of protein) based on structure homologies, (ii) considering all types of turns in a unified model, and (iii) practical capability of accurate prediction of all turns simultaneously for a query. TurnP utilizes predicted secondary structures and predicted shape strings, both of which have greater accuracy, based on innovative technologies which were both developed by our group. Then, sequence and structural evolution features, which are profile of sequence, profile of secondary structures and profile of shape strings are generated by sequence and structure alignment. When TurnP was validated on a non-redundant dataset (4,107 entries) by five-fold cross-validation, we achieved an accuracy of 88.8% and a sensitivity of 71.8%, which exceeded the most state-of-the-art predictors of certain type of turn. Newly determined sequences, the EVA and CASP9 datasets were used as independent tests and the results we achieved were outstanding for turn predictions and confirmed the good performance of TurnP for practical applications.

  9. Predicting Turns in Proteins with a Unified Model

    PubMed Central

    Song, Qi; Li, Tonghua; Cong, Peisheng; Sun, Jiangming; Li, Dapeng; Tang, Shengnan

    2012-01-01

    Motivation Turns are a critical element of the structure of a protein; turns play a crucial role in loops, folds, and interactions. Current prediction methods are well developed for the prediction of individual turn types, including α-turn, β-turn, and γ-turn, etc. However, for further protein structure and function prediction it is necessary to develop a uniform model that can accurately predict all types of turns simultaneously. Results In this study, we present a novel approach, TurnP, which offers the ability to investigate all the turns in a protein based on a unified model. The main characteristics of TurnP are: (i) using newly exploited features of structural evolution information (secondary structure and shape string of protein) based on structure homologies, (ii) considering all types of turns in a unified model, and (iii) practical capability of accurate prediction of all turns simultaneously for a query. TurnP utilizes predicted secondary structures and predicted shape strings, both of which have greater accuracy, based on innovative technologies which were both developed by our group. Then, sequence and structural evolution features, which are profile of sequence, profile of secondary structures and profile of shape strings are generated by sequence and structure alignment. When TurnP was validated on a non-redundant dataset (4,107 entries) by five-fold cross-validation, we achieved an accuracy of 88.8% and a sensitivity of 71.8%, which exceeded the most state-of-the-art predictors of certain type of turn. Newly determined sequences, the EVA and CASP9 datasets were used as independent tests and the results we achieved were outstanding for turn predictions and confirmed the good performance of TurnP for practical applications. PMID:23144872

  10. Predicted secondary structure similarity in the absence of primary amino acid sequence homology: hepatitis B virus open reading frames.

    PubMed Central

    Schaeffer, E; Sninsky, J J

    1984-01-01

    Proteins that are related evolutionarily may have diverged at the level of primary amino acid sequence while maintaining similar secondary structures. Computer analysis has been used to compare the open reading frames of the hepatitis B virus to those of the woodchuck hepatitis virus at the level of amino acid sequence, and to predict the relative hydrophilic character and the secondary structure of putative polypeptides. Similarity is seen at the levels of relative hydrophilicity and secondary structure, in the absence of sequence homology. These data reinforce the proposal that these open reading frames encode viral proteins. Computer analysis of this type can be more generally used to establish structural similarities between proteins that do not share obvious sequence homology as well as to assess whether an open reading frame is fortuitous or codes for a protein. PMID:6585835

  11. Accurate secondary structure prediction and fold recognition for circular dichroism spectroscopy

    PubMed Central

    Micsonai, András; Wien, Frank; Kernya, Linda; Lee, Young-Ho; Goto, Yuji; Réfrégiers, Matthieu; Kardos, József

    2015-01-01

    Circular dichroism (CD) spectroscopy is a widely used technique for the study of protein structure. Numerous algorithms have been developed for the estimation of the secondary structure composition from the CD spectra. These methods often fail to provide acceptable results on α/β-mixed or β-structure–rich proteins. The problem arises from the spectral diversity of β-structures, which has hitherto been considered as an intrinsic limitation of the technique. The predictions are less reliable for proteins of unusual β-structures such as membrane proteins, protein aggregates, and amyloid fibrils. Here, we show that the parallel/antiparallel orientation and the twisting of the β-sheets account for the observed spectral diversity. We have developed a method called β-structure selection (BeStSel) for the secondary structure estimation that takes into account the twist of β-structures. This method can reliably distinguish parallel and antiparallel β-sheets and accurately estimates the secondary structure for a broad range of proteins. Moreover, the secondary structure components applied by the method are characteristic to the protein fold, and thus the fold can be predicted to the level of topology in the CATH classification from a single CD spectrum. By constructing a web server, we offer a general tool for a quick and reliable structure analysis using conventional CD or synchrotron radiation CD (SRCD) spectroscopy for the protein science research community. The method is especially useful when X-ray or NMR techniques fail. Using BeStSel on data collected by SRCD spectroscopy, we investigated the structure of amyloid fibrils of various disease-related proteins and peptides. PMID:26038575

  12. NIAS-Server: Neighbors Influence of Amino acids and Secondary Structures in Proteins.

    PubMed

    Borguesan, Bruno; Inostroza-Ponta, Mario; Dorn, Márcio

    2017-03-01

    The exponential growth in the number of experimentally determined three-dimensional protein structures provide a new and relevant knowledge about the conformation of amino acids in proteins. Only a few of probability densities of amino acids are publicly available for use in structure validation and prediction methods. NIAS (Neighbors Influence of Amino acids and Secondary structures) is a web-based tool used to extract information about conformational preferences of amino acid residues and secondary structures in experimental-determined protein templates. This information is useful, for example, to characterize folds and local motifs in proteins, molecular folding, and can help the solution of complex problems such as protein structure prediction, protein design, among others. The NIAS-Server and supplementary data are available at http://sbcb.inf.ufrgs.br/nias .

  13. SVM-PB-Pred: SVM based protein block prediction method using sequence profiles and secondary structures.

    PubMed

    Suresh, V; Parthasarathy, S

    2014-01-01

    We developed a support vector machine based web server called SVM-PB-Pred, to predict the Protein Block for any given amino acid sequence. The input features of SVM-PB-Pred include i) sequence profiles (PSSM) and ii) actual secondary structures (SS) from DSSP method or predicted secondary structures from NPS@ and GOR4 methods. There were three combined input features PSSM+SS(DSSP), PSSM+SS(NPS@) and PSSM+SS(GOR4) used to test and train the SVM models. Similarly, four datasets RS90, DB433, LI1264 and SP1577 were used to develop the SVM models. These four SVM models developed were tested using three different benchmarking tests namely; (i) self consistency, (ii) seven fold cross validation test and (iii) independent case test. The maximum possible prediction accuracy of ~70% was observed in self consistency test for the SVM models of both LI1264 and SP1577 datasets, where PSSM+SS(DSSP) input features was used to test. The prediction accuracies were reduced to ~53% for PSSM+SS(NPS@) and ~43% for PSSM+SS(GOR4) in independent case test, for the SVM models of above two same datasets. Using our method, it is possible to predict the protein block letters for any query protein sequence with ~53% accuracy, when the SP1577 dataset and predicted secondary structure from NPS@ server were used. The SVM-PB-Pred server can be freely accessed through http://bioinfo.bdu.ac.in/~svmpbpred.

  14. Predicting beta-turns in proteins using support vector machines with fractional polynomials

    PubMed Central

    2013-01-01

    Background β-turns are secondary structure type that have essential role in molecular recognition, protein folding, and stability. They are found to be the most common type of non-repetitive structures since 25% of amino acids in protein structures are situated on them. Their prediction is considered to be one of the crucial problems in bioinformatics and molecular biology, which can provide valuable insights and inputs for the fold recognition and drug design. Results We propose an approach that combines support vector machines (SVMs) and logistic regression (LR) in a hybrid prediction method, which we call (H-SVM-LR) to predict β-turns in proteins. Fractional polynomials are used for LR modeling. We utilize position specific scoring matrices (PSSMs) and predicted secondary structure (PSS) as features. Our simulation studies show that H-SVM-LR achieves Qtotal of 82.87%, 82.84%, and 82.32% on the BT426, BT547, and BT823 datasets respectively. These values are the highest among other β-turns prediction methods that are based on PSSMs and secondary structure information. H-SVM-LR also achieves favorable performance in predicting β-turns as measured by the Matthew's correlation coefficient (MCC) on these datasets. Furthermore, H-SVM-LR shows good performance when considering shape strings as additional features. Conclusions In this paper, we present a comprehensive approach for β-turns prediction. Experiments show that our proposed approach achieves better performance compared to other competing prediction methods. PMID:24565438

  15. Predicting beta-turns in proteins using support vector machines with fractional polynomials.

    PubMed

    Elbashir, Murtada; Wang, Jianxin; Wu, Fang-Xiang; Wang, Lusheng

    2013-11-07

    β-turns are secondary structure type that have essential role in molecular recognition, protein folding, and stability. They are found to be the most common type of non-repetitive structures since 25% of amino acids in protein structures are situated on them. Their prediction is considered to be one of the crucial problems in bioinformatics and molecular biology, which can provide valuable insights and inputs for the fold recognition and drug design. We propose an approach that combines support vector machines (SVMs) and logistic regression (LR) in a hybrid prediction method, which we call (H-SVM-LR) to predict β-turns in proteins. Fractional polynomials are used for LR modeling. We utilize position specific scoring matrices (PSSMs) and predicted secondary structure (PSS) as features. Our simulation studies show that H-SVM-LR achieves Qtotal of 82.87%, 82.84%, and 82.32% on the BT426, BT547, and BT823 datasets respectively. These values are the highest among other β-turns prediction methods that are based on PSSMs and secondary structure information. H-SVM-LR also achieves favorable performance in predicting β-turns as measured by the Matthew's correlation coefficient (MCC) on these datasets. Furthermore, H-SVM-LR shows good performance when considering shape strings as additional features. In this paper, we present a comprehensive approach for β-turns prediction. Experiments show that our proposed approach achieves better performance compared to other competing prediction methods.

  16. CNNH_PSS: protein 8-class secondary structure prediction by convolutional neural network with highway.

    PubMed

    Zhou, Jiyun; Wang, Hongpeng; Zhao, Zhishan; Xu, Ruifeng; Lu, Qin

    2018-05-08

    Protein secondary structure is the three dimensional form of local segments of proteins and its prediction is an important problem in protein tertiary structure prediction. Developing computational approaches for protein secondary structure prediction is becoming increasingly urgent. We present a novel deep learning based model, referred to as CNNH_PSS, by using multi-scale CNN with highway. In CNNH_PSS, any two neighbor convolutional layers have a highway to deliver information from current layer to the output of the next one to keep local contexts. As lower layers extract local context while higher layers extract long-range interdependencies, the highways between neighbor layers allow CNNH_PSS to have ability to extract both local contexts and long-range interdependencies. We evaluate CNNH_PSS on two commonly used datasets: CB6133 and CB513. CNNH_PSS outperforms the multi-scale CNN without highway by at least 0.010 Q8 accuracy and also performs better than CNF, DeepCNF and SSpro8, which cannot extract long-range interdependencies, by at least 0.020 Q8 accuracy, demonstrating that both local contexts and long-range interdependencies are indeed useful for prediction. Furthermore, CNNH_PSS also performs better than GSM and DCRNN which need extra complex model to extract long-range interdependencies. It demonstrates that CNNH_PSS not only cost less computer resource, but also achieves better predicting performance. CNNH_PSS have ability to extracts both local contexts and long-range interdependencies by combing multi-scale CNN and highway network. The evaluations on common datasets and comparisons with state-of-the-art methods indicate that CNNH_PSS is an useful and efficient tool for protein secondary structure prediction.

  17. Validation of Molecular Dynamics Simulations for Prediction of Three-Dimensional Structures of Small Proteins.

    PubMed

    Kato, Koichi; Nakayoshi, Tomoki; Fukuyoshi, Shuichi; Kurimoto, Eiji; Oda, Akifumi

    2017-10-12

    Although various higher-order protein structure prediction methods have been developed, almost all of them were developed based on the three-dimensional (3D) structure information of known proteins. Here we predicted the short protein structures by molecular dynamics (MD) simulations in which only Newton's equations of motion were used and 3D structural information of known proteins was not required. To evaluate the ability of MD simulationto predict protein structures, we calculated seven short test protein (10-46 residues) in the denatured state and compared their predicted and experimental structures. The predicted structure for Trp-cage (20 residues) was close to the experimental structure by 200-ns MD simulation. For proteins shorter or longer than Trp-cage, root-mean square deviation values were larger than those for Trp-cage. However, secondary structures could be reproduced by MD simulations for proteins with 10-34 residues. Simulations by replica exchange MD were performed, but the results were similar to those from normal MD simulations. These results suggest that normal MD simulations can roughly predict short protein structures and 200-ns simulations are frequently sufficient for estimating the secondary structures of protein (approximately 20 residues). Structural prediction method using only fundamental physical laws are useful for investigating non-natural proteins, such as primitive proteins and artificial proteins for peptide-based drug delivery systems.

  18. Secondary structure of the 3'-noncoding region of flavivirus genomes: comparative analysis of base pairing probabilities.

    PubMed

    Rauscher, S; Flamm, C; Mandl, C W; Heinz, F X; Stadler, P F

    1997-07-01

    The prediction of the complete matrix of base pairing probabilities was applied to the 3' noncoding region (NCR) of flavivirus genomes. This approach identifies not only well-defined secondary structure elements, but also regions of high structural flexibility. Flaviviruses, many of which are important human pathogens, have a common genomic organization, but exhibit a significant degree of RNA sequence diversity in the functionally important 3'-NCR. We demonstrate the presence of secondary structures shared by all flaviviruses, as well as structural features that are characteristic for groups of viruses within the genus reflecting the established classification scheme. The significance of most of the predicted structures is corroborated by compensatory mutations. The availability of infectious clones for several flaviviruses will allow the assessment of these structural elements in processes of the viral life cycle, such as replication and assembly.

  19. Finding the target sites of RNA-binding proteins

    PubMed Central

    Li, Xiao; Kazan, Hilal; Lipshitz, Howard D; Morris, Quaid D

    2014-01-01

    RNA–protein interactions differ from DNA–protein interactions because of the central role of RNA secondary structure. Some RNA-binding domains (RBDs) recognize their target sites mainly by their shape and geometry and others are sequence-specific but are sensitive to secondary structure context. A number of small- and large-scale experimental approaches have been developed to measure RNAs associated in vitro and in vivo with RNA-binding proteins (RBPs). Generalizing outside of the experimental conditions tested by these assays requires computational motif finding. Often RBP motif finding is done by adapting DNA motif finding methods; but modeling secondary structure context leads to better recovery of RBP-binding preferences. Genome-wide assessment of mRNA secondary structure has recently become possible, but these data must be combined with computational predictions of secondary structure before they add value in predicting in vivo binding. There are two main approaches to incorporating structural information into motif models: supplementing primary sequence motif models with preferred secondary structure contexts (e.g., MEMERIS and RNAcontext) and directly modeling secondary structure recognized by the RBP using stochastic context-free grammars (e.g., CMfinder and RNApromo). The former better reconstruct known binding preferences for sequence-specific RBPs but are not suitable for modeling RBPs that recognize shape and geometry of RNAs. Future work in RBP motif finding should incorporate interactions between multiple RBDs and multiple RBPs in binding to RNA. WIREs RNA 2014, 5:111–130. doi: 10.1002/wrna.1201 PMID:24217996

  20. Structural protein descriptors in 1-dimension and their sequence-based predictions.

    PubMed

    Kurgan, Lukasz; Disfani, Fatemeh Miri

    2011-09-01

    The last few decades observed an increasing interest in development and application of 1-dimensional (1D) descriptors of protein structure. These descriptors project 3D structural features onto 1D strings of residue-wise structural assignments. They cover a wide-range of structural aspects including conformation of the backbone, burying depth/solvent exposure and flexibility of residues, and inter-chain residue-residue contacts. We perform first-of-its-kind comprehensive comparative review of the existing 1D structural descriptors. We define, review and categorize ten structural descriptors and we also describe, summarize and contrast over eighty computational models that are used to predict these descriptors from the protein sequences. We show that the majority of the recent sequence-based predictors utilize machine learning models, with the most popular being neural networks, support vector machines, hidden Markov models, and support vector and linear regressions. These methods provide high-throughput predictions and most of them are accessible to a non-expert user via web servers and/or stand-alone software packages. We empirically evaluate several recent sequence-based predictors of secondary structure, disorder, and solvent accessibility descriptors using a benchmark set based on CASP8 targets. Our analysis shows that the secondary structure can be predicted with over 80% accuracy and segment overlap (SOV), disorder with over 0.9 AUC, 0.6 Matthews Correlation Coefficient (MCC), and 75% SOV, and relative solvent accessibility with PCC of 0.7 and MCC of 0.6 (0.86 when homology is used). We demonstrate that the secondary structure predicted from sequence without the use of homology modeling is as good as the structure extracted from the 3D folds predicted by top-performing template-based methods.

  1. On the importance of cotranscriptional RNA structure formation

    PubMed Central

    Lai, Daniel; Proctor, Jeff R.; Meyer, Irmtraud M.

    2013-01-01

    The expression of genes, both coding and noncoding, can be significantly influenced by RNA structural features of their corresponding transcripts. There is by now mounting experimental and some theoretical evidence that structure formation in vivo starts during transcription and that this cotranscriptional folding determines the functional RNA structural features that are being formed. Several decades of research in bioinformatics have resulted in a wide range of computational methods for predicting RNA secondary structures. Almost all state-of-the-art methods in terms of prediction accuracy, however, completely ignore the process of structure formation and focus exclusively on the final RNA structure. This review hopes to bridge this gap. We summarize the existing evidence for cotranscriptional folding and then review the different, currently used strategies for RNA secondary-structure prediction. Finally, we propose a range of ideas on how state-of-the-art methods could be potentially improved by explicitly capturing the process of cotranscriptional structure formation. PMID:24131802

  2. Parallel protein secondary structure prediction based on neural networks.

    PubMed

    Zhong, Wei; Altun, Gulsah; Tian, Xinmin; Harrison, Robert; Tai, Phang C; Pan, Yi

    2004-01-01

    Protein secondary structure prediction has a fundamental influence on today's bioinformatics research. In this work, binary and tertiary classifiers of protein secondary structure prediction are implemented on Denoeux belief neural network (DBNN) architecture. Hydrophobicity matrix, orthogonal matrix, BLOSUM62 and PSSM (position specific scoring matrix) are experimented separately as the encoding schemes for DBNN. The experimental results contribute to the design of new encoding schemes. New binary classifier for Helix versus not Helix ( approximately H) for DBNN produces prediction accuracy of 87% when PSSM is used for the input profile. The performance of DBNN binary classifier is comparable to other best prediction methods. The good test results for binary classifiers open a new approach for protein structure prediction with neural networks. Due to the time consuming task of training the neural networks, Pthread and OpenMP are employed to parallelize DBNN in the hyperthreading enabled Intel architecture. Speedup for 16 Pthreads is 4.9 and speedup for 16 OpenMP threads is 4 in the 4 processors shared memory architecture. Both speedup performance of OpenMP and Pthread is superior to that of other research. With the new parallel training algorithm, thousands of amino acids can be processed in reasonable amount of time. Our research also shows that hyperthreading technology for Intel architecture is efficient for parallel biological algorithms.

  3. Correlations of nucleotide substitution rates and base composition of mammalian coding sequences with protein structure.

    PubMed

    Chiusano, M L; D'Onofrio, G; Alvarez-Valin, F; Jabbari, K; Colonna, G; Bernardi, G

    1999-09-30

    We investigated the relationships between the nucleotide substitution rates and the predicted secondary structures in the three states representation (alpha-helix, beta-sheet, and coil). The analysis was carried out on 34 alignments, each of which comprised sequences belonging to at least four different mammalian orders. The rates of synonymous substitution were found to be significantly different in regions predicted to be alpha-helix, beta-sheet, or coil. Likewise, the nonsynonymous rates also differ, although expectedly at a lower extent, in the three types of secondary structure, suggesting that different selective constraints associated with the different structures are affecting in a similar way the synonymous and nonsynonymous rates. Moreover, the base composition of the third codon positions is different in coding sequence regions corresponding to different secondary structures of proteins.

  4. RNA design using simulated SHAPE data.

    PubMed

    Lotfi, Mohadeseh; Zare-Mirakabad, Fatemeh; Montaseri, Soheila

    2018-05-03

    It has long been established that in addition to being involved in protein translation, RNA plays essential roles in numerous other cellular processes, including gene regulation and DNA replication. Such roles are known to be dictated by higher-order structures of RNA molecules. It is therefore of prime importance to find an RNA sequence that can fold to acquire a particular function that is desirable for use in pharmaceuticals and basic research. The challenge of finding an RNA sequence for a given structure is known as the RNA design problem. Although there are several algorithms to solve this problem, they mainly consider hard constraints, such as minimum free energy, to evaluate the predicted sequences. Recently, SHAPE data has emerged as a new soft constraint for RNA secondary structure prediction. To take advantage of this new experimental constraint, we report here a new method for accurate design of RNA sequences based on their secondary structures using SHAPE data as pseudo-free energy. We then compare our algorithm with four others: INFO-RNA, ERD, MODENA and RNAifold 2.0. Our algorithm precisely predicts 26 out of 29 new sequences for the structures extracted from the Rfam dataset, while the other four algorithms predict no more than 22 out of 29. The proposed algorithm is comparable to the above algorithms on RNA-SSD datasets, where they can predict up to 33 appropriate sequences for RNA secondary structures out of 34.

  5. RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis.

    PubMed

    Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab

    2012-01-01

    RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. http://www.cemb.edu.pk/sw.html RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language.

  6. Improved Model for Predicting the Free Energy Contribution of Dinucleotide Bulges to RNA Duplex Stability.

    PubMed

    Tomcho, Jeremy C; Tillman, Magdalena R; Znosko, Brent M

    2015-09-01

    Predicting the secondary structure of RNA is an intermediate in predicting RNA three-dimensional structure. Commonly, determining RNA secondary structure from sequence uses free energy minimization and nearest neighbor parameters. Current algorithms utilize a sequence-independent model to predict free energy contributions of dinucleotide bulges. To determine if a sequence-dependent model would be more accurate, short RNA duplexes containing dinucleotide bulges with different sequences and nearest neighbor combinations were optically melted to derive thermodynamic parameters. These data suggested energy contributions of dinucleotide bulges were sequence-dependent, and a sequence-dependent model was derived. This model assigns free energy penalties based on the identity of nucleotides in the bulge (3.06 kcal/mol for two purines, 2.93 kcal/mol for two pyrimidines, 2.71 kcal/mol for 5'-purine-pyrimidine-3', and 2.41 kcal/mol for 5'-pyrimidine-purine-3'). The predictive model also includes a 0.45 kcal/mol penalty for an A-U pair adjacent to the bulge and a -0.28 kcal/mol bonus for a G-U pair adjacent to the bulge. The new sequence-dependent model results in predicted values within, on average, 0.17 kcal/mol of experimental values, a significant improvement over the sequence-independent model. This model and new experimental values can be incorporated into algorithms that predict RNA stability and secondary structure from sequence.

  7. An object programming based environment for protein secondary structure prediction.

    PubMed

    Giacomini, M; Ruggiero, C; Sacile, R

    1996-01-01

    The most frequently used methods for protein secondary structure prediction are empirical statistical methods and rule based methods. A consensus system based on object-oriented programming is presented, which integrates the two approaches with the aim of improving the prediction quality. This system uses an object-oriented knowledge representation based on the concepts of conformation, residue and protein, where the conformation class is the basis, the residue class derives from it and the protein class derives from the residue class. The system has been tested with satisfactory results on several proteins of the Brookhaven Protein Data Bank. Its results have been compared with the results of the most widely used prediction methods, and they show a higher prediction capability and greater stability. Moreover, the system itself provides an index of the reliability of its current prediction. This system can also be regarded as a basis structure for programs of this kind.

  8. VITAL NMR: Using Chemical Shift Derived Secondary Structure Information for a Limited Set of Amino Acids to Assess Homology Model Accuracy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brothers, Michael C; Nesbitt, Anna E; Hallock, Michael J

    2011-01-01

    Homology modeling is a powerful tool for predicting protein structures, whose success depends on obtaining a reasonable alignment between a given structural template and the protein sequence being analyzed. In order to leverage greater predictive power for proteins with few structural templates, we have developed a method to rank homology models based upon their compliance to secondary structure derived from experimental solid-state NMR (SSNMR) data. Such data is obtainable in a rapid manner by simple SSNMR experiments (e.g., (13)C-(13)C 2D correlation spectra). To test our homology model scoring procedure for various amino acid labeling schemes, we generated a library ofmore » 7,474 homology models for 22 protein targets culled from the TALOS+/SPARTA+ training set of protein structures. Using subsets of amino acids that are plausibly assigned by SSNMR, we discovered that pairs of the residues Val, Ile, Thr, Ala and Leu (VITAL) emulate an ideal dataset where all residues are site specifically assigned. Scoring the models with a predicted VITAL site-specific dataset and calculating secondary structure with the Chemical Shift Index resulted in a Pearson correlation coefficient (-0.75) commensurate to the control (-0.77), where secondary structure was scored site specifically for all amino acids (ALL 20) using STRIDE. This method promises to accelerate structure procurement by SSNMR for proteins with unknown folds through guiding the selection of remotely homologous protein templates and assessing model quality.« less

  9. Secondary structure prediction and structure-specific sequence analysis of single-stranded DNA.

    PubMed

    Dong, F; Allawi, H T; Anderson, T; Neri, B P; Lyamichev, V I

    2001-08-01

    DNA sequence analysis by oligonucleotide binding is often affected by interference with the secondary structure of the target DNA. Here we describe an approach that improves DNA secondary structure prediction by combining enzymatic probing of DNA by structure-specific 5'-nucleases with an energy minimization algorithm that utilizes the 5'-nuclease cleavage sites as constraints. The method can identify structural differences between two DNA molecules caused by minor sequence variations such as a single nucleotide mutation. It also demonstrates the existence of long-range interactions between DNA regions separated by >300 nt and the formation of multiple alternative structures by a 244 nt DNA molecule. The differences in the secondary structure of DNA molecules revealed by 5'-nuclease probing were used to design structure-specific probes for mutation discrimination that target the regions of structural, rather than sequence, differences. We also demonstrate the performance of structure-specific 'bridge' probes complementary to non-contiguous regions of the target molecule. The structure-specific probes do not require the high stringency binding conditions necessary for methods based on mismatch formation and permit mutation detection at temperatures from 4 to 37 degrees C. Structure-specific sequence analysis is applied for mutation detection in the Mycobacterium tuberculosis katG gene and for genotyping of the hepatitis C virus.

  10. COOLAIR Antisense RNAs Form Evolutionarily Conserved Elaborate Secondary Structures

    DOE PAGES

    Hawkes, Emily J.; Hennelly, Scott P.; Novikova, Irina V.; ...

    2016-09-20

    There is considerable debate about the functionality of long non-coding RNAs (lncRNAs). Lack of sequence conservation has been used to argue against functional relevance. Here, we investigated antisense lncRNAs, called COOLAIR, at the A. thaliana FLC locus and experimentally determined their secondary structure. The major COOLAIR variants are highly structured, organized by exon. The distally polyadenylated transcript has a complex multi-domain structure, altered by a single non-coding SNP defining a functionally distinct A. thaliana FLC haplotype. The A. thaliana COOLAIR secondary structure was used to predict COOLAIR exons in evolutionarily divergent Brassicaceae species. These predictions were validated through chemical probingmore » and cloning. Despite the relatively low nucleotide sequence identity, the structures, including multi-helix junctions, show remarkable evolutionary conservation. In a number of places, the structure is conserved through covariation of a non-contiguous DNA sequence. This structural conservation supports a functional role for COOLAIR transcripts rather than, or in addition to, antisense transcription.« less

  11. COOLAIR Antisense RNAs Form Evolutionarily Conserved Elaborate Secondary Structures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hawkes, Emily J.; Hennelly, Scott P.; Novikova, Irina V.

    There is considerable debate about the functionality of long non-coding RNAs (lncRNAs). Lack of sequence conservation has been used to argue against functional relevance. Here, we investigated antisense lncRNAs, called COOLAIR, at the A. thaliana FLC locus and experimentally determined their secondary structure. The major COOLAIR variants are highly structured, organized by exon. The distally polyadenylated transcript has a complex multi-domain structure, altered by a single non-coding SNP defining a functionally distinct A. thaliana FLC haplotype. The A. thaliana COOLAIR secondary structure was used to predict COOLAIR exons in evolutionarily divergent Brassicaceae species. These predictions were validated through chemical probingmore » and cloning. Despite the relatively low nucleotide sequence identity, the structures, including multi-helix junctions, show remarkable evolutionary conservation. In a number of places, the structure is conserved through covariation of a non-contiguous DNA sequence. This structural conservation supports a functional role for COOLAIR transcripts rather than, or in addition to, antisense transcription.« less

  12. Data-directed RNA secondary structure prediction using probabilistic modeling

    PubMed Central

    Deng, Fei; Ledda, Mirko; Vaziri, Sana; Aviran, Sharon

    2016-01-01

    Structure dictates the function of many RNAs, but secondary RNA structure analysis is either labor intensive and costly or relies on computational predictions that are often inaccurate. These limitations are alleviated by integration of structure probing data into prediction algorithms. However, existing algorithms are optimized for a specific type of probing data. Recently, new chemistries combined with advances in sequencing have facilitated structure probing at unprecedented scale and sensitivity. These novel technologies and anticipated wealth of data highlight a need for algorithms that readily accommodate more complex and diverse input sources. We implemented and investigated a recently outlined probabilistic framework for RNA secondary structure prediction and extended it to accommodate further refinement of structural information. This framework utilizes direct likelihood-based calculations of pseudo-energy terms per considered structural context and can readily accommodate diverse data types and complex data dependencies. We use real data in conjunction with simulations to evaluate performances of several implementations and to show that proper integration of structural contexts can lead to improvements. Our tests also reveal discrepancies between real data and simulations, which we show can be alleviated by refined modeling. We then propose statistical preprocessing approaches to standardize data interpretation and integration into such a generic framework. We further systematically quantify the information content of data subsets, demonstrating that high reactivities are major drivers of SHAPE-directed predictions and that better understanding of less informative reactivities is key to further improvements. Finally, we provide evidence for the adaptive capability of our framework using mock probe simulations. PMID:27251549

  13. Exact calculation of loop formation probability identifies folding motifs in RNA secondary structures

    PubMed Central

    Sloma, Michael F.; Mathews, David H.

    2016-01-01

    RNA secondary structure prediction is widely used to analyze RNA sequences. In an RNA partition function calculation, free energy nearest neighbor parameters are used in a dynamic programming algorithm to estimate statistical properties of the secondary structure ensemble. Previously, partition functions have largely been used to estimate the probability that a given pair of nucleotides form a base pair, the conditional stacking probability, the accessibility to binding of a continuous stretch of nucleotides, or a representative sample of RNA structures. Here it is demonstrated that an RNA partition function can also be used to calculate the exact probability of formation of hairpin loops, internal loops, bulge loops, or multibranch loops at a given position. This calculation can also be used to estimate the probability of formation of specific helices. Benchmarking on a set of RNA sequences with known secondary structures indicated that loops that were calculated to be more probable were more likely to be present in the known structure than less probable loops. Furthermore, highly probable loops are more likely to be in the known structure than the set of loops predicted in the lowest free energy structures. PMID:27852924

  14. CPU-GPU hybrid accelerating the Zuker algorithm for RNA secondary structure prediction applications.

    PubMed

    Lei, Guoqing; Dou, Yong; Wan, Wen; Xia, Fei; Li, Rongchun; Ma, Meng; Zou, Dan

    2012-01-01

    Prediction of ribonucleic acid (RNA) secondary structure remains one of the most important research areas in bioinformatics. The Zuker algorithm is one of the most popular methods of free energy minimization for RNA secondary structure prediction. Thus far, few studies have been reported on the acceleration of the Zuker algorithm on general-purpose processors or on extra accelerators such as Field Programmable Gate-Array (FPGA) and Graphics Processing Units (GPU). To the best of our knowledge, no implementation combines both CPU and extra accelerators, such as GPUs, to accelerate the Zuker algorithm applications. In this paper, a CPU-GPU hybrid computing system that accelerates Zuker algorithm applications for RNA secondary structure prediction is proposed. The computing tasks are allocated between CPU and GPU for parallel cooperate execution. Performance differences between the CPU and the GPU in the task-allocation scheme are considered to obtain workload balance. To improve the hybrid system performance, the Zuker algorithm is optimally implemented with special methods for CPU and GPU architecture. Speedup of 15.93× over optimized multi-core SIMD CPU implementation and performance advantage of 16% over optimized GPU implementation are shown in the experimental results. More than 14% of the sequences are executed on CPU in the hybrid system. The system combining CPU and GPU to accelerate the Zuker algorithm is proven to be promising and can be applied to other bioinformatics applications.

  15. Predicting disulfide connectivity from protein sequence using multiple sequence feature vectors and secondary structure.

    PubMed

    Song, Jiangning; Yuan, Zheng; Tan, Hao; Huber, Thomas; Burrage, Kevin

    2007-12-01

    Disulfide bonds are primary covalent crosslinks between two cysteine residues in proteins that play critical roles in stabilizing the protein structures and are commonly found in extracy-toplasmatic or secreted proteins. In protein folding prediction, the localization of disulfide bonds can greatly reduce the search in conformational space. Therefore, there is a great need to develop computational methods capable of accurately predicting disulfide connectivity patterns in proteins that could have potentially important applications. We have developed a novel method to predict disulfide connectivity patterns from protein primary sequence, using a support vector regression (SVR) approach based on multiple sequence feature vectors and predicted secondary structure by the PSIPRED program. The results indicate that our method could achieve a prediction accuracy of 74.4% and 77.9%, respectively, when averaged on proteins with two to five disulfide bridges using 4-fold cross-validation, measured on the protein and cysteine pair on a well-defined non-homologous dataset. We assessed the effects of different sequence encoding schemes on the prediction performance of disulfide connectivity. It has been shown that the sequence encoding scheme based on multiple sequence feature vectors coupled with predicted secondary structure can significantly improve the prediction accuracy, thus enabling our method to outperform most of other currently available predictors. Our work provides a complementary approach to the current algorithms that should be useful in computationally assigning disulfide connectivity patterns and helps in the annotation of protein sequences generated by large-scale whole-genome projects. The prediction web server and Supplementary Material are accessible at http://foo.maths.uq.edu.au/~huber/disulfide

  16. Computational modeling of membrane proteins

    PubMed Central

    Leman, Julia Koehler; Ulmschneider, Martin B.; Gray, Jeffrey J.

    2014-01-01

    The determination of membrane protein (MP) structures has always trailed that of soluble proteins due to difficulties in their overexpression, reconstitution into membrane mimetics, and subsequent structure determination. The percentage of MP structures in the protein databank (PDB) has been at a constant 1-2% for the last decade. In contrast, over half of all drugs target MPs, only highlighting how little we understand about drug-specific effects in the human body. To reduce this gap, researchers have attempted to predict structural features of MPs even before the first structure was experimentally elucidated. In this review, we present current computational methods to predict MP structure, starting with secondary structure prediction, prediction of trans-membrane spans, and topology. Even though these methods generate reliable predictions, challenges such as predicting kinks or precise beginnings and ends of secondary structure elements are still waiting to be addressed. We describe recent developments in the prediction of 3D structures of both α-helical MPs as well as β-barrels using comparative modeling techniques, de novo methods, and molecular dynamics (MD) simulations. The increase of MP structures has (1) facilitated comparative modeling due to availability of more and better templates, and (2) improved the statistics for knowledge-based scoring functions. Moreover, de novo methods have benefitted from the use of correlated mutations as restraints. Finally, we outline current advances that will likely shape the field in the forthcoming decade. PMID:25355688

  17. Integration of QUARK and I-TASSER for ab initio protein structure prediction in CASP11

    PubMed Central

    Zhang, Wenxuan; Yang, Jianyi; He, Baoji; Walker, Sara Elizabeth; Zhang, Hongjiu; Govindarajoo, Brandon; Virtanen, Jouko; Xue, Zhidong; Shen, Hong-Bin; Zhang, Yang

    2015-01-01

    We tested two pipelines developed for template-free protein structure prediction in the CASP11 experiment. First, the QUARK pipeline constructs structure models by reassembling fragments of continuously distributed lengths excised from unrelated proteins. Five free-modeling (FM) targets have the model successfully constructed by QUARK with a TM-score above 0.4, including the first model of T0837-D1, which has a TM-score=0.736 and RMSD=2.9 Å to the native. Detailed analysis showed that the success is partly attributed to the high-resolution contact map prediction derived from fragment-based distance-profiles, which are mainly located between regular secondary structure elements and loops/turns and help guide the orientation of secondary structure assembly. In the Zhang-Server pipeline, weakly scoring threading templates are re-ordered by the structural similarity to the ab initio folding models, which are then reassembled by I-TASSER based structure assembly simulations; 60% more domains with length up to 204 residues, compared to the QUARK pipeline, were successfully modeled by the I-TASSER pipeline with a TM-score above 0.4. The robustness of the I-TASSER pipeline can stem from the composite fragment-assembly simulations that combine structures from both ab initio folding and threading template refinements. Despite the promising cases, challenges still exist in long-range beta-strand folding, domain parsing, and the uncertainty of secondary structure prediction; the latter of which was found to affect nearly all aspects of FM structure predictions, from fragment identification, target classification, structure assembly, to final model selection. Significant efforts are needed to solve these problems before real progress on FM could be made. PMID:26370505

  18. Integration of QUARK and I-TASSER for Ab Initio Protein Structure Prediction in CASP11.

    PubMed

    Zhang, Wenxuan; Yang, Jianyi; He, Baoji; Walker, Sara Elizabeth; Zhang, Hongjiu; Govindarajoo, Brandon; Virtanen, Jouko; Xue, Zhidong; Shen, Hong-Bin; Zhang, Yang

    2016-09-01

    We tested two pipelines developed for template-free protein structure prediction in the CASP11 experiment. First, the QUARK pipeline constructs structure models by reassembling fragments of continuously distributed lengths excised from unrelated proteins. Five free-modeling (FM) targets have the model successfully constructed by QUARK with a TM-score above 0.4, including the first model of T0837-D1, which has a TM-score = 0.736 and RMSD = 2.9 Å to the native. Detailed analysis showed that the success is partly attributed to the high-resolution contact map prediction derived from fragment-based distance-profiles, which are mainly located between regular secondary structure elements and loops/turns and help guide the orientation of secondary structure assembly. In the Zhang-Server pipeline, weakly scoring threading templates are re-ordered by the structural similarity to the ab initio folding models, which are then reassembled by I-TASSER based structure assembly simulations; 60% more domains with length up to 204 residues, compared to the QUARK pipeline, were successfully modeled by the I-TASSER pipeline with a TM-score above 0.4. The robustness of the I-TASSER pipeline can stem from the composite fragment-assembly simulations that combine structures from both ab initio folding and threading template refinements. Despite the promising cases, challenges still exist in long-range beta-strand folding, domain parsing, and the uncertainty of secondary structure prediction; the latter of which was found to affect nearly all aspects of FM structure predictions, from fragment identification, target classification, structure assembly, to final model selection. Significant efforts are needed to solve these problems before real progress on FM could be made. Proteins 2016; 84(Suppl 1):76-86. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.

  19. Computational prediction and biochemical characterization of novel RNA aptamers to Rift Valley fever virus nucleocapsid protein.

    PubMed

    Ellenbecker, Mary; St Goddard, Jeremy; Sundet, Alec; Lanchy, Jean-Marc; Raiford, Douglas; Lodmell, J Stephen

    2015-10-01

    Rift Valley fever virus (RVFV) is a potent human and livestock pathogen endemic to sub-Saharan Africa and the Arabian Peninsula that has potential to spread to other parts of the world. Although there is no proven effective and safe treatment for RVFV infections, a potential therapeutic target is the virally encoded nucleocapsid protein (N). During the course of infection, N binds to viral RNA, and perturbation of this interaction can inhibit viral replication. To gain insight into how N recognizes viral RNA specifically, we designed an algorithm that uses a distance matrix and multidimensional scaling to compare the predicted secondary structures of known N-binding RNAs, or aptamers, that were isolated and characterized in previous in vitro evolution experiment. These aptamers did not exhibit overt sequence or predicted structure similarity, so we employed bioinformatic methods to propose novel aptamers based on analysis and clustering of secondary structures. We screened and scored the predicted secondary structures of novel randomly generated RNA sequences in silico and selected several of these putative N-binding RNAs whose secondary structures were similar to those of known N-binding RNAs. We found that overall the in silico generated RNA sequences bound well to N in vitro. Furthermore, introduction of these RNAs into cells prior to infection with RVFV inhibited viral replication in cell culture. This proof of concept study demonstrates how the predictive power of bioinformatics and the empirical power of biochemistry can be jointly harnessed to discover, synthesize, and test new RNA sequences that bind tightly to RVFV N protein. The approach would be easily generalizable to other applications. Copyright © 2015 Elsevier Ltd. All rights reserved.

  20. A range of complex probabilistic models for RNA secondary structure prediction that includes the nearest-neighbor model and more.

    PubMed

    Rivas, Elena; Lang, Raymond; Eddy, Sean R

    2012-02-01

    The standard approach for single-sequence RNA secondary structure prediction uses a nearest-neighbor thermodynamic model with several thousand experimentally determined energy parameters. An attractive alternative is to use statistical approaches with parameters estimated from growing databases of structural RNAs. Good results have been reported for discriminative statistical methods using complex nearest-neighbor models, including CONTRAfold, Simfold, and ContextFold. Little work has been reported on generative probabilistic models (stochastic context-free grammars [SCFGs]) of comparable complexity, although probabilistic models are generally easier to train and to use. To explore a range of probabilistic models of increasing complexity, and to directly compare probabilistic, thermodynamic, and discriminative approaches, we created TORNADO, a computational tool that can parse a wide spectrum of RNA grammar architectures (including the standard nearest-neighbor model and more) using a generalized super-grammar that can be parameterized with probabilities, energies, or arbitrary scores. By using TORNADO, we find that probabilistic nearest-neighbor models perform comparably to (but not significantly better than) discriminative methods. We find that complex statistical models are prone to overfitting RNA structure and that evaluations should use structurally nonhomologous training and test data sets. Overfitting has affected at least one published method (ContextFold). The most important barrier to improving statistical approaches for RNA secondary structure prediction is the lack of diversity of well-curated single-sequence RNA secondary structures in current RNA databases.

  1. A range of complex probabilistic models for RNA secondary structure prediction that includes the nearest-neighbor model and more

    PubMed Central

    Rivas, Elena; Lang, Raymond; Eddy, Sean R.

    2012-01-01

    The standard approach for single-sequence RNA secondary structure prediction uses a nearest-neighbor thermodynamic model with several thousand experimentally determined energy parameters. An attractive alternative is to use statistical approaches with parameters estimated from growing databases of structural RNAs. Good results have been reported for discriminative statistical methods using complex nearest-neighbor models, including CONTRAfold, Simfold, and ContextFold. Little work has been reported on generative probabilistic models (stochastic context-free grammars [SCFGs]) of comparable complexity, although probabilistic models are generally easier to train and to use. To explore a range of probabilistic models of increasing complexity, and to directly compare probabilistic, thermodynamic, and discriminative approaches, we created TORNADO, a computational tool that can parse a wide spectrum of RNA grammar architectures (including the standard nearest-neighbor model and more) using a generalized super-grammar that can be parameterized with probabilities, energies, or arbitrary scores. By using TORNADO, we find that probabilistic nearest-neighbor models perform comparably to (but not significantly better than) discriminative methods. We find that complex statistical models are prone to overfitting RNA structure and that evaluations should use structurally nonhomologous training and test data sets. Overfitting has affected at least one published method (ContextFold). The most important barrier to improving statistical approaches for RNA secondary structure prediction is the lack of diversity of well-curated single-sequence RNA secondary structures in current RNA databases. PMID:22194308

  2. Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields.

    PubMed

    Wang, Sheng; Peng, Jian; Ma, Jianzhu; Xu, Jinbo

    2016-01-11

    Protein secondary structure (SS) prediction is important for studying protein structure and function. When only the sequence (profile) information is used as input feature, currently the best predictors can obtain ~80% Q3 accuracy, which has not been improved in the past decade. Here we present DeepCNF (Deep Convolutional Neural Fields) for protein SS prediction. DeepCNF is a Deep Learning extension of Conditional Neural Fields (CNF), which is an integration of Conditional Random Fields (CRF) and shallow neural networks. DeepCNF can model not only complex sequence-structure relationship by a deep hierarchical architecture, but also interdependency between adjacent SS labels, so it is much more powerful than CNF. Experimental results show that DeepCNF can obtain ~84% Q3 accuracy, ~85% SOV score, and ~72% Q8 accuracy, respectively, on the CASP and CAMEO test proteins, greatly outperforming currently popular predictors. As a general framework, DeepCNF can be used to predict other protein structure properties such as contact number, disorder regions, and solvent accessibility.

  3. Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields

    NASA Astrophysics Data System (ADS)

    Wang, Sheng; Peng, Jian; Ma, Jianzhu; Xu, Jinbo

    2016-01-01

    Protein secondary structure (SS) prediction is important for studying protein structure and function. When only the sequence (profile) information is used as input feature, currently the best predictors can obtain ~80% Q3 accuracy, which has not been improved in the past decade. Here we present DeepCNF (Deep Convolutional Neural Fields) for protein SS prediction. DeepCNF is a Deep Learning extension of Conditional Neural Fields (CNF), which is an integration of Conditional Random Fields (CRF) and shallow neural networks. DeepCNF can model not only complex sequence-structure relationship by a deep hierarchical architecture, but also interdependency between adjacent SS labels, so it is much more powerful than CNF. Experimental results show that DeepCNF can obtain ~84% Q3 accuracy, ~85% SOV score, and ~72% Q8 accuracy, respectively, on the CASP and CAMEO test proteins, greatly outperforming currently popular predictors. As a general framework, DeepCNF can be used to predict other protein structure properties such as contact number, disorder regions, and solvent accessibility.

  4. Order within disorder: Aggrecan chondroitin sulphate-attachment region provides new structural insights into protein sequences classified as disordered

    PubMed Central

    Jowitt, Thomas A; Murdoch, Alan D; Baldock, Clair; Berry, Richard; Day, Joanna M; Hardingham, Timothy E

    2010-01-01

    Structural investigation of proteins containing large stretches of sequences without predicted secondary structure is the focus of much increased attention. Here, we have produced an unglycosylated 30 kDa peptide from the chondroitin sulphate (CS)-attachment region of human aggrecan (CS-peptide), which was predicted to be intrinsically disordered and compared its structure with the adjacent aggrecan G3 domain. Biophysical analyses, including analytical ultracentrifugation, light scattering, and circular dichroism showed that the CS-peptide had an elongated and stiffened conformation in contrast to the globular G3 domain. The results suggested that it contained significant secondary structure, which was sensitive to urea, and we propose that the CS-peptide forms an elongated wormlike molecule based on a dynamic range of energetically equivalent secondary structures stabilized by hydrogen bonds. The dimensions of the structure predicted from small-angle X-ray scattering analysis were compatible with EM images of fully glycosylated aggrecan and a partly glycosylated aggrecan CS2-G3 construct. The semiordered structure identified in CS-peptide was not predicted by common structural algorithms and identified a potentially distinct class of semiordered structure within sequences currently identified as disordered. Sequence comparisons suggested some evidence for comparable structures in proteins encoded by other genes (PRG4, MUC5B, and CBP). The function of these semiordered sequences may serve to spatially position attached folded modules and/or to present polypeptides for modification, such as glycosylation, and to provide templates for the multiple pleiotropic interactions proposed for disordered proteins. Proteins 2010. © 2010 Wiley-Liss, Inc. PMID:20806220

  5. NNvPDB: Neural Network based Protein Secondary Structure Prediction with PDB Validation.

    PubMed

    Sakthivel, Seethalakshmi; S K M, Habeeb

    2015-01-01

    The predicted secondary structural states are not cross validated by any of the existing servers. Hence, information on the level of accuracy for every sequence is not reported by the existing servers. This was overcome by NNvPDB, which not only reported greater Q3 but also validates every prediction with the homologous PDB entries. NNvPDB is based on the concept of Neural Network, with a new and different approach of training the network every time with five PDB structures that are similar to query sequence. The average accuracy for helix is 76%, beta sheet is 71% and overall (helix, sheet and coil) is 66%. http://bit.srmuniv.ac.in/cgi-bin/bit/cfpdb/nnsecstruct.pl.

  6. A high-throughput approach to profile RNA structure.

    PubMed

    Delli Ponti, Riccardo; Marti, Stefanie; Armaos, Alexandros; Tartaglia, Gian Gaetano

    2017-03-17

    Here we introduce the Computational Recognition of Secondary Structure (CROSS) method to calculate the structural profile of an RNA sequence (single- or double-stranded state) at single-nucleotide resolution and without sequence length restrictions. We trained CROSS using data from high-throughput experiments such as Selective 2΄-Hydroxyl Acylation analyzed by Primer Extension (SHAPE; Mouse and HIV transcriptomes) and Parallel Analysis of RNA Structure (PARS; Human and Yeast transcriptomes) as well as high-quality NMR/X-ray structures (PDB database). The algorithm uses primary structure information alone to predict experimental structural profiles with >80% accuracy, showing high performances on large RNAs such as Xist (17 900 nucleotides; Area Under the ROC Curve AUC of 0.75 on dimethyl sulfate (DMS) experiments). We integrated CROSS in thermodynamics-based methods to predict secondary structure and observed an increase in their predictive power by up to 30%. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. Analysis of protein circular dichroism spectra for secondary structure using a simple matrix multiplication.

    PubMed

    Compton, L A; Johnson, W C

    1986-05-15

    Inverse circular dichroism (CD) spectra are presented for each of the five major secondary structures of proteins: alpha-helix, antiparallel and parallel beta-sheet, beta-turn, and other (random) structures. The fraction of the each secondary structure in a protein is predicted by forming the dot product of the corresponding inverse CD spectrum, expressed as a vector, with the CD spectrum of the protein digitized in the same way. We show how this method is based on the construction of the generalized inverse from the singular value decomposition of a set of CD spectra corresponding to proteins whose secondary structures are known from X-ray crystallography. These inverse spectra compute secondary structure directly from protein CD spectra without resorting to least-squares fitting and standard matrix inversion techniques. In addition, spectra corresponding to the individual secondary structures, analogous to the CD spectra of synthetic polypeptides, are generated from the five most significant CD eigenvectors.

  8. Secondary structure prediction for complete rDNA sequences (18S, 5.8S, and 28S rDNA) of Demodex folliculorum, and comparison of divergent domains structures across Acari.

    PubMed

    Zhao, Ya-E; Wang, Zheng-Hang; Xu, Yang; Wu, Li-Ping; Hu, Li

    2013-10-01

    According to base pairing, the rRNA folds into corresponding secondary structures, which contain additional phylogenetic information. On the basis of sequencing for complete rDNA sequences (18S, ITS1, 5.8S, ITS2 and 28S rDNA) of Demodex, we predicted the secondary structure of the complete rDNA sequence (18S, 5.8S, and 28S rDNA) of Demodex folliculorum, which was in concordance with that of the main arthropod lineages in past studies. And together with the sequence data from GenBank, we also predicted the secondary structures of divergent domains in SSU rRNA of 51 species and in LSU rRNA of 43 species from four superfamilies in Acari (Cheyletoidea, Tetranychoidea, Analgoidea and Ixodoidea). The multiple alignment among the four superfamilies in Acari showed that, insertions from Tetranychoidea SSU rRNA formed two newly proposed helixes, and helix c3-2b of LSU rRNA was absent in Demodex (Cheyletoidea) taxa. Generally speaking, LSU rRNA presented more remarkable differences than SSU rRNA did, mainly in D2, D3, D5, D7a, D7b, D8 and D10. Copyright © 2013 Elsevier Inc. All rights reserved.

  9. A new model for approximating RNA folding trajectories and population kinetics

    NASA Astrophysics Data System (ADS)

    Kirkpatrick, Bonnie; Hajiaghayi, Monir; Condon, Anne

    2013-01-01

    RNA participates both in functional aspects of the cell and in gene regulation. The interactions of these molecules are mediated by their secondary structure which can be viewed as a planar circle graph with arcs for all the chemical bonds between pairs of bases in the RNA sequence. The problem of predicting RNA secondary structure, specifically the chemically most probable structure, has many useful and efficient algorithms. This leaves RNA folding, the problem of predicting the dynamic behavior of RNA structure over time, as the main open problem. RNA folding is important for functional understanding because some RNA molecules change secondary structure in response to interactions with the environment. The full RNA folding model on at most O(3n) secondary structures is the gold standard. We present a new subset approximation model for the full model, give methods to analyze its accuracy and discuss the relative merits of our model as compared with a pre-existing subset approximation. The main advantage of our model is that it generates Monte Carlo folding pathways with the same probabilities with which they are generated under the full model. The pre-existing subset approximation does not have this property.

  10. RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis

    PubMed Central

    Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab

    2012-01-01

    RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. Availability http://www.cemb.edu.pk/sw.html Abbreviations RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language. PMID:23055611

  11. An efficient method for the prediction of deleterious multiple-point mutations in the secondary structure of RNAs using suboptimal folding solutions

    PubMed Central

    Churkin, Alexander; Barash, Danny

    2008-01-01

    Background RNAmute is an interactive Java application which, given an RNA sequence, calculates the secondary structure of all single point mutations and organizes them into categories according to their similarity to the predicted structure of the wild type. The secondary structure predictions are performed using the Vienna RNA package. A more efficient implementation of RNAmute is needed, however, to extend from the case of single point mutations to the general case of multiple point mutations, which may often be desired for computational predictions alongside mutagenesis experiments. But analyzing multiple point mutations, a process that requires traversing all possible mutations, becomes highly expensive since the running time is O(nm) for a sequence of length n with m-point mutations. Using Vienna's RNAsubopt, we present a method that selects only those mutations, based on stability considerations, which are likely to be conformational rearranging. The approach is best examined using the dot plot representation for RNA secondary structure. Results Using RNAsubopt, the suboptimal solutions for a given wild-type sequence are calculated once. Then, specific mutations are selected that are most likely to cause a conformational rearrangement. For an RNA sequence of about 100 nts and 3-point mutations (n = 100, m = 3), for example, the proposed method reduces the running time from several hours or even days to several minutes, thus enabling the practical application of RNAmute to the analysis of multiple-point mutations. Conclusion A highly efficient addition to RNAmute that is as user friendly as the original application but that facilitates the practical analysis of multiple-point mutations is presented. Such an extension can now be exploited prior to site-directed mutagenesis experiments by virologists, for example, who investigate the change of function in an RNA virus via mutations that disrupt important motifs in its secondary structure. A complete explanation of the application, called MultiRNAmute, is available at [1]. PMID:18445289

  12. Evidence of pervasive biologically functional secondary structures within the genomes of eukaryotic single-stranded DNA viruses.

    PubMed

    Muhire, Brejnev Muhizi; Golden, Michael; Murrell, Ben; Lefeuvre, Pierre; Lett, Jean-Michel; Gray, Alistair; Poon, Art Y F; Ngandu, Nobubelo Kwanele; Semegni, Yves; Tanov, Emil Pavlov; Monjane, Adérito Luis; Harkins, Gordon William; Varsani, Arvind; Shepherd, Dionne Natalie; Martin, Darren Patrick

    2014-02-01

    Single-stranded DNA (ssDNA) viruses have genomes that are potentially capable of forming complex secondary structures through Watson-Crick base pairing between their constituent nucleotides. A few of the structural elements formed by such base pairings are, in fact, known to have important functions during the replication of many ssDNA viruses. Unknown, however, are (i) whether numerous additional ssDNA virus genomic structural elements predicted to exist by computational DNA folding methods actually exist and (ii) whether those structures that do exist have any biological relevance. We therefore computationally inferred lists of the most evolutionarily conserved structures within a diverse selection of animal- and plant-infecting ssDNA viruses drawn from the families Circoviridae, Anelloviridae, Parvoviridae, Nanoviridae, and Geminiviridae and analyzed these for evidence of natural selection favoring the maintenance of these structures. While we find evidence that is consistent with purifying selection being stronger at nucleotide sites that are predicted to be base paired than at sites predicted to be unpaired, we also find strong associations between sites that are predicted to pair with one another and site pairs that are apparently coevolving in a complementary fashion. Collectively, these results indicate that natural selection actively preserves much of the pervasive secondary structure that is evident within eukaryote-infecting ssDNA virus genomes and, therefore, that much of this structure is biologically functional. Lastly, we provide examples of various highly conserved but completely uncharacterized structural elements that likely have important functions within some of the ssDNA virus genomes analyzed here.

  13. Evidence of Pervasive Biologically Functional Secondary Structures within the Genomes of Eukaryotic Single-Stranded DNA Viruses

    PubMed Central

    Muhire, Brejnev Muhizi; Golden, Michael; Murrell, Ben; Lefeuvre, Pierre; Lett, Jean-Michel; Gray, Alistair; Poon, Art Y. F.; Ngandu, Nobubelo Kwanele; Semegni, Yves; Tanov, Emil Pavlov; Monjane, Adérito Luis; Harkins, Gordon William; Varsani, Arvind; Shepherd, Dionne Natalie

    2014-01-01

    Single-stranded DNA (ssDNA) viruses have genomes that are potentially capable of forming complex secondary structures through Watson-Crick base pairing between their constituent nucleotides. A few of the structural elements formed by such base pairings are, in fact, known to have important functions during the replication of many ssDNA viruses. Unknown, however, are (i) whether numerous additional ssDNA virus genomic structural elements predicted to exist by computational DNA folding methods actually exist and (ii) whether those structures that do exist have any biological relevance. We therefore computationally inferred lists of the most evolutionarily conserved structures within a diverse selection of animal- and plant-infecting ssDNA viruses drawn from the families Circoviridae, Anelloviridae, Parvoviridae, Nanoviridae, and Geminiviridae and analyzed these for evidence of natural selection favoring the maintenance of these structures. While we find evidence that is consistent with purifying selection being stronger at nucleotide sites that are predicted to be base paired than at sites predicted to be unpaired, we also find strong associations between sites that are predicted to pair with one another and site pairs that are apparently coevolving in a complementary fashion. Collectively, these results indicate that natural selection actively preserves much of the pervasive secondary structure that is evident within eukaryote-infecting ssDNA virus genomes and, therefore, that much of this structure is biologically functional. Lastly, we provide examples of various highly conserved but completely uncharacterized structural elements that likely have important functions within some of the ssDNA virus genomes analyzed here. PMID:24284329

  14. Protein 8-class secondary structure prediction using conditional neural fields.

    PubMed

    Wang, Zhiyong; Zhao, Feng; Peng, Jian; Xu, Jinbo

    2011-10-01

    Compared with the protein 3-class secondary structure (SS) prediction, the 8-class prediction gains less attention and is also much more challenging, especially for proteins with few sequence homologs. This paper presents a new probabilistic method for 8-class SS prediction using conditional neural fields (CNFs), a recently invented probabilistic graphical model. This CNF method not only models the complex relationship between sequence features and SS, but also exploits the interdependency among SS types of adjacent residues. In addition to sequence profiles, our method also makes use of non-evolutionary information for SS prediction. Tested on the CB513 and RS126 data sets, our method achieves Q8 accuracy of 64.9 and 64.7%, respectively, which are much better than the SSpro8 web server (51.0 and 48.0%, respectively). Our method can also be used to predict other structure properties (e.g. solvent accessibility) of a protein or the SS of RNA. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  15. Exact calculation of loop formation probability identifies folding motifs in RNA secondary structures.

    PubMed

    Sloma, Michael F; Mathews, David H

    2016-12-01

    RNA secondary structure prediction is widely used to analyze RNA sequences. In an RNA partition function calculation, free energy nearest neighbor parameters are used in a dynamic programming algorithm to estimate statistical properties of the secondary structure ensemble. Previously, partition functions have largely been used to estimate the probability that a given pair of nucleotides form a base pair, the conditional stacking probability, the accessibility to binding of a continuous stretch of nucleotides, or a representative sample of RNA structures. Here it is demonstrated that an RNA partition function can also be used to calculate the exact probability of formation of hairpin loops, internal loops, bulge loops, or multibranch loops at a given position. This calculation can also be used to estimate the probability of formation of specific helices. Benchmarking on a set of RNA sequences with known secondary structures indicated that loops that were calculated to be more probable were more likely to be present in the known structure than less probable loops. Furthermore, highly probable loops are more likely to be in the known structure than the set of loops predicted in the lowest free energy structures. © 2016 Sloma and Mathews; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  16. CPU-GPU hybrid accelerating the Zuker algorithm for RNA secondary structure prediction applications

    PubMed Central

    2012-01-01

    Background Prediction of ribonucleic acid (RNA) secondary structure remains one of the most important research areas in bioinformatics. The Zuker algorithm is one of the most popular methods of free energy minimization for RNA secondary structure prediction. Thus far, few studies have been reported on the acceleration of the Zuker algorithm on general-purpose processors or on extra accelerators such as Field Programmable Gate-Array (FPGA) and Graphics Processing Units (GPU). To the best of our knowledge, no implementation combines both CPU and extra accelerators, such as GPUs, to accelerate the Zuker algorithm applications. Results In this paper, a CPU-GPU hybrid computing system that accelerates Zuker algorithm applications for RNA secondary structure prediction is proposed. The computing tasks are allocated between CPU and GPU for parallel cooperate execution. Performance differences between the CPU and the GPU in the task-allocation scheme are considered to obtain workload balance. To improve the hybrid system performance, the Zuker algorithm is optimally implemented with special methods for CPU and GPU architecture. Conclusions Speedup of 15.93× over optimized multi-core SIMD CPU implementation and performance advantage of 16% over optimized GPU implementation are shown in the experimental results. More than 14% of the sequences are executed on CPU in the hybrid system. The system combining CPU and GPU to accelerate the Zuker algorithm is proven to be promising and can be applied to other bioinformatics applications. PMID:22369626

  17. Prediction of beta-turns and beta-turn types by a novel bidirectional Elman-type recurrent neural network with multiple output layers (MOLEBRNN).

    PubMed

    Kirschner, Andreas; Frishman, Dmitrij

    2008-10-01

    Prediction of beta-turns from amino acid sequences has long been recognized as an important problem in structural bioinformatics due to their frequent occurrence as well as their structural and functional significance. Because various structural features of proteins are intercorrelated, secondary structure information has been often employed as an additional input for machine learning algorithms while predicting beta-turns. Here we present a novel bidirectional Elman-type recurrent neural network with multiple output layers (MOLEBRNN) capable of predicting multiple mutually dependent structural motifs and demonstrate its efficiency in recognizing three aspects of protein structure: beta-turns, beta-turn types, and secondary structure. The advantage of our method compared to other predictors is that it does not require any external input except for sequence profiles because interdependencies between different structural features are taken into account implicitly during the learning process. In a sevenfold cross-validation experiment on a standard test dataset our method exhibits the total prediction accuracy of 77.9% and the Mathew's Correlation Coefficient of 0.45, the highest performance reported so far. It also outperforms other known methods in delineating individual turn types. We demonstrate how simultaneous prediction of multiple targets influences prediction performance on single targets. The MOLEBRNN presented here is a generic method applicable in a variety of research fields where multiple mutually depending target classes need to be predicted. http://webclu.bio.wzw.tum.de/predator-web/.

  18. Predicting backbone Cα angles and dihedrals from protein sequences by stacked sparse auto-encoder deep neural network.

    PubMed

    Lyons, James; Dehzangi, Abdollah; Heffernan, Rhys; Sharma, Alok; Paliwal, Kuldip; Sattar, Abdul; Zhou, Yaoqi; Yang, Yuedong

    2014-10-30

    Because a nearly constant distance between two neighbouring Cα atoms, local backbone structure of proteins can be represented accurately by the angle between C(αi-1)-C(αi)-C(αi+1) (θ) and a dihedral angle rotated about the C(αi)-C(αi+1) bond (τ). θ and τ angles, as the representative of structural properties of three to four amino-acid residues, offer a description of backbone conformations that is complementary to φ and ψ angles (single residue) and secondary structures (>3 residues). Here, we report the first machine-learning technique for sequence-based prediction of θ and τ angles. Predicted angles based on an independent test have a mean absolute error of 9° for θ and 34° for τ with a distribution on the θ-τ plane close to that of native values. The average root-mean-square distance of 10-residue fragment structures constructed from predicted θ and τ angles is only 1.9Å from their corresponding native structures. Predicted θ and τ angles are expected to be complementary to predicted ϕ and ψ angles and secondary structures for using in model validation and template-based as well as template-free structure prediction. The deep neural network learning technique is available as an on-line server called Structural Property prediction with Integrated DEep neuRal network (SPIDER) at http://sparks-lab.org. Copyright © 2014 Wiley Periodicals, Inc.

  19. RNA 3D Modules in Genome-Wide Predictions of RNA 2D Structure

    PubMed Central

    Theis, Corinna; Zirbel, Craig L.; zu Siederdissen, Christian Höner; Anthon, Christian; Hofacker, Ivo L.; Nielsen, Henrik; Gorodkin, Jan

    2015-01-01

    Recent experimental and computational progress has revealed a large potential for RNA structure in the genome. This has been driven by computational strategies that exploit multiple genomes of related organisms to identify common sequences and secondary structures. However, these computational approaches have two main challenges: they are computationally expensive and they have a relatively high false discovery rate (FDR). Simultaneously, RNA 3D structure analysis has revealed modules composed of non-canonical base pairs which occur in non-homologous positions, apparently by independent evolution. These modules can, for example, occur inside structural elements which in RNA 2D predictions appear as internal loops. Hence one question is if the use of such RNA 3D information can improve the prediction accuracy of RNA secondary structure at a genome-wide level. Here, we use RNAz in combination with 3D module prediction tools and apply them on a 13-way vertebrate sequence-based alignment. We find that RNA 3D modules predicted by metaRNAmodules and JAR3D are significantly enriched in the screened windows compared to their shuffled counterparts. The initially estimated FDR of 47.0% is lowered to below 25% when certain 3D module predictions are present in the window of the 2D prediction. We discuss the implications and prospects for further development of computational strategies for detection of RNA 2D structure in genomic sequence. PMID:26509713

  20. Predicting RNA 3D structure using a coarse-grain helix-centered model

    PubMed Central

    Kerpedjiev, Peter; Höner zu Siederdissen, Christian; Hofacker, Ivo L.

    2015-01-01

    A 3D model of RNA structure can provide information about its function and regulation that is not possible with just the sequence or secondary structure. Current models suffer from low accuracy and long running times and either neglect or presume knowledge of the long-range interactions which stabilize the tertiary structure. Our coarse-grained, helix-based, tertiary structure model operates with only a few degrees of freedom compared with all-atom models while preserving the ability to sample tertiary structures given a secondary structure. It strikes a balance between the precision of an all-atom tertiary structure model and the simplicity and effectiveness of a secondary structure representation. It provides a simplified tool for exploring global arrangements of helices and loops within RNA structures. We provide an example of a novel energy function relying only on the positions of stems and loops. We show that coupling our model to this energy function produces predictions as good as or better than the current state of the art tools. We propose that given the wide range of conformational space that needs to be explored, a coarse-grain approach can explore more conformations in less iterations than an all-atom model coupled to a fine-grain energy function. Finally, we emphasize the overarching theme of providing an ensemble of predicted structures, something which our tool excels at, rather than providing a handful of the lowest energy structures. PMID:25904133

  1. Protein secondary structure and stability determined by combining exoproteolysis and matrix-assisted laser desorption/ionization time-of-flight mass spectrometry.

    PubMed

    Villanueva, Josep; Villegas, Virtudes; Querol, Enrique; Avilés, Francesc X; Serrano, Luis

    2002-09-01

    In the post-genomic era, several projects focused on the massive experimental resolution of the three-dimensional structures of all the proteins of different organisms have been initiated. Simultaneously, significant progress has been made in the ab initio prediction of protein three-dimensional structure. One of the keys to the success of such a prediction is the use of local information (i.e. secondary structure). Here we describe a new limited proteolysis methodology, based on the use of unspecific exoproteases coupled with matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS), to map quickly secondary structure elements of a protein from both ends, the N- and C-termini. We show that the proteolytic patterns (mass spectra series) obtained can be interpreted in the light of the conformation and local stability of the analyzed proteins, a direct correlation being observed between the predicted and the experimentally derived protein secondary structure. Further, this methodology can be easily applied to check rapidly the folding state of a protein and characterize mutational effects on protein conformation and stability. Moreover, given global stability information, this methodology allows one to locate the protein regions of increased or decreased conformational stability. All of this can be done with a small fraction of the amount of protein required by most of the other methods for conformational analysis. Thus limited exoproteolysis, together with MALDI-TOF MS, can be a useful tool to achieve quickly the elucidation of protein structure and stability. Copyright 2002 John Wiley & Sons, Ltd.

  2. SCPRED: accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences.

    PubMed

    Kurgan, Lukasz; Cios, Krzysztof; Chen, Ke

    2008-05-01

    Protein structure prediction methods provide accurate results when a homologous protein is predicted, while poorer predictions are obtained in the absence of homologous templates. However, some protein chains that share twilight-zone pairwise identity can form similar folds and thus determining structural similarity without the sequence similarity would be desirable for the structure prediction. The folding type of a protein or its domain is defined as the structural class. Current structural class prediction methods that predict the four structural classes defined in SCOP provide up to 63% accuracy for the datasets in which sequence identity of any pair of sequences belongs to the twilight-zone. We propose SCPRED method that improves prediction accuracy for sequences that share twilight-zone pairwise similarity with sequences used for the prediction. SCPRED uses a support vector machine classifier that takes several custom-designed features as its input to predict the structural classes. Based on extensive design that considers over 2300 index-, composition- and physicochemical properties-based features along with features based on the predicted secondary structure and content, the classifier's input includes 8 features based on information extracted from the secondary structure predicted with PSI-PRED and one feature computed from the sequence. Tests performed with datasets of 1673 protein chains, in which any pair of sequences shares twilight-zone similarity, show that SCPRED obtains 80.3% accuracy when predicting the four SCOP-defined structural classes, which is superior when compared with over a dozen recent competing methods that are based on support vector machine, logistic regression, and ensemble of classifiers predictors. The SCPRED can accurately find similar structures for sequences that share low identity with sequence used for the prediction. The high predictive accuracy achieved by SCPRED is attributed to the design of the features, which are capable of separating the structural classes in spite of their low dimensionality. We also demonstrate that the SCPRED's predictions can be successfully used as a post-processing filter to improve performance of modern fold classification methods.

  3. SCPRED: Accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences

    PubMed Central

    Kurgan, Lukasz; Cios, Krzysztof; Chen, Ke

    2008-01-01

    Background Protein structure prediction methods provide accurate results when a homologous protein is predicted, while poorer predictions are obtained in the absence of homologous templates. However, some protein chains that share twilight-zone pairwise identity can form similar folds and thus determining structural similarity without the sequence similarity would be desirable for the structure prediction. The folding type of a protein or its domain is defined as the structural class. Current structural class prediction methods that predict the four structural classes defined in SCOP provide up to 63% accuracy for the datasets in which sequence identity of any pair of sequences belongs to the twilight-zone. We propose SCPRED method that improves prediction accuracy for sequences that share twilight-zone pairwise similarity with sequences used for the prediction. Results SCPRED uses a support vector machine classifier that takes several custom-designed features as its input to predict the structural classes. Based on extensive design that considers over 2300 index-, composition- and physicochemical properties-based features along with features based on the predicted secondary structure and content, the classifier's input includes 8 features based on information extracted from the secondary structure predicted with PSI-PRED and one feature computed from the sequence. Tests performed with datasets of 1673 protein chains, in which any pair of sequences shares twilight-zone similarity, show that SCPRED obtains 80.3% accuracy when predicting the four SCOP-defined structural classes, which is superior when compared with over a dozen recent competing methods that are based on support vector machine, logistic regression, and ensemble of classifiers predictors. Conclusion The SCPRED can accurately find similar structures for sequences that share low identity with sequence used for the prediction. The high predictive accuracy achieved by SCPRED is attributed to the design of the features, which are capable of separating the structural classes in spite of their low dimensionality. We also demonstrate that the SCPRED's predictions can be successfully used as a post-processing filter to improve performance of modern fold classification methods. PMID:18452616

  4. Prediction of pi-turns in proteins using PSI-BLAST profiles and secondary structure information.

    PubMed

    Wang, Yan; Xue, Zhi-Dong; Shi, Xiao-Hong; Xu, Jin

    2006-09-01

    Due to the structural and functional importance of tight turns, some methods have been proposed to predict gamma-turns, beta-turns, and alpha-turns in proteins. In the past, studies of pi-turns were made, but not a single prediction approach has been developed so far. It will be useful to develop a method for identifying pi-turns in a protein sequence. In this paper, the support vector machine (SVM) method has been introduced to predict pi-turns from the amino acid sequence. The training and testing of this approach is performed with a newly collected data set of 640 non-homologous protein chains containing 1931 pi-turns. Different sequence encoding schemes have been explored in order to investigate their effects on the prediction performance. With multiple sequence alignment and predicted secondary structure, the final SVM model yields a Matthews correlation coefficient (MCC) of 0.556 by a 7-fold cross-validation. A web server implementing the prediction method is available at the following URL: http://210.42.106.80/piturn/.

  5. STITCHER: Dynamic assembly of likely amyloid and prion β-structures from secondary structure predictions

    PubMed Central

    Bryan, Allen W; O’Donnell, Charles W; Menke, Matthew; Cowen, Lenore J; Lindquist, Susan; Berger, Bonnie

    2012-01-01

    The supersecondary structure of amyloids and prions, proteins of intense clinical and biological interest, are difficult to determine by standard experimental or computational means. In addition, significant conformational heterogeneity is known or suspected to exist in many amyloid fibrils. Previous work has demonstrated that probability-based prediction of discrete β-strand pairs can offer insight into these structures. Here, we devise a system of energetic rules that can be used to dynamically assemble these discrete β-strand pairs into complete amyloid β-structures. The STITCHER algorithm progressively ‘stitches’ strand-pairs into full β-sheets based on a novel free-energy model, incorporating experimentally observed amino-acid side-chain stacking contributions, entropic estimates, and steric restrictions for amyloidal parallel β-sheet construction. A dynamic program computes the top 50 structures and returns both the highest scoring structure and a consensus structure taken by polling this list for common discrete elements. Putative structural heterogeneity can be inferred from sequence regions that compose poorly. Predictions show agreement with experimental models of Alzheimer’s amyloid beta peptide and the Podospora anserina Het-s prion. Predictions of the HET-s homolog HET-S also reflect experimental observations of poor amyloid formation. We put forward predicted structures for the yeast prion Sup35, suggesting N-terminal structural stability enabled by tyrosine ladders, and C-terminal heterogeneity. Predictions for the Rnq1 prion and alpha-synuclein are also given, identifying a similar mix of homogenous and heterogeneous secondary structure elements. STITCHER provides novel insight into the energetic basis of amyloid structure, provides accurate structure predictions, and can help guide future experimental studies. Proteins 2012. © 2011 Wiley Periodicals, Inc. PMID:22095906

  6. STITCHER: Dynamic assembly of likely amyloid and prion β-structures from secondary structure predictions.

    PubMed

    Bryan, Allen W; O'Donnell, Charles W; Menke, Matthew; Cowen, Lenore J; Lindquist, Susan; Berger, Bonnie

    2012-02-01

    The supersecondary structure of amyloids and prions, proteins of intense clinical and biological interest, are difficult to determine by standard experimental or computational means. In addition, significant conformational heterogeneity is known or suspected to exist in many amyloid fibrils. Previous work has demonstrated that probability-based prediction of discrete β-strand pairs can offer insight into these structures. Here, we devise a system of energetic rules that can be used to dynamically assemble these discrete β-strand pairs into complete amyloid β-structures. The STITCHER algorithm progressively 'stitches' strand-pairs into full β-sheets based on a novel free-energy model, incorporating experimentally observed amino-acid side-chain stacking contributions, entropic estimates, and steric restrictions for amyloidal parallel β-sheet construction. A dynamic program computes the top 50 structures and returns both the highest scoring structure and a consensus structure taken by polling this list for common discrete elements. Putative structural heterogeneity can be inferred from sequence regions that compose poorly. Predictions show agreement with experimental models of Alzheimer's amyloid beta peptide and the Podospora anserina Het-s prion. Predictions of the HET-s homolog HET-S also reflect experimental observations of poor amyloid formation. We put forward predicted structures for the yeast prion Sup35, suggesting N-terminal structural stability enabled by tyrosine ladders, and C-terminal heterogeneity. Predictions for the Rnq1 prion and alpha-synuclein are also given, identifying a similar mix of homogenous and heterogeneous secondary structure elements. STITCHER provides novel insight into the energetic basis of amyloid structure, provides accurate structure predictions, and can help guide future experimental studies. Copyright © 2011 Wiley Periodicals, Inc.

  7. Prediction of cis/trans isomerization in proteins using PSI-BLAST profiles and secondary structure information.

    PubMed

    Song, Jiangning; Burrage, Kevin; Yuan, Zheng; Huber, Thomas

    2006-03-09

    The majority of peptide bonds in proteins are found to occur in the trans conformation. However, for proline residues, a considerable fraction of Prolyl peptide bonds adopt the cis form. Proline cis/trans isomerization is known to play a critical role in protein folding, splicing, cell signaling and transmembrane active transport. Accurate prediction of proline cis/trans isomerization in proteins would have many important applications towards the understanding of protein structure and function. In this paper, we propose a new approach to predict the proline cis/trans isomerization in proteins using support vector machine (SVM). The preliminary results indicated that using Radial Basis Function (RBF) kernels could lead to better prediction performance than that of polynomial and linear kernel functions. We used single sequence information of different local window sizes, amino acid compositions of different local sequences, multiple sequence alignment obtained from PSI-BLAST and the secondary structure information predicted by PSIPRED. We explored these different sequence encoding schemes in order to investigate their effects on the prediction performance. The training and testing of this approach was performed on a newly enlarged dataset of 2424 non-homologous proteins determined by X-Ray diffraction method using 5-fold cross-validation. Selecting the window size 11 provided the best performance for determining the proline cis/trans isomerization based on the single amino acid sequence. It was found that using multiple sequence alignments in the form of PSI-BLAST profiles could significantly improve the prediction performance, the prediction accuracy increased from 62.8% with single sequence to 69.8% and Matthews Correlation Coefficient (MCC) improved from 0.26 with single local sequence to 0.40. Furthermore, if coupled with the predicted secondary structure information by PSIPRED, our method yielded a prediction accuracy of 71.5% and MCC of 0.43, 9% and 0.17 higher than the accuracy achieved based on the singe sequence information, respectively. A new method has been developed to predict the proline cis/trans isomerization in proteins based on support vector machine, which used the single amino acid sequence with different local window sizes, the amino acid compositions of local sequence flanking centered proline residues, the position-specific scoring matrices (PSSMs) extracted by PSI-BLAST and the predicted secondary structures generated by PSIPRED. The successful application of SVM approach in this study reinforced that SVM is a powerful tool in predicting proline cis/trans isomerization in proteins and biological sequence analysis.

  8. Automated 3D structure composition for large RNAs

    PubMed Central

    Popenda, Mariusz; Szachniuk, Marta; Antczak, Maciej; Purzycka, Katarzyna J.; Lukasiak, Piotr; Bartol, Natalia; Blazewicz, Jacek; Adamiak, Ryszard W.

    2012-01-01

    Understanding the numerous functions that RNAs play in living cells depends critically on knowledge of their three-dimensional structure. Due to the difficulties in experimentally assessing structures of large RNAs, there is currently great demand for new high-resolution structure prediction methods. We present the novel method for the fully automated prediction of RNA 3D structures from a user-defined secondary structure. The concept is founded on the machine translation system. The translation engine operates on the RNA FRABASE database tailored to the dictionary relating the RNA secondary structure and tertiary structure elements. The translation algorithm is very fast. Initial 3D structure is composed in a range of seconds on a single processor. The method assures the prediction of large RNA 3D structures of high quality. Our approach needs neither structural templates nor RNA sequence alignment, required for comparative methods. This enables the building of unresolved yet native and artificial RNA structures. The method is implemented in a publicly available, user-friendly server RNAComposer. It works in an interactive mode and a batch mode. The batch mode is designed for large-scale modelling and accepts atomic distance restraints. Presently, the server is set to build RNA structures of up to 500 residues. PMID:22539264

  9. Capturing non-local interactions by long short-term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility.

    PubMed

    Heffernan, Rhys; Yang, Yuedong; Paliwal, Kuldip; Zhou, Yaoqi

    2017-09-15

    The accuracy of predicting protein local and global structural properties such as secondary structure and solvent accessible surface area has been stagnant for many years because of the challenge of accounting for non-local interactions between amino acid residues that are close in three-dimensional structural space but far from each other in their sequence positions. All existing machine-learning techniques relied on a sliding window of 10-20 amino acid residues to capture some 'short to intermediate' non-local interactions. Here, we employed Long Short-Term Memory (LSTM) Bidirectional Recurrent Neural Networks (BRNNs) which are capable of capturing long range interactions without using a window. We showed that the application of LSTM-BRNN to the prediction of protein structural properties makes the most significant improvement for residues with the most long-range contacts (|i-j| >19) over a previous window-based, deep-learning method SPIDER2. Capturing long-range interactions allows the accuracy of three-state secondary structure prediction to reach 84% and the correlation coefficient between predicted and actual solvent accessible surface areas to reach 0.80, plus a reduction of 5%, 10%, 5% and 10% in the mean absolute error for backbone ϕ , ψ , θ and τ angles, respectively, from SPIDER2. More significantly, 27% of 182724 40-residue models directly constructed from predicted C α atom-based θ and τ have similar structures to their corresponding native structures (6Å RMSD or less), which is 3% better than models built by ϕ and ψ angles. We expect the method to be useful for assisting protein structure and function prediction. The method is available as a SPIDER3 server and standalone package at http://sparks-lab.org . yaoqi.zhou@griffith.edu.au or yuedong.yang@griffith.edu.au. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  10. Secondary structural analyses of ITS1 in Paramecium.

    PubMed

    Hoshina, Ryo

    2010-01-01

    The nuclear ribosomal RNA gene operon is interrupted by internal transcribed spacer (ITS) 1 and ITS2. Although the secondary structure of ITS2 has been widely investigated, less is known about ITS1 and its structure. In this study, the secondary structure of ITS1 sequences for Paramecium and other ciliates was predicted. Each Paramecium ITS1 forms an open loop with three helices, A through C. Helix B was highly conserved among Paramecium, and similar helices were found in other ciliates. A phylogenetic analysis using the ITS1 sequences showed high-resolution, implying that ITS1 is a good tool for species-level analyses.

  11. RNApdbee 2.0: multifunctional tool for RNA structure annotation.

    PubMed

    Zok, Tomasz; Antczak, Maciej; Zurkowski, Michal; Popenda, Mariusz; Blazewicz, Jacek; Adamiak, Ryszard W; Szachniuk, Marta

    2018-04-30

    In the field of RNA structural biology and bioinformatics, an access to correctly annotated RNA structure is of crucial importance, especially in the secondary and 3D structure predictions. RNApdbee webserver, introduced in 2014, primarily aimed to address the problem of RNA secondary structure extraction from the PDB files. Its new version, RNApdbee 2.0, is a highly advanced multifunctional tool for RNA structure annotation, revealing the relationship between RNA secondary and 3D structure given in the PDB or PDBx/mmCIF format. The upgraded version incorporates new algorithms for recognition and classification of high-ordered pseudoknots in large RNA structures. It allows analysis of isolated base pairs impact on RNA structure. It can visualize RNA secondary structures-including that of quadruplexes-with depiction of non-canonical interactions. It also annotates motifs to ease identification of stems, loops and single-stranded fragments in the input RNA structure. RNApdbee 2.0 is implemented as a publicly available webserver with an intuitive interface and can be freely accessed at http://rnapdbee.cs.put.poznan.pl/.

  12. Computing the Partition Function for Kinetically Trapped RNA Secondary Structures

    PubMed Central

    Lorenz, William A.; Clote, Peter

    2011-01-01

    An RNA secondary structure is locally optimal if there is no lower energy structure that can be obtained by the addition or removal of a single base pair, where energy is defined according to the widely accepted Turner nearest neighbor model. Locally optimal structures form kinetic traps, since any evolution away from a locally optimal structure must involve energetically unfavorable folding steps. Here, we present a novel, efficient algorithm to compute the partition function over all locally optimal secondary structures of a given RNA sequence. Our software, RNAlocopt runs in time and space. Additionally, RNAlocopt samples a user-specified number of structures from the Boltzmann subensemble of all locally optimal structures. We apply RNAlocopt to show that (1) the number of locally optimal structures is far fewer than the total number of structures – indeed, the number of locally optimal structures approximately equal to the square root of the number of all structures, (2) the structural diversity of this subensemble may be either similar to or quite different from the structural diversity of the entire Boltzmann ensemble, a situation that depends on the type of input RNA, (3) the (modified) maximum expected accuracy structure, computed by taking into account base pairing frequencies of locally optimal structures, is a more accurate prediction of the native structure than other current thermodynamics-based methods. The software RNAlocopt constitutes a technical breakthrough in our study of the folding landscape for RNA secondary structures. For the first time, locally optimal structures (kinetic traps in the Turner energy model) can be rapidly generated for long RNA sequences, previously impossible with methods that involved exhaustive enumeration. Use of locally optimal structure leads to state-of-the-art secondary structure prediction, as benchmarked against methods involving the computation of minimum free energy and of maximum expected accuracy. Web server and source code available at http://bioinformatics.bc.edu/clotelab/RNAlocopt/. PMID:21297972

  13. Prediction of Long Loops with Embedded Secondary Structure using the Protein Local Optimization Program

    PubMed Central

    Miller, Edward B.; Murrett, Colleen S.; Zhu, Kai; Zhao, Suwen; Goldfeld, Dahlia A.; Bylund, Joseph H.; Friesner, Richard A.

    2013-01-01

    Robust homology modeling to atomic-level accuracy requires in the general case successful prediction of protein loops containing small segments of secondary structure. Further, as loop prediction advances to success with larger loops, the exclusion of loops containing secondary structure becomes awkward. Here, we extend the applicability of the Protein Local Optimization Program (PLOP) to loops up to 17 residues in length that contain either helical or hairpin segments. In general, PLOP hierarchically samples conformational space and ranks candidate loops with a high-quality molecular mechanics force field. For loops identified to possess α-helical segments, we employ an alternative dihedral library composed of (ϕ,ψ) angles commonly found in helices. The alternative library is searched over a user-specified range of residues that define the helical bounds. The source of these helical bounds can be from popular secondary structure prediction software or from analysis of past loop predictions where a propensity to form a helix is observed. Due to the maturity of our energy model, the lowest energy loop across all experiments can be selected with an accuracy of sub-Ångström RMSD in 80% of cases, 1.0 to 1.5 Å RMSD in 14% of cases, and poorer than 1.5 Å RMSD in 6% of cases. The effectiveness of our current methods in predicting hairpin-containing loops is explored with hairpins up to 13 residues in length and again reaching an accuracy of sub-Ångström RMSD in 83% of cases, 1.0 to 1.5 Å RMSD in 10% of cases, and poorer than 1.5 Å RMSD in 7% of cases. Finally, we explore the effect of an imprecise surrounding environment, in which side chains, but not the backbone, are initially in perturbed geometries. In these cases, loops perturbed to 3Å RMSD from the native environment were restored to their native conformation with sub-Ångström RMSD. PMID:23814507

  14. Three-Dimensional Molecular Modeling of a Diverse Range of SC Clan Serine Proteases

    PubMed Central

    Laskar, Aparna; Chatterjee, Aniruddha; Chatterjee, Somnath; Rodger, Euan J.

    2012-01-01

    Serine proteases are involved in a variety of biological processes and are classified into clans sharing structural homology. Although various three-dimensional structures of SC clan proteases have been experimentally determined, they are mostly bacterial and animal proteases, with some from archaea, plants, and fungi, and as yet no structures have been determined for protozoa. To bridge this gap, we have used molecular modeling techniques to investigate the structural properties of different SC clan serine proteases from a diverse range of taxa. Either SWISS-MODEL was used for homology-based structure prediction or the LOOPP server was used for threading-based structure prediction. The predicted models were refined using Insight II and SCRWL and validated against experimental structures. Investigation of secondary structures and electrostatic surface potential was performed using MOLMOL. The structural geometry of the catalytic core shows clear deviations between taxa, but the relative positions of the catalytic triad residues were conserved. Evolutionary divergence was also exhibited by large variation in secondary structure features outside the core, differences in overall amino acid distribution, and unique surface electrostatic potential patterns between species. Encompassing a wide range of taxa, our structural analysis provides an evolutionary perspective on SC clan serine proteases. PMID:23213528

  15. A systematic review on popularity, application and characteristics of protein secondary structure prediction tools.

    PubMed

    Kashani-Amin, Elaheh; Tabatabaei-Malazy, Ozra; Sakhteman, Amirhossein; Larijani, Bagher; Ebrahim-Habibi, Azadeh

    2018-02-27

    Prediction of proteins' secondary structure is one of the major steps in the generation of homology models. These models provide structural information which is used to design suitable ligands for potential medicinal targets. However, selecting a proper tool between multiple secondary structure prediction (SSP) options is challenging. The current study is an insight onto currently favored methods and tools, within various contexts. A systematic review was performed for a comprehensive access to recent (2013-2016) studies which used or recommended protein SSP tools. Three databases, Web of Science, PubMed and Scopus were systematically searched and 99 out of 209 studies were finally found eligible to extract data. Four categories of applications for 59 retrieved SSP tools were: (I) prediction of structural features of a given sequence, (II) evaluation of a method, (III) providing input for a new SSP method and (IV) integrating a SSP tool as a component for a program. PSIPRED was found to be the most popular tool in all four categories. JPred and tools utilizing PHD (Profile network from HeiDelberg) method occupied second and third places of popularity in categories I and II. JPred was only found in the two first categories, while PHD was present in three fields. This study provides a comprehensive insight about the recent usage of SSP tools which could be helpful for selecting a proper tool's choice. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  16. SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction.

    PubMed

    Boniecki, Michal J; Lach, Grzegorz; Dawson, Wayne K; Tomala, Konrad; Lukasz, Pawel; Soltysinski, Tomasz; Rother, Kristian M; Bujnicki, Janusz M

    2016-04-20

    RNA molecules play fundamental roles in cellular processes. Their function and interactions with other biomolecules are dependent on the ability to form complex three-dimensional (3D) structures. However, experimental determination of RNA 3D structures is laborious and challenging, and therefore, the majority of known RNAs remain structurally uncharacterized. Here, we present SimRNA: a new method for computational RNA 3D structure prediction, which uses a coarse-grained representation, relies on the Monte Carlo method for sampling the conformational space, and employs a statistical potential to approximate the energy and identify conformations that correspond to biologically relevant structures. SimRNA can fold RNA molecules using only sequence information, and, on established test sequences, it recapitulates secondary structure with high accuracy, including correct prediction of pseudoknots. For modeling of complex 3D structures, it can use additional restraints, derived from experimental or computational analyses, including information about secondary structure and/or long-range contacts. SimRNA also can be used to analyze conformational landscapes and identify potential alternative structures. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. PSS-3D1D: an improved 3D1D profile method of protein fold recognition for the annotation of twilight zone sequences.

    PubMed

    Ganesan, K; Parthasarathy, S

    2011-12-01

    Annotation of any newly determined protein sequence depends on the pairwise sequence identity with known sequences. However, for the twilight zone sequences which have only 15-25% identity, the pair-wise comparison methods are inadequate and the annotation becomes a challenging task. Such sequences can be annotated by using methods that recognize their fold. Bowie et al. described a 3D1D profile method in which the amino acid sequences that fold into a known 3D structure are identified by their compatibility to that known 3D structure. We have improved the above method by using the predicted secondary structure information and employ it for fold recognition from the twilight zone sequences. In our Protein Secondary Structure 3D1D (PSS-3D1D) method, a score (w) for the predicted secondary structure of the query sequence is included in finding the compatibility of the query sequence to the known fold 3D structures. In the benchmarks, the PSS-3D1D method shows a maximum of 21% improvement in predicting correctly the α + β class of folds from the sequences with twilight zone level of identity, when compared with the 3D1D profile method. Hence, the PSS-3D1D method could offer more clues than the 3D1D method for the annotation of twilight zone sequences. The web based PSS-3D1D method is freely available in the PredictFold server at http://bioinfo.bdu.ac.in/servers/ .

  18. Dynamic/Jitter Assessment of Multiple Potential HabEx Structural Designs

    NASA Technical Reports Server (NTRS)

    Knight, J. Brent; Stahl, H. Philip; Singleton, Andy; Hunt, Ron; Therrell, Melissa; Caldwell, Kate; Garcia, Jay; Baysinger, Mike

    2017-01-01

    One of the driving structural requirements of the Habitable Exo-Planet (HabEx) telescope is to maintain Line Of Sight (LOS) stability between the Primary Mirror (PM) and Secondary Mirror (SM) of = 5 mas. Dynamic analyses of two configurations of a proposed (HabEx) 4 meter off-axis telescope structure were performed to predict effects of jitter on primary/secondary mirror alignment. The dynamic disturbance used as the forcing function was the James Webb Space Telescope reaction wheel assembly vibration emission specification level. The objective of these analyses was to predict "order-of-magnitude" performance for various structural configurations which will roll into efforts to define the HabEx structural design's global architecture. Two variations of the basic architectural design were analyzed. Relative motion between the PM and the SM for each design configuration are reported.

  19. Dynamic/jitter assessment of multiple potential HabEx structural designs

    NASA Astrophysics Data System (ADS)

    Knight, J. Brent; Stahl, H. Philip; Singleton, Andy; Hunt, Ron; Therrell, Melissa; Caldwell, Kate; Garcia, Jay; Baysinger, Mike

    2017-09-01

    One of the driving structural requirements of the Habitable Exo-Planet (HabEx) telescope is to maintain Line Of Sight (LOS) stability between the Primary Mirror (PM) and Secondary Mirror (SM) of <= 5 milli-arc seconds (mas). Dynamic analyses of two configurations of a proposed HabEx 4 meter off-axis telescope structure were performed to predict effects of a vibration input on primary/secondary mirror alignment. The dynamic disturbance used as the forcing function was the James Webb Space Telescope reaction wheel assembly vibration emission specification level. The objective of these analyses was to predict "order-of-magnitude" performance for various structural configurations which contribute to efforts in defining the HabEx structural design's global architecture. Two variations of the basic architectural design were analyzed. Relative motion between the PM and the SM for each design configuration are reported.

  20. A novel Multi-Agent Ada-Boost algorithm for predicting protein structural class with the information of protein secondary structure.

    PubMed

    Fan, Ming; Zheng, Bin; Li, Lihua

    2015-10-01

    Knowledge of the structural class of a given protein is important for understanding its folding patterns. Although a lot of efforts have been made, it still remains a challenging problem for prediction of protein structural class solely from protein sequences. The feature extraction and classification of proteins are the main problems in prediction. In this research, we extended our earlier work regarding these two aspects. In protein feature extraction, we proposed a scheme by calculating the word frequency and word position from sequences of amino acid, reduced amino acid, and secondary structure. For an accurate classification of the structural class of protein, we developed a novel Multi-Agent Ada-Boost (MA-Ada) method by integrating the features of Multi-Agent system into Ada-Boost algorithm. Extensive experiments were taken to test and compare the proposed method using four benchmark datasets in low homology. The results showed classification accuracies of 88.5%, 96.0%, 88.4%, and 85.5%, respectively, which are much better compared with the existing methods. The source code and dataset are available on request.

  1. Predicting oligonucleotide affinity to nucleic acid targets.

    PubMed Central

    Mathews, D H; Burkard, M E; Freier, S M; Wyatt, J R; Turner, D H

    1999-01-01

    A computer program, OligoWalk, is reported that predicts the equilibrium affinity of complementary DNA or RNA oligonucleotides to an RNA target. This program considers the predicted stability of the oligonucleotide-target helix and the competition with predicted secondary structure of both the target and the oligonucleotide. Both unimolecular and bimolecular oligonucleotide self structure are considered with a user-defined concentration. The application of OligoWalk is illustrated with three comparisons to experimental results drawn from the literature. PMID:10580474

  2. Visualizing the global secondary structure of a viral RNA genome with cryo-electron microscopy

    PubMed Central

    Garmann, Rees F.; Gopal, Ajaykumar; Athavale, Shreyas S.; Knobler, Charles M.; Gelbart, William M.; Harvey, Stephen C.

    2015-01-01

    The lifecycle, and therefore the virulence, of single-stranded (ss)-RNA viruses is regulated not only by their particular protein gene products, but also by the secondary and tertiary structure of their genomes. The secondary structure of the entire genomic RNA of satellite tobacco mosaic virus (STMV) was recently determined by selective 2′-hydroxyl acylation analyzed by primer extension (SHAPE). The SHAPE analysis suggested a single highly extended secondary structure with much less branching than occurs in the ensemble of structures predicted by purely thermodynamic algorithms. Here we examine the solution-equilibrated STMV genome by direct visualization with cryo-electron microscopy (cryo-EM), using an RNA of similar length transcribed from the yeast genome as a control. The cryo-EM data reveal an ensemble of branching patterns that are collectively consistent with the SHAPE-derived secondary structure model. Thus, our results both elucidate the statistical nature of the secondary structure of large ss-RNAs and give visual support for modern RNA structure determination methods. Additionally, this work introduces cryo-EM as a means to distinguish between competing secondary structure models if the models differ significantly in terms of the number and/or length of branches. Furthermore, with the latest advances in cryo-EM technology, we suggest the possibility of developing methods that incorporate restraints from cryo-EM into the next generation of algorithms for the determination of RNA secondary and tertiary structures. PMID:25752599

  3. Impact of target mRNA structure on siRNA silencing efficiency: A large-scale study.

    PubMed

    Gredell, Joseph A; Berger, Angela K; Walton, S Patrick

    2008-07-01

    The selection of active siRNAs is generally based on identifying siRNAs with certain sequence and structural properties. However, the efficiency of RNA interference has also been shown to depend on the structure of the target mRNA, primarily through studies using exogenous transcripts with well-defined secondary structures in the vicinity of the target sequence. While these studies provide a means for examining the impact of target sequence and structure independently, the predicted secondary structures for these transcripts are often not reflective of structures that form in full-length, native mRNAs where interactions can occur between relatively remote segments of the mRNAs. Here, using a combination of experimental results and analysis of a large dataset, we demonstrate that the accessibility of certain local target structures on the mRNA is an important determinant in the gene silencing ability of siRNAs. siRNAs targeting the enhanced green fluorescent protein were chosen using a minimal siRNA selection algorithm followed by classification based on the predicted minimum free energy structures of the target transcripts. Transfection into HeLa and HepG2 cells revealed that siRNAs targeting regions of the mRNA predicted to have unpaired 5'- and 3'-ends resulted in greater gene silencing than regions predicted to have other types of secondary structure. These results were confirmed by analysis of gene silencing data from previously published siRNAs, which showed that mRNA target regions unpaired at either the 5'-end or 3'-end were silenced, on average, approximately 10% more strongly than target regions unpaired in the center or primarily paired throughout. We found this effect to be independent of the structure of the siRNA guide strand. Taken together, these results suggest minimal requirements for nucleation of hybridization between the siRNA guide strand and mRNA and that both mRNA and guide strand structure should be considered when choosing candidate siRNAs. (c) 2008 Wiley Periodicals, Inc.

  4. Impact of target mRNA structure on siRNA silencing efficiency: a large-scale study

    PubMed Central

    Gredell, Joseph A.; Berger, Angela K.; Walton, S. Patrick

    2009-01-01

    The selection of active siRNAs is generally based on identifying siRNAs with certain sequence and structural properties. However, the efficiency of RNA interference has also been shown to depend on the structure of the target mRNA, primarily through studies using exogenous transcripts with well-defined secondary structures in the vicinity of the target sequence. While these studies provide a means for examining the impact of target sequence and structure independently, the predicted secondary structures for these transcripts are often not reflective of structures that form in full-length, native mRNAs where interactions can occur between relatively remote segments of the mRNAs. Here, using a combination of experimental results and analysis of a large dataset, we demonstrate that the accessibility of certain local target structures on the mRNA is an important determinant in the gene silencing ability of siRNAs. siRNAs targeting the enhanced green fluorescent protein were chosen using a minimal siRNA selection algorithm followed by classification based on the predicted minimum free energy structures of the target transcripts. Transfection into HeLa and HepG2 cells revealed that siRNAs targeting regions of the mRNA predicted to have unpaired 5’- and 3’-ends resulted in greater gene silencing than regions predicted to have other types of secondary structure. These results were confirmed by analysis of gene silencing data from previously published siRNAs, which showed that mRNA target regions unpaired at either the 5’-end or 3’-end were silenced, on average, ~10% more strongly than target regions unpaired in the center or primarily paired throughout. We found this effect to be independent of the structure of the siRNA guide strand. Taken together, these results suggest minimal requirements for nucleation of hybridization between the siRNA guide strand and mRNA and that both mRNA and guide strand structure should be considered when choosing candidate siRNAs. PMID:18306428

  5. Predicting β-Turns in Protein Using Kernel Logistic Regression

    PubMed Central

    Elbashir, Murtada Khalafallah; Sheng, Yu; Wang, Jianxin; Wu, FangXiang; Li, Min

    2013-01-01

    A β-turn is a secondary protein structure type that plays a significant role in protein configuration and function. On average 25% of amino acids in protein structures are located in β-turns. It is very important to develope an accurate and efficient method for β-turns prediction. Most of the current successful β-turns prediction methods use support vector machines (SVMs) or neural networks (NNs). The kernel logistic regression (KLR) is a powerful classification technique that has been applied successfully in many classification problems. However, it is often not found in β-turns classification, mainly because it is computationally expensive. In this paper, we used KLR to obtain sparse β-turns prediction in short evolution time. Secondary structure information and position-specific scoring matrices (PSSMs) are utilized as input features. We achieved Q total of 80.7% and MCC of 50% on BT426 dataset. These results show that KLR method with the right algorithm can yield performance equivalent to or even better than NNs and SVMs in β-turns prediction. In addition, KLR yields probabilistic outcome and has a well-defined extension to multiclass case. PMID:23509793

  6. Predicting β-turns in protein using kernel logistic regression.

    PubMed

    Elbashir, Murtada Khalafallah; Sheng, Yu; Wang, Jianxin; Wu, Fangxiang; Li, Min

    2013-01-01

    A β-turn is a secondary protein structure type that plays a significant role in protein configuration and function. On average 25% of amino acids in protein structures are located in β-turns. It is very important to develope an accurate and efficient method for β-turns prediction. Most of the current successful β-turns prediction methods use support vector machines (SVMs) or neural networks (NNs). The kernel logistic regression (KLR) is a powerful classification technique that has been applied successfully in many classification problems. However, it is often not found in β-turns classification, mainly because it is computationally expensive. In this paper, we used KLR to obtain sparse β-turns prediction in short evolution time. Secondary structure information and position-specific scoring matrices (PSSMs) are utilized as input features. We achieved Q total of 80.7% and MCC of 50% on BT426 dataset. These results show that KLR method with the right algorithm can yield performance equivalent to or even better than NNs and SVMs in β-turns prediction. In addition, KLR yields probabilistic outcome and has a well-defined extension to multiclass case.

  7. A statistical analysis of RNA folding algorithms through thermodynamic parameter perturbation.

    PubMed

    Layton, D M; Bundschuh, R

    2005-01-01

    Computational RNA secondary structure prediction is rather well established. However, such prediction algorithms always depend on a large number of experimentally measured parameters. Here, we study how sensitive structure prediction algorithms are to changes in these parameters. We found already that for changes corresponding to the actual experimental error to which these parameters have been determined, 30% of the structure are falsely predicted whereas the ground state structure is preserved under parameter perturbation in only 5% of all the cases. We establish that base-pairing probabilities calculated in a thermal ensemble are viable although not a perfect measure for the reliability of the prediction of individual structure elements. Here, a new measure of stability using parameter perturbation is proposed, and its limitations are discussed.

  8. Efficient pairwise RNA structure prediction using probabilistic alignment constraints in Dynalign

    PubMed Central

    2007-01-01

    Background Joint alignment and secondary structure prediction of two RNA sequences can significantly improve the accuracy of the structural predictions. Methods addressing this problem, however, are forced to employ constraints that reduce computation by restricting the alignments and/or structures (i.e. folds) that are permissible. In this paper, a new methodology is presented for the purpose of establishing alignment constraints based on nucleotide alignment and insertion posterior probabilities. Using a hidden Markov model, posterior probabilities of alignment and insertion are computed for all possible pairings of nucleotide positions from the two sequences. These alignment and insertion posterior probabilities are additively combined to obtain probabilities of co-incidence for nucleotide position pairs. A suitable alignment constraint is obtained by thresholding the co-incidence probabilities. The constraint is integrated with Dynalign, a free energy minimization algorithm for joint alignment and secondary structure prediction. The resulting method is benchmarked against the previous version of Dynalign and against other programs for pairwise RNA structure prediction. Results The proposed technique eliminates manual parameter selection in Dynalign and provides significant computational time savings in comparison to prior constraints in Dynalign while simultaneously providing a small improvement in the structural prediction accuracy. Savings are also realized in memory. In experiments over a 5S RNA dataset with average sequence length of approximately 120 nucleotides, the method reduces computation by a factor of 2. The method performs favorably in comparison to other programs for pairwise RNA structure prediction: yielding better accuracy, on average, and requiring significantly lesser computational resources. Conclusion Probabilistic analysis can be utilized in order to automate the determination of alignment constraints for pairwise RNA structure prediction methods in a principled fashion. These constraints can reduce the computational and memory requirements of these methods while maintaining or improving their accuracy of structural prediction. This extends the practical reach of these methods to longer length sequences. The revised Dynalign code is freely available for download. PMID:17445273

  9. Predicting the helix packing of globular proteins by self-correcting distance geometry.

    PubMed

    Mumenthaler, C; Braun, W

    1995-05-01

    A new self-correcting distance geometry method for predicting the three-dimensional structure of small globular proteins was assessed with a test set of 8 helical proteins. With the knowledge of the amino acid sequence and the helical segments, our completely automated method calculated the correct backbone topology of six proteins. The accuracy of the predicted structures ranged from 2.3 A to 3.1 A for the helical segments compared to the experimentally determined structures. For two proteins, the predicted constraints were not restrictive enough to yield a conclusive prediction. The method can be applied to all small globular proteins, provided the secondary structure is known from NMR analysis or can be predicted with high reliability.

  10. Statistical mechanical approach to secondary processes and structural relaxation in glasses and glass formers: a leading model to describe the onset of Johari-Goldstein processes and their relationship with fully cooperative processes.

    PubMed

    Crisanti, A; Leuzzi, L; Paoluzzi, M

    2011-09-01

    The interrelation of dynamic processes active on separated time-scales in glasses and viscous liquids is investigated using a model displaying two time-scale bifurcations both between fast and secondary relaxation and between secondary and structural relaxation. The study of the dynamics allows for predictions on the system relaxation above the temperature of dynamic arrest in the mean-field approximation, that are compared with the outcomes of the equations of motion directly derived within the Mode Coupling Theory (MCT) for under-cooled viscous liquids. By varying the external thermodynamic parameters, a wide range of phenomenology can be represented, from a very clear separation of structural and secondary peak in the susceptibility loss to excess wing structures.

  11. Structural alterations in rat liver proteins due to streptozotocin-induced diabetes and the recovery effect of selenium: Fourier transform infrared microspectroscopy and neural network study

    NASA Astrophysics Data System (ADS)

    Bozkurt, Ozlem; Haman Bayari, Sevgi; Severcan, Mete; Krafft, Christoph; Popp, Jürgen; Severcan, Feride

    2012-07-01

    The relation between protein structural alterations and tissue dysfunction is a major concern as protein fibrillation and/or aggregation due to structural alterations has been reported in many disease states. In the current study, Fourier transform infrared microspectroscopic imaging has been used to investigate diabetes-induced changes on protein secondary structure and macromolecular content in streptozotocin-induced diabetic rat liver. Protein secondary structural alterations were predicted using neural network approach utilizing the amide I region. Moreover, the role of selenium in the recovery of diabetes-induced alterations on macromolecular content and protein secondary structure was also studied. The results revealed that diabetes induced a decrease in lipid to protein and glycogen to protein ratios in diabetic livers. Significant alterations in protein secondary structure were observed with a decrease in α-helical and an increase in β-sheet content. Both doses of selenium restored diabetes-induced changes in lipid to protein and glycogen to protein ratios. However, low-dose selenium supplementation was not sufficient to recover the effects of diabetes on protein secondary structure, while a higher dose of selenium fully restored diabetes-induced alterations in protein structure.

  12. A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction

    PubMed Central

    Spencer, Matt; Eickholt, Jesse; Cheng, Jianlin

    2014-01-01

    Ab initio protein secondary structure (SS) predictions are utilized to generate tertiary structure predictions, which are increasingly demanded due to the rapid discovery of proteins. Although recent developments have slightly exceeded previous methods of SS prediction, accuracy has stagnated around 80% and many wonder if prediction cannot be advanced beyond this ceiling. Disciplines that have traditionally employed neural networks are experimenting with novel deep learning techniques in attempts to stimulate progress. Since neural networks have historically played an important role in SS prediction, we wanted to determine whether deep learning could contribute to the advancement of this field as well. We developed an SS predictor that makes use of the position-specific scoring matrix generated by PSI-BLAST and deep learning network architectures, which we call DNSS. Graphical processing units and CUDA software optimize the deep network architecture and efficiently train the deep networks. Optimal parameters for the training process were determined, and a workflow comprising three separately trained deep networks was constructed in order to make refined predictions. This deep learning network approach was used to predict SS for a fully independent test data set of 198 proteins, achieving a Q3 accuracy of 80.7% and a Sov accuracy of 74.2%. PMID:25750595

  13. A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction.

    PubMed

    Spencer, Matt; Eickholt, Jesse; Jianlin Cheng

    2015-01-01

    Ab initio protein secondary structure (SS) predictions are utilized to generate tertiary structure predictions, which are increasingly demanded due to the rapid discovery of proteins. Although recent developments have slightly exceeded previous methods of SS prediction, accuracy has stagnated around 80 percent and many wonder if prediction cannot be advanced beyond this ceiling. Disciplines that have traditionally employed neural networks are experimenting with novel deep learning techniques in attempts to stimulate progress. Since neural networks have historically played an important role in SS prediction, we wanted to determine whether deep learning could contribute to the advancement of this field as well. We developed an SS predictor that makes use of the position-specific scoring matrix generated by PSI-BLAST and deep learning network architectures, which we call DNSS. Graphical processing units and CUDA software optimize the deep network architecture and efficiently train the deep networks. Optimal parameters for the training process were determined, and a workflow comprising three separately trained deep networks was constructed in order to make refined predictions. This deep learning network approach was used to predict SS for a fully independent test dataset of 198 proteins, achieving a Q3 accuracy of 80.7 percent and a Sov accuracy of 74.2 percent.

  14. A search for H/ACA snoRNAs in yeast using MFE secondary structure prediction.

    PubMed

    Edvardsson, Sverker; Gardner, Paul P; Poole, Anthony M; Hendy, Michael D; Penny, David; Moulton, Vincent

    2003-05-01

    Noncoding RNA genes produce functional RNA molecules rather than coding for proteins. One such family is the H/ACA snoRNAs. Unlike the related C/D snoRNAs these have resisted automated detection to date. We develop an algorithm to screen the yeast genome for novel H/ACA snoRNAs. To achieve this, we introduce some new methods for facilitating the search for noncoding RNAs in genomic sequences which are based on properties of predicted minimum free-energy (MFE) secondary structures. The algorithm has been implemented and can be generalized to enable screening of other eukaryote genomes. We find that use of primary sequence alone is insufficient for identifying novel H/ACA snoRNAs. Only the use of secondary structure filters reduces the number of candidates to a manageable size. From genomic context, we identify three strong H/ACA snoRNA candidates. These together with a further 47 candidates obtained by our analysis are being experimentally screened.

  15. SeMPI: a genome-based secondary metabolite prediction and identification web server.

    PubMed

    Zierep, Paul F; Padilla, Natàlia; Yonchev, Dimitar G; Telukunta, Kiran K; Klementz, Dennis; Günther, Stefan

    2017-07-03

    The secondary metabolism of bacteria, fungi and plants yields a vast number of bioactive substances. The constantly increasing amount of published genomic data provides the opportunity for an efficient identification of gene clusters by genome mining. Conversely, for many natural products with resolved structures, the encoding gene clusters have not been identified yet. Even though genome mining tools have become significantly more efficient in the identification of biosynthetic gene clusters, structural elucidation of the actual secondary metabolite is still challenging, especially due to as yet unpredictable post-modifications. Here, we introduce SeMPI, a web server providing a prediction and identification pipeline for natural products synthesized by polyketide synthases of type I modular. In order to limit the possible structures of PKS products and to include putative tailoring reactions, a structural comparison with annotated natural products was introduced. Furthermore, a benchmark was designed based on 40 gene clusters with annotated PKS products. The web server of the pipeline (SeMPI) is freely available at: http://www.pharmaceutical-bioinformatics.de/sempi. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  16. Metamorphic Proteins: Emergence of Dual Protein Folds from One Primary Sequence.

    PubMed

    Lella, Muralikrishna; Mahalakshmi, Radhakrishnan

    2017-06-20

    Every amino acid exhibits a different propensity for distinct structural conformations. Hence, decoding how the primary amino acid sequence undergoes the transition to a defined secondary structure and its final three-dimensional fold is presently considered predictable with reasonable certainty. However, protein sequences that defy the first principles of secondary structure prediction (they attain two different folds) have recently been discovered. Such proteins, aptly named metamorphic proteins, decrease the conformational constraint by increasing flexibility in the secondary structure and thereby result in efficient functionality. In this review, we discuss the major factors driving the conformational switch related both to protein sequence and to structure using illustrative examples. We discuss the concept of an evolutionary transition in sequence and structure, the functional impact of the tertiary fold, and the pressure of intrinsic and external factors that give rise to metamorphic proteins. We mainly focus on the major components of protein architecture, namely, the α-helix and β-sheet segments, which are involved in conformational switching within the same or highly similar sequences. These chameleonic sequences are widespread in both cytosolic and membrane proteins, and these folds are equally important for protein structure and function. We discuss the implications of metamorphic proteins and chameleonic peptide sequences in de novo peptide design.

  17. Modeling the Role of Alkanes, Polycyclic Aromatic Hydrocarbons, and Their Oligomers in Secondary Organic Aerosol Formation

    EPA Science Inventory

    A computationally efficient method to treat secondary organic aerosol (SOA) from various length and structure alkanes as well as SOA from polycyclic aromatic hydrocarbons (PAHs) is implemented in the Community Multiscale Air Quality (CMAQ) model to predict aerosol concentrations ...

  18. Functional formation of domain V of the poliovirus noncoding region: significance of unpaired bases.

    PubMed

    Rowe, A; Burlison, J; Macadam, A J; Minor, P D

    2001-10-10

    Previously we have shown that polioviruses with mutations that disrupt the predicted secondary structure of the 5' noncoding region of domain V are temperature sensitive for growth. Non-temperature-sensitive revertant viruses had mutations that re-formed secondary structure by a direct back mutation of changes in the opposite strand. We mutated unpaired regions and selected revertants of viruses with single base deletions, where no obvious back mutation was available in order to gain information on secondary structure. Results indicated that conservation of length of a three base loop between two double-stranded stems was essential for a functional domain V to form. The requirement for the unpaired "hinge" base at 484 which is implicated in the attenuation of Sabin 2 was also confirmed. Results also underline the necessity for functional folding over local secondary structure stability. Copyright 2001 Academic Press.

  19. Tertiary structure-based analysis of microRNA–target interactions

    PubMed Central

    Gan, Hin Hark; Gunsalus, Kristin C.

    2013-01-01

    Current computational analysis of microRNA interactions is based largely on primary and secondary structure analysis. Computationally efficient tertiary structure-based methods are needed to enable more realistic modeling of the molecular interactions underlying miRNA-mediated translational repression. We incorporate algorithms for predicting duplex RNA structures, ionic strength effects, duplex entropy and free energy, and docking of duplex–Argonaute protein complexes into a pipeline to model and predict miRNA–target duplex binding energies. To ensure modeling accuracy and computational efficiency, we use an all-atom description of RNA and a continuum description of ionic interactions using the Poisson–Boltzmann equation. Our method predicts the conformations of two constructs of Caenorhabditis elegans let-7 miRNA–target duplexes to an accuracy of ∼3.8 Å root mean square distance of their NMR structures. We also show that the computed duplex formation enthalpies, entropies, and free energies for eight miRNA–target duplexes agree with titration calorimetry data. Analysis of duplex–Argonaute docking shows that structural distortions arising from single-base-pair mismatches in the seed region influence the activity of the complex by destabilizing both duplex hybridization and its association with Argonaute. Collectively, these results demonstrate that tertiary structure-based modeling of miRNA interactions can reveal structural mechanisms not accessible with current secondary structure-based methods. PMID:23417009

  20. RNAmutants: a web server to explore the mutational landscape of RNA secondary structures

    PubMed Central

    Waldispühl, Jerome; Devadas, Srinivas; Berger, Bonnie; Clote, Peter

    2009-01-01

    The history and mechanism of molecular evolution in DNA have been greatly elucidated by contributions from genetics, probability theory and bioinformatics—indeed, mathematical developments such as Kimura's neutral theory, Kingman's coalescent theory and efficient software such as BLAST, ClustalW, Phylip, etc., provide the foundation for modern population genetics. In contrast to DNA, the function of most noncoding RNA depends on tertiary structure, experimentally known to be largely determined by secondary structure, for which dynamic programming can efficiently compute the minimum free energy secondary structure. For this reason, understanding the effect of pointwise mutations in RNA secondary structure could reveal fundamental properties of structural RNA molecules and improve our understanding of molecular evolution of RNA. The web server RNAmutants provides several efficient tools to compute the ensemble of low-energy secondary structures for all k-mutants of a given RNA sequence, where k is bounded by a user-specified upper bound. As we have previously shown, these tools can be used to predict putative deleterious mutations and to analyze regulatory sequences from the hepatitis C and human immunodeficiency genomes. Web server is available at http://bioinformatics.bc.edu/clotelab/RNAmutants/, and downloadable binaries at http://rnamutants.csail.mit.edu/. PMID:19531740

  1. 3x2 Classroom Goal Structures, Motivational Regulations, Self-Concept, and Affectivity in Secondary School.

    PubMed

    Méndez-Giménez, Antonio; Cecchini-Estrada, José-Antonio; Fernández-Río, Javier; Prieto Saborit, José Antonio; Méndez-Alonso, David

    2017-09-20

    The main objective was to analyze relationships and predictive patterns between 3x2 classroom goal structures (CGS), and motivational regulations, dimensions of self-concept, and affectivity in the context of secondary education. A sample of 1,347 secondary school students (56.6% young men, 43.4% young women) from 10 different provinces of Spain agreed to participate (M age = 13.43, SD = 1.05). Hierarchical regression analyses indicated the self-approach CGS was the most adaptive within the spectrum of self-determination, followed by the task-approach CGS. The other-approach CGS had an ambivalent influence on motivation. Task-approach and self-approach CGS predicted academic self-concept (p < .01; p < .001, respectively; R 2 = .134), and both along with other-approach CGS (negatively) predicted family self-concept (p < .05; p < .001; p < .01, respectively; R 2 = .064). Physical self-concept was predicted by the task-approach and other-approach CGS's (p < .05; p < .001, respectively; R 2 = .078). Finally, positive affect was predicted by all three approach-oriented CGS's (p < .001; R 2 = .137), whereas negative affect was predicted by other-approach (positively) and self-approach (negatively) CGS (p < .001; p < .05, respectively; R 2 = .028). These results expand the 3x2 achievement goal framework to include environmental factors, and reiterate that teachers should focus on raising levels of self- and task-based goals for students in their classes.

  2. Validation Evidence of the Motivation for Teaching Scale in Secondary Education.

    PubMed

    Abós, Ángel; Sevil, Javier; Martín-Albo, José; Aibar, Alberto; García-González, Luis

    2018-04-10

    Grounded in self-determination theory, the aim of this study was to develop a scale with adequate psychometric properties to assess motivation for teaching and to explain some outcomes of secondary education teachers at work. The sample comprised 584 secondary education teachers. Analyses supported the five-factor model (intrinsic motivation, identified regulation, introjected regulation, external regulation and amotivation) and indicated the presence of a continuum of self-determination. Evidence of reliability was provided by Cronbach's alpha, composite reliability and average variance extracted. Multigroup confirmatory factor analyses supported the partial invariance (configural and metric) of the scale in different sub-samples, in terms of gender and type of school. Concurrent validity was analyzed by a structural equation modeling that explained 71% of the work dedication variance and 69% of the boredom at work variance. Work dedication was positively predicted by intrinsic motivation (ß = .56, p < .001) and external regulation (ß = .29, p < .001) and negatively predicted by introjected regulation (ß = -.22, p < .001) and amotivation (ß = -.49, p < .001). Boredom at work was negatively predicted by intrinsic motivation (ß = -.28, p < .005) and positively predicted by amotivation (ß = .68, p < .001). The Motivation for Teaching Scale in Secondary Education (Spanish acronym EME-ES, Escala de Motivación por la Enseñanza en Educación Secundaria) is discussed as a valid and reliable instrument. This is the first specific scale in the work context of secondary teachers that has integrated the five-factor structure together with their dedication and boredom at work.

  3. RaptorX server: a resource for template-based protein structure modeling.

    PubMed

    Källberg, Morten; Margaryan, Gohar; Wang, Sheng; Ma, Jianzhu; Xu, Jinbo

    2014-01-01

    Assigning functional properties to a newly discovered protein is a key challenge in modern biology. To this end, computational modeling of the three-dimensional atomic arrangement of the amino acid chain is often crucial in determining the role of the protein in biological processes. We present a community-wide web-based protocol, RaptorX server ( http://raptorx.uchicago.edu ), for automated protein secondary structure prediction, template-based tertiary structure modeling, and probabilistic alignment sampling.Given a target sequence, RaptorX server is able to detect even remotely related template sequences by means of a novel nonlinear context-specific alignment potential and probabilistic consistency algorithm. Using the protocol presented here it is thus possible to obtain high-quality structural models for many target protein sequences when only distantly related protein domains have experimentally solved structures. At present, RaptorX server can perform secondary and tertiary structure prediction of a 200 amino acid target sequence in approximately 30 min.

  4. Principles for Predicting RNA Secondary Structure Design Difficulty.

    PubMed

    Anderson-Lee, Jeff; Fisker, Eli; Kosaraju, Vineet; Wu, Michelle; Kong, Justin; Lee, Jeehyung; Lee, Minjae; Zada, Mathew; Treuille, Adrien; Das, Rhiju

    2016-02-27

    Designing RNAs that form specific secondary structures is enabling better understanding and control of living systems through RNA-guided silencing, genome editing and protein organization. Little is known, however, about which RNA secondary structures might be tractable for downstream sequence design, increasing the time and expense of design efforts due to inefficient secondary structure choices. Here, we present insights into specific structural features that increase the difficulty of finding sequences that fold into a target RNA secondary structure, summarizing the design efforts of tens of thousands of human participants and three automated algorithms (RNAInverse, INFO-RNA and RNA-SSD) in the Eterna massive open laboratory. Subsequent tests through three independent RNA design algorithms (NUPACK, DSS-Opt and MODENA) confirmed the hypothesized importance of several features in determining design difficulty, including sequence length, mean stem length, symmetry and specific difficult-to-design motifs such as zigzags. Based on these results, we have compiled an Eterna100 benchmark of 100 secondary structure design challenges that span a large range in design difficulty to help test future efforts. Our in silico results suggest new routes for improving computational RNA design methods and for extending these insights to assess "designability" of single RNA structures, as well as of switches for in vitro and in vivo applications. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  5. Understanding Quality of Life in Adults with Spinal Cord Injury Via SCI-Related Needs and Secondary Complications.

    PubMed

    Sweet, Shane N; Noreau, Luc; Leblond, Jean; Dumont, Frédéric S

    2014-01-01

    Understanding the factors that can predict greater quality of life (QoL) is important for adults with spinal cord injury (SCI), given that they report lower levels of QoL than the general population. To build a conceptual model linking SCI-related needs, secondary complications, and QoL in adults with SCI. Prior to testing the conceptual model, we aimed to develop and evaluate the factor structure for both SCI-related needs and secondary complications. Individuals with a traumatic SCI (N = 1,137) responded to an online survey measuring 13 SCI-related needs, 13 secondary complications, and the Life Satisfaction Questionnaire to assess QoL. The SCI-related needs and secondary complications were conceptualized into factors, tested with a confirmatory factor analysis, and subsequently evaluated in a structural equation model to predict QoL. The confirmatory factor analysis supported a 2-factor model for SCI related needs, χ(2)(61, N = 1,137) = 250.40, P <.001, comparative fit index (CFI) = .93, root mean square error of approximation (RMSEA) = .05, standardized root mean square residual (SRMR) = .04, and for 11 of the 13 secondary complications, χ(2)(44, N = 1,137) = 305.67, P < .001, CFI = .91, RMSEA = .060, SRMR = .033. The final 2 secondary complications were kept as observed constructs. In the structural model, both vital and personal development unmet SCI-related needs (β = -.22 and -.20, P < .05, respectively) and the neuro-physiological systems factor (β = -.45, P < .05) were negatively related with QoL. Identifying unmet SCI-related needs of individuals with SCI and preventing or managing secondary complications are essential to their QoL.

  6. Prediction of RNA secondary structures: from theory to models and real molecules

    NASA Astrophysics Data System (ADS)

    Schuster, Peter

    2006-05-01

    RNA secondary structures are derived from RNA sequences, which are strings built form the natural four letter nucleotide alphabet, {AUGC}. These coarse-grained structures, in turn, are tantamount to constrained strings over a three letter alphabet. Hence, the secondary structures are discrete objects and the number of sequences always exceeds the number of structures. The sequences built from two letter alphabets form perfect structures when the nucleotides can form a base pair, as is the case with {GC} or {AU}, but the relation between the sequences and structures differs strongly from the four letter alphabet. A comprehensive theory of RNA structure is presented, which is based on the concepts of sequence space and shape space, being a space of structures. It sets the stage for modelling processes in ensembles of RNA molecules like evolutionary optimization or kinetic folding as dynamical phenomena guided by mappings between the two spaces. The number of minimum free energy (mfe) structures is always smaller than the number of sequences, even for two letter alphabets. Folding of RNA molecules into mfe energy structures constitutes a non-invertible mapping from sequence space onto shape space. The preimage of a structure in sequence space is defined as its neutral network. Similarly the set of suboptimal structures is the preimage of a sequence in shape space. This set represents the conformation space of a given sequence. The evolutionary optimization of structures in populations is a process taking place in sequence space, whereas kinetic folding occurs in molecular ensembles that optimize free energy in conformation space. Efficient folding algorithms based on dynamic programming are available for the prediction of secondary structures for given sequences. The inverse problem, the computation of sequences for predefined structures, is an important tool for the design of RNA molecules with tailored properties. Simultaneous folding or cofolding of two or more RNA molecules can be modelled readily at the secondary structure level and allows prediction of the most stable (mfe) conformations of complexes together with suboptimal states. Cofolding algorithms are important tools for efficient and highly specific primer design in the polymerase chain reaction (PCR) and help to explain the mechanisms of small interference RNA (si-RNA) molecules in gene regulation. The evolutionary optimization of RNA structures is illustrated by the search for a target structure and mimics aptamer selection in evolutionary biotechnology. It occurs typically in steps consisting of short adaptive phases interrupted by long epochs of little or no obvious progress in optimization. During these quasi-stationary epochs the populations are essentially confined to neutral networks where they search for sequences that allow a continuation of the adaptive process. Modelling RNA evolution as a simultaneous process in sequence and shape space provides answers to questions of the optimal population size and mutation rates. Kinetic folding is a stochastic process in conformation space. Exact solutions are derived by direct simulation in the form of trajectory sampling or by solving the master equation. The exact solutions can be approximated straightforwardly by Arrhenius kinetics on barrier trees, which represent simplified versions of conformational energy landscapes. The existence of at least one sequence forming any arbitrarily chosen pair of structures is granted by the intersection theorem. Folding kinetics is the key to understanding and designing multistable RNA molecules or RNA switches. These RNAs form two or more long lived conformations, and conformational changes occur either spontaneously or are induced through binding of small molecules or other biopolymers. RNA switches are found in nature where they act as elements in genetic and metabolic regulation. The reliability of RNA secondary structure prediction is limited by the accuracy with which the empirical parameters can be determined and by principal deficiencies, for example by the lack of energy contributions resulting from tertiary interactions. In addition, native structures may be determined by folding kinetics rather than by thermodynamics. We address the first problem by considering base pair probabilities or base pairing entropies, which are derived from the partition function of conformations. A high base pair probability corresponding to a low pairing entropy is taken as an indicator of a high reliability of prediction. Pseudoknots are discussed as an example of a tertiary interaction that is highly important for RNA function. Moreover, pseudoknot formation is readily incorporated into structure prediction algorithms. Some examples of experimental data on RNA secondary structures that are readily explained using the landscape concept are presented. They deal with (i) properties of RNA molecules with random sequences, (ii) RNA molecules from restricted alphabets, (iii) existence of neutral networks, (iv) shape space covering, (v) riboswitches and (vi) evolution of non-coding RNAs as an example of evolution restricted to neutral networks.

  7. Factors Predicting Burnout Among Chaplains: Compassion Satisfaction, Organizational Factors, and the Mediators of Mindful Self-Care and Secondary Traumatic Stress.

    PubMed

    Hotchkiss, Jason T; Lesher, Ruth

    2018-06-01

    This study predicted Burnout from the self-care practices, compassion satisfaction, secondary traumatic stress, and organizational factors among chaplains who participated from all 50 states (N = 534). A hierarchical regression model indicated that the combined effect of compassion satisfaction, secondary traumatic stress, mindful self-care, demographic, and organizational factors explained 83.2% of the variance in Burnout. Chaplains serving in a hospital were slightly more at risk for Burnout than those in hospice or other settings. Organizational factors that most predicted Burnout were feeling bogged down by the "system" (25.7%) and an overwhelming caseload (19.9%). Each self-care category was a statistically significant protective factor against Burnout risk. The strongest protective factors against Burnout in order of strength were self-compassion and purpose, supportive structure, mindful self-awareness, mindful relaxation, supportive relationships, and physical care. For secondary traumatic stress, supportive structure, mindful self-awareness, and self-compassion and purpose were the strongest protective factors. Chaplains who engaged in multiple and frequent self-care strategies experienced higher professional quality of life and low Burnout risk. In the chaplain's journey toward wellness, a reflective practice of feeling good about doing good and mindful self-care are vital. The significance, implications, and limitations of the study were discussed.

  8. Ab initio NMR Confirmed Evolutionary Structure Prediction for Organic Molecular Crystals

    NASA Astrophysics Data System (ADS)

    Pham, Cong-Huy; Kucukbenli, Emine; de Gironcoli, Stefano

    2015-03-01

    Ab initio crystal structure prediction of even small organic compounds is extremely challenging due to polymorphism, molecular flexibility and difficulties in addressing the dispersion interaction from first principles. We recently implemented vdW-aware density functionals and demonstrated their success in energy ordering of aminoacid crystals. In this work we combine this development with the evolutionary structure prediction method to study cholesterol polymorphs. Cholesterol crystals have paramount importance in various diseases, from cancer to atherosclerosis. The structure of some polymorphs (e.g. ChM, ChAl, ChAh) have already been resolved while some others, which display distinct NMR spectra and are involved in disease formation, are yet to be determined. Here we thoroughly assess the applicability of evolutionary structure prediction to address such real world problems. We validate the newly predicted structures with ab initio NMR chemical shift data using secondary referencing for an improved comparison with experiments.

  9. Structure Prediction of the Second Extracellular Loop in G-Protein-Coupled Receptors

    PubMed Central

    Kmiecik, Sebastian; Jamroz, Michal; Kolinski, Michal

    2014-01-01

    G-protein-coupled receptors (GPCRs) play key roles in living organisms. Therefore, it is important to determine their functional structures. The second extracellular loop (ECL2) is a functionally important region of GPCRs, which poses significant challenge for computational structure prediction methods. In this work, we evaluated CABS, a well-established protein modeling tool for predicting ECL2 structure in 13 GPCRs. The ECL2s (with between 13 and 34 residues) are predicted in an environment of other extracellular loops being fully flexible and the transmembrane domain fixed in its x-ray conformation. The modeling procedure used theoretical predictions of ECL2 secondary structure and experimental constraints on disulfide bridges. Our approach yielded ensembles of low-energy conformers and the most populated conformers that contained models close to the available x-ray structures. The level of similarity between the predicted models and x-ray structures is comparable to that of other state-of-the-art computational methods. Our results extend other studies by including newly crystallized GPCRs. PMID:24896119

  10. fRMSDPred: Predicting Local RMSD Between Structural Fragments Using Sequence Information

    DTIC Science & Technology

    2007-04-04

    machine learning approaches for estimating the RMSD value of a pair of protein fragments. These estimated fragment-level RMSD values can be used to construct the alignment, assess the quality of an alignment, and identify high-quality alignment segments. We present algorithms to solve this fragment-level RMSD prediction problem using a supervised learning framework based on support vector regression and classification that incorporates protein profiles, predicted secondary structure, effective information encoding schemes, and novel second-order pairwise exponential kernel

  11. [Comparative analysis of clustered regularly interspaced short palindromic repeats (CRISPRs) loci in the genomes of halophilic archaea].

    PubMed

    Zhang, Fan; Zhang, Bing; Xiang, Hua; Hu, Songnian

    2009-11-01

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) is a widespread system that provides acquired resistance against phages in bacteria and archaea. Here we aim to genome-widely analyze the CRISPR in extreme halophilic archaea, of which the whole genome sequences are available at present time. We used bioinformatics methods including alignment, conservation analysis, GC content and RNA structure prediction to analyze the CRISPR structures of 7 haloarchaeal genomes. We identified the CRISPR structures in 5 halophilic archaea and revealed a conserved palindromic motif in the flanking regions of these CRISPR structures. In addition, we found that the repeat sequences of large CRISPR structures in halophilic archaea were greatly conserved, and two types of predicted RNA secondary structures derived from the repeat sequences were likely determined by the fourth base of the repeat sequence. Our results support the proposal that the leader sequence may function as recognition site by having palindromic structures in flanking regions, and the stem-loop secondary structure formed by repeat sequences may function in mediating the interaction between foreign genetic elements and CAS-encoded proteins.

  12. SMARTIV: combined sequence and structure de-novo motif discovery for in-vivo RNA binding data.

    PubMed

    Polishchuk, Maya; Paz, Inbal; Yakhini, Zohar; Mandel-Gutfreund, Yael

    2018-05-25

    Gene expression regulation is highly dependent on binding of RNA-binding proteins (RBPs) to their RNA targets. Growing evidence supports the notion that both RNA primary sequence and its local secondary structure play a role in specific Protein-RNA recognition and binding. Despite the great advance in high-throughput experimental methods for identifying sequence targets of RBPs, predicting the specific sequence and structure binding preferences of RBPs remains a major challenge. We present a novel webserver, SMARTIV, designed for discovering and visualizing combined RNA sequence and structure motifs from high-throughput RNA-binding data, generated from in-vivo experiments. The uniqueness of SMARTIV is that it predicts motifs from enriched k-mers that combine information from ranked RNA sequences and their predicted secondary structure, obtained using various folding methods. Consequently, SMARTIV generates Position Weight Matrices (PWMs) in a combined sequence and structure alphabet with assigned P-values. SMARTIV concisely represents the sequence and structure motif content as a single graphical logo, which is informative and easy for visual perception. SMARTIV was examined extensively on a variety of high-throughput binding experiments for RBPs from different families, generated from different technologies, showing consistent and accurate results. Finally, SMARTIV is a user-friendly webserver, highly efficient in run-time and freely accessible via http://smartiv.technion.ac.il/.

  13. Unraveling the meaning of chemical shifts in protein NMR.

    PubMed

    Berjanskii, Mark V; Wishart, David S

    2017-11-01

    Chemical shifts are among the most informative parameters in protein NMR. They provide wealth of information about protein secondary and tertiary structure, protein flexibility, and protein-ligand binding. In this report, we review the progress in interpreting and utilizing protein chemical shifts that has occurred over the past 25years, with a particular focus on the large body of work arising from our group and other Canadian NMR laboratories. More specifically, this review focuses on describing, assessing, and providing some historical context for various chemical shift-based methods to: (1) determine protein secondary and super-secondary structure; (2) derive protein torsion angles; (3) assess protein flexibility; (4) predict residue accessible surface area; (5) refine 3D protein structures; (6) determine 3D protein structures and (7) characterize intrinsically disordered proteins. This review also briefly covers some of the methods that we previously developed to predict chemical shifts from 3D protein structures and/or protein sequence data. It is hoped that this review will help to increase awareness of the considerable utility of NMR chemical shifts in structural biology and facilitate more widespread adoption of chemical-shift based methods by the NMR spectroscopists, structural biologists, protein biophysicists, and biochemists worldwide. This article is part of a Special Issue entitled: Biophysics in Canada, edited by Lewis Kay, John Baenziger, Albert Berghuis and Peter Tieleman. Copyright © 2017 Elsevier B.V. All rights reserved.

  14. CONFOLD2: improved contact-driven ab initio protein structure modeling.

    PubMed

    Adhikari, Badri; Cheng, Jianlin

    2018-01-25

    Contact-guided protein structure prediction methods are becoming more and more successful because of the latest advances in residue-residue contact prediction. To support contact-driven structure prediction, effective tools that can quickly build tertiary structural models of good quality from predicted contacts need to be developed. We develop an improved contact-driven protein modelling method, CONFOLD2, and study how it may be effectively used for ab initio protein structure prediction with predicted contacts as input. It builds models using various subsets of input contacts to explore the fold space under the guidance of a soft square energy function, and then clusters the models to obtain the top five models. CONFOLD2 obtains an average reconstruction accuracy of 0.57 TM-score for the 150 proteins in the PSICOV contact prediction dataset. When benchmarked on the CASP11 contacts predicted using CONSIP2 and CASP12 contacts predicted using Raptor-X, CONFOLD2 achieves a mean TM-score of 0.41 on both datasets. CONFOLD2 allows to quickly generate top five structural models for a protein sequence when its secondary structures and contacts predictions at hand. The source code of CONFOLD2 is publicly available at https://github.com/multicom-toolbox/CONFOLD2/ .

  15. Base pair probability estimates improve the prediction accuracy of RNA non-canonical base pairs

    PubMed Central

    2017-01-01

    Prediction of RNA tertiary structure from sequence is an important problem, but generating accurate structure models for even short sequences remains difficult. Predictions of RNA tertiary structure tend to be least accurate in loop regions, where non-canonical pairs are important for determining the details of structure. Non-canonical pairs can be predicted using a knowledge-based model of structure that scores nucleotide cyclic motifs, or NCMs. In this work, a partition function algorithm is introduced that allows the estimation of base pairing probabilities for both canonical and non-canonical interactions. Pairs that are predicted to be probable are more likely to be found in the true structure than pairs of lower probability. Pair probability estimates can be further improved by predicting the structure conserved across multiple homologous sequences using the TurboFold algorithm. These pairing probabilities, used in concert with prior knowledge of the canonical secondary structure, allow accurate inference of non-canonical pairs, an important step towards accurate prediction of the full tertiary structure. Software to predict non-canonical base pairs and pairing probabilities is now provided as part of the RNAstructure software package. PMID:29107980

  16. Structural characterization of the α-mating factor prepro-peptide for secretion of recombinant proteins in Pichia pastoris.

    PubMed

    Chahal, Sabreen; Wei, Peter; Moua, Pachai; Park, Sung Pil James; Kwon, Janet; Patel, Arth; Vu, Anthony T; Catolico, Jason A; Tsai, Yu Fang Tina; Shaheen, Nadia; Chu, Tiffany T; Tam, Vivian; Khan, Zill-E-Huma; Joo, Hyun Henry; Xue, Liang; Lin-Cereghino, Joan; Tsai, Jerry W; Lin-Cereghino, Geoff P

    2017-01-20

    The methylotrophic yeast Pichia pastoris has been used extensively for expressing recombinant proteins because it combines the ease of genetic manipulation, the ability to provide complex posttranslational modifications and the capacity for efficient protein secretion. The most successful and commonly used secretion signal leader in Pichia pastoris has been the alpha mating factor (MATα) prepro secretion signal. However, limitations exist as some proteins cannot be secreted efficiently, leading to strategies to enhance secretion efficiency by modifying the secretion signal leader. Based on a Jpred secondary structure prediction and knob-socket modeling of tertiary structure, numerous deletions and duplications of the MATα prepro leader were engineered to evaluate the correlation between predicted secondary structure and the secretion level of the reporters horseradish peroxidase (HRP) and Candida antarctica lipase B. In addition, circular dichroism analyses were completed for the wild type and several mutant pro-peptides to evaluate actual differences in secondary structure. The results lead to a new model of MATα pro-peptide signal leader, which suggests that the N and C-termini of MATα pro-peptide need to be presented in a specific orientation for proper interaction with the cellular secretion machinery and for efficient protein secretion. Copyright © 2016 Elsevier B.V. All rights reserved.

  17. Class Anxiety in Secondary Education: Exploring Structural Relations with Perceived Control, Engagement, Disaffection, and Performance.

    PubMed

    González, Antonio; Faílde Garrido, José María; Rodríguez Castro, Yolanda; Carrera Rodríguez, María Victoria

    2015-09-14

    The aim of this study was to assess the relationships between class-related anxiety with perceived control, teacher-reported behavioral engagement, behavioral disaffection, and academic performance. Participants were 355 compulsory secondary students (9th and 10th grades; Mean age = 15.2 years; SD = 1.8 years). Structural equation models revealed performance was predicted by perceived control, anxiety, disaffection, and engagement. Perceived control predicted anxiety, disaffection, and engagement. Anxiety predicted disaffection and engagement, and partially mediated the effects from control on disaffection (β = -.277, p < .005; CI = -.378, -.197) and engagement (β = .170, p < .002; CI = .103 .258). The negative association between anxiety and performance was mediated by engagement and disaffection (β = -.295, p < .002; CI = -.439, -.182). Anxiety, engagement, and disaffection mediated the effects of control on performance (β = .352, p < .003; CI = .279, .440). The implications of these results are discussed in the light of current theory and educational interventions.

  18. Prelude and Fugue, predicting local protein structure, early folding regions and structural weaknesses.

    PubMed

    Kwasigroch, Jean Marc; Rooman, Marianne

    2006-07-15

    Prelude&Fugue are bioinformatics tools aiming at predicting the local 3D structure of a protein from its amino acid sequence in terms of seven backbone torsion angle domains, using database-derived potentials. Prelude(&Fugue) computes all lowest free energy conformations of a protein or protein region, ranked by increasing energy, and possibly satisfying some interresidue distance constraints specified by the user. (Prelude&)Fugue detects sequence regions whose predicted structure is significantly preferred relative to other conformations in the absence of tertiary interactions. These programs can be used for predicting secondary structure, tertiary structure of short peptides, flickering early folding sequences and peptides that adopt a preferred conformation in solution. They can also be used for detecting structural weaknesses, i.e. sequence regions that are not optimal with respect to the tertiary fold. http://babylone.ulb.ac.be/Prelude_and_Fugue.

  19. StruLocPred: structure-based protein subcellular localisation prediction using multi-class support vector machine.

    PubMed

    Zhou, Wengang; Dickerson, Julie A

    2012-01-01

    Knowledge of protein subcellular locations can help decipher a protein's biological function. This work proposes new features: sequence-based: Hybrid Amino Acid Pair (HAAP) and two structure-based: Secondary Structural Element Composition (SSEC) and solvent accessibility state frequency. A multi-class Support Vector Machine is developed to predict the locations. Testing on two established data sets yields better prediction accuracies than the best available systems. Comparisons with existing methods show comparable results to ESLPred2. When StruLocPred is applied to the entire Arabidopsis proteome, over 77% of proteins with known locations match the prediction results. An implementation of this system is at http://wgzhou.ece. iastate.edu/StruLocPred/.

  20. Secondary Structure of Rat and Human Amylin across Force Fields

    PubMed Central

    Hoffmann, Kyle Quynn; McGovern, Michael; Chiu, Chi-cheng; de Pablo, Juan J.

    2015-01-01

    The aggregation of human amylin has been strongly implicated in the progression of Type II diabetes. This 37-residue peptide forms a variety of secondary structures, including random coils, α-helices, and β-hairpins. The balance between these structures depends on the chemical environment, making amylin an ideal candidate to examine inherent biases in force fields. Rat amylin differs from human amylin by only 6 residues; however, it does not form fibrils. Therefore it provides a useful complement to human amylin in studies of the key events along the aggregation pathway. In this work, the free energy of rat and human amylin was determined as a function of α-helix and β-hairpin content for the Gromos96 53a6, OPLS-AA/L, CHARMM22/CMAP, CHARMM22*, Amberff99sb*-ILDN, and Amberff03w force fields using advanced sampling techniques, specifically bias exchange metadynamics. This work represents a first systematic attempt to evaluate the conformations and the corresponding free energy of a large, clinically relevant disordered peptide in solution across force fields. The NMR chemical shifts of rIAPP were calculated for each of the force fields using their respective free energy maps, allowing us to quantitatively assess their predictions. We show that the predicted distribution of secondary structures is sensitive to the choice of force-field: Gromos53a6 is biased towards β-hairpins, while CHARMM22/CMAP predicts structures that are overly α-helical. OPLS-AA/L favors disordered structures. Amberff99sb*-ILDN, AmberFF03w and CHARMM22* provide the balance between secondary structures that is most consistent with available experimental data. In contrast to previous reports, our findings suggest that the equilibrium conformations of human and rat amylin are remarkably similar, but that subtle differences arise in transient alpha-helical and beta-strand containing structures that the human peptide can more readily adopt. We hypothesize that these transient states enable dynamic pathways that facilitate the formation of aggregates and, eventually, amyloid fibrils. PMID:26221949

  1. Secondary structure of rat and human amylin across force fields

    DOE PAGES

    Hoffmann, Kyle Quynn; McGovern, Michael; Chiu, Chi -cheng; ...

    2015-07-29

    The aggregation of human amylin has been strongly implicated in the progression of Type II diabetes. This 37-residue peptide forms a variety of secondary structures, including random coils, α-helices, and β-hairpins. The balance between these structures depends on the chemical environment, making amylin an ideal candidate to examine inherent biases in force fields. Rat amylin differs from human amylin by only 6 residues; however, it does not form fibrils. Therefore it provides a useful complement to human amylin in studies of the key events along the aggregation pathway. In this work, the free energy of rat and human amylin wasmore » determined as a function of α-helix and β-hairpin content for the Gromos96 53a6, OPLS-AA/L, CHARMM22/CMAP, CHARMM22*, Amberff99sb*-ILDN, and Amberff03w force fields using advanced sampling techniques, specifically bias exchange metadynamics. This work represents a first systematic attempt to evaluate the conformations and the corresponding free energy of a large, clinically relevant disordered peptide in solution across force fields. The NMR chemical shifts of rIAPP were calculated for each of the force fields using their respective free energy maps, allowing us to quantitatively assess their predictions. We show that the predicted distribution of secondary structures is sensitive to the choice of force-field: Gromos53a6 is biased towards β-hairpins, while CHARMM22/CMAP predicts structures that are overly α-helical. OPLS-AA/L favors disordered structures. Amberff99sb*-ILDN, AmberFF03w and CHARMM22* provide the balance between secondary structures that is most consistent with available experimental data. In contrast to previous reports, our findings suggest that the equilibrium conformations of human and rat amylin are remarkably similar, but that subtle differences arise in transient alpha-helical and beta-strand containing structures that the human peptide can more readily adopt. We hypothesize that these transient states enable dynamic pathways that facilitate the formation of aggregates and, eventually, amyloid fibrils.« less

  2. Secondary structure model of the RNA recognized by the reverse transcriptase from the R2 retrotransposable element.

    PubMed Central

    Mathews, D H; Banerjee, A R; Luan, D D; Eickbush, T H; Turner, D H

    1997-01-01

    RNA transcripts corresponding to the 250-nt 3' untranslated region of the R2 non-LTR retrotransposable element are recognized by the R2 reverse transcriptase and are sufficient to serve as templates in the target DNA-primed reverse transcription (TPRT) reaction. The R2 protein encoded by the Bombyx mori R2 can recognize this region from both the B. mori and Drosophila melanogaster R2 elements even though these regions show little nucleotide sequence identity. A model for the RNA secondary structure of the 3' untranslated region of the D. melanogaster R2 retrotransposon was developed by sequence comparison of 10 species aided by free energy minimization. Chemical modification experiments are consistent with this prediction. A secondary structure model for the 3' untranslated region of R2 RNA from the R2 element from B. mori was obtained by a combination of chemical modification data and free energy minimization. These two secondary structure models, found independently, share several common sites. This study shows the utility of combining free energy minimization, sequence comparison, and chemical modification to model an RNA secondary structure. PMID:8990394

  3. Improve the prediction of RNA-binding residues using structural neighbours.

    PubMed

    Li, Quan; Cao, Zanxia; Liu, Haiyan

    2010-03-01

    The interactions between RNA-binding proteins (RBPs) with RNA play key roles in managing some of the cell's basic functions. The identification and prediction of RNA binding sites is important for understanding the RNA-binding mechanism. Computational approaches are being developed to predict RNA-binding residues based on the sequence- or structure-derived features. To achieve higher prediction accuracy, improvements on current prediction methods are necessary. We identified that the structural neighbors of RNA-binding and non-RNA-binding residues have different amino acid compositions. Combining this structure-derived feature with evolutionary (PSSM) and other structural information (secondary structure and solvent accessibility) significantly improves the predictions over existing methods. Using a multiple linear regression approach and 6-fold cross validation, our best model can achieve an overall correct rate of 87.8% and MCC of 0.47, with a specificity of 93.4%, correctly predict 52.4% of the RNA-binding residues for a dataset containing 107 non-homologous RNA-binding proteins. Compared with existing methods, including the amino acid compositions of structure neighbors lead to clearly improvement. A web server was developed for predicting RNA binding residues in a protein sequence (or structure),which is available at http://mcgill.3322.org/RNA/.

  4. Successive range expansion promotes diversity and accelerates evolution in spatially structured microbial populations.

    PubMed

    Goldschmidt, Felix; Regoes, Roland R; Johnson, David R

    2017-09-01

    Successive range expansions occur within all domains of life, where one population expands first (primary expansion) and one or more secondary populations then follow (secondary expansion). In general, genetic drift reduces diversity during range expansion. However, it is not clear whether the same effect applies during successive range expansion, mainly because the secondary population must expand into space occupied by the primary population. Here we used an experimental microbial model system to show that, in contrast to primary range expansion, successive range expansion promotes local population diversity. Because of mechanical constraints imposed by the presence of the primary population, the secondary population forms fractal-like dendritic structures. This divides the advancing secondary population into many small sub-populations and promotes intermixing between the primary and secondary populations. We further developed a mathematical model to simulate the formation of dendritic structures in the secondary population during succession. By introducing mutations in the primary or dendritic secondary populations, we found that mutations are more likely to accumulate in the dendritic secondary populations. Our results thus show that successive range expansion can promote intermixing over the short term and increase genetic diversity over the long term. Our results therefore have potentially important implications for predicting the ecological processes and evolutionary trajectories of microbial communities.

  5. Method of identifying hairpin DNA probes by partial fold analysis

    DOEpatents

    Miller, Benjamin L [Penfield, NY; Strohsahl, Christopher M [Saugerties, NY

    2009-10-06

    Method of identifying molecular beacons in which a secondary structure prediction algorithm is employed to identify oligonucleotide sequences within a target gene having the requisite hairpin structure. Isolated oligonucleotides, molecular beacons prepared from those oligonucleotides, and their use are also disclosed.

  6. Method of identifying hairpin DNA probes by partial fold analysis

    DOEpatents

    Miller, Benjamin L.; Strohsahl, Christopher M.

    2008-10-28

    Methods of identifying molecular beacons in which a secondary structure prediction algorithm is employed to identify oligonucleotide sequences within a target gene having the requisite hairpin structure. Isolated oligonucleotides, molecular beacons prepared from those oligonucleotides, and their use are also disclosed.

  7. In vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features.

    PubMed

    Ding, Yiliang; Tang, Yin; Kwok, Chun Kit; Zhang, Yu; Bevilacqua, Philip C; Assmann, Sarah M

    2014-01-30

    RNA structure has critical roles in processes ranging from ligand sensing to the regulation of translation, polyadenylation and splicing. However, a lack of genome-wide in vivo RNA structural data has limited our understanding of how RNA structure regulates gene expression in living cells. Here we present a high-throughput, genome-wide in vivo RNA structure probing method, structure-seq, in which dimethyl sulphate methylation of unprotected adenines and cytosines is identified by next-generation sequencing. Application of this method to Arabidopsis thaliana seedlings yielded the first in vivo genome-wide RNA structure map at nucleotide resolution for any organism, with quantitative structural information across more than 10,000 transcripts. Our analysis reveals a three-nucleotide periodic repeat pattern in the structure of coding regions, as well as a less-structured region immediately upstream of the start codon, and shows that these features are strongly correlated with translation efficiency. We also find patterns of strong and weak secondary structure at sites of alternative polyadenylation, as well as strong secondary structure at 5' splice sites that correlates with unspliced events. Notably, in vivo structures of messenger RNAs annotated for stress responses are poorly predicted in silico, whereas mRNA structures of genes related to cell function maintenance are well predicted. Global comparison of several structural features between these two categories shows that the mRNAs associated with stress responses tend to have more single-strandedness, longer maximal loop length and higher free energy per nucleotide, features that may allow these RNAs to undergo conformational changes in response to environmental conditions. Structure-seq allows the RNA structurome and its biological roles to be interrogated on a genome-wide scale and should be applicable to any organism.

  8. Evaluating the accuracy of SHAPE-directed RNA secondary structure predictions

    PubMed Central

    Sükösd, Zsuzsanna; Swenson, M. Shel; Kjems, Jørgen; Heitsch, Christine E.

    2013-01-01

    Recent advances in RNA structure determination include using data from high-throughput probing experiments to improve thermodynamic prediction accuracy. We evaluate the extent and nature of improvements in data-directed predictions for a diverse set of 16S/18S ribosomal sequences using a stochastic model of experimental SHAPE data. The average accuracy for 1000 data-directed predictions always improves over the original minimum free energy (MFE) structure. However, the amount of improvement varies with the sequence, exhibiting a correlation with MFE accuracy. Further analysis of this correlation shows that accurate MFE base pairs are typically preserved in a data-directed prediction, whereas inaccurate ones are not. Thus, the positive predictive value of common base pairs is consistently higher than the directed prediction accuracy. Finally, we confirm sequence dependencies in the directability of thermodynamic predictions and investigate the potential for greater accuracy improvements in the worst performing test sequence. PMID:23325843

  9. PredictProtein—an open resource for online prediction of protein structural and functional features

    PubMed Central

    Yachdav, Guy; Kloppmann, Edda; Kajan, Laszlo; Hecht, Maximilian; Goldberg, Tatyana; Hamp, Tobias; Hönigschmid, Peter; Schafferhans, Andrea; Roos, Manfred; Bernhofer, Michael; Richter, Lothar; Ashkenazy, Haim; Punta, Marco; Schlessinger, Avner; Bromberg, Yana; Schneider, Reinhard; Vriend, Gerrit; Sander, Chris; Ben-Tal, Nir; Rost, Burkhard

    2014-01-01

    PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein–protein binding sites (ISIS2), protein–polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org. PMID:24799431

  10. Lattice-free prediction of three-dimensional structure of programmed DNA assemblies

    PubMed Central

    Pan, Keyao; Kim, Do-Nyun; Zhang, Fei; Adendorff, Matthew R.; Yan, Hao; Bathe, Mark

    2014-01-01

    DNA can be programmed to self-assemble into high molecular weight 3D assemblies with precise nanometer-scale structural features. Although numerous sequence design strategies exist to realize these assemblies in solution, there is currently no computational framework to predict their 3D structures on the basis of programmed underlying multi-way junction topologies constrained by DNA duplexes. Here, we introduce such an approach and apply it to assemblies designed using the canonical immobile four-way junction. The procedure is used to predict the 3D structure of high molecular weight planar and spherical ring-like origami objects, a tile-based sheet-like ribbon, and a 3D crystalline tensegrity motif, in quantitative agreement with experiments. Our framework provides a new approach to predict programmed nucleic acid 3D structure on the basis of prescribed secondary structure motifs, with possible application to the design of such assemblies for use in biomolecular and materials science. PMID:25470497

  11. A New Secondary Structure Assignment Algorithm Using Cα Backbone Fragments

    PubMed Central

    Cao, Chen; Wang, Guishen; Liu, An; Xu, Shutan; Wang, Lincong; Zou, Shuxue

    2016-01-01

    The assignment of secondary structure elements in proteins is a key step in the analysis of their structures and functions. We have developed an algorithm, SACF (secondary structure assignment based on Cα fragments), for secondary structure element (SSE) assignment based on the alignment of Cα backbone fragments with central poses derived by clustering known SSE fragments. The assignment algorithm consists of three steps: First, the outlier fragments on known SSEs are detected. Next, the remaining fragments are clustered to obtain the central fragments for each cluster. Finally, the central fragments are used as a template to make assignments. Following a large-scale comparison of 11 secondary structure assignment methods, SACF, KAKSI and PROSS are found to have similar agreement with DSSP, while PCASSO agrees with DSSP best. SACF and PCASSO show preference to reducing residues in N and C cap regions, whereas KAKSI, P-SEA and SEGNO tend to add residues to the terminals when DSSP assignment is taken as standard. Moreover, our algorithm is able to assign subtle helices (310-helix, π-helix and left-handed helix) and make uniform assignments, as well as to detect rare SSEs in β-sheets or long helices as outlier fragments from other programs. The structural uniformity should be useful for protein structure classification and prediction, while outlier fragments underlie the structure–function relationship. PMID:26978354

  12. Structure prediction of the second extracellular loop in G-protein-coupled receptors.

    PubMed

    Kmiecik, Sebastian; Jamroz, Michal; Kolinski, Michal

    2014-06-03

    G-protein-coupled receptors (GPCRs) play key roles in living organisms. Therefore, it is important to determine their functional structures. The second extracellular loop (ECL2) is a functionally important region of GPCRs, which poses significant challenge for computational structure prediction methods. In this work, we evaluated CABS, a well-established protein modeling tool for predicting ECL2 structure in 13 GPCRs. The ECL2s (with between 13 and 34 residues) are predicted in an environment of other extracellular loops being fully flexible and the transmembrane domain fixed in its x-ray conformation. The modeling procedure used theoretical predictions of ECL2 secondary structure and experimental constraints on disulfide bridges. Our approach yielded ensembles of low-energy conformers and the most populated conformers that contained models close to the available x-ray structures. The level of similarity between the predicted models and x-ray structures is comparable to that of other state-of-the-art computational methods. Our results extend other studies by including newly crystallized GPCRs. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.

  13. The secondary structure and the thermal unfolding parameters of the S-layer protein from Lactobacillus salivarius.

    PubMed

    Lighezan, Liliana; Georgieva, Ralitsa; Neagu, Adrian

    2016-09-01

    Surface layer (S-layer) proteins have been identified in the cell envelope of many organisms, such as bacteria and archaea. They self-assemble, forming monomolecular crystalline arrays. Isolated S-layer proteins are able to recrystallize into regular lattices, which proved useful in biotechnology. Here we investigate the structure and thermal unfolding of the S-layer protein isolated from Lactobacillus salivarius 16 strain of human origin. Using circular dichroism (CD) spectroscopy, and the software CDSSTR from DICHROWEB, CONTINLL from CDPro, as well as CDNN, we assess the fractions of the protein's secondary structural elements at temperatures ranging between 10 and 90 °C, and predict the tertiary class of the protein. To study the thermal unfolding of the protein, we analyze the temperature dependence of the CD signal in the far- and near-UV domains. Fitting the experimental data by two- and three-state models of thermal unfolding, we infer the midpoint temperatures, the temperature dependence of the changes in Gibbs free energy, enthalpy, and entropy of the unfolding transitions in standard conditions, and the temperature dependence of the equilibrium constant. We also estimate the changes in heat capacity at constant pressure in standard conditions. The results indicate that the thermal unfolding of the S-layer protein from L. salivarius is highly cooperative, since changes in the secondary and tertiary structures occur simultaneously. The thermodynamic analysis predicts a "cold" transition, at about -3 °C, of both the secondary and tertiary structures. Our findings may be important for the use of S-layer proteins in biotechnology and in biomedical applications.

  14. Cloud prediction of protein structure and function with PredictProtein for Debian.

    PubMed

    Kaján, László; Yachdav, Guy; Vicedo, Esmeralda; Steinegger, Martin; Mirdita, Milot; Angermüller, Christof; Böhm, Ariane; Domke, Simon; Ertl, Julia; Mertes, Christian; Reisinger, Eva; Staniewski, Cedric; Rost, Burkhard

    2013-01-01

    We report the release of PredictProtein for the Debian operating system and derivatives, such as Ubuntu, Bio-Linux, and Cloud BioLinux. The PredictProtein suite is available as a standard set of open source Debian packages. The release covers the most popular prediction methods from the Rost Lab, including methods for the prediction of secondary structure and solvent accessibility (profphd), nuclear localization signals (predictnls), and intrinsically disordered regions (norsnet). We also present two case studies that successfully utilize PredictProtein packages for high performance computing in the cloud: the first analyzes protein disorder for whole organisms, and the second analyzes the effect of all possible single sequence variants in protein coding regions of the human genome.

  15. Cloud Prediction of Protein Structure and Function with PredictProtein for Debian

    PubMed Central

    Kaján, László; Yachdav, Guy; Vicedo, Esmeralda; Steinegger, Martin; Mirdita, Milot; Angermüller, Christof; Böhm, Ariane; Domke, Simon; Ertl, Julia; Mertes, Christian; Reisinger, Eva; Rost, Burkhard

    2013-01-01

    We report the release of PredictProtein for the Debian operating system and derivatives, such as Ubuntu, Bio-Linux, and Cloud BioLinux. The PredictProtein suite is available as a standard set of open source Debian packages. The release covers the most popular prediction methods from the Rost Lab, including methods for the prediction of secondary structure and solvent accessibility (profphd), nuclear localization signals (predictnls), and intrinsically disordered regions (norsnet). We also present two case studies that successfully utilize PredictProtein packages for high performance computing in the cloud: the first analyzes protein disorder for whole organisms, and the second analyzes the effect of all possible single sequence variants in protein coding regions of the human genome. PMID:23971032

  16. Ab initio RNA folding by discrete molecular dynamics: From structure prediction to folding mechanisms

    PubMed Central

    Ding, Feng; Sharma, Shantanu; Chalasani, Poornima; Demidov, Vadim V.; Broude, Natalia E.; Dokholyan, Nikolay V.

    2008-01-01

    RNA molecules with novel functions have revived interest in the accurate prediction of RNA three-dimensional (3D) structure and folding dynamics. However, existing methods are inefficient in automated 3D structure prediction. Here, we report a robust computational approach for rapid folding of RNA molecules. We develop a simplified RNA model for discrete molecular dynamics (DMD) simulations, incorporating base-pairing and base-stacking interactions. We demonstrate correct folding of 150 structurally diverse RNA sequences. The majority of DMD-predicted 3D structures have <4 Å deviations from experimental structures. The secondary structures corresponding to the predicted 3D structures consist of 94% native base-pair interactions. Folding thermodynamics and kinetics of tRNAPhe, pseudoknots, and mRNA fragments in DMD simulations are in agreement with previous experimental findings. Folding of RNA molecules features transient, non-native conformations, suggesting non-hierarchical RNA folding. Our method allows rapid conformational sampling of RNA folding, with computational time increasing linearly with RNA length. We envision this approach as a promising tool for RNA structural and functional analyses. PMID:18456842

  17. Improved cryoEM-Guided Iterative Molecular Dynamics–Rosetta Protein Structure Refinement Protocol for High Precision Protein Structure Prediction

    PubMed Central

    2016-01-01

    Many excellent methods exist that incorporate cryo-electron microscopy (cryoEM) data to constrain computational protein structure prediction and refinement. Previously, it was shown that iteration of two such orthogonal sampling and scoring methods – Rosetta and molecular dynamics (MD) simulations – facilitated exploration of conformational space in principle. Here, we go beyond a proof-of-concept study and address significant remaining limitations of the iterative MD–Rosetta protein structure refinement protocol. Specifically, all parts of the iterative refinement protocol are now guided by medium-resolution cryoEM density maps, and previous knowledge about the native structure of the protein is no longer necessary. Models are identified solely based on score or simulation time. All four benchmark proteins showed substantial improvement through three rounds of the iterative refinement protocol. The best-scoring final models of two proteins had sub-Ångstrom RMSD to the native structure over residues in secondary structure elements. Molecular dynamics was most efficient in refining secondary structure elements and was thus highly complementary to the Rosetta refinement which is most powerful in refining side chains and loop regions. PMID:25883538

  18. How does vegetation structure influence woodpeckers and secondary cavity nesting birds in African cork oak forest?

    NASA Astrophysics Data System (ADS)

    Segura, Amalia

    2017-08-01

    The Great Spotted Woodpecker provides important information about the status of a forest in terms of structure and age. As a primary cavity creator, it provides small-medium size cavities for passerines. However, despite its interest as an ecosystem engineer, studies of this species in Africa are scarce. Here, spatially explicit predictive models were used to investigate how forest structural variables are related to both the Great Spotted Woodpecker and secondary cavity nesting birds in Maamora cork oak forest (northwest Morocco). A positive association between Great Spotted Woodpecker and both dead-tree density and large mature trees (>60 cm dbh) was found. This study area, Maamora, has an old-growth forest structure incorporating a broad range of size and condition of live and dead trees, favouring Great Spotted Woodpecker by providing high availability of foraging and excavating sites. Secondary cavity nesting birds, represented by Great Tit, African Blue Tit, and Hoopoe, were predicted by Great Spotted Woodpecker detections. The findings suggest that the conservation of the Maamora cork oak forest could be key to maintaining these hole-nesting birds. However, this forest is threatened by forestry practises and livestock overgrazing and the challenge is therefore to find sustainable management strategies that ensure conservation while allowing its exploitation.

  19. Ligand Based-Pharmacophore Modeling and Extended Bi oactivity Prediction for Salinosporamide A, B and C from Marine Actino mycetes Salinispora tropica.

    PubMed

    Dineshkumar, Kesavan; Vasudevan, Aparna; Hopper, Waheeta

    2017-01-01

    Actinomycetes produce structurally unique secondary metabolites with pharmaceutically essential bioactivities. Salinispora, an obligate marine actinomycete, produces structurally varied and unique secondary metabolites. There is plenty of scope for development of drugs from the novel compounds isolated from Salinispora. Anticancer, antibacterial and anti-protozoa activities have been shown for Salinosporamides A, B and C, the secondary metabolites identified from Salinispora, which make them interesting subjects for further extended biological activity prediction. An in silico ligand based-pharmacophore approach was used for the prediction of extended biological targets for salinosporamide A, B and C. Pharmacophore models of salinosporamide A, B and C were generated individually and screened against known drug databases. The drugs with best fitness score were shortlisted, and their respective targets pertaining to their bioactivity were retrieved. The predicted biological drug targets were docked with salinosporamide A, B and C for validation. The glucocorticoid receptor and methionine aminopeptidase 2 showed good docking score and binding energy with salinosporamide A, B and C. Molecular dynamics studies of the protein-ligand complexes showed stable interactions suggesting that the predicted new targets for salinosporamides might be promising. The glucocorticoid receptor and methionine aminopeptidase 2 could be possible new drug targets of bioactivity of salinosporamides. These proteins could be the druggable targets for antiinflammatory and anticancer activity of salinosporamides. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  20. Internal and external relationships of the Cnidaria: implications of primary and predicted secondary structure of the 5'-end of the 23S-like rDNA.

    PubMed Central

    Odorico, D M; Miller, D J

    1997-01-01

    Since both internal (class-level) and external relationships of the Cnidaria remain unclear on the basis of analyses of 18S and (partial) 16S rDNA sequence data, we examined the informativeness of the 5'-end of the 23S-like rDNA. Here we describe analyses of both primary and predicted secondary structure data for this region from the ctenophore Bolinopsis sp., the placozoan Trichoplax adhaerens, the sponge Hymeniacidon heliophila, and representatives of all four cnidarian classes. Primary sequence analyses clearly resolved the Cnidaria from other lower Metazoa, supported sister group relationships between the Scyphozoa and Cubozoa and between the Ctenophora and the Placozoa, and confirmed the basal status of the Anthozoa within the Cnidaria. Additionally, in the ctenophore, placozoan and sponge, non-canonical base pairing is required to maintain the secondary structure of the B12 region, whereas amongst the Cnidaria this is not the case. Although the phylogenetic significance of this molecular character is unclear, our analyses do not support the close relationship between Cnidaria and Placozoa suggested by previous studies. PMID:9061962

  1. SOV_refine: A further refined definition of segment overlap score and its significance for protein structure similarity.

    PubMed

    Liu, Tong; Wang, Zheng

    2018-01-01

    The segment overlap score (SOV) has been used to evaluate the predicted protein secondary structures, a sequence composed of helix (H), strand (E), and coil (C), by comparing it with the native or reference secondary structures, another sequence of H, E, and C. SOV's advantage is that it can consider the size of continuous overlapping segments and assign extra allowance to longer continuous overlapping segments instead of only judging from the percentage of overlapping individual positions as Q3 score does. However, we have found a drawback from its previous definition, that is, it cannot ensure increasing allowance assignment when more residues in a segment are further predicted accurately. A new way of assigning allowance has been designed, which keeps all the advantages of the previous SOV score definitions and ensures that the amount of allowance assigned is incremental when more elements in a segment are predicted accurately. Furthermore, our improved SOV has achieved a higher correlation with the quality of protein models measured by GDT-TS score and TM-score, indicating its better abilities to evaluate tertiary structure quality at the secondary structure level. We analyzed the statistical significance of SOV scores and found the threshold values for distinguishing two protein structures (SOV_refine  > 0.19) and indicating whether two proteins are under the same CATH fold (SOV_refine > 0.94 and > 0.90 for three- and eight-state secondary structures respectively). We provided another two example applications, which are when used as a machine learning feature for protein model quality assessment and comparing different definitions of topologically associating domains. We proved that our newly defined SOV score resulted in better performance. The SOV score can be widely used in bioinformatics research and other fields that need to compare two sequences of letters in which continuous segments have important meanings. We also generalized the previous SOV definitions so that it can work for sequences composed of more than three states (e.g., it can work for the eight-state definition of protein secondary structures). A standalone software package has been implemented in Perl with source code released. The software can be downloaded from http://dna.cs.miami.edu/SOV/.

  2. Secondary relaxation dynamics in rigid glass-forming molecular liquids with related structures.

    PubMed

    Li, Xiangqian; Wang, Meng; Liu, Riping; Ngai, Kia L; Tian, Yongjun; Wang, Li-Min; Capaccioli, Simone

    2015-09-14

    The dielectric relaxation in three glass-forming molecular liquids, 1-methylindole (1MID), 5H-5-Methyl-6,7-dihydrocyclopentapyrazine (MDCP), and Quinaldine (QN) is studied focusing on the secondary relaxation and its relation to the structural α-relaxation. All three glass-formers are rigid and more or less planar molecules with related chemical structures but have dipoles of different strengths at different locations. A strong and fast secondary relaxation is detected in the dielectric spectra of 1MID, while no resolved β-relaxation is observed in MDCP and QN. If the observed secondary relaxation in 1MID is identified with the Johari-Goldstein (JG) β-relaxation, then apparently the relation between the α- and β-relaxation frequencies of 1MID is not in accord with the Coupling Model (CM). The possibility of the violation of the prediction in 1MID as due to either the formation of hydrogen-bond induced clusters or the involvement of intramolecular degree of freedom is ruled out. The violation is explained by the secondary relaxation originating from the in-plane rotation of the dipole located on the plane of the rigid molecule, contributing to dielectric loss at higher frequencies and more intense than the JG β-relaxation generated by the out-of-plane rotation. MDCP has smaller dipole moment located in the plane of the molecule; however, presence of the change of curvature of dielectric loss, ε″(f), at some frequency on the high-frequency flank of the α-relaxation reveals the JG β-relaxation in MDCP and which is in accord with the CM prediction. QN has as large an in-plane dipole moment as 1MID, and the absence of the resolved secondary relaxation is explained by the smaller coupling parameter than the latter in the framework of the CM.

  3. Exploring Student, Family, and School Predictors of Self-Determination Using NLTS2 Data

    ERIC Educational Resources Information Center

    Shogren, Karrie A.; Garnier Villarreal, Mauricio; Dowsett, Chantelle; Little, Todd D.

    2016-01-01

    This study conducted secondary analysis of data from the National Longitudinal Transition Study-2 (NLTS2) to examine the degree to which student, family, and school constructs predicted self-determination outcomes. Multi-group structural equation modeling was used to examine predictive relationships between 5 students, 4 family, and 7 school…

  4. Exploring Student, Family, and School Predictors of Self-Determination Using NLTS2 Data

    ERIC Educational Resources Information Center

    Shogren, Karrie A.; Garnier Villarreal, Mauricio; Dowsett, Chantelle; Little, Todd D.

    2016-01-01

    This study conducted secondary analysis of data from the National Longitudinal Transition Study-2 (NLTS2) to examine the degree to which student, family, and school constructs predicted self-determination outcomes. Multi-group structural equation modeling was used to examine predictive relationships between 5 student, 4 family, and 7 school…

  5. How Predictive Analytics and Choice Architecture Can Improve Student Success

    ERIC Educational Resources Information Center

    Denley, Tristan

    2014-01-01

    This article explores the challenges that students face in navigating the curricular structure of post-secondary degree programs, and how predictive analytics and choice architecture can play a role. It examines Degree Compass, a course recommendation system that successfully pairs current students with the courses that best fit their talents and…

  6. FPGA accelerator for protein secondary structure prediction based on the GOR algorithm

    PubMed Central

    2011-01-01

    Background Protein is an important molecule that performs a wide range of functions in biological systems. Recently, the protein folding attracts much more attention since the function of protein can be generally derived from its molecular structure. The GOR algorithm is one of the most successful computational methods and has been widely used as an efficient analysis tool to predict secondary structure from protein sequence. However, the execution time is still intolerable with the steep growth in protein database. Recently, FPGA chips have emerged as one promising application accelerator to accelerate bioinformatics algorithms by exploiting fine-grained custom design. Results In this paper, we propose a complete fine-grained parallel hardware implementation on FPGA to accelerate the GOR-IV package for 2D protein structure prediction. To improve computing efficiency, we partition the parameter table into small segments and access them in parallel. We aggressively exploit data reuse schemes to minimize the need for loading data from external memory. The whole computation structure is carefully pipelined to overlap the sequence loading, computing and back-writing operations as much as possible. We implemented a complete GOR desktop system based on an FPGA chip XC5VLX330. Conclusions The experimental results show a speedup factor of more than 430x over the original GOR-IV version and 110x speedup over the optimized version with multi-thread SIMD implementation running on a PC platform with AMD Phenom 9650 Quad CPU for 2D protein structure prediction. However, the power consumption is only about 30% of that of current general-propose CPUs. PMID:21342582

  7. Web-Beagle: a web server for the alignment of RNA secondary structures.

    PubMed

    Mattei, Eugenio; Pietrosanto, Marco; Ferrè, Fabrizio; Helmer-Citterich, Manuela

    2015-07-01

    Web-Beagle (http://beagle.bio.uniroma2.it) is a web server for the pairwise global or local alignment of RNA secondary structures. The server exploits a new encoding for RNA secondary structure and a substitution matrix of RNA structural elements to perform RNA structural alignments. The web server allows the user to compute up to 10 000 alignments in a single run, taking as input sets of RNA sequences and structures or primary sequences alone. In the latter case, the server computes the secondary structure prediction for the RNAs on-the-fly using RNAfold (free energy minimization). The user can also compare a set of input RNAs to one of five pre-compiled RNA datasets including lncRNAs and 3' UTRs. All types of comparison produce in output the pairwise alignments along with structural similarity and statistical significance measures for each resulting alignment. A graphical color-coded representation of the alignments allows the user to easily identify structural similarities between RNAs. Web-Beagle can be used for finding structurally related regions in two or more RNAs, for the identification of homologous regions or for functional annotation. Benchmark tests show that Web-Beagle has lower computational complexity, running time and better performances than other available methods. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  8. RNA-Puzzles Round III: 3D RNA structure prediction of five riboswitches and one ribozyme

    PubMed Central

    Biesiada, Marcin; Boniecki, Michał J.; Chou, Fang-Chieh; Ferré-D'Amaré, Adrian R.; Das, Rhiju; Dunin-Horkawicz, Stanisław; Geniesse, Caleb; Kappel, Kalli; Kladwang, Wipapat; Krokhotin, Andrey; Łach, Grzegorz E.; Major, François; Mann, Thomas H.; Pachulska-Wieczorek, Katarzyna; Patel, Dinshaw J.; Piccirilli, Joseph A.; Popenda, Mariusz; Purzycka, Katarzyna J.; Ren, Aiming; Rice, Greggory M.; Santalucia, John; Tandon, Arpit; Trausch, Jeremiah J.; Wang, Jian; Weeks, Kevin M.; Williams, Benfeard; Xiao, Yi; Zhang, Dong; Zok, Tomasz

    2017-01-01

    RNA-Puzzles is a collective experiment in blind 3D RNA structure prediction. We report here a third round of RNA-Puzzles. Five puzzles, 4, 8, 12, 13, 14, all structures of riboswitch aptamers and puzzle 7, a ribozyme structure, are included in this round of the experiment. The riboswitch structures include biological binding sites for small molecules (S-adenosyl methionine, cyclic diadenosine monophosphate, 5-amino 4-imidazole carboxamide riboside 5′-triphosphate, glutamine) and proteins (YbxF), and one set describes large conformational changes between ligand-free and ligand-bound states. The Varkud satellite ribozyme is the most recently solved structure of a known large ribozyme. All puzzles have established biological functions and require structural understanding to appreciate their molecular mechanisms. Through the use of fast-track experimental data, including multidimensional chemical mapping, and accurate prediction of RNA secondary structure, a large portion of the contacts in 3D have been predicted correctly leading to similar topologies for the top ranking predictions. Template-based and homology-derived predictions could predict structures to particularly high accuracies. However, achieving biological insights from de novo prediction of RNA 3D structures still depends on the size and complexity of the RNA. Blind computational predictions of RNA structures already appear to provide useful structural information in many cases. Similar to the previous RNA-Puzzles Round II experiment, the prediction of non-Watson–Crick interactions and the observed high atomic clash scores reveal a notable need for an algorithm of improvement. All prediction models and assessment results are available at http://ahsoka.u-strasbg.fr/rnapuzzles/. PMID:28138060

  9. RNApdbee--a webserver to derive secondary structures from pdb files of knotted and unknotted RNAs.

    PubMed

    Antczak, Maciej; Zok, Tomasz; Popenda, Mariusz; Lukasiak, Piotr; Adamiak, Ryszard W; Blazewicz, Jacek; Szachniuk, Marta

    2014-07-01

    In RNA structural biology and bioinformatics an access to correct RNA secondary structure and its proper representation is of crucial importance. This is true especially in the field of secondary and 3D RNA structure prediction. Here, we introduce RNApdbee-a new tool that allows to extract RNA secondary structure from the pdb file, and presents it in both textual and graphical form. RNApdbee supports processing of knotted and unknotted structures of large RNAs, also within protein complexes. The method works not only for first but also for high order pseudoknots, and gives an information about canonical and non-canonical base pairs. A combination of these features is unique among existing applications for RNA structure analysis. Additionally, a function of converting between the text notations, i.e. BPSEQ, CT and extended dot-bracket, is provided. In order to facilitate a more comprehensive study, the webserver integrates the functionality of RNAView, MC-Annotate and 3DNA/DSSR, being the most common tools used for automated identification and classification of RNA base pairs. RNApdbee is implemented as a publicly available webserver with an intuitive interface and can be freely accessed at http://rnapdbee.cs.put.poznan.pl/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. Inferences from structural comparison: flexibility, secondary structure wobble and sequence alignment optimization.

    PubMed

    Zhang, Gaihua; Su, Zhen

    2012-01-01

    Work on protein structure prediction is very useful in biological research. To evaluate their accuracy, experimental protein structures or their derived data are used as the 'gold standard'. However, as proteins are dynamic molecular machines with structural flexibility such a standard may be unreliable. To investigate the influence of the structure flexibility, we analysed 3,652 protein structures of 137 unique sequences from 24 protein families. The results showed that (1) the three-dimensional (3D) protein structures were not rigid: the root-mean-square deviation (RMSD) of the backbone Cα of structures with identical sequences was relatively large, with the average of the maximum RMSD from each of the 137 sequences being 1.06 Å; (2) the derived data of the 3D structure was not constant, e.g. the highest ratio of the secondary structure wobble site was 60.69%, with the sequence alignments from structural comparisons of two proteins in the same family sometimes being completely different. Proteins may have several stable conformations and the data derived from resolved structures as a 'gold standard' should be optimized before being utilized as criteria to evaluate the prediction methods, e.g. sequence alignment from structural comparison. Helix/β-sheet transition exists in normal free proteins. The coil ratio of the 3D structure could affect its resolution as determined by X-ray crystallography.

  11. Prediction of β-turns in proteins from multiple alignment using neural network

    PubMed Central

    Kaur, Harpreet; Raghava, Gajendra Pal Singh

    2003-01-01

    A neural network-based method has been developed for the prediction of β-turns in proteins by using multiple sequence alignment. Two feed-forward back-propagation networks with a single hidden layer are used where the first-sequence structure network is trained with the multiple sequence alignment in the form of PSI-BLAST–generated position-specific scoring matrices. The initial predictions from the first network and PSIPRED-predicted secondary structure are used as input to the second structure-structure network to refine the predictions obtained from the first net. A significant improvement in prediction accuracy has been achieved by using evolutionary information contained in the multiple sequence alignment. The final network yields an overall prediction accuracy of 75.5% when tested by sevenfold cross-validation on a set of 426 nonhomologous protein chains. The corresponding Qpred, Qobs, and Matthews correlation coefficient values are 49.8%, 72.3%, and 0.43, respectively, and are the best among all the previously published β-turn prediction methods. The Web server BetaTPred2 (http://www.imtech.res.in/raghava/betatpred2/) has been developed based on this approach. PMID:12592033

  12. Structural variant of the intergenic internal ribosome entry site elements in dicistroviruses and computational search for their counterparts

    PubMed Central

    HATAKEYAMA, YOSHINORI; SHIBUYA, NORIHIRO; NISHIYAMA, TAKASHI; NAKASHIMA, NOBUHIKO

    2004-01-01

    The intergenic region (IGR) located upstream of the capsid protein gene in dicistroviruses contains an internal ribosome entry site (IRES). Translation initiation mediated by the IRES does not require initiator methionine tRNA. Comparison of the IGRs among dicistroviruses suggested that Taura syndrome virus (TSV) and acute bee paralysis virus have an extra side stem loop in the predicted IRES. We examined whether the side stem is responsible for translation activity mediated by the IGR using constructs with compensatory mutations. In vitro translation analysis showed that TSV has an IGR-IRES that is structurally distinct from those previously described. Because IGR-IRES elements determine the translation initiation site by virtue of their own tertiary structure formation, the discovery of this initiation mechanism suggests the possibility that eukaryotic mRNAs might have more extensive coding regions than previously predicted. To test this hypothesis, we searched full-length cDNA databases and whole genome sequences of eukaryotes using the pattern matching program, Scan For Matches, with parameters that can extract sequences containing secondary structure elements resembling those of IGR-IRES. Our search yielded several sequences, but their predicted secondary structures were suggested to be unstable in comparison to those of dicistroviruses. These results suggest that RNAs structurally similar to dicistroviruses are not common. If some eukaryotic mRNAs are translated independently of an initiator methionine tRNA, their structures are likely to be significantly distinct from those of dicistroviruses. PMID:15100433

  13. Rose spring dwarf-associated virus has RNA structural and gene-expression features like those of Barley yellow dwarf virus

    PubMed Central

    Salem, Nida’ M.; Miller, W. Allen; Rowhani, Adib; Golino, Deborah A.; Moyne, Anne-Laure; Falk, Bryce W.

    2015-01-01

    We determined the complete nucleotide sequence of the Rose spring dwarf-associated virus (RSDaV) genomic RNA (GenBank accession no. EU024678) and compared its predicted RNA structural characteristics affecting gene expression. A cDNA library was derived from RSDaV double-stranded RNAs (dsRNAs) purified from infected tissue. Nucleotide sequence analysis of the cloned cDNAs, plus for clones generated by 5′- and 3′-RACE showed the RSDaV genomic RNA to be 5,808 nucleotides. The genomic RNA contains five major open reading frames (ORFs), and three small ORFs in the 3′-terminal 800 nucleotides, typical for viruses of genus Luteovirus in the family Luteoviridae. Northern blot hybridization analysis revealed the genomic RNA and two prominent subgenomic RNAs of approximately 3 kb and 1 kb. Putative 5′ ends of the sgRNAs were predicted by identification of conserved sequences and secondary structures which resembled the Barley yellow dwarf virus (BYDV) genomic RNA 5′ end and subgenomic RNA promoter sequences. Secondary structures of the BYDV-like ribosomal frameshift elements and cap-independent translation elements, including long-distance base pairing spanning four kb were identified. These contain similarities but also informative differences with the BYDV structures, including a strikingly different structure predicted for the 3′ cap-independent translation element. These analyses of the RSDaV genomic RNA show more complexity for the RNA structural elements for members of the Luteoviridae. PMID:18329064

  14. Rose spring dwarf-associated virus has RNA structural and gene-expression features like those of Barley yellow dwarf virus.

    PubMed

    Salem, Nida' M; Miller, W Allen; Rowhani, Adib; Golino, Deborah A; Moyne, Anne-Laure; Falk, Bryce W

    2008-06-05

    We determined the complete nucleotide sequence of the Rose spring dwarf-associated virus (RSDaV) genomic RNA (GenBank accession no. EU024678) and compared its predicted RNA structural characteristics affecting gene expression. A cDNA library was derived from RSDaV double-stranded RNAs (dsRNAs) purified from infected tissue. Nucleotide sequence analysis of the cloned cDNAs, plus for clones generated by 5'- and 3'-RACE showed the RSDaV genomic RNA to be 5808 nucleotides. The genomic RNA contains five major open reading frames (ORFs), and three small ORFs in the 3'-terminal 800 nucleotides, typical for viruses of genus Luteovirus in the family Luteoviridae. Northern blot hybridization analysis revealed the genomic RNA and two prominent subgenomic RNAs of approximately 3 kb and 1 kb. Putative 5' ends of the sgRNAs were predicted by identification of conserved sequences and secondary structures which resembled the Barley yellow dwarf virus (BYDV) genomic RNA 5' end and subgenomic RNA promoter sequences. Secondary structures of the BYDV-like ribosomal frameshift elements and cap-independent translation elements, including long-distance base pairing spanning four kb were identified. These contain similarities but also informative differences with the BYDV structures, including a strikingly different structure predicted for the 3' cap-independent translation element. These analyses of the RSDaV genomic RNA show more complexity for the RNA structural elements for members of the Luteoviridae.

  15. Discrete Molecular Dynamics Can Predict Helical Prestructured Motifs in Disordered Proteins

    PubMed Central

    Han, Kyou-Hoon; Dokholyan, Nikolay V.; Tompa, Péter; Kalmár, Lajos; Hegedűs, Tamás

    2014-01-01

    Intrinsically disordered proteins (IDPs) lack a stable tertiary structure, but their short binding regions termed Pre-Structured Motifs (PreSMo) can form transient secondary structure elements in solution. Although disordered proteins are crucial in many biological processes and designing strategies to modulate their function is highly important, both experimental and computational tools to describe their conformational ensembles and the initial steps of folding are sparse. Here we report that discrete molecular dynamics (DMD) simulations combined with replica exchange (RX) method efficiently samples the conformational space and detects regions populating α-helical conformational states in disordered protein regions. While the available computational methods predict secondary structural propensities in IDPs based on the observation of protein-protein interactions, our ab initio method rests on physical principles of protein folding and dynamics. We show that RX-DMD predicts α-PreSMos with high confidence confirmed by comparison to experimental NMR data. Moreover, the method also can dissect α-PreSMos in close vicinity to each other and indicate helix stability. Importantly, simulations with disordered regions forming helices in X-ray structures of complexes indicate that a preformed helix is frequently the binding element itself, while in other cases it may have a role in initiating the binding process. Our results indicate that RX-DMD provides a breakthrough in the structural and dynamical characterization of disordered proteins by generating the structural ensembles of IDPs even when experimental data are not available. PMID:24763499

  16. Modeling of Dendritic Evolution of Continuously Cast Steel Billet with Cellular Automaton

    NASA Astrophysics Data System (ADS)

    Wang, Weiling; Ji, Cheng; Luo, Sen; Zhu, Miaoyong

    2018-02-01

    In order to predict the dendritic evolution during the continuous steel casting process, a simple mechanism to connect the heat transfer at the macroscopic scale and the dendritic growth at the microscopic scale was proposed in the present work. As the core of the across-scale simulation, a two-dimensional cell automaton (CA) model with a decentered square algorithm was developed and parallelized. Apart from nucleation undercooling and probability, a temperature gradient was introduced to deal with the columnar-to-equiaxed transition (CET) by considering its variation during continuous casting. Based on the thermal history, the dendritic evolution in a 4 mm × 40 mm region near the centerline of a SWRH82B steel billet was predicted. The influences of the secondary cooling intensity, superheat, and casting speed on the dendritic structure of the billet were investigated in detail. The results show that the predicted equiaxed dendritic solidification of Fe-5.3Si alloy and columnar dendritic solidification of Fe-0.45C alloy are consistent with in situ experimental results [Yasuda et al. Int J Cast Metals Res 22:15-21 (2009); Yasuda et al. ISIJ Int 51:402-408 (2011)]. Moreover, the predicted dendritic arm spacing and CET location agree well with the actual results in the billet. The primary dendrite arm spacing of columnar dendrites decreases with increasing secondary cooling intensity, or decreasing superheat and casting speed. Meanwhile, the CET is promoted as the secondary cooling intensity and superheat decrease. However, the CET is not influenced by the casting speed, owing to the adjusting of the flow rate of secondary spray water. Compared with the superheat and casting speed, the secondary cooling intensity can influence the cooling rate and temperature gradient in deeper locations, and accordingly exerts a more significant influence on the equiaxed dendritic structure.

  17. Protein Structure Prediction Using Gas Phase Molecular Dynamics Simulation: EOTAXIN-3 Cytokine as a Case Study

    NASA Astrophysics Data System (ADS)

    Khairudin, Nurul Bahiyah Ahmad; Wahab, Habibah A.

    In the current work, the structure of the enzyme CC chemokine eotaxin-3 (1G2S) was chosen as a case study to investigate the effects of gas phase on the predicted protein conformation using molecular dynamics simulation. Generally, simulating proteins in the gas phase tend to suffer from various drawbacks, among which excessive numbers of protein-protein hydrogen bonds. However, current results showed that the effects of gas phase simulation on 1G2S did not amplify the protein-protein hydrogen bonds. It was also found that some of the hydrogen bonds which were crucial in maintaining the secondary structural elements were disrupted. The predicted models showed high values of RMSD, 11.5 Å and 13.5 Å for both vacuum and explicit solvent simulations, respectively, indicating that the conformers were very much different from the native conformation. Even though the RMSD value for the in vacuo model was slightly lower, it somehow suffered from lower fraction of native contacts, poor hydrogen bonding networks and fewer occurrences of secondary structural elements compared to the solvated model. This finding supports the notion that water plays a dominant role in guiding the protein to fold along the correct path.

  18. RAG-3D: A search tool for RNA 3D substructures

    DOE PAGES

    Zahran, Mai; Sevim Bayrak, Cigdem; Elmetwaly, Shereef; ...

    2015-08-24

    In this study, to address many challenges in RNA structure/function prediction, the characterization of RNA's modular architectural units is required. Using the RNA-As-Graphs (RAG) database, we have previously explored the existence of secondary structure (2D) submotifs within larger RNA structures. Here we present RAG-3D—a dataset of RNA tertiary (3D) structures and substructures plus a web-based search tool—designed to exploit graph representations of RNAs for the goal of searching for similar 3D structural fragments. The objects in RAG-3D consist of 3D structures translated into 3D graphs, cataloged based on the connectivity between their secondary structure elements. Each graph is additionally describedmore » in terms of its subgraph building blocks. The RAG-3D search tool then compares a query RNA 3D structure to those in the database to obtain structurally similar structures and substructures. This comparison reveals conserved 3D RNA features and thus may suggest functional connections. Though RNA search programs based on similarity in sequence, 2D, and/or 3D structural elements are available, our graph-based search tool may be advantageous for illuminating similarities that are not obvious; using motifs rather than sequence space also reduces search times considerably. Ultimately, such substructuring could be useful for RNA 3D structure prediction, structure/function inference and inverse folding.« less

  19. RAG-3D: a search tool for RNA 3D substructures

    PubMed Central

    Zahran, Mai; Sevim Bayrak, Cigdem; Elmetwaly, Shereef; Schlick, Tamar

    2015-01-01

    To address many challenges in RNA structure/function prediction, the characterization of RNA's modular architectural units is required. Using the RNA-As-Graphs (RAG) database, we have previously explored the existence of secondary structure (2D) submotifs within larger RNA structures. Here we present RAG-3D—a dataset of RNA tertiary (3D) structures and substructures plus a web-based search tool—designed to exploit graph representations of RNAs for the goal of searching for similar 3D structural fragments. The objects in RAG-3D consist of 3D structures translated into 3D graphs, cataloged based on the connectivity between their secondary structure elements. Each graph is additionally described in terms of its subgraph building blocks. The RAG-3D search tool then compares a query RNA 3D structure to those in the database to obtain structurally similar structures and substructures. This comparison reveals conserved 3D RNA features and thus may suggest functional connections. Though RNA search programs based on similarity in sequence, 2D, and/or 3D structural elements are available, our graph-based search tool may be advantageous for illuminating similarities that are not obvious; using motifs rather than sequence space also reduces search times considerably. Ultimately, such substructuring could be useful for RNA 3D structure prediction, structure/function inference and inverse folding. PMID:26304547

  20. RAG-3D: A search tool for RNA 3D substructures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zahran, Mai; Sevim Bayrak, Cigdem; Elmetwaly, Shereef

    In this study, to address many challenges in RNA structure/function prediction, the characterization of RNA's modular architectural units is required. Using the RNA-As-Graphs (RAG) database, we have previously explored the existence of secondary structure (2D) submotifs within larger RNA structures. Here we present RAG-3D—a dataset of RNA tertiary (3D) structures and substructures plus a web-based search tool—designed to exploit graph representations of RNAs for the goal of searching for similar 3D structural fragments. The objects in RAG-3D consist of 3D structures translated into 3D graphs, cataloged based on the connectivity between their secondary structure elements. Each graph is additionally describedmore » in terms of its subgraph building blocks. The RAG-3D search tool then compares a query RNA 3D structure to those in the database to obtain structurally similar structures and substructures. This comparison reveals conserved 3D RNA features and thus may suggest functional connections. Though RNA search programs based on similarity in sequence, 2D, and/or 3D structural elements are available, our graph-based search tool may be advantageous for illuminating similarities that are not obvious; using motifs rather than sequence space also reduces search times considerably. Ultimately, such substructuring could be useful for RNA 3D structure prediction, structure/function inference and inverse folding.« less

  1. ProbFold: a probabilistic method for integration of probing data in RNA secondary structure prediction.

    PubMed

    Sahoo, Sudhakar; Świtnicki, Michał P; Pedersen, Jakob Skou

    2016-09-01

    Recently, new RNA secondary structure probing techniques have been developed, including Next Generation Sequencing based methods capable of probing transcriptome-wide. These techniques hold great promise for improving structure prediction accuracy. However, each new data type comes with its own signal properties and biases, which may even be experiment specific. There is therefore a growing need for RNA structure prediction methods that can be automatically trained on new data types and readily extended to integrate and fully exploit multiple types of data. Here, we develop and explore a modular probabilistic approach for integrating probing data in RNA structure prediction. It can be automatically trained given a set of known structures with probing data. The approach is demonstrated on SHAPE datasets, where we evaluate and selectively model specific correlations. The approach often makes superior use of the probing data signal compared to other methods. We illustrate the use of ProbFold on multiple data types using both simulations and a small set of structures with both SHAPE, DMS and CMCT data. Technically, the approach combines stochastic context-free grammars (SCFGs) with probabilistic graphical models. This approach allows rapid adaptation and integration of new probing data types. ProbFold is implemented in C ++. Models are specified using simple textual formats. Data reformatting is done using separate C ++ programs. Source code, statically compiled binaries for x86 Linux machines, C ++ programs, example datasets and a tutorial is available from http://moma.ki.au.dk/prj/probfold/ : jakob.skou@clin.au.dk Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  2. Examining the applicability of the IMB model in predicting condom use among sexually active secondary school students in Mbarara, Uganda

    PubMed Central

    Ybarra, Michele L.; Korchmaros, Josephine; Kiwanuka, Julius; Bangsberg, David R.; Bull, Sheana

    2012-01-01

    We tested the applicability of the IMB model in predicting condom use among sexually active secondary school students in Mbarara, Uganda. Three hundred and ninety adolescents across five secondary schools completed a self-report survey about their health and sexual experiences. Based upon results from structural equation modeling, the IMB model partially predicts condom use. Condom use was directly predicted by HIV prevention information and behavioral skills regarding having and using condoms. It was indirectly predicted (through behavioral skills regarding having and using condoms) by behavioral intentions regarding using condoms and talking to one‘s partner about safer sex. Aspects of one‘s first sexual experience (i.e., age at first sex, having discussed using condoms with first sex partner, willingness at first sex) are hugely influential of current condom use; this is especially true for discussing condoms with one‘s first partner. Findings highlight the importance of providing clear and comprehensive condom use training in HIV prevention programs aimed at Ugandan adolescents. They also underscore the importance of targeting abstinent youth before they become sexually active to positively affect their HIV preventive behavior at their first sexual experience. PMID:22350827

  3. RNA structural constraints in the evolution of the influenza A virus genome NP segment

    PubMed Central

    Gultyaev, Alexander P; Tsyganov-Bodounov, Anton; Spronken, Monique IJ; van der Kooij, Sander; Fouchier, Ron AM; Olsthoorn, René CL

    2014-01-01

    Conserved RNA secondary structures were predicted in the nucleoprotein (NP) segment of the influenza A virus genome using comparative sequence and structure analysis. A number of structural elements exhibiting nucleotide covariations were identified over the whole segment length, including protein-coding regions. Calculations of mutual information values at the paired nucleotide positions demonstrate that these structures impose considerable constraints on the virus genome evolution. Functional importance of a pseudoknot structure, predicted in the NP packaging signal region, was confirmed by plaque assays of the mutant viruses with disrupted structure and those with restored folding using compensatory substitutions. Possible functions of the conserved RNA folding patterns in the influenza A virus genome are discussed. PMID:25180940

  4. BEAM web server: a tool for structural RNA motif discovery.

    PubMed

    Pietrosanto, Marco; Adinolfi, Marta; Casula, Riccardo; Ausiello, Gabriele; Ferrè, Fabrizio; Helmer-Citterich, Manuela

    2018-03-15

    RNA structural motif finding is a relevant problem that becomes computationally hard when working on high-throughput data (e.g. eCLIP, PAR-CLIP), often represented by thousands of RNA molecules. Currently, the BEAM server is the only web tool capable to handle tens of thousands of RNA in input with a motif discovery procedure that is only limited by the current secondary structure prediction accuracies. The recently developed method BEAM (BEAr Motifs finder) can analyze tens of thousands of RNA molecules and identify RNA secondary structure motifs associated to a measure of their statistical significance. BEAM is extremely fast thanks to the BEAR encoding that transforms each RNA secondary structure in a string of characters. BEAM also exploits the evolutionary knowledge contained in a substitution matrix of secondary structure elements, extracted from the RFAM database of families of homologous RNAs. The BEAM web server has been designed to streamline data pre-processing by automatically handling folding and encoding of RNA sequences, giving users a choice for the preferred folding program. The server provides an intuitive and informative results page with the list of secondary structure motifs identified, the logo of each motif, its significance, graphic representation and information about its position in the RNA molecules sharing it. The web server is freely available at http://beam.uniroma2.it/ and it is implemented in NodeJS and Python with all major browsers supported. marco.pietrosanto@uniroma2.it. Supplementary data are available at Bioinformatics online.

  5. Counter ion induced irreversible denaturation of hen egg white lysozyme upon electrostatic interaction with iron oxide nanoparticles: a predicted model.

    PubMed

    Ghosh, Goutam; Panicker, Lata; Ningthoujam, R S; Barick, K C; Tewari, R

    2013-03-01

    The effects of electrostatic interaction between the hen egg white lysozyme (HEWL) and the functionalized iron oxide nanoparticles (IONPs) have been investigated using several techniques, e.g., CD, DSC, ζ-potential, UV-visible spectroscopy, DLS, TEM. Nanoparticles (IONPs) were functionalized with three hydrophilic ligands, viz., poly(ethylene glycol) (PEG), trisodium citrate (TSC) and sodium triphosphate (STP); where both TSC and STP contain Na(+) counter ions. It has been observed that the secondary structure of HEWL was not affected by PEG functionalized IONPs, but was partially and almost completely perturbed by TSC and STP functionalized IONPs, respectively. The perturbation of the secondary structure was irreversible. We have predicted an interaction model to explain the origin of perturbation of HEWL structure. We have also investigated the stability of nanoparticles dispersions after interaction with HEWL and used the DLVO theory to explain results. Copyright © 2012 Elsevier B.V. All rights reserved.

  6. Improving RNA nearest neighbor parameters for helices by going beyond the two-state model.

    PubMed

    Spasic, Aleksandar; Berger, Kyle D; Chen, Jonathan L; Seetin, Matthew G; Turner, Douglas H; Mathews, David H

    2018-06-01

    RNA folding free energy change nearest neighbor parameters are widely used to predict folding stabilities of secondary structures. They were determined by linear regression to datasets of optical melting experiments on small model systems. Traditionally, the optical melting experiments are analyzed assuming a two-state model, i.e. a structure is either complete or denatured. Experimental evidence, however, shows that structures exist in an ensemble of conformations. Partition functions calculated with existing nearest neighbor parameters predict that secondary structures can be partially denatured, which also directly conflicts with the two-state model. Here, a new approach for determining RNA nearest neighbor parameters is presented. Available optical melting data for 34 Watson-Crick helices were fit directly to a partition function model that allows an ensemble of conformations. Fitting parameters were the enthalpy and entropy changes for helix initiation, terminal AU pairs, stacks of Watson-Crick pairs and disordered internal loops. The resulting set of nearest neighbor parameters shows a 38.5% improvement in the sum of residuals in fitting the experimental melting curves compared to the current literature set.

  7. Evaluating the effect of disturbed ensemble distributions on SCFG based statistical sampling of RNA secondary structures.

    PubMed

    Scheid, Anika; Nebel, Markus E

    2012-07-09

    Over the past years, statistical and Bayesian approaches have become increasingly appreciated to address the long-standing problem of computational RNA structure prediction. Recently, a novel probabilistic method for the prediction of RNA secondary structures from a single sequence has been studied which is based on generating statistically representative and reproducible samples of the entire ensemble of feasible structures for a particular input sequence. This method samples the possible foldings from a distribution implied by a sophisticated (traditional or length-dependent) stochastic context-free grammar (SCFG) that mirrors the standard thermodynamic model applied in modern physics-based prediction algorithms. Specifically, that grammar represents an exact probabilistic counterpart to the energy model underlying the Sfold software, which employs a sampling extension of the partition function (PF) approach to produce statistically representative subsets of the Boltzmann-weighted ensemble. Although both sampling approaches have the same worst-case time and space complexities, it has been indicated that they differ in performance (both with respect to prediction accuracy and quality of generated samples), where neither of these two competing approaches generally outperforms the other. In this work, we will consider the SCFG based approach in order to perform an analysis on how the quality of generated sample sets and the corresponding prediction accuracy changes when different degrees of disturbances are incorporated into the needed sampling probabilities. This is motivated by the fact that if the results prove to be resistant to large errors on the distinct sampling probabilities (compared to the exact ones), then it will be an indication that these probabilities do not need to be computed exactly, but it may be sufficient and more efficient to approximate them. Thus, it might then be possible to decrease the worst-case time requirements of such an SCFG based sampling method without significant accuracy losses. If, on the other hand, the quality of sampled structures can be observed to strongly react to slight disturbances, there is little hope for improving the complexity by heuristic procedures. We hence provide a reliable test for the hypothesis that a heuristic method could be implemented to improve the time scaling of RNA secondary structure prediction in the worst-case - without sacrificing much of the accuracy of the results. Our experiments indicate that absolute errors generally lead to the generation of useless sample sets, whereas relative errors seem to have only small negative impact on both the predictive accuracy and the overall quality of resulting structure samples. Based on these observations, we present some useful ideas for developing a time-reduced sampling method guaranteeing an acceptable predictive accuracy. We also discuss some inherent drawbacks that arise in the context of approximation. The key results of this paper are crucial for the design of an efficient and competitive heuristic prediction method based on the increasingly accepted and attractive statistical sampling approach. This has indeed been indicated by the construction of prototype algorithms.

  8. Evaluating the effect of disturbed ensemble distributions on SCFG based statistical sampling of RNA secondary structures

    PubMed Central

    2012-01-01

    Background Over the past years, statistical and Bayesian approaches have become increasingly appreciated to address the long-standing problem of computational RNA structure prediction. Recently, a novel probabilistic method for the prediction of RNA secondary structures from a single sequence has been studied which is based on generating statistically representative and reproducible samples of the entire ensemble of feasible structures for a particular input sequence. This method samples the possible foldings from a distribution implied by a sophisticated (traditional or length-dependent) stochastic context-free grammar (SCFG) that mirrors the standard thermodynamic model applied in modern physics-based prediction algorithms. Specifically, that grammar represents an exact probabilistic counterpart to the energy model underlying the Sfold software, which employs a sampling extension of the partition function (PF) approach to produce statistically representative subsets of the Boltzmann-weighted ensemble. Although both sampling approaches have the same worst-case time and space complexities, it has been indicated that they differ in performance (both with respect to prediction accuracy and quality of generated samples), where neither of these two competing approaches generally outperforms the other. Results In this work, we will consider the SCFG based approach in order to perform an analysis on how the quality of generated sample sets and the corresponding prediction accuracy changes when different degrees of disturbances are incorporated into the needed sampling probabilities. This is motivated by the fact that if the results prove to be resistant to large errors on the distinct sampling probabilities (compared to the exact ones), then it will be an indication that these probabilities do not need to be computed exactly, but it may be sufficient and more efficient to approximate them. Thus, it might then be possible to decrease the worst-case time requirements of such an SCFG based sampling method without significant accuracy losses. If, on the other hand, the quality of sampled structures can be observed to strongly react to slight disturbances, there is little hope for improving the complexity by heuristic procedures. We hence provide a reliable test for the hypothesis that a heuristic method could be implemented to improve the time scaling of RNA secondary structure prediction in the worst-case – without sacrificing much of the accuracy of the results. Conclusions Our experiments indicate that absolute errors generally lead to the generation of useless sample sets, whereas relative errors seem to have only small negative impact on both the predictive accuracy and the overall quality of resulting structure samples. Based on these observations, we present some useful ideas for developing a time-reduced sampling method guaranteeing an acceptable predictive accuracy. We also discuss some inherent drawbacks that arise in the context of approximation. The key results of this paper are crucial for the design of an efficient and competitive heuristic prediction method based on the increasingly accepted and attractive statistical sampling approach. This has indeed been indicated by the construction of prototype algorithms. PMID:22776037

  9. RNA-TVcurve: a Web server for RNA secondary structure comparison based on a multi-scale similarity of its triple vector curve representation.

    PubMed

    Li, Ying; Shi, Xiaohu; Liang, Yanchun; Xie, Juan; Zhang, Yu; Ma, Qin

    2017-01-21

    RNAs have been found to carry diverse functionalities in nature. Inferring the similarity between two given RNAs is a fundamental step to understand and interpret their functional relationship. The majority of functional RNAs show conserved secondary structures, rather than sequence conservation. Those algorithms relying on sequence-based features usually have limitations in their prediction performance. Hence, integrating RNA structure features is very critical for RNA analysis. Existing algorithms mainly fall into two categories: alignment-based and alignment-free. The alignment-free algorithms of RNA comparison usually have lower time complexity than alignment-based algorithms. An alignment-free RNA comparison algorithm was proposed, in which novel numerical representations RNA-TVcurve (triple vector curve representation) of RNA sequence and corresponding secondary structure features are provided. Then a multi-scale similarity score of two given RNAs was designed based on wavelet decomposition of their numerical representation. In support of RNA mutation and phylogenetic analysis, a web server (RNA-TVcurve) was designed based on this alignment-free RNA comparison algorithm. It provides three functional modules: 1) visualization of numerical representation of RNA secondary structure; 2) detection of single-point mutation based on secondary structure; and 3) comparison of pairwise and multiple RNA secondary structures. The inputs of the web server require RNA primary sequences, while corresponding secondary structures are optional. For the primary sequences alone, the web server can compute the secondary structures using free energy minimization algorithm in terms of RNAfold tool from Vienna RNA package. RNA-TVcurve is the first integrated web server, based on an alignment-free method, to deliver a suite of RNA analysis functions, including visualization, mutation analysis and multiple RNAs structure comparison. The comparison results with two popular RNA comparison tools, RNApdist and RNAdistance, showcased that RNA-TVcurve can efficiently capture subtle relationships among RNAs for mutation detection and non-coding RNA classification. All the relevant results were shown in an intuitive graphical manner, and can be freely downloaded from this server. RNA-TVcurve, along with test examples and detailed documents, are available at: http://ml.jlu.edu.cn/tvcurve/ .

  10. Transcriptome-Wide Analysis of UTRs in Non-Small Cell Lung Cancer Reveals Cancer-Related Genes with SNV-Induced Changes on RNA Secondary Structure and miRNA Target Sites

    PubMed Central

    Novotny, Peter; Tang, Xiaojia; Kalari, Krishna R.; Gorodkin, Jan

    2014-01-01

    Traditional mutation assessment methods generally focus on predicting disruptive changes in protein-coding regions rather than non-coding regulatory regions like untranslated regions (UTRs) of mRNAs. The UTRs, however, are known to have many sequence and structural motifs that can regulate translational and transcriptional efficiency and stability of mRNAs through interaction with RNA-binding proteins and other non-coding RNAs like microRNAs (miRNAs). In a recent study, transcriptomes of tumor cells harboring mutant and wild-type KRAS (V-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog) genes in patients with non-small cell lung cancer (NSCLC) have been sequenced to identify single nucleotide variations (SNVs). About 40% of the total SNVs (73,717) identified were mapped to UTRs, but omitted in the previous analysis. To meet this obvious demand for analysis of the UTRs, we designed a comprehensive pipeline to predict the effect of SNVs on two major regulatory elements, secondary structure and miRNA target sites. Out of 29,290 SNVs in 6462 genes, we predict 472 SNVs (in 408 genes) affecting local RNA secondary structure, 490 SNVs (in 447 genes) affecting miRNA target sites and 48 that do both. Together these disruptive SNVs were present in 803 different genes, out of which 188 (23.4%) were previously known to be cancer-associated. Notably, this ratio is significantly higher (one-sided Fisher's exact test p-value = 0.032) than the ratio (20.8%) of known cancer-associated genes (n = 1347) in our initial data set (n = 6462). Network analysis shows that the genes harboring disruptive SNVs were involved in molecular mechanisms of cancer, and the signaling pathways of LPS-stimulated MAPK, IL-6, iNOS, EIF2 and mTOR. In conclusion, we have found hundreds of SNVs which are highly disruptive with respect to changes in the secondary structure and miRNA target sites within UTRs. These changes hold the potential to alter the expression of known cancer genes or genes linked to cancer-associated pathways. PMID:24416147

  11. Transcriptome-wide analysis of UTRs in non-small cell lung cancer reveals cancer-related genes with SNV-induced changes on RNA secondary structure and miRNA target sites.

    PubMed

    Sabarinathan, Radhakrishnan; Wenzel, Anne; Novotny, Peter; Tang, Xiaojia; Kalari, Krishna R; Gorodkin, Jan

    2014-01-01

    Traditional mutation assessment methods generally focus on predicting disruptive changes in protein-coding regions rather than non-coding regulatory regions like untranslated regions (UTRs) of mRNAs. The UTRs, however, are known to have many sequence and structural motifs that can regulate translational and transcriptional efficiency and stability of mRNAs through interaction with RNA-binding proteins and other non-coding RNAs like microRNAs (miRNAs). In a recent study, transcriptomes of tumor cells harboring mutant and wild-type KRAS (V-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog) genes in patients with non-small cell lung cancer (NSCLC) have been sequenced to identify single nucleotide variations (SNVs). About 40% of the total SNVs (73,717) identified were mapped to UTRs, but omitted in the previous analysis. To meet this obvious demand for analysis of the UTRs, we designed a comprehensive pipeline to predict the effect of SNVs on two major regulatory elements, secondary structure and miRNA target sites. Out of 29,290 SNVs in 6462 genes, we predict 472 SNVs (in 408 genes) affecting local RNA secondary structure, 490 SNVs (in 447 genes) affecting miRNA target sites and 48 that do both. Together these disruptive SNVs were present in 803 different genes, out of which 188 (23.4%) were previously known to be cancer-associated. Notably, this ratio is significantly higher (one-sided Fisher's exact test p-value = 0.032) than the ratio (20.8%) of known cancer-associated genes (n = 1347) in our initial data set (n = 6462). Network analysis shows that the genes harboring disruptive SNVs were involved in molecular mechanisms of cancer, and the signaling pathways of LPS-stimulated MAPK, IL-6, iNOS, EIF2 and mTOR. In conclusion, we have found hundreds of SNVs which are highly disruptive with respect to changes in the secondary structure and miRNA target sites within UTRs. These changes hold the potential to alter the expression of known cancer genes or genes linked to cancer-associated pathways.

  12. Protein Structure Prediction by Protein Threading

    NASA Astrophysics Data System (ADS)

    Xu, Ying; Liu, Zhijie; Cai, Liming; Xu, Dong

    The seminal work of Bowie, Lüthy, and Eisenberg (Bowie et al., 1991) on "the inverse protein folding problem" laid the foundation of protein structure prediction by protein threading. By using simple measures for fitness of different amino acid types to local structural environments defined in terms of solvent accessibility and protein secondary structure, the authors derived a simple and yet profoundly novel approach to assessing if a protein sequence fits well with a given protein structural fold. Their follow-up work (Elofsson et al., 1996; Fischer and Eisenberg, 1996; Fischer et al., 1996a,b) and the work by Jones, Taylor, and Thornton (Jones et al., 1992) on protein fold recognition led to the development of a new brand of powerful tools for protein structure prediction, which we now term "protein threading." These computational tools have played a key role in extending the utility of all the experimentally solved structures by X-ray crystallography and nuclear magnetic resonance (NMR), providing structural models and functional predictions for many of the proteins encoded in the hundreds of genomes that have been sequenced up to now.

  13. Kassiopeia: a database and web application for the analysis of mutually exclusive exomes of eukaryotes

    PubMed Central

    2014-01-01

    Background Alternative splicing is an important process in higher eukaryotes that allows obtaining several transcripts from one gene. A specific case of alternative splicing is mutually exclusive splicing, in which exactly one exon out of a cluster of neighbouring exons is spliced into the mature transcript. Recently, a new algorithm for the prediction of these exons has been developed based on the preconditions that the exons of the cluster have similar lengths, sequence homology, and conserved splice sites, and that they are translated in the same reading frame. Description In this contribution we introduce Kassiopeia, a database and web application for the generation, storage, and presentation of genome-wide analyses of mutually exclusive exomes. Currently, Kassiopeia provides access to the mutually exclusive exomes of twelve Drosophila species, the thale cress Arabidopsis thaliana, the flatworm Caenorhabditis elegans, and human. Mutually exclusive spliced exons (MXEs) were predicted based on gene reconstructions from Scipio. Based on the standard prediction values, with which 83.5% of the annotated MXEs of Drosophila melanogaster were reconstructed, the exomes contain surprisingly more MXEs than previously supposed and identified. The user can search Kassiopeia using BLAST or browse the genes of each species optionally adjusting the parameters used for the prediction to reveal more divergent or only very similar exon candidates. Conclusions We developed a pipeline to predict MXEs in the genomes of several model organisms and a web interface, Kassiopeia, for their visualization. For each gene Kassiopeia provides a comprehensive gene structure scheme, the sequences and predicted secondary structures of the MXEs, and, if available, further evidence for MXE candidates from cDNA/EST data, predictions of MXEs in homologous genes of closely related species, and RNA secondary structure predictions. Kassiopeia can be accessed at http://www.motorprotein.de/kassiopeia. PMID:24507667

  14. Repeat-swap homology modeling of secondary active transporters: updated protocol and prediction of elevator-type mechanisms

    PubMed Central

    Vergara-Jaque, Ariela; Fenollar-Ferrer, Cristina; Kaufmann, Desirée; Forrest, Lucy R.

    2015-01-01

    Secondary active transporters are critical for neurotransmitter clearance and recycling during synaptic transmission and uptake of nutrients. These proteins mediate the movement of solutes against their concentration gradients, by using the energy released in the movement of ions down pre-existing concentration gradients. To achieve this, transporters conform to the so-called alternating-access hypothesis, whereby the protein adopts at least two conformations in which the substrate binding sites are exposed to one or other side of the membrane, but not both simultaneously. Structures of a bacterial homolog of neuronal glutamate transporters, GltPh, in several different conformational states have revealed that the protein structure is asymmetric in the outward- and inward-open states, and that the conformational change connecting them involves a elevator-like movement of a substrate binding domain across the membrane. The structural asymmetry is created by inverted-topology repeats, i.e., structural repeats with similar overall folds whose transmembrane topologies are related to each other by two-fold pseudo-symmetry around an axis parallel to the membrane plane. Inverted repeats have been found in around three-quarters of secondary transporter folds. Moreover, the (a)symmetry of these systems has been successfully used as a bioinformatic tool, called “repeat-swap modeling” to predict structural models of a transporter in one conformation using the known structure of the transporter in the complementary conformation as a template. Here, we describe an updated repeat-swap homology modeling protocol, and calibrate the accuracy of the method using GltPh, for which both inward- and outward-facing conformations are known. We then apply this repeat-swap homology modeling procedure to a concentrative nucleoside transporter, VcCNT, which has a three-dimensional arrangement related to that of GltPh. The repeat-swapped model of VcCNT predicts that nucleoside transport also occurs via an elevator-like mechanism. PMID:26388773

  15. Transcripts with in silico predicted RNA structure are enriched everywhere in the mouse brain

    PubMed Central

    2012-01-01

    Background Post-transcriptional control of gene expression is mostly conducted by specific elements in untranslated regions (UTRs) of mRNAs, in collaboration with specific binding proteins and RNAs. In several well characterized cases, these RNA elements are known to form stable secondary structures. RNA secondary structures also may have major functional implications for long noncoding RNAs (lncRNAs). Recent transcriptional data has indicated the importance of lncRNAs in brain development and function. However, no methodical efforts to investigate this have been undertaken. Here, we aim to systematically analyze the potential for RNA structure in brain-expressed transcripts. Results By comprehensive spatial expression analysis of the adult mouse in situ hybridization data of the Allen Mouse Brain Atlas, we show that transcripts (coding as well as non-coding) associated with in silico predicted structured probes are highly and significantly enriched in almost all analyzed brain regions. Functional implications of these RNA structures and their role in the brain are discussed in detail along with specific examples. We observe that mRNAs with a structure prediction in their UTRs are enriched for binding, transport and localization gene ontology categories. In addition, after manual examination we observe agreement between RNA binding protein interaction sites near the 3’ UTR structures and correlated expression patterns. Conclusions Our results show a potential use for RNA structures in expressed coding as well as noncoding transcripts in the adult mouse brain, and describe the role of structured RNAs in the context of intracellular signaling pathways and regulatory networks. Based on this data we hypothesize that RNA structure is widely involved in transcriptional and translational regulatory mechanisms in the brain and ultimately plays a role in brain function. PMID:22651826

  16. The influence of ignoring secondary structure on divergence time estimates from ribosomal RNA genes.

    PubMed

    Dohrmann, Martin

    2014-02-01

    Genes coding for ribosomal RNA molecules (rDNA) are among the most popular markers in molecular phylogenetics and evolution. However, coevolution of sites that code for pairing regions (stems) in the RNA secondary structure can make it challenging to obtain accurate results from such loci. While the influence of ignoring secondary structure on multiple sequence alignment and tree topology has been investigated in numerous studies, its effect on molecular divergence time estimates is still poorly known. Here, I investigate this issue in Bayesian Markov Chain Monte Carlo (BMCMC) and penalized likelihood (PL) frameworks, using empirical datasets from dragonflies (Odonata: Anisoptera) and glass sponges (Porifera: Hexactinellida). My results indicate that highly biased inferences under substitution models that ignore secondary structure only occur if maximum-likelihood estimates of branch lengths are used as input to PL dating, whereas in a BMCMC framework and in PL dating based on Bayesian consensus branch lengths, the effect is far less severe. I conclude that accounting for coevolution of paired sites in molecular dating studies is not as important as previously suggested, as long as the estimates are based on Bayesian consensus branch lengths instead of ML point estimates. This finding is especially relevant for studies where computational limitations do not allow the use of secondary-structure specific substitution models, or where accurate consensus structures cannot be predicted. I also found that the magnitude and direction (over- vs. underestimating node ages) of bias in age estimates when secondary structure is ignored was not distributed randomly across the nodes of the phylogenies, a phenomenon that requires further investigation. Copyright © 2013 Elsevier Inc. All rights reserved.

  17. Predicting the transmembrane secondary structure of ligand-gated ion channels.

    PubMed

    Bertaccini, E; Trudell, J R

    2002-06-01

    Recent mutational analyses of ligand-gated ion channels (LGICs) have demonstrated a plausible site of anesthetic action within their transmembrane domains. Although there is a consensus that the transmembrane domain is formed from four membrane-spanning segments, the secondary structure of these segments is not known. We utilized 10 state-of-the-art bioinformatics techniques to predict the transmembrane topology of the tetrameric regions within six members of the LGIC family that are relevant to anesthetic action. They are the human forms of the GABA alpha 1 receptor, the glycine alpha 1 receptor, the 5HT3 serotonin receptor, the nicotinic AChR alpha 4 and alpha 7 receptors and the Torpedo nAChR alpha 1 receptor. The algorithms utilized were HMMTOP, TMHMM, TMPred, PHDhtm, DAS, TMFinder, SOSUI, TMAP, MEMSAT and TOPPred2. The resulting predictions were superimposed on to a multiple sequence alignment of the six amino acid sequences created using the CLUSTAL W algorithm. There was a clear statistical consensus for the presence of four alpha helices in those regions experimentally thought to span the membrane. The consensus of 10 topology prediction techniques supports the hypothesis that the transmembrane subunits of the LGICs are tetrameric bundles of alpha helices.

  18. RaptorX-Property: a web server for protein structure property prediction.

    PubMed

    Wang, Sheng; Li, Wei; Liu, Shiwang; Xu, Jinbo

    2016-07-08

    RaptorX Property (http://raptorx2.uchicago.edu/StructurePropertyPred/predict/) is a web server predicting structure property of a protein sequence without using any templates. It outperforms other servers, especially for proteins without close homologs in PDB or with very sparse sequence profile (i.e. carries little evolutionary information). This server employs a powerful in-house deep learning model DeepCNF (Deep Convolutional Neural Fields) to predict secondary structure (SS), solvent accessibility (ACC) and disorder regions (DISO). DeepCNF not only models complex sequence-structure relationship by a deep hierarchical architecture, but also interdependency between adjacent property labels. Our experimental results show that, tested on CASP10, CASP11 and the other benchmarks, this server can obtain ∼84% Q3 accuracy for 3-state SS, ∼72% Q8 accuracy for 8-state SS, ∼66% Q3 accuracy for 3-state solvent accessibility, and ∼0.89 area under the ROC curve (AUC) for disorder prediction. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. Au13(8e): A secondary block for describing a special group of liganded gold clusters containing icosahedral Au13 motifs

    NASA Astrophysics Data System (ADS)

    Xu, Wen Wu; Zeng, Xiao Cheng; Gao, Yi

    2017-05-01

    A grand unified model (GUM) has been proposed recently to understand structure anatomy and evolution of liganded gold clusters. In this work, besides the two types of elementary blocks (triangular Au3(2e) and tetrahedral Au4(2e)), we introduce a secondary block, namely, the icosahedral Au13 with 8e valence electrons, noted as Au13(8e). Using this secondary block, structural anatomy and evolution of a special group of liganded gold nanoclusters containing icosahedral Au13 motifs can be conveniently analyzed. In addition, a new ligand-protected cluster Au49(PR3)10(SR)15Cl2 is predicted to exhibit high chemical and thermal stability, suggesting likelihood of its synthesis in the laboratory.

  20. GeneSilico protein structure prediction meta-server.

    PubMed

    Kurowski, Michal A; Bujnicki, Janusz M

    2003-07-01

    Rigorous assessments of protein structure prediction have demonstrated that fold recognition methods can identify remote similarities between proteins when standard sequence search methods fail. It has been shown that the accuracy of predictions is improved when refined multiple sequence alignments are used instead of single sequences and if different methods are combined to generate a consensus model. There are several meta-servers available that integrate protein structure predictions performed by various methods, but they do not allow for submission of user-defined multiple sequence alignments and they seldom offer confidentiality of the results. We developed a novel WWW gateway for protein structure prediction, which combines the useful features of other meta-servers available, but with much greater flexibility of the input. The user may submit an amino acid sequence or a multiple sequence alignment to a set of methods for primary, secondary and tertiary structure prediction. Fold-recognition results (target-template alignments) are converted into full-atom 3D models and the quality of these models is uniformly assessed. A consensus between different FR methods is also inferred. The results are conveniently presented on-line on a single web page over a secure, password-protected connection. The GeneSilico protein structure prediction meta-server is freely available for academic users at http://genesilico.pl/meta.

  1. GeneSilico protein structure prediction meta-server

    PubMed Central

    Kurowski, Michal A.; Bujnicki, Janusz M.

    2003-01-01

    Rigorous assessments of protein structure prediction have demonstrated that fold recognition methods can identify remote similarities between proteins when standard sequence search methods fail. It has been shown that the accuracy of predictions is improved when refined multiple sequence alignments are used instead of single sequences and if different methods are combined to generate a consensus model. There are several meta-servers available that integrate protein structure predictions performed by various methods, but they do not allow for submission of user-defined multiple sequence alignments and they seldom offer confidentiality of the results. We developed a novel WWW gateway for protein structure prediction, which combines the useful features of other meta-servers available, but with much greater flexibility of the input. The user may submit an amino acid sequence or a multiple sequence alignment to a set of methods for primary, secondary and tertiary structure prediction. Fold-recognition results (target-template alignments) are converted into full-atom 3D models and the quality of these models is uniformly assessed. A consensus between different FR methods is also inferred. The results are conveniently presented on-line on a single web page over a secure, password-protected connection. The GeneSilico protein structure prediction meta-server is freely available for academic users at http://genesilico.pl/meta. PMID:12824313

  2. A sampling-based method for ranking protein structural models by integrating multiple scores and features.

    PubMed

    Shi, Xiaohu; Zhang, Jingfen; He, Zhiquan; Shang, Yi; Xu, Dong

    2011-09-01

    One of the major challenges in protein tertiary structure prediction is structure quality assessment. In many cases, protein structure prediction tools generate good structural models, but fail to select the best models from a huge number of candidates as the final output. In this study, we developed a sampling-based machine-learning method to rank protein structural models by integrating multiple scores and features. First, features such as predicted secondary structure, solvent accessibility and residue-residue contact information are integrated by two Radial Basis Function (RBF) models trained from different datasets. Then, the two RBF scores and five selected scoring functions developed by others, i.e., Opus-CA, Opus-PSP, DFIRE, RAPDF, and Cheng Score are synthesized by a sampling method. At last, another integrated RBF model ranks the structural models according to the features of sampling distribution. We tested the proposed method by using two different datasets, including the CASP server prediction models of all CASP8 targets and a set of models generated by our in-house software MUFOLD. The test result shows that our method outperforms any individual scoring function on both best model selection, and overall correlation between the predicted ranking and the actual ranking of structural quality.

  3. Building a Better Fragment Library for De Novo Protein Structure Prediction

    PubMed Central

    de Oliveira, Saulo H. P.; Shi, Jiye; Deane, Charlotte M.

    2015-01-01

    Fragment-based approaches are the current standard for de novo protein structure prediction. These approaches rely on accurate and reliable fragment libraries to generate good structural models. In this work, we describe a novel method for structure fragment library generation and its application in fragment-based de novo protein structure prediction. The importance of correct testing procedures in assessing the quality of fragment libraries is demonstrated. In particular, the exclusion of homologs to the target from the libraries to correctly simulate a de novo protein structure prediction scenario, something which surprisingly is not always done. We demonstrate that fragments presenting different predominant predicted secondary structures should be treated differently during the fragment library generation step and that exhaustive and random search strategies should both be used. This information was used to develop a novel method, Flib. On a validation set of 41 structurally diverse proteins, Flib libraries presents both a higher precision and coverage than two of the state-of-the-art methods, NNMake and HHFrag. Flib also achieves better precision and coverage on the set of 275 protein domains used in the two previous experiments of the the Critical Assessment of Structure Prediction (CASP9 and CASP10). We compared Flib libraries against NNMake libraries in a structure prediction context. Of the 13 cases in which a correct answer was generated, Flib models were more accurate than NNMake models for 10. “Flib is available for download at: http://www.stats.ox.ac.uk/research/proteins/resources”. PMID:25901595

  4. Structure-activity relationships to estimate the effective Henry's law coefficients of organics of atmospheric interest

    NASA Astrophysics Data System (ADS)

    Raventos-Duran, Teresa; Valorso, Richard; Aumont, Bernard; Camredon, Marie

    2010-05-01

    The oxidation of volatile organic compounds emitted in the atmosphere involves complex reaction mechanisms which leads to the formation of oxygenated organic intermediates, usually denoted as secondary organics. The fate of these secondary organics remains poorly quantified due to a lack of information about their speciation, distribution and evolution in the gas and condensed phases. A significant fraction of secondary organics may dissolve into the tropospheric aqueous phase owing to the presence of polar moieties generated during the oxidation processes. The partitioning of organics between the gas and the aqueous atmospheric phases is usually described in the basis of Henry's law. Atmospheric models require a knowledge of the Henry's law coefficient (H) for every water soluble organic species described in the chemical mechanism. Methods that can predict reliable H values for the vast number of organic compounds are therefore required. We have compiled a data set of experimental Henry's law constants for compounds bearing functional groups of atmospheric relevance. This data set was then used to develop GROMHE, a structure activity relationship to predict H values based on a group contribution approach. We assessed its performance with two other available estimation methods. The results show that for all these methods the reliability of the estimates decreases with increasing solubility. We discuss differences between methods and found that GROMHE had greater prediction ability.

  5. Multiphase flow predictions from carbonate pore space images using extracted network models

    NASA Astrophysics Data System (ADS)

    Al-Kharusi, Anwar S.; Blunt, Martin J.

    2008-06-01

    A methodology to extract networks from pore space images is used to make predictions of multiphase transport properties for subsurface carbonate samples. The extraction of the network model is based on the computation of the location and sizes of pores and throats to create a topological representation of the void space of three-dimensional (3-D) rock images, using the concept of maximal balls. In this work, we follow a multistaged workflow. We start with a 2-D thin-section image; convert it statistically into a 3-D representation of the pore space; extract a network model from this image; and finally, simulate primary drainage, waterflooding, and secondary drainage flow processes using a pore-scale simulator. We test this workflow for a reservoir carbonate rock. The network-predicted absolute permeability is similar to the core plug measured value and the value computed on the 3-D void space image using the lattice Boltzmann method. The predicted capillary pressure during primary drainage agrees well with a mercury-air experiment on a core sample, indicating that we have an adequate representation of the rock's pore structure. We adjust the contact angles in the network to match the measured waterflood and secondary drainage capillary pressures. We infer a significant degree of contact angle hysteresis. We then predict relative permeabilities for primary drainage, waterflooding, and secondary drainage that agree well with laboratory measured values. This approach can be used to predict multiphase transport properties when wettability and pore structure vary in a reservoir, where experimental data is scant or missing. There are shortfalls to this approach, however. We compare results from three networks, one of which was derived from a section of the rock containing vugs. Our method fails to predict properties reliably when an unrepresentative image is processed to construct the 3-D network model. This occurs when the image volume is not sufficient to represent the geological variations observed in a core plug sample.

  6. Structural changes and fluctuations of proteins. I. A statistical thermodynamic model.

    PubMed

    Ikegami, A

    1977-01-01

    A general theory of the structural changes and fluctuations of proteins has been proposed based on statistical thermodynamic considerations at the chain level. The "structure" of protein was assumed to be characterized by the state of secondary bonds between unique pairs of specific sites on peptide chains. Every secondary bond changes between the bonded and unbonded states by thermal agitation and the "structure" is continuously fluctuating. The free energy of the "structural state" that is defined by the fraction of secondary bonds in the bonded state has been expressed by the bond energy, the cooperative interaction between bonds, the mixing entropy of bonds, and the entropy of polypeptide chains. The most probable "structural state" can be simply determined by graphical analysis and the effect of temperature or solvent composition on it is discussed. The temperature dependence of the free energy, the probability distribution of structural states and the specific heat have been calculted for two examples of structural change. The theory predicts two different types of structural changes from the ordered to disorderd state, a "structured transition" and a "gradual structural change" with rising temperature. In the "structural transition", the probability distribution has two maxima in the temperature range of transition. In the "gradual structural change", the probabilty distribution has only one maximum during the change. A considerable fraction of secondary bonds is in the unbounded state and is always fluctuating even in the ordered state at room temperature. Such structural flucutations in a single protein molecule have been discussed quantitatively. The theory is extended to include small molecules which bind to the protein molecule and affect the structural state. The changes of structural state caused by specific and non-specific binding and allosteric effects are explained in a unified manner.

  7. Phylogenetic Reconstruction of the Calosphaeriales and Togniniales Using Five Genes and Predicted RNA Secondary Structures of ITS, and Flabellascus tenuirostris gen. et sp. nov.

    PubMed Central

    Réblová, Martina; Jaklitsch, Walter M.; Réblová, Kamila; Štěpánek, Václav

    2015-01-01

    The Calosphaeriales is revisited with new collection data, living cultures, morphological studies of ascoma centrum, secondary structures of the internal transcribed spacer (ITS) rDNA and phylogeny based on novel DNA sequences of five nuclear ribosomal and protein-coding loci. Morphological features, molecular evidence and information from predicted RNA secondary structures of ITS converged upon robust phylogenies of the Calosphaeriales and Togniniales. The current concept of the Calosphaeriales includes the Calosphaeriaceae and Pleurostomataceae encompassing five monophyletic genera, Calosphaeria, Flabellascus gen. nov., Jattaea, Pleurostoma and Togniniella, strongly supported by Bayesian and Maximum Likelihood methods. The structural elements of ITS1 form characteristic patterns that are phylogenetically conserved, corroborate observations based on morphology and have a high predictive value at the generic level. Three major clades containing 44 species of Phaeoacremonium were recovered in the closely related Togniniales based on ITS, actin and β-tubulin sequences. They are newly characterized by sexual and RNA structural characters and ecology. This approach is a first step towards understanding of the molecular systematics of Phaeoacremonium and possibly its new classification. In the Calosphaeriales, Jattaea aphanospora sp. nov. and J. ribicola sp. nov. are introduced, Calosphaeria taediosa is combined in Jattaea and epitypified. The sexual morph of Phaeoacremonium cinereum was encountered for the first time on decaying wood and obtained in vitro. In order to achieve a single nomenclature, the genera of asexual morphs linked with the Calosphaeriales are transferred to synonymy of their sexual morphs following the principle of priority, i.e. Calosphaeriophora to Calosphaeria, Phaeocrella to Togniniella and Pleurostomophora to Pleurostoma. Three new combinations are proposed, i.e. Pleurostoma ochraceum comb. nov., P. repens comb. nov. and P. richardsiae comb. nov. The morphology-based key is provided to facilitate identification of genera accepted in the Calosphaeriales. PMID:26699541

  8. Virtual Screening of Receptor Sites for Molecularly Imprinted Polymers.

    PubMed

    Bates, Ferdia; Cela-Pérez, María Concepción; Karim, Kal; Piletsky, Sergey; López-Vilariño, José Manuel

    2016-08-01

    Molecularly Imprinted Polymers (MIPs) are highly advantageous in the field of analytical chemistry. However, interference from secondary molecules can also impede capture of a target by a MIP receptor. This greatly complicates the design process and often requires extensive laboratory screening which is time consuming, costly, and creates substantial waste products. Herein, is presented a new technique for screening of "virtually imprinted receptors" for rebinding of the molecular template as well as secondary structures, correlating the virtual predictions with experimentally acquired data in three case studies. This novel technique is particularly applicable to the evaluation and prediction of MIP receptor specificity and efficiency in complex aqueous systems. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  9. Rational Design and Tuning of Functional RNA Switch to Control an Allosteric Intermolecular Interaction.

    PubMed

    Endoh, Tamaki; Sugimoto, Naoki

    2015-08-04

    Conformational transitions of biomolecules in response to specific stimuli control many biological processes. In natural functional RNA switches, often called riboswitches, a particular RNA structure that has a suppressive or facilitative effect on gene expression transitions to an alternative structure with the opposite effect upon binding of a specific metabolite to the aptamer region. Stability of RNA secondary structure (-ΔG°) can be predicted based on thermodynamic parameters and is easily tuned by changes in nucleobases. We envisioned that tuning of a functional RNA switch that causes an allosteric interaction between an RNA and a peptide would be possible based on a predicted switching energy (ΔΔG°) that corresponds to the energy difference between the RNA secondary structure before (-ΔG°before) and after (-ΔG°after) the RNA conformational transition. We first selected functional RNA switches responsive to neomycin with predicted ΔΔG° values ranging from 5.6 to 12.2 kcal mol(-1). We then demonstrated a simple strategy to rationally convert the functional RNA switch to switches responsive to natural metabolites thiamine pyrophosphate, S-adenosyl methionine, and adenine based on the predicted ΔΔG° values. The ΔΔG° values of the designed RNA switches proportionally correlated with interaction energy (ΔG°interaction) between the RNA and peptide, and we were able to tune the sensitivity of the RNA switches for the trigger molecule. The strategy demonstrated here will be generally applicable for construction of functional RNA switches and biosensors in which mechanisms are based on conformational transition of nucleic acids.

  10. SwiSpot: modeling riboswitches by spotting out switching sequences.

    PubMed

    Barsacchi, Marco; Novoa, Eva Maria; Kellis, Manolis; Bechini, Alessio

    2016-11-01

    Riboswitches are cis-regulatory elements in mRNA, mostly found in Bacteria, which exhibit two main secondary structure conformations. Although one of them prevents the gene from being expressed, the other conformation allows its expression, and this switching process is typically driven by the presence of a specific ligand. Although there are a handful of known riboswitches, our knowledge in this field has been greatly limited due to our inability to identify their alternate structures from their sequences. Indeed, current methods are not able to predict the presence of the two functionally distinct conformations just from the knowledge of the plain RNA nucleotide sequence. Whether this would be possible, for which cases, and what prediction accuracy can be achieved, are currently open questions. Here we show that the two alternate secondary structures of riboswitches can be accurately predicted once the 'switching sequence' of the riboswitch has been properly identified. The proposed SwiSpot approach is capable of identifying the switching sequence inside a putative, complete riboswitch sequence, on the basis of pairing behaviors, which are evaluated on proper sets of configurations. Moreover, it is able to model the switching behavior of riboswitches whose generated ensemble covers both alternate configurations. Beyond structural predictions, the approach can also be paired to homology-based riboswitch searches. SwiSpot software, along with the reference dataset files, is available at: http://www.iet.unipi.it/a.bechini/swispot/Supplementary information: Supplementary data are available at Bioinformatics online. a.bechini@ing.unipi.it. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  11. Prediction of Protein Structural Classes for Low-Similarity Sequences Based on Consensus Sequence and Segmented PSSM.

    PubMed

    Liang, Yunyun; Liu, Sanyang; Zhang, Shengli

    2015-01-01

    Prediction of protein structural classes for low-similarity sequences is useful for understanding fold patterns, regulation, functions, and interactions of proteins. It is well known that feature extraction is significant to prediction of protein structural class and it mainly uses protein primary sequence, predicted secondary structure sequence, and position-specific scoring matrix (PSSM). Currently, prediction solely based on the PSSM has played a key role in improving the prediction accuracy. In this paper, we propose a novel method called CSP-SegPseP-SegACP by fusing consensus sequence (CS), segmented PsePSSM, and segmented autocovariance transformation (ACT) based on PSSM. Three widely used low-similarity datasets (1189, 25PDB, and 640) are adopted in this paper. Then a 700-dimensional (700D) feature vector is constructed and the dimension is decreased to 224D by using principal component analysis (PCA). To verify the performance of our method, rigorous jackknife cross-validation tests are performed on 1189, 25PDB, and 640 datasets. Comparison of our results with the existing PSSM-based methods demonstrates that our method achieves the favorable and competitive performance. This will offer an important complementary to other PSSM-based methods for prediction of protein structural classes for low-similarity sequences.

  12. Novel Approach to Classify Plants Based on Metabolite-Content Similarity.

    PubMed

    Liu, Kang; Abdullah, Azian Azamimi; Huang, Ming; Nishioka, Takaaki; Altaf-Ul-Amin, Md; Kanaya, Shigehiko

    2017-01-01

    Secondary metabolites are bioactive substances with diverse chemical structures. Depending on the ecological environment within which they are living, higher plants use different combinations of secondary metabolites for adaptation (e.g., defense against attacks by herbivores or pathogenic microbes). This suggests that the similarity in metabolite content is applicable to assess phylogenic similarity of higher plants. However, such a chemical taxonomic approach has limitations of incomplete metabolomics data. We propose an approach for successfully classifying 216 plants based on their known incomplete metabolite content. Structurally similar metabolites have been clustered using the network clustering algorithm DPClus. Plants have been represented as binary vectors, implying relations with structurally similar metabolite groups, and classified using Ward's method of hierarchical clustering. Despite incomplete data, the resulting plant clusters are consistent with the known evolutional relations of plants. This finding reveals the significance of metabolite content as a taxonomic marker. We also discuss the predictive power of metabolite content in exploring nutritional and medicinal properties in plants. As a byproduct of our analysis, we could predict some currently unknown species-metabolite relations.

  13. Novel Approach to Classify Plants Based on Metabolite-Content Similarity

    PubMed Central

    Abdullah, Azian Azamimi; Huang, Ming; Nishioka, Takaaki

    2017-01-01

    Secondary metabolites are bioactive substances with diverse chemical structures. Depending on the ecological environment within which they are living, higher plants use different combinations of secondary metabolites for adaptation (e.g., defense against attacks by herbivores or pathogenic microbes). This suggests that the similarity in metabolite content is applicable to assess phylogenic similarity of higher plants. However, such a chemical taxonomic approach has limitations of incomplete metabolomics data. We propose an approach for successfully classifying 216 plants based on their known incomplete metabolite content. Structurally similar metabolites have been clustered using the network clustering algorithm DPClus. Plants have been represented as binary vectors, implying relations with structurally similar metabolite groups, and classified using Ward's method of hierarchical clustering. Despite incomplete data, the resulting plant clusters are consistent with the known evolutional relations of plants. This finding reveals the significance of metabolite content as a taxonomic marker. We also discuss the predictive power of metabolite content in exploring nutritional and medicinal properties in plants. As a byproduct of our analysis, we could predict some currently unknown species-metabolite relations. PMID:28164123

  14. The ViennaRNA web services.

    PubMed

    Gruber, Andreas R; Bernhart, Stephan H; Lorenz, Ronny

    2015-01-01

    The ViennaRNA package is a widely used collection of programs for thermodynamic RNA secondary structure prediction. Over the years, many additional tools have been developed building on the core programs of the package to also address issues related to noncoding RNA detection, RNA folding kinetics, or efficient sequence design considering RNA-RNA hybridizations. The ViennaRNA web services provide easy and user-friendly web access to these tools. This chapter describes how to use this online platform to perform tasks such as prediction of minimum free energy structures, prediction of RNA-RNA hybrids, or noncoding RNA detection. The ViennaRNA web services can be used free of charge and can be accessed via http://rna.tbi.univie.ac.at.

  15. Amino acid sequence analysis of the annexin super-gene family of proteins.

    PubMed

    Barton, G J; Newman, R H; Freemont, P S; Crumpton, M J

    1991-06-15

    The annexins are a widespread family of calcium-dependent membrane-binding proteins. No common function has been identified for the family and, until recently, no crystallographic data existed for an annexin. In this paper we draw together 22 available annexin sequences consisting of 88 similar repeat units, and apply the techniques of multiple sequence alignment, pattern matching, secondary structure prediction and conservation analysis to the characterisation of the molecules. The analysis clearly shows that the repeats cluster into four distinct families and that greatest variation occurs within the repeat 3 units. Multiple alignment of the 88 repeats shows amino acids with conserved physicochemical properties at 22 positions, with only Gly at position 23 being absolutely conserved in all repeats. Secondary structure prediction techniques identify five conserved helices in each repeat unit and patterns of conserved hydrophobic amino acids are consistent with one face of a helix packing against the protein core in predicted helices a, c, d, e. Helix b is generally hydrophobic in all repeats, but contains a striking pattern of repeat-specific residue conservation at position 31, with Arg in repeats 4 and Glu in repeats 2, but unconserved amino acids in repeats 1 and 3. This suggests repeats 2 and 4 may interact via a buried saltbridge. The loop between predicted helices a and b of repeat 3 shows features distinct from the equivalent loop in repeats 1, 2 and 4, suggesting an important structural and/or functional role for this region. No compelling evidence emerges from this study for uteroglobin and the annexins sharing similar tertiary structures, or for uteroglobin representing a derivative of a primordial one-repeat structure that underwent duplication to give the present day annexins. The analyses performed in this paper are re-evaluated in the Appendix, in the light of the recently published X-ray structure for human annexin V. The structure confirms most of the predictions and shows the power of techniques for the determination of tertiary structural information from the amino acid sequences of an aligned protein family.

  16. Design, implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics

    PubMed Central

    Reeder, Jens; Giegerich, Robert

    2004-01-01

    Background The general problem of RNA secondary structure prediction under the widely used thermodynamic model is known to be NP-complete when the structures considered include arbitrary pseudoknots. For restricted classes of pseudoknots, several polynomial time algorithms have been designed, where the O(n6)time and O(n4) space algorithm by Rivas and Eddy is currently the best available program. Results We introduce the class of canonical simple recursive pseudoknots and present an algorithm that requires O(n4) time and O(n2) space to predict the energetically optimal structure of an RNA sequence, possible containing such pseudoknots. Evaluation against a large collection of known pseudoknotted structures shows the adequacy of the canonization approach and our algorithm. Conclusions RNA pseudoknots of medium size can now be predicted reliably as well as efficiently by the new algorithm. PMID:15294028

  17. IMG-ABC: A Knowledge Base To Fuel Discovery of Biosynthetic Gene Clusters and Novel Secondary Metabolites.

    PubMed

    Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Szeto, Ernest; Huang, Jinghua; Reddy, T B K; Cimermančič, Peter; Fischbach, Michael A; Ivanova, Natalia N; Markowitz, Victor M; Kyrpides, Nikos C; Pati, Amrita

    2015-07-14

    In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of "big" genomic data for discovering small molecules. IMG-ABC relies on IMG's comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve as the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC's focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in Alphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. IMG-ABC is the largest publicly available database of predicted and experimental biosynthetic gene clusters and the secondary metabolites they produce. The system also includes powerful search and analysis tools that are integrated with IMG's extensive genomic/metagenomic data and analysis tool kits. As new research on biosynthetic gene clusters and secondary metabolites is published and more genomes are sequenced, IMG-ABC will continue to expand, with the goal of becoming an essential component of any bioinformatic exploration of the secondary metabolism world. Copyright © 2015 Hadjithomas et al.

  18. Aromatic claw: A new fold with high aromatic content that evades structural prediction: Aromatic Claw

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sachleben, Joseph R.; Adhikari, Aashish N.; Gawlak, Grzegorz

    2016-11-10

    We determined the NMR structure of a highly aromatic (13%) protein of unknown function, Aq1974 from Aquifex aeolicus (PDB ID: 5SYQ). The unusual sequence of this protein has a tryptophan content five times the normal (six tryptophan residues of 114 or 5.2% while the average tryptophan content is 1.0%) with the tryptophans occurring in a WXW motif. It has no detectable sequence homology with known protein structures. Although its NMR spectrum suggested that the protein was rich in β-sheet, upon resonance assignment and solution structure determination, the protein was found to be primarily α-helical with a small two-stranded β-sheet withmore » a novel fold that we have termed an Aromatic Claw. As this fold was previously unknown and the sequence unique, we submitted the sequence to CASP10 as a target for blind structural prediction. At the end of the competition, the sequence was classified a hard template based model; the structural relationship between the template and the experimental structure was small and the predictions all failed to predict the structure. CSRosetta was found to predict the secondary structure and its packing; however, it was found that there was little correlation between CSRosetta score and the RMSD between the CSRosetta structure and the NMR determined one. This work demonstrates that even in relatively small proteins, we do not yet have the capacity to accurately predict the fold for all primary sequences. The experimental discovery of new folds helps guide the improvement of structural prediction methods.« less

  19. Identification of novel RNA secondary structures within the hepatitis C virus genome reveals a cooperative involvement in genome packaging

    PubMed Central

    Stewart, H.; Bingham, R.J.; White, S. J.; Dykeman, E. C.; Zothner, C.; Tuplin, A. K.; Stockley, P. G.; Twarock, R.; Harris, M.

    2016-01-01

    The specific packaging of the hepatitis C virus (HCV) genome is hypothesised to be driven by Core-RNA interactions. To identify the regions of the viral genome involved in this process, we used SELEX (systematic evolution of ligands by exponential enrichment) to identify RNA aptamers which bind specifically to Core in vitro. Comparison of these aptamers to multiple HCV genomes revealed the presence of a conserved terminal loop motif within short RNA stem-loop structures. We postulated that interactions of these motifs, as well as sub-motifs which were present in HCV genomes at statistically significant levels, with the Core protein may drive virion assembly. We mutated 8 of these predicted motifs within the HCV infectious molecular clone JFH-1, thereby producing a range of mutant viruses predicted to possess altered RNA secondary structures. RNA replication and viral titre were unaltered in viruses possessing only one mutated structure. However, infectivity titres were decreased in viruses possessing a higher number of mutated regions. This work thus identified multiple novel RNA motifs which appear to contribute to genome packaging. We suggest that these structures act as cooperative packaging signals to drive specific RNA encapsidation during HCV assembly. PMID:26972799

  20. Fuzzy cluster analysis of simple physicochemical properties of amino acids for recognizing secondary structure in proteins.

    PubMed Central

    Mocz, G.

    1995-01-01

    Fuzzy cluster analysis has been applied to the 20 amino acids by using 65 physicochemical properties as a basis for classification. The clustering products, the fuzzy sets (i.e., classical sets with associated membership functions), have provided a new measure of amino acid similarities for use in protein folding studies. This work demonstrates that fuzzy sets of simple molecular attributes, when assigned to amino acid residues in a protein's sequence, can predict the secondary structure of the sequence with reasonable accuracy. An approach is presented for discriminating standard folding states, using near-optimum information splitting in half-overlapping segments of the sequence of assigned membership functions. The method is applied to a nonredundant set of 252 proteins and yields approximately 73% matching for correctly predicted and correctly rejected residues with approximately 60% overall success rate for the correctly recognized ones in three folding states: alpha-helix, beta-strand, and coil. The most useful attributes for discriminating these states appear to be related to size, polarity, and thermodynamic factors. Van der Waals volume, apparent average thickness of surrounding molecular free volume, and a measure of dimensionless surface electron density can explain approximately 95% of prediction results. hydrogen bonding and hydrophobicity induces do not yet enable clear clustering and prediction. PMID:7549882

  1. In silico study of breast cancer associated gene 3 using LION Target Engine and other tools.

    PubMed

    León, Darryl A; Cànaves, Jaume M

    2003-12-01

    Sequence analysis of individual targets is an important step in annotation and validation. As a test case, we investigated human breast cancer associated gene 3 (BCA3) with LION Target Engine and with other bioinformatics tools. LION Target Engine confirmed that the BCA3 gene is located on 11p15.4 and that the two most likely splice variants (lacking exon 3 and exons 3 and 5, respectively) exist. Based on our manual curation of sequence data, it is proposed that an additional variant (missing only exon 5) published in a public sequence repository, is a prediction artifact. A significant number of new orthologs were also identified, and these were the basis for a high-quality protein secondary structure prediction. Moreover, our research confirmed several distinct functional domains as described in earlier reports. Sequence conservation from multiple sequence alignments, splice variant identification, secondary structure predictions, and predicted phosphorylation sites suggest that the removal of interaction sites through alternative splicing might play a modulatory role in BCA3. This in silico approach shows the depth and relevance of an analysis that can be accomplished by including a variety of publicly available tools with an integrated and customizable life science informatics platform.

  2. UNRES server for physics-based coarse-grained simulations and prediction of protein structure, dynamics and thermodynamics.

    PubMed

    Czaplewski, Cezary; Karczynska, Agnieszka; Sieradzan, Adam K; Liwo, Adam

    2018-04-30

    A server implementation of the UNRES package (http://www.unres.pl) for coarse-grained simulations of protein structures with the physics-based UNRES model, coined a name UNRES server, is presented. In contrast to most of the protein coarse-grained models, owing to its physics-based origin, the UNRES force field can be used in simulations, including those aimed at protein-structure prediction, without ancillary information from structural databases; however, the implementation includes the possibility of using restraints. Local energy minimization, canonical molecular dynamics simulations, replica exchange and multiplexed replica exchange molecular dynamics simulations can be run with the current UNRES server; the latter are suitable for protein-structure prediction. The user-supplied input includes protein sequence and, optionally, restraints from secondary-structure prediction or small x-ray scattering data, and simulation type and parameters which are selected or typed in. Oligomeric proteins, as well as those containing D-amino-acid residues and disulfide links can be treated. The output is displayed graphically (minimized structures, trajectories, final models, analysis of trajectory/ensembles); however, all output files can be downloaded by the user. The UNRES server can be freely accessed at http://unres-server.chem.ug.edu.pl.

  3. Using RNA Sequence and Structure for the Prediction of Riboswitch Aptamer: A Comprehensive Review of Available Software and Tools

    PubMed Central

    Antunes, Deborah; Jorge, Natasha A. N.; Caffarena, Ernesto R.; Passetti, Fabio

    2018-01-01

    RNA molecules are essential players in many fundamental biological processes. Prokaryotes and eukaryotes have distinct RNA classes with specific structural features and functional roles. Computational prediction of protein structures is a research field in which high confidence three-dimensional protein models can be proposed based on the sequence alignment between target and templates. However, to date, only a few approaches have been developed for the computational prediction of RNA structures. Similar to proteins, RNA structures may be altered due to the interaction with various ligands, including proteins, other RNAs, and metabolites. A riboswitch is a molecular mechanism, found in the three kingdoms of life, in which the RNA structure is modified by the binding of a metabolite. It can regulate multiple gene expression mechanisms, such as transcription, translation initiation, and mRNA splicing and processing. Due to their nature, these entities also act on the regulation of gene expression and detection of small metabolites and have the potential to helping in the discovery of new classes of antimicrobial agents. In this review, we describe software and web servers currently available for riboswitch aptamer identification and secondary and tertiary structure prediction, including applications. PMID:29403526

  4. An Evolution-Based Approach to De Novo Protein Design and Case Study on Mycobacterium tuberculosis

    PubMed Central

    Brender, Jeffrey R.; Czajka, Jeff; Marsh, David; Gray, Felicia; Cierpicki, Tomasz; Zhang, Yang

    2013-01-01

    Computational protein design is a reverse procedure of protein folding and structure prediction, where constructing structures from evolutionarily related proteins has been demonstrated to be the most reliable method for protein 3-dimensional structure prediction. Following this spirit, we developed a novel method to design new protein sequences based on evolutionarily related protein families. For a given target structure, a set of proteins having similar fold are identified from the PDB library by structural alignments. A structural profile is then constructed from the protein templates and used to guide the conformational search of amino acid sequence space, where physicochemical packing is accommodated by single-sequence based solvation, torsion angle, and secondary structure predictions. The method was tested on a computational folding experiment based on a large set of 87 protein structures covering different fold classes, which showed that the evolution-based design significantly enhances the foldability and biological functionality of the designed sequences compared to the traditional physics-based force field methods. Without using homologous proteins, the designed sequences can be folded with an average root-mean-square-deviation of 2.1 Å to the target. As a case study, the method is extended to redesign all 243 structurally resolved proteins in the pathogenic bacteria Mycobacterium tuberculosis, which is the second leading cause of death from infectious disease. On a smaller scale, five sequences were randomly selected from the design pool and subjected to experimental validation. The results showed that all the designed proteins are soluble with distinct secondary structure and three have well ordered tertiary structure, as demonstrated by circular dichroism and NMR spectroscopy. Together, these results demonstrate a new avenue in computational protein design that uses knowledge of evolutionary conservation from protein structural families to engineer new protein molecules of improved fold stability and biological functionality. PMID:24204234

  5. Prediction of protein tertiary structure to low resolution: performance for a large and structurally diverse test set.

    PubMed

    Eyrich, V A; Standley, D M; Friesner, R A

    1999-05-14

    We report the tertiary structure predictions for 95 proteins ranging in size from 17 to 160 residues starting from known secondary structure. Predictions are obtained from global minimization of an empirical potential function followed by the application of a refined atomic overlap potential. The minimization strategy employed represents a variant of the Monte Carlo plus minimization scheme of Li and Scheraga applied to a reduced model of the protein chain. For all of the cases except beta-proteins larger than 75 residues, a native-like structure, usually 4-6 A root-mean-square deviation from the native, is located. For beta-proteins larger than 75 residues, the energy gap between native-like structures and the lowest energy structures produced in the simulation is large, so that low RMSD structures are not generated starting from an unfolded state. This is attributed to the lack of an explicit hydrogen bond term in the potential function, which we hypothesize is necessary to stabilize large assemblies of beta-strands. Copyright 1999 Academic Press.

  6. Cryptic tRNAs in chaetognath mitochondrial genomes.

    PubMed

    Barthélémy, Roxane-Marie; Seligmann, Hervé

    2016-06-01

    The chaetognaths constitute a small and enigmatic phylum of little marine invertebrates. Both nuclear and mitochondrial genomes have numerous originalities, some phylum-specific. Until recently, their mitogenomes seemed containing only one tRNA gene (trnMet), but a recent study found in two chaetognath mitogenomes two and four tRNA genes. Moreover, apparently two conspecific mitogenomes have different tRNA gene numbers (one and two). Reanalyses by tRNAscan-SE and ARWEN softwares of the five available complete chaetognath mitogenomes suggest numerous additional tRNA genes from different types. Their total number never reaches the 22 found in most other invertebrates using that genetic code. Predicted error compensation between codon-anticodon mismatch and tRNA misacylation suggests translational activity by tRNAs predicted solely according to secondary structure for tRNAs predicted by tRNAscan-SE, not ARWEN. Numbers of predicted stop-suppressor (antitermination) tRNAs coevolve with predicted overlapping, frameshifted protein coding genes including stop codons. Sequence alignments in secondary structure prediction with non-chaetognath tRNAs suggest that the most likely functional tRNAs are in intergenic regions, as regular mt-tRNAs. Due to usually short intergenic regions, generally tRNA sequences partially overlap with flanking genes. Some tRNA pairs seem templated by sense-antisense strands. Moreover, 16S rRNA genes, but not 12S rRNAs, appear as tRNA nurseries, as previously suggested for multifunctional ribosomal-like protogenomes. Copyright © 2016 Elsevier Ltd. All rights reserved.

  7. Towards Long-Range RNA Structure Prediction in Eukaryotic Genes.

    PubMed

    Pervouchine, Dmitri D

    2018-06-15

    The ability to form an intramolecular structure plays a fundamental role in eukaryotic RNA biogenesis. Proximate regions in the primary transcripts fold into a local secondary structure, which is then hierarchically assembled into a tertiary structure that is stabilized by RNA-binding proteins and long-range intramolecular base pairings. While the local RNA structure can be predicted reasonably well for short sequences, long-range structure at the scale of eukaryotic genes remains problematic from the computational standpoint. The aim of this review is to list functional examples of long-range RNA structures, to summarize current comparative methods of structure prediction, and to highlight their advances and limitations in the context of long-range RNA structures. Most comparative methods implement the “first-align-then-fold” principle, i.e., they operate on multiple sequence alignments, while functional RNA structures often reside in non-conserved parts of the primary transcripts. The opposite “first-fold-then-align” approach is currently explored to a much lesser extent. Developing novel methods in both directions will improve the performance of comparative RNA structure analysis and help discover novel long-range structures, their higher-order organization, and RNA⁻RNA interactions across the transcriptome.

  8. Core-satellite species hypothesis and native versus exotic species in secondary succession

    USGS Publications Warehouse

    Martinez, Kelsey A.; Gibson, David J.; Middleton, Beth A.

    2015-01-01

    A number of hypotheses exist to explain species’ distributions in a landscape, but these hypotheses are not frequently utilized to explain the differences in native and exotic species distributions. The core-satellite species (CSS) hypothesis predicts species occupancy will be bimodally distributed, i.e., many species will be common and many species will be rare, but does not explicitly consider exotic species distributions. The parallel dynamics (PD) hypothesis predicts that regional occurrence patterns of exotic species will be similar to native species. Together, the CSS and PD hypotheses may increase our understanding of exotic species’ distribution relative to natives. We selected an old field undergoing secondary succession to study the CSS and PD hypotheses in conjunction with each other. The ratio of exotic to native species (richness and abundance) was observed through 17 years of secondary succession. We predicted species would be bimodally distributed and that exotic:native species ratios would remain steady or decrease through time under frequent disturbance. In contrast to the CSS and PD hypotheses, native species occupancies were not bimodally distributed at the site, but exotic species were. The exotic:native species ratios for both richness (E:Nrichness) and abundance (E:Ncover) generally decreased or remained constant throughout supporting the PD hypothesis. Our results suggest exotic species exhibit metapopulation structure in old field landscapes, but that metapopulation structures of native species are disrupted, perhaps because these species are dispersal limited in the fragmented landscape.

  9. GalaxyGPCRloop: Template-Based and Ab Initio Structure Sampling of the Extracellular Loops of G-Protein-Coupled Receptors.

    PubMed

    Won, Jonghun; Lee, Gyu Rie; Park, Hahnbeom; Seok, Chaok

    2018-06-07

    The second extracellular loops (ECL2s) of G-protein-coupled receptors (GPCRs) are often involved in GPCR functions, and their structures have important implications in drug discovery. However, structure prediction of ECL2 is difficult because of its long length and the structural diversity among different GPCRs. In this study, a new ECL2 conformational sampling method involving both template-based and ab initio sampling was developed. Inspired by the observation of similar ECL2 structures of closely related GPCRs, a template-based sampling method employing loop structure templates selected from the structure database was developed. A new metric for evaluating similarity of the target loop to templates was introduced for template selection. An ab initio loop sampling method was also developed to treat cases without highly similar templates. The ab initio method is based on the previously developed fragment assembly and loop closure method. A new sampling component that takes advantage of secondary structure prediction was added. In addition, a conserved disulfide bridge restraining ECL2 conformation was predicted and analytically incorporated into sampling, reducing the effective dimension of the conformational search space. The sampling method was combined with an existing energy function for comparison with previously reported loop structure prediction methods, and the benchmark test demonstrated outstanding performance.

  10. Dynamic/Jitter Assessment of Multiple Potential HabEx Structural Designs

    NASA Technical Reports Server (NTRS)

    Knight, J. Brent; Stahl, H. Philip; Singleton, Andrew William; Hunt, Ronald A.; Therrell, Melissa F.; Caldwell, Mary Kathryn; Garcia, Jay Clarke

    2017-01-01

    The 2020 Decadal Survey in Astronomy and Astrophysics will assess candidate large missions to follow James Webb Space Telescope (JWST) and Wide Field Infrared Space Telescope (WFIRST). One candidate mission is the Habitable ExoPlanet Imaging Mission (HabEx). This presentation describes two HabEx structural designs and results from structural dynamic analyses performed to predict Primary Mirror (PM) Secondary Mirror (SM) Line of Site (LOS) stability (jitter) due to Reaction Wheel Assembly (RWA) vibrations.

  11. Bioinformatics study of the mangrove actin genes

    NASA Astrophysics Data System (ADS)

    Basyuni, M.; Wasilah, M.; Sumardi

    2017-01-01

    This study describes the bioinformatics methods to analyze eight actin genes from mangrove plants on DDBJ/EMBL/GenBank as well as predicted the structure, composition, subcellular localization, similarity, and phylogenetic. The physical and chemical properties of eight mangroves showed variation among the genes. The percentage of the secondary structure of eight mangrove actin genes followed the order of a helix > random coil > extended chain structure for BgActl, KcActl, RsActl, and A. corniculatum Act. In contrast to this observation, the remaining actin genes were random coil > extended chain structure > a helix. This study, therefore, shown the prediction of secondary structure was performed for necessary structural information. The values of chloroplast or signal peptide or mitochondrial target were too small, indicated that no chloroplast or mitochondrial transit peptide or signal peptide of secretion pathway in mangrove actin genes. These results suggested the importance of understanding the diversity and functional of properties of the different amino acids in mangrove actin genes. To clarify the relationship among the mangrove actin gene, a phylogenetic tree was constructed. Three groups of mangrove actin genes were formed, the first group contains B. gymnorrhiza BgAct and R. stylosa RsActl. The second cluster which consists of 5 actin genes the largest group, and the last branch consist of one gene, B. sexagula Act. The present study, therefore, supported the previous results that plant actin genes form distinct clusters in the tree.

  12. Efficient algorithms for probing the RNA mutation landscape.

    PubMed

    Waldispühl, Jérôme; Devadas, Srinivas; Berger, Bonnie; Clote, Peter

    2008-08-08

    The diversity and importance of the role played by RNAs in the regulation and development of the cell are now well-known and well-documented. This broad range of functions is achieved through specific structures that have been (presumably) optimized through evolution. State-of-the-art methods, such as McCaskill's algorithm, use a statistical mechanics framework based on the computation of the partition function over the canonical ensemble of all possible secondary structures on a given sequence. Although secondary structure predictions from thermodynamics-based algorithms are not as accurate as methods employing comparative genomics, the former methods are the only available tools to investigate novel RNAs, such as the many RNAs of unknown function recently reported by the ENCODE consortium. In this paper, we generalize the McCaskill partition function algorithm to sum over the grand canonical ensemble of all secondary structures of all mutants of the given sequence. Specifically, our new program, RNAmutants, simultaneously computes for each integer k the minimum free energy structure MFE(k) and the partition function Z(k) over all secondary structures of all k-point mutants, even allowing the user to specify certain positions required not to mutate and certain positions required to base-pair or remain unpaired. This technically important extension allows us to study the resilience of an RNA molecule to pointwise mutations. By computing the mutation profile of a sequence, a novel graphical representation of the mutational tendency of nucleotide positions, we analyze the deleterious nature of mutating specific nucleotide positions or groups of positions. We have successfully applied RNAmutants to investigate deleterious mutations (mutations that radically modify the secondary structure) in the Hepatitis C virus cis-acting replication element and to evaluate the evolutionary pressure applied on different regions of the HIV trans-activation response element. In particular, we show qualitative agreement between published Hepatitis C and HIV experimental mutagenesis studies and our analysis of deleterious mutations using RNAmutants. Our work also predicts other deleterious mutations, which could be verified experimentally. Finally, we provide evidence that the 3' UTR of the GB RNA virus C has been optimized to preserve evolutionarily conserved stem regions from a deleterious effect of pointwise mutations. We hope that there will be long-term potential applications of RNAmutants in de novo RNA design and drug design against RNA viruses. This work also suggests potential applications for large-scale exploration of the RNA sequence-structure network. Binary distributions are available at http://RNAmutants.csail.mit.edu/.

  13. Structure of the ordered hydration of amino acids in proteins: analysis of crystal structures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Biedermannová, Lada, E-mail: lada.biedermannova@ibt.cas.cz; Schneider, Bohdan

    2015-10-27

    The hydration of protein crystal structures was studied at the level of individual amino acids. The dependence of the number of water molecules and their preferred spatial localization on various parameters, such as solvent accessibility, secondary structure and side-chain conformation, was determined. Crystallography provides unique information about the arrangement of water molecules near protein surfaces. Using a nonredundant set of 2818 protein crystal structures with a resolution of better than 1.8 Å, the extent and structure of the hydration shell of all 20 standard amino-acid residues were analyzed as function of the residue conformation, secondary structure and solvent accessibility. Themore » results show how hydration depends on the amino-acid conformation and the environment in which it occurs. After conformational clustering of individual residues, the density distribution of water molecules was compiled and the preferred hydration sites were determined as maxima in the pseudo-electron-density representation of water distributions. Many hydration sites interact with both main-chain and side-chain amino-acid atoms, and several occurrences of hydration sites with less canonical contacts, such as carbon–donor hydrogen bonds, OH–π interactions and off-plane interactions with aromatic heteroatoms, are also reported. Information about the location and relative importance of the empirically determined preferred hydration sites in proteins has applications in improving the current methods of hydration-site prediction in molecular replacement, ab initio protein structure prediction and the set-up of molecular-dynamics simulations.« less

  14. A protein block based fold recognition method for the annotation of twilight zone sequences.

    PubMed

    Suresh, V; Ganesan, K; Parthasarathy, S

    2013-03-01

    The description of protein backbone was recently improved with a group of structural fragments called Structural Alphabets instead of the regular three states (Helix, Sheet and Coil) secondary structure description. Protein Blocks is one of the Structural Alphabets used to describe each and every region of protein backbone including the coil. According to de Brevern (2000) the Protein Blocks has 16 structural fragments and each one has 5 residues in length. Protein Blocks fragments are highly informative among the available Structural Alphabets and it has been used for many applications. Here, we present a protein fold recognition method based on Protein Blocks for the annotation of twilight zone sequences. In our method, we align the predicted Protein Blocks of a query amino acid sequence with a library of assigned Protein Blocks of 953 known folds using the local pair-wise alignment. The alignment results with z-value ≥ 2.5 and P-value ≤ 0.08 are predicted as possible folds. Our method is able to recognize the possible folds for nearly 35.5% of the twilight zone sequences with their predicted Protein Block sequence obtained by pb_prediction, which is available at Protein Block Export server.

  15. Accelerating calculations of RNA secondary structure partition functions using GPUs

    PubMed Central

    2013-01-01

    Background RNA performs many diverse functions in the cell in addition to its role as a messenger of genetic information. These functions depend on its ability to fold to a unique three-dimensional structure determined by the sequence. The conformation of RNA is in part determined by its secondary structure, or the particular set of contacts between pairs of complementary bases. Prediction of the secondary structure of RNA from its sequence is therefore of great interest, but can be computationally expensive. In this work we accelerate computations of base-pair probababilities using parallel graphics processing units (GPUs). Results Calculation of the probabilities of base pairs in RNA secondary structures using nearest-neighbor standard free energy change parameters has been implemented using CUDA to run on hardware with multiprocessor GPUs. A modified set of recursions was introduced, which reduces memory usage by about 25%. GPUs are fastest in single precision, and for some hardware, restricted to single precision. This may introduce significant roundoff error. However, deviations in base-pair probabilities calculated using single precision were found to be negligible compared to those resulting from shifting the nearest-neighbor parameters by a random amount of magnitude similar to their experimental uncertainties. For large sequences running on our particular hardware, the GPU implementation reduces execution time by a factor of close to 60 compared with an optimized serial implementation, and by a factor of 116 compared with the original code. Conclusions Using GPUs can greatly accelerate computation of RNA secondary structure partition functions, allowing calculation of base-pair probabilities for large sequences in a reasonable amount of time, with a negligible compromise in accuracy due to working in single precision. The source code is integrated into the RNAstructure software package and available for download at http://rna.urmc.rochester.edu. PMID:24180434

  16. Protein asparagine deamidation prediction based on structures with machine learning methods.

    PubMed

    Jia, Lei; Sun, Yaxiong

    2017-01-01

    Chemical stability is a major concern in the development of protein therapeutics due to its impact on both efficacy and safety. Protein "hotspots" are amino acid residues that are subject to various chemical modifications, including deamidation, isomerization, glycosylation, oxidation etc. A more accurate prediction method for potential hotspot residues would allow their elimination or reduction as early as possible in the drug discovery process. In this work, we focus on prediction models for asparagine (Asn) deamidation. Sequence-based prediction method simply identifies the NG motif (amino acid asparagine followed by a glycine) to be liable to deamidation. It still dominates deamidation evaluation process in most pharmaceutical setup due to its convenience. However, the simple sequence-based method is less accurate and often causes over-engineering a protein. We introduce structure-based prediction models by mining available experimental and structural data of deamidated proteins. Our training set contains 194 Asn residues from 25 proteins that all have available high-resolution crystal structures. Experimentally measured deamidation half-life of Asn in penta-peptides as well as 3D structure-based properties, such as solvent exposure, crystallographic B-factors, local secondary structure and dihedral angles etc., were used to train prediction models with several machine learning algorithms. The prediction tools were cross-validated as well as tested with an external test data set. The random forest model had high enrichment in ranking deamidated residues higher than non-deamidated residues while effectively eliminated false positive predictions. It is possible that such quantitative protein structure-function relationship tools can also be applied to other protein hotspot predictions. In addition, we extensively discussed metrics being used to evaluate the performance of predicting unbalanced data sets such as the deamidation case.

  17. Application of the maximum entropy principle to determine ensembles of intrinsically disordered proteins from residual dipolar couplings.

    PubMed

    Sanchez-Martinez, M; Crehuet, R

    2014-12-21

    We present a method based on the maximum entropy principle that can re-weight an ensemble of protein structures based on data from residual dipolar couplings (RDCs). The RDCs of intrinsically disordered proteins (IDPs) provide information on the secondary structure elements present in an ensemble; however even two sets of RDCs are not enough to fully determine the distribution of conformations, and the force field used to generate the structures has a pervasive influence on the refined ensemble. Two physics-based coarse-grained force fields, Profasi and Campari, are able to predict the secondary structure elements present in an IDP, but even after including the RDC data, the re-weighted ensembles differ between both force fields. Thus the spread of IDP ensembles highlights the need for better force fields. We distribute our algorithm in an open-source Python code.

  18. General mechanism of two-state protein folding kinetics.

    PubMed

    Rollins, Geoffrey C; Dill, Ken A

    2014-08-13

    We describe here a general model of the kinetic mechanism of protein folding. In the Foldon Funnel Model, proteins fold in units of secondary structures, which form sequentially along the folding pathway, stabilized by tertiary interactions. The model predicts that the free energy landscape has a volcano shape, rather than a simple funnel, that folding is two-state (single-exponential) when secondary structures are intrinsically unstable, and that each structure along the folding path is a transition state for the previous structure. It shows how sequential pathways are consistent with multiple stochastic routes on funnel landscapes, and it gives good agreement with the 9 order of magnitude dependence of folding rates on protein size for a set of 93 proteins, at the same time it is consistent with the near independence of folding equilibrium constant on size. This model gives estimates of folding rates of proteomes, leading to a median folding time in Escherichia coli of about 5 s.

  19. Comparison of red blood cells from gastric cancer patients and healthy persons using FTIR spectroscopy

    NASA Astrophysics Data System (ADS)

    Liu, Hui; Su, Qinglong; Sheng, Daping; Zheng, Wei; Wang, Xin

    2017-02-01

    In this paper, FTIR spectroscopy was used to compare gastric cancer patients' red blood cells (RBCs) with healthy persons' RBCs. IR spectra were acquired with high resolution. The A1653/A1543 (the protein secondary structures), A1543/A2958 (the relative content of proteins and lipids), A1106/A1166 (the structure and content changes of sugars) and A1543/A1106 (the relative content of proteins and sugars) ratios of gastric cancer patients' RBCs were significantly different from those of healthy persons' RBCs. Curve fitting results showed that the protein secondary structures and sugars' structures had differences between gastric cancer patients' and healthy persons' RBCs. Additionally, FTIR spectroscopy could obtain 95% sensitivity, 70% specificity, 84.2% accuracy and 80.9% positive predictive value in combination with canconical discriminant analysis. The above results indicate FTIR spectroscopy may be useful for diagnosing gastric cancer.

  20. FRAGSION: ultra-fast protein fragment library generation by IOHMM sampling.

    PubMed

    Bhattacharya, Debswapna; Adhikari, Badri; Li, Jilong; Cheng, Jianlin

    2016-07-01

    Speed, accuracy and robustness of building protein fragment library have important implications in de novo protein structure prediction since fragment-based methods are one of the most successful approaches in template-free modeling (FM). Majority of the existing fragment detection methods rely on database-driven search strategies to identify candidate fragments, which are inherently time-consuming and often hinder the possibility to locate longer fragments due to the limited sizes of databases. Also, it is difficult to alleviate the effect of noisy sequence-based predicted features such as secondary structures on the quality of fragment. Here, we present FRAGSION, a database-free method to efficiently generate protein fragment library by sampling from an Input-Output Hidden Markov Model. FRAGSION offers some unique features compared to existing approaches in that it (i) is lightning-fast, consuming only few seconds of CPU time to generate fragment library for a protein of typical length (300 residues); (ii) can generate dynamic-size fragments of any length (even for the whole protein sequence) and (iii) offers ways to handle noise in predicted secondary structure during fragment sampling. On a FM dataset from the most recent Critical Assessment of Structure Prediction, we demonstrate that FGRAGSION provides advantages over the state-of-the-art fragment picking protocol of ROSETTA suite by speeding up computation by several orders of magnitude while achieving comparable performance in fragment quality. Source code and executable versions of FRAGSION for Linux and MacOS is freely available to non-commercial users at http://sysbio.rnet.missouri.edu/FRAGSION/ It is bundled with a manual and example data. chengji@missouri.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  1. Modeling of structural uncertainties in Reynolds-averaged Navier-Stokes closures

    NASA Astrophysics Data System (ADS)

    Emory, Michael; Larsson, Johan; Iaccarino, Gianluca

    2013-11-01

    Estimation of the uncertainty in numerical predictions by Reynolds-averaged Navier-Stokes closures is a vital step in building confidence in such predictions. An approach to model-form uncertainty quantification that does not assume the eddy-viscosity hypothesis to be exact is proposed. The methodology for estimation of uncertainty is demonstrated for plane channel flow, for a duct with secondary flows, and for the shock/boundary-layer interaction over a transonic bump.

  2. Thermodynamics of RNA structures by Wang–Landau sampling

    PubMed Central

    Lou, Feng; Clote, Peter

    2010-01-01

    Motivation: Thermodynamics-based dynamic programming RNA secondary structure algorithms have been of immense importance in molecular biology, where applications range from the detection of novel selenoproteins using expressed sequence tag (EST) data, to the determination of microRNA genes and their targets. Dynamic programming algorithms have been developed to compute the minimum free energy secondary structure and partition function of a given RNA sequence, the minimum free-energy and partition function for the hybridization of two RNA molecules, etc. However, the applicability of dynamic programming methods depends on disallowing certain types of interactions (pseudoknots, zig-zags, etc.), as their inclusion renders structure prediction an nondeterministic polynomial time (NP)-complete problem. Nevertheless, such interactions have been observed in X-ray structures. Results: A non-Boltzmannian Monte Carlo algorithm was designed by Wang and Landau to estimate the density of states for complex systems, such as the Ising model, that exhibit a phase transition. In this article, we apply the Wang-Landau (WL) method to compute the density of states for secondary structures of a given RNA sequence, and for hybridizations of two RNA sequences. Our method is shown to be much faster than existent software, such as RNAsubopt. From density of states, we compute the partition function over all secondary structures and over all pseudoknot-free hybridizations. The advantage of the WL method is that by adding a function to evaluate the free energy of arbitary pseudoknotted structures and of arbitrary hybridizations, we can estimate thermodynamic parameters for situations known to be NP-complete. This extension to pseudoknots will be made in the sequel to this article; in contrast, the current article describes the WL algorithm applied to pseudoknot-free secondary structures and hybridizations. Availability: The WL RNA hybridization web server is under construction at http://bioinformatics.bc.edu/clotelab/. Contact: clote@bc.edu PMID:20529917

  3. Learning to apply models of materials while explaining their properties

    NASA Astrophysics Data System (ADS)

    Karpin, Tiia; Juuti, Kalle; Lavonen, Jari

    2014-09-01

    Background:Applying structural models is important to chemistry education at the upper secondary level, but it is considered one of the most difficult topics to learn. Purpose:This study analyses to what extent in designed lessons students learned to apply structural models in explaining the properties and behaviours of various materials. Sample:An experimental group is 27 Finnish upper secondary school students and control group included 18 students from the same school. Design and methods:In quasi-experimental setting, students were guided through predict, observe, explain activities in four practical work situations. It was intended that the structural models would encourage students to learn how to identify and apply appropriate models when predicting and explaining situations. The lessons, organised over a one-week period, began with a teacher's demonstration and continued with student experiments in which they described the properties and behaviours of six household products representing three different materials. Results:Most students in the experimental group learned to apply the models correctly, as demonstrated by post-test scores that were significantly higher than pre-test scores. The control group showed no significant difference between pre- and post-test scores. Conclusions:The findings indicate that the intervention where students engage in predict, observe, explain activities while several materials and models are confronted at the same time, had a positive effect on learning outcomes.

  4. Virtual screening on an α-helix to β-strand switchable region of the FGFR2 extracellular domain revealed positive and negative modulators.

    PubMed

    Diaz, Constantino; Corentin, Herbert; Thierry, Vermat; Chantal, Alcouffe; Tanguy, Bozec; David, Sibrac; Jean-Marc, Herbert; Pascual, Ferrara; Françoise, Bono; Edgardo, Ferran

    2014-11-01

    The secondary structure of some protein segments may vary between α-helix and β-strand. To predict these switchable segments, we have developed an algorithm, Switch-P, based solely on the protein sequence. This algorithm was used on the extracellular parts of FGF receptors. For FGFR2, it predicted that β4 and β5 strands of the third Ig-like domain were highly switchable. These two strands possess a high number of somatic mutations associated with cancer. Analysis of PDB structures of FGF receptors confirmed the switchability prediction for β5. We thus evaluated if compound-driven α-helix/β-strand switching of β5 could modulate FGFR2 signaling. We performed the virtual screening of a library containing 1.4 million of chemical compounds with two models of the third Ig-like domain of FGFR2 showing different secondary structures for β5, and we selected 32 compounds. Experimental testing using proliferation assays with FGF7-stimulated SNU-16 cells and a FGFR2-dependent Erk1/2 phosphorylation assay with FGFR2-transfected L6 cells, revealed activators and inhibitors of FGFR2. Our method for the identification of switchable proteinic regions, associated with our virtual screening approach, provides an opportunity to discover new generation of drugs with under-explored mechanism of action. © 2014 Wiley Periodicals, Inc.

  5. False belief and language comprehension in Cantonese-speaking children.

    PubMed

    Cheung, Him

    2006-10-01

    The current research compared two accounts of the relation between language and false belief in children, namely that (a) language is generally related to false belief because both require secondary representation in a social-interactional context and that (b) specific language structures that explicitly code metarepresentation contribute uniquely to the language-false belief relation. In three studies, attempts were made to correlate Cantonese-speaking children's false belief with their general language comprehension and understanding of certain structures that explicitly express metarepresentational knowledge. Results showed that these structures failed to predict false belief after age, nonverbal intelligence, and general language comprehension were considered. In contrast, general language remained predictive of false belief after controlling for age, nonverbal intelligence, and language structures. The current findings are more consistent with a general language account than a language structure account.

  6. Simulation of unsteady state performance of a secondary air system by the 1D-3D-Structure coupled method

    NASA Astrophysics Data System (ADS)

    Wu, Hong; Li, Peng; Li, Yulong

    2016-02-01

    This paper describes the calculation method for unsteady state conditions in the secondary air systems in gas turbines. The 1D-3D-Structure coupled method was applied. A 1D code was used to model the standard components that have typical geometric characteristics. Their flow and heat transfer were described by empirical correlations based on experimental data or CFD calculations. A 3D code was used to model the non-standard components that cannot be described by typical geometric languages, while a finite element analysis was carried out to compute the structural deformation and heat conduction at certain important positions. These codes were coupled through their interfaces. Thus, the changes in heat transfer and structure and their interactions caused by exterior disturbances can be reflected. The results of the coupling method in an unsteady state showed an apparent deviation from the existing data, while the results in the steady state were highly consistent with the existing data. The difference in the results in the unsteady state was caused primarily by structural deformation that cannot be predicted by the 1D method. Thus, in order to obtain the unsteady state performance of a secondary air system more accurately and efficiently, the 1D-3D-Structure coupled method should be used.

  7. Multiobjective evolutionary algorithm with many tables for purely ab initio protein structure prediction.

    PubMed

    Brasil, Christiane Regina Soares; Delbem, Alexandre Claudio Botazzo; da Silva, Fernando Luís Barroso

    2013-07-30

    This article focuses on the development of an approach for ab initio protein structure prediction (PSP) without using any earlier knowledge from similar protein structures, as fragment-based statistics or inference of secondary structures. Such an approach is called purely ab initio prediction. The article shows that well-designed multiobjective evolutionary algorithms can predict relevant protein structures in a purely ab initio way. One challenge for purely ab initio PSP is the prediction of structures with β-sheets. To work with such proteins, this research has also developed procedures to efficiently estimate hydrogen bond and solvation contribution energies. Considering van der Waals, electrostatic, hydrogen bond, and solvation contribution energies, the PSP is a problem with four energetic terms to be minimized. Each interaction energy term can be considered an objective of an optimization method. Combinatorial problems with four objectives have been considered too complex for the available multiobjective optimization (MOO) methods. The proposed approach, called "Multiobjective evolutionary algorithms with many tables" (MEAMT), can efficiently deal with four objectives through the combination thereof, performing a more adequate sampling of the objective space. Therefore, this method can better map the promising regions in this space, predicting structures in a purely ab initio way. In other words, MEAMT is an efficient optimization method for MOO, which explores simultaneously the search space as well as the objective space. MEAMT can predict structures with one or two domains with RMSDs comparable to values obtained by recently developed ab initio methods (GAPFCG , I-PAES, and Quark) that use different levels of earlier knowledge. Copyright © 2013 Wiley Periodicals, Inc.

  8. Knowledge-based prediction of protein backbone conformation using a structural alphabet.

    PubMed

    Vetrivel, Iyanar; Mahajan, Swapnil; Tyagi, Manoj; Hoffmann, Lionel; Sanejouand, Yves-Henri; Srinivasan, Narayanaswamy; de Brevern, Alexandre G; Cadet, Frédéric; Offmann, Bernard

    2017-01-01

    Libraries of structural prototypes that abstract protein local structures are known as structural alphabets and have proven to be very useful in various aspects of protein structure analyses and predictions. One such library, Protein Blocks, is composed of 16 standard 5-residues long structural prototypes. This form of analyzing proteins involves drafting its structure as a string of Protein Blocks. Predicting the local structure of a protein in terms of protein blocks is the general objective of this work. A new approach, PB-kPRED is proposed towards this aim. It involves (i) organizing the structural knowledge in the form of a database of pentapeptide fragments extracted from all protein structures in the PDB and (ii) applying a knowledge-based algorithm that does not rely on any secondary structure predictions and/or sequence alignment profiles, to scan this database and predict most probable backbone conformations for the protein local structures. Though PB-kPRED uses the structural information from homologues in preference, if available. The predictions were evaluated rigorously on 15,544 query proteins representing a non-redundant subset of the PDB filtered at 30% sequence identity cut-off. We have shown that the kPRED method was able to achieve mean accuracies ranging from 40.8% to 66.3% depending on the availability of homologues. The impact of the different strategies for scanning the database on the prediction was evaluated and is discussed. Our results highlight the usefulness of the method in the context of proteins without any known structural homologues. A scoring function that gives a good estimate of the accuracy of prediction was further developed. This score estimates very well the accuracy of the algorithm (R2 of 0.82). An online version of the tool is provided freely for non-commercial usage at http://www.bo-protscience.fr/kpred/.

  9. Identifying intrinsically disordered protein regions likely to undergo binding-induced helical transitions.

    PubMed

    Glover, Karen; Mei, Yang; Sinha, Sangita C

    2016-10-01

    Many proteins contain intrinsically disordered regions (IDRs) lacking stable secondary and ordered tertiary structure. IDRs are often implicated in macromolecular interactions, and may undergo structural transitions upon binding to interaction partners. However, as binding partners of many protein IDRs are unknown, these structural transitions are difficult to verify and often are poorly understood. In this study we describe a method to identify IDRs that are likely to undergo helical transitions upon binding. This method combines bioinformatics analyses followed by circular dichroism spectroscopy to monitor 2,2,2-trifluoroethanol (TFE)-induced changes in secondary structure content of these IDRs. Our results demonstrate that there is no significant change in the helicity of IDRs that are not predicted to fold upon binding. IDRs that are predicted to fold fall into two groups: one group does not become helical in the presence of TFE and includes examples of IDRs that form β-strands upon binding, while the other group becomes more helical and includes examples that are known to fold into helices upon binding. Therefore, we propose that bioinformatics analyses combined with experimental evaluation using TFE may provide a general method to identify IDRs that undergo binding-induced disorder-to-helix transitions. Copyright © 2016 Elsevier B.V. All rights reserved.

  10. Bridging the gap between structural bioinformatics and receptor research: the membrane-embedded, ligand-gated, P2X glycoprotein receptor.

    PubMed

    Mager, Peter P; Weber, Anje; Illes, Peter

    2004-01-01

    No details on P2X receptor architecture had been known at the atomic resolution level. Using comparative homology-based molecular modelling and threading, it was attempted to predict the three-dimensional structure of P2X receptors. This prediction could not be carried out, however, because important properties of the P2X family differ considerably from that of the potential template proteins. This paper reviews an alternative approach consisting of three research fields: bioinformatics, structural modelling, and a variety of the results of biological experiments. Starting point is the amino acid sequence. Using the sequential data, the first step is a secondary structure prediction. The resulting secondary structure is converted into a three-dimensional geometry. Then, the secondary and tertiary structures are optimized by using the quantum chemistry RHF/3-21G minimal basic set and the all-atom molecular mechanics AMBER96 force field. The fold of the membrane-embedded protein is simulated by a suitable dielectricum. The structure is refined using a conjugate gradient minimizer (Fletcher-Reeves modification of the Polak-Ribiere method). The results of the geometry optimization were checked by a Ramanchandran plot, rotamer analysis, all-atom contact dots, and the C(beta) deviation. As additional tools for the model building, multiple alignment analysis and comparative sequence-function analysis were used. The approach is exemplified on the membrane-embedded, ligand-gated P2X3 receptor subunit, a monovalent-bivalent cation channel-forming glycoprotein that is activated by extracellular adenosine 5'-triphosphate. From these results, a topology of the pore-forming motif of the P2X3 receptor subunit was proposed. It is believed that a fully functional P2X channel requires a precise coupling between (i) two distinct peptide modules, an extracellularly occurring ATP-binding module and a pore module that includes a long transmembrane and short intracellular part, (ii) an interaction surface with membranes, and (iii) hydrogen bonding forces of the residues and hydrated cations. Furthermore, this paper demonstrates the role of quantitative structure-activity relationships (QSARs) in P2X research (calcium ion permeability of the wild-type and after site-directed mutagenesis of the rat P2X2 receptor protein, KN-62 analogs as competitive antagonists of the human P2X7 receptor). EXPERIMENTAL PROOFS: The predictions are experimentally testable and may provide an additional interpretation of experimental observations published in literature. In particular, there is the good agreement of the geometry optimized P2X3 structure with experimentally proposed P2X receptor models obtained by neurophysiological, biochemical, pharmacological, and mutation experiments. Although the rat P2X3 receptor subunit is more complex (397 amino acids) than the KcsA protein (160 amino acids), the overall folds of the peptide backbone atoms are similar. To avoid semantic confusion, it should be noted that "prediction" is defined in a probabilistic sense. Matches to generic rules do not mean "this is true" but rather "this might be true". Only biological and chemical knowledge can determine whether or not these predictions are meaningful. Thus, the results from the computational tools are probabilistic predictions and subject to further experimental verification. The geometry optimized P2X3 receptor subunit is freely available for academic researchers on e-mail request (PDB format).

  11. Exploring the Sequence-based Prediction of Folding Initiation Sites in Proteins.

    PubMed

    Raimondi, Daniele; Orlando, Gabriele; Pancsa, Rita; Khan, Taushif; Vranken, Wim F

    2017-08-18

    Protein folding is a complex process that can lead to disease when it fails. Especially poorly understood are the very early stages of protein folding, which are likely defined by intrinsic local interactions between amino acids close to each other in the protein sequence. We here present EFoldMine, a method that predicts, from the primary amino acid sequence of a protein, which amino acids are likely involved in early folding events. The method is based on early folding data from hydrogen deuterium exchange (HDX) data from NMR pulsed labelling experiments, and uses backbone and sidechain dynamics as well as secondary structure propensities as features. The EFoldMine predictions give insights into the folding process, as illustrated by a qualitative comparison with independent experimental observations. Furthermore, on a quantitative proteome scale, the predicted early folding residues tend to become the residues that interact the most in the folded structure, and they are often residues that display evolutionary covariation. The connection of the EFoldMine predictions with both folding pathway data and the folded protein structure suggests that the initial statistical behavior of the protein chain with respect to local structure formation has a lasting effect on its subsequent states.

  12. Bioinformatics analysis of the predicted polyprenol reductase genes in higher plants

    NASA Astrophysics Data System (ADS)

    Basyuni, M.; Wati, R.

    2018-03-01

    The present study evaluates the bioinformatics methods to analyze twenty-four predicted polyprenol reductase genes from higher plants on GenBank as well as predicted the structure, composition, similarity, subcellular localization, and phylogenetic. The physicochemical properties of plant polyprenol showed diversity among the observed genes. The percentage of the secondary structure of plant polyprenol genes followed the ratio order of α helix > random coil > extended chain structure. The values of chloroplast but not signal peptide were too low, indicated that few chloroplast transit peptide in plant polyprenol reductase genes. The possibility of the potential transit peptide showed variation among the plant polyprenol reductase, suggested the importance of understanding the variety of peptide components of plant polyprenol genes. To clarify this finding, a phylogenetic tree was drawn. The phylogenetic tree shows several branches in the tree, suggested that plant polyprenol reductase genes grouped into divergent clusters in the tree.

  13. Indepth diagnosis of a secondary clarifier by the application of radiotracer technique and numerical modeling.

    PubMed

    Kim, H S; Shin, M S; Jang, D S; Jung, S H

    2006-01-01

    To make an indepth diagnosis of a full-scale rectangular secondary clarifier, an experimental and numerical study has been performed in a wastewater treatment facility. Calculation results by the numerical model with the adoption of the SIMPLE algorithm of Patankar are validated with radiotracer experiments. Emphasis is given to the prediction of residence time distribution (RTD) curves. The predicted RTD profiles are in good agreement with the experimental RTD curves at the upstream and center sections except for the withdrawal zone of the complex effluent weir structure. The simulation results predict successfully the well-known flow characteristics of each stage such as the waterfall phenomenon at the front of the clarifier, the bottom density current and the surface return flow in the settling zone, and the upward flow in the exit zone. The detailed effects of density current are thoroughly investigated in terms of high SS loading and temperature difference between influent and ambient fluid. The program developed in this study shows the high potential to assist in the design and determination of optimal operating conditions to improve effluent quality in a full-scale secondary clarifier.

  14. Fine-grained parallel RNAalifold algorithm for RNA secondary structure prediction on FPGA

    PubMed Central

    Xia, Fei; Dou, Yong; Zhou, Xingming; Yang, Xuejun; Xu, Jiaqing; Zhang, Yang

    2009-01-01

    Background In the field of RNA secondary structure prediction, the RNAalifold algorithm is one of the most popular methods using free energy minimization. However, general-purpose computers including parallel computers or multi-core computers exhibit parallel efficiency of no more than 50%. Field Programmable Gate-Array (FPGA) chips provide a new approach to accelerate RNAalifold by exploiting fine-grained custom design. Results RNAalifold shows complicated data dependences, in which the dependence distance is variable, and the dependence direction is also across two dimensions. We propose a systolic array structure including one master Processing Element (PE) and multiple slave PEs for fine grain hardware implementation on FPGA. We exploit data reuse schemes to reduce the need to load energy matrices from external memory. We also propose several methods to reduce energy table parameter size by 80%. Conclusion To our knowledge, our implementation with 16 PEs is the only FPGA accelerator implementing the complete RNAalifold algorithm. The experimental results show a factor of 12.2 speedup over the RNAalifold (ViennaPackage – 1.6.5) software for a group of aligned RNA sequences with 2981-residue running on a Personal Computer (PC) platform with Pentium 4 2.6 GHz CPU. PMID:19208138

  15. Fatigue as a cause, not a consequence of depression and daytime sleepiness: a cross-lagged analysis.

    PubMed

    Schönberger, Michael; Herrberg, Marlene; Ponsford, Jennie

    2014-01-01

    To examine the temporal relation between fatigue, depression, and daytime sleepiness after traumatic brain injury. Fatigue is a frequent and disabling consequence of traumatic brain injury (TBI). However, it is unclear whether fatigue is a primary consequence of the structural brain injury or a secondary consequence of injury-related sequelae such as depression and daytime sleepiness. Eighty-eight adults with complicated mild-severe TBI (69% male). Fatigue Severity Scale; depression subscale of the Hospital Anxiety and Depression Scale; Epworth Sleepiness scale at baseline and 6-month follow-up. A cross-lagged path analysis computed within a structural equation modeling framework revealed that fatigue was predictive of depression (β = .20, P < .05) and sleepiness (β = .25, P < .05). However, depression and sleepiness did not predict fatigue (P > .05). The results support the view of fatigue after TBI as "primary fatigue"-that is, a consequence of the structural brain injury rather than a secondary consequence of depression or daytime sleepiness. A rehabilitation approach that assists individuals with brain injury in learning to cope with their neuropsychological and physical limitations in everyday life might attenuate their experience with fatigue.

  16. Simulations Using Random-Generated DNA and RNA Sequences

    ERIC Educational Resources Information Center

    Bryce, C. F. A.

    1977-01-01

    Using a very simple computer program written in BASIC, a very large number of random-generated DNA or RNA sequences are obtained. Students use these sequences to predict complementary sequences and translational products, evaluate base compositions, determine frequencies of particular triplet codons, and suggest possible secondary structures.…

  17. Comparison of Three Ionic Liquid-Tolerant Cellulases by Molecular Dynamics

    PubMed Central

    Jaeger, Vance; Burney, Patrick; Pfaendtner, Jim

    2015-01-01

    We have employed molecular dynamics to investigate the differences in ionic liquid tolerance among three distinct family 5 cellulases from Trichoderma viride, Thermogata maritima, and Pyrococcus horikoshii. Simulations of the three cellulases were conducted at a range of temperatures in various binary mixtures of the ionic liquid 1-ethyl-3-methyl-imidazolium acetate with water. Our analysis demonstrates that the effects of ionic liquids on the enzymes vary in each individual case from local structural disturbances to loss of much of one of the enzyme’s secondary structure. Enzymes with more negatively charged surfaces tend to resist destabilization by ionic liquids. Specific and unique structural changes in the enzymes are induced by the presence of ionic liquids. Disruption of the secondary structure, changes in dynamical motion, and local changes in the binding pocket are observed in less tolerant enzymes. Ionic-liquid-induced denaturation of one of the enzymes is indicated over the 500 ns timescale. In contrast, the most tolerant cellulase behaves similarly in water and in ionic-liquid-containing mixtures. Unlike the heuristic approaches that attempt to predict enzyme stability using macroscopic properties, molecular dynamics allows us to predict specific atomic-level structural and dynamical changes in an enzyme’s behavior induced by ionic liquids and other mixed solvents. Using these insights, we propose specific experimentally testable hypotheses regarding the origin of activity loss for each of the systems investigated in this study. PMID:25692593

  18. Deep sequencing of foot-and-mouth disease virus reveals RNA sequences involved in genome packaging.

    PubMed

    Logan, Grace; Newman, Joseph; Wright, Caroline F; Lasecka-Dykes, Lidia; Haydon, Daniel T; Cottam, Eleanor M; Tuthill, Tobias J

    2017-10-18

    Non-enveloped viruses protect their genomes by packaging them into an outer shell or capsid of virus-encoded proteins. Packaging and capsid assembly in RNA viruses can involve interactions between capsid proteins and secondary structures in the viral genome as exemplified by the RNA bacteriophage MS2 and as proposed for other RNA viruses of plants, animals and human. In the picornavirus family of non-enveloped RNA viruses, the requirements for genome packaging remain poorly understood. Here we show a novel and simple approach to identify predicted RNA secondary structures involved in genome packaging in the picornavirus foot-and-mouth disease virus (FMDV). By interrogating deep sequencing data generated from both packaged and unpackaged populations of RNA we have determined multiple regions of the genome with constrained variation in the packaged population. Predicted secondary structures of these regions revealed stem loops with conservation of structure and a common motif at the loop. Disruption of these features resulted in attenuation of virus growth in cell culture due to a reduction in assembly of mature virions. This study provides evidence for the involvement of predicted RNA structures in picornavirus packaging and offers a readily transferable methodology for identifying packaging requirements in many other viruses. Importance In order to transmit their genetic material to a new host, non-enveloped viruses must protect their genomes by packaging them into an outer shell or capsid of virus-encoded proteins. For many non-enveloped RNA viruses the requirements for this critical part of the viral life cycle remain poorly understood. We have identified RNA sequences involved in genome packaging of the picornavirus foot-and-mouth disease virus. This virus causes an economically devastating disease of livestock affecting both the developed and developing world. The experimental methods developed to carry out this work are novel, simple and transferable to the study of packaging signals in other RNA viruses. Improved understanding of RNA packaging may lead to novel vaccine approaches or targets for antiviral drugs with broad spectrum activity. Copyright © 2017 Logan et al.

  19. Evaluating minimalist mimics by exploring key orientations on secondary structures (EKOS)☟

    PubMed Central

    Xin, Dongyue; Ko, Eunhwa; Perez, Lisa M.; Ioerger, Thomas R.; Burgess, Kevin

    2013-01-01

    Peptide mimics that display amino acid side-chains on semi-rigid scaffolds (not peptide polyamides) can be referred to as minimalist mimics. Accessible conformations of these scaffolds may overlay with secondary structures giving, for example, “minimalist helical mimics”. It is difficult for researchers who want to apply minimalist mimics to decide which one to use because there is no widely accepted protocol for calibrating how closely these compounds mimic secondary structures. Moreover, it is also difficult for potential practitioners to evaluate which ideal minimalist helical mimics are preferred for a particular set of side-chains. For instance, what mimic presents i, i+4, i+7 side-chains in orientations that best resemble an ideal α-helix, and is a different mimic required for a i, i+3, i+7 helical combination? This article describes a protocol for fitting each member of an array of accessible scaffold conformations on secondary structures. The protocol involves: (i) use quenched molecular dynamics (QMD) to generate an ensemble consisting of hundreds of accessible, low energy conformers of the mimics; (ii) representation of each of these as a set of Cα and Cβ coordinates corresponding to three amino acid side-chains displayed by the scaffolds;(iii) similar representation of each combination of three side-chains in each ideal secondary structure as a set of Cα and Cβ coordinates corresponding to three amino acid side-chains displayed by the scaffolds; and, (iv) overlay Cα and Cβ coordinates of all the conformers on all the sets of side-chain “triads” in the ideal secondary structures and express the goodness of fit in terms of root mean squared deviation (RMSD, Å) for each overlay. We refer to this process as Exploring Key Orientations on Secondary structures (EKOS). Application of this procedure reveals the relative bias of a scaffold to overlay on different secondary structures, the “side-chain correspondences” (eg i, i+4, i+7 or i, i+3, i+4) of those overlays, and the energy of this state relative to the minimum located. This protocol was tested on some of the most widely cited minimalist α-helical mimics (1 – 8 in the text). The data obtained indicates several of these compounds preferentially exist in conformations that resemble other secondary structures as well as α-helices, and many of the α-helical conformations have unexpected side-chain correspondences. These observations imply the featured minimalist mimics have more scope for disrupting PPI interfaces than previously anticipated. Finally, the same simulation method was used to match preferred conformations of minimalist mimics with actual protein/peptide structures at interfaces providing quantitative comparisons of predicted fits of the test mimics at protein-protein interaction sites. PMID:24121516

  20. Evaluating minimalist mimics by exploring key orientations on secondary structures (EKOS).

    PubMed

    Xin, Dongyue; Ko, Eunhwa; Perez, Lisa M; Ioerger, Thomas R; Burgess, Kevin

    2013-11-28

    Peptide mimics that display amino acid side-chains on semi-rigid scaffolds (not peptide polyamides) can be referred to as minimalist mimics. Accessible conformations of these scaffolds may overlay with secondary structures giving, for example, "minimalist helical mimics". It is difficult for researchers who want to apply minimalist mimics to decide which one to use because there is no widely accepted protocol for calibrating how closely these compounds mimic secondary structures. Moreover, it is also difficult for potential practitioners to evaluate which ideal minimalist helical mimics are preferred for a particular set of side-chains. For instance, what mimic presents i, i + 4, i + 7 side-chains in orientations that best resemble an ideal α-helix, and is a different mimic required for a i, i + 3, i + 7 helical combination? This article describes a protocol for fitting each member of an array of accessible scaffold conformations on secondary structures. The protocol involves: (i) use quenched molecular dynamics (QMD) to generate an ensemble consisting of hundreds of accessible, low energy conformers of the mimics; (ii) representation of each of these as a set of Cα and Cβ coordinates corresponding to three amino acid side-chains displayed by the scaffolds; (iii) similar representation of each combination of three side-chains in each ideal secondary structure as a set of Cα and Cβ coordinates corresponding to three amino acid side-chains displayed by the scaffolds; and, (iv) overlay Cα and Cβ coordinates of all the conformers on all the sets of side-chain "triads" in the ideal secondary structures and express the goodness of fit in terms of root mean squared deviation (RMSD, Å) for each overlay. We refer to this process as Exploring Key Orientations on Secondary structures (EKOS). Application of this procedure reveals the relative bias of a scaffold to overlay on different secondary structures, the "side-chain correspondences" (e.g. i, i + 4, i + 7 or i, i + 3, i + 4) of those overlays, and the energy of this state relative to the minimum located. This protocol was tested on some of the most widely cited minimalist α-helical mimics (1-8 in the text). The data obtained indicates several of these compounds preferentially exist in conformations that resemble other secondary structures as well as α-helices, and many of the α-helical conformations have unexpected side-chain correspondences. These observations imply the featured minimalist mimics have more scope for disrupting PPI interfaces than previously anticipated. Finally, the same simulation method was used to match preferred conformations of minimalist mimics with actual protein/peptide structures at interfaces providing quantitative comparisons of predicted fits of the test mimics at protein-protein interaction sites.

  1. Determination of protein folding kinetic types using sequence and predicted secondary structure and solvent accessibility.

    PubMed

    Zhang, Hua; Zhang, Tuo; Gao, Jianzhao; Ruan, Jishou; Shen, Shiyi; Kurgan, Lukasz

    2012-01-01

    Proteins fold through a two-state (TS), with no visible intermediates, or a multi-state (MS), via at least one intermediate, process. We analyze sequence-derived factors that determine folding types by introducing a novel sequence-based folding type predictor called FOKIT. This method implements a logistic regression model with six input features which hybridize information concerning amino acid composition and predicted secondary structure and solvent accessibility. FOKIT provides predictions with average Matthews correlation coefficient (MCC) between 0.58 and 0.91 measured using out-of-sample tests on four benchmark datasets. These results are shown to be competitive or better than results of four modern predictors. We also show that FOKIT outperforms these methods when predicting chains that share low similarity with the chains used to build the model, which is an important advantage given the limited number of annotated chains. We demonstrate that inclusion of solvent accessibility helps in discrimination of the folding kinetic types and that three of the features constitute statistically significant markers that differentiate TS and MS folders. We found that the increased content of exposed Trp and buried Leu are indicative of the MS folding, which implies that the exposure/burial of certain hydrophobic residues may play important role in the formation of the folding intermediates. Our conclusions are supported by two case studies.

  2. Performance of protein-structure predictions with the physics-based UNRES force field in CASP11.

    PubMed

    Krupa, Paweł; Mozolewska, Magdalena A; Wiśniewska, Marta; Yin, Yanping; He, Yi; Sieradzan, Adam K; Ganzynkowicz, Robert; Lipska, Agnieszka G; Karczyńska, Agnieszka; Ślusarz, Magdalena; Ślusarz, Rafał; Giełdoń, Artur; Czaplewski, Cezary; Jagieła, Dawid; Zaborowski, Bartłomiej; Scheraga, Harold A; Liwo, Adam

    2016-11-01

    Participating as the Cornell-Gdansk group, we have used our physics-based coarse-grained UNited RESidue (UNRES) force field to predict protein structure in the 11th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP11). Our methodology involved extensive multiplexed replica exchange simulations of the target proteins with a recently improved UNRES force field to provide better reproductions of the local structures of polypeptide chains. All simulations were started from fully extended polypeptide chains, and no external information was included in the simulation process except for weak restraints on secondary structure to enable us to finish each prediction within the allowed 3-week time window. Because of simplified UNRES representation of polypeptide chains, use of enhanced sampling methods, code optimization and parallelization and sufficient computational resources, we were able to treat, for the first time, all 55 human prediction targets with sizes from 44 to 595 amino acid residues, the average size being 251 residues. Complete structures of six single-domain proteins were predicted accurately, with the highest accuracy being attained for the T0769, for which the CαRMSD was 3.8 Å for 97 residues of the experimental structure. Correct structures were also predicted for 13 domains of multi-domain proteins with accuracy comparable to that of the best template-based modeling methods. With further improvements of the UNRES force field that are now underway, our physics-based coarse-grained approach to protein-structure prediction will eventually reach global prediction capacity and, consequently, reliability in simulating protein structure and dynamics that are important in biochemical processes. Freely available on the web at http://www.unres.pl/ CONTACT: has5@cornell.edu. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  3. PRISM 3: expanded prediction of natural product chemical structures from microbial genomes

    PubMed Central

    Skinnider, Michael A.; Merwin, Nishanth J.; Johnston, Chad W.

    2017-01-01

    Abstract Microbial natural products represent a rich resource of pharmaceutically and industrially important compounds. Genome sequencing has revealed that the majority of natural products remain undiscovered, and computational methods to connect biosynthetic gene clusters to their corresponding natural products therefore have the potential to revitalize natural product discovery. Previously, we described PRediction Informatics for Secondary Metabolomes (PRISM), a combinatorial approach to chemical structure prediction for genetically encoded nonribosomal peptides and type I and II polyketides. Here, we present a ground-up rewrite of the PRISM structure prediction algorithm to derive prediction of natural products arising from non-modular biosynthetic paradigms. Within this new version, PRISM 3, natural product scaffolds are modeled as chemical graphs, permitting structure prediction for aminocoumarins, antimetabolites, bisindoles and phosphonate natural products, and building upon the addition of ribosomally synthesized and post-translationally modified peptides. Further, with the addition of cluster detection for 11 new cluster types, PRISM 3 expands to detect 22 distinct natural product cluster types. Other major modifications to PRISM include improved sequence input and ORF detection, user-friendliness and output. Distribution of PRISM 3 over a 300-core server grid improves the speed and capacity of the web application. PRISM 3 is available at http://magarveylab.ca/prism/. PMID:28460067

  4. Imaging of Traumatic Brain Injury.

    PubMed

    Bodanapally, Uttam K; Sours, Chandler; Zhuo, Jiachen; Shanmuganathan, Kathirkamanathan

    2015-07-01

    Imaging plays an important role in the management of patients with traumatic brain injury (TBI). Computed tomography (CT) is the first-line imaging technique allowing rapid detection of primary structural brain lesions that require surgical intervention. CT also detects various deleterious secondary insults allowing early medical and surgical management. Serial imaging is critical to identifying secondary injuries. MR imaging is indicated in patients with acute TBI when CT fails to explain neurologic findings. However, MR imaging is superior in patients with subacute and chronic TBI and also predicts neurocognitive outcome. Copyright © 2015 Elsevier Inc. All rights reserved.

  5. A test of AMBER force fields in predicting the secondary structure of α-helical and β-hairpin peptides

    NASA Astrophysics Data System (ADS)

    Gao, Ya; Zhang, Chaomin; Wang, Xianwei; Zhu, Tong

    2017-07-01

    We tested the ability of some current AMBER force fields, namely, AMBER03, AMBER99SB, AMBER99SB-ildn, AMBER99SB-nmr, AMBER12SB, AMBER14SB, and AMBER14ipq, with implicit solvent model in reproducing the folding behavior of two peptides by REMD simulations. AMBER99SB-nmr force field provides the most reliable performance. After a novel polarized hydrogen bond charge model is considered, the α-helix successfully folded to its native state, while the further folding of the β-hairpin is not observed. This study strongly suggests that polarization effect and correct torsional term are important to investigate dynamic and conformational properties of peptides with different secondary structures.

  6. IMG-ABC. A knowledge base to fuel discovery of biosynthetic gene clusters and novel secondary metabolites

    DOE PAGES

    Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; ...

    2015-07-14

    In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of “big” genomic data for discovering small molecules. IMG-ABC relies on IMG’s comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve asmore » the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC’s focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in lphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. IMG-ABC is the largest publicly available database of predicted and experimental biosynthetic gene clusters and the secondary metabolites they produce. The system also includes powerful search and analysis tools that are integrated with IMG’s extensive genomic/metagenomic data and analysis tool kits. As new research on biosynthetic gene clusters and secondary metabolites is published and more genomes are sequenced, IMG-ABC will continue to expand, with the goal of becoming an essential component of any bioinformatic exploration of the secondary metabolism world.« less

  7. Computational RNomics of Drosophilids

    PubMed Central

    Rose, Dominic; Hackermüller, Jörg; Washietl, Stefan; Reiche, Kristin; Hertel, Jana; Findeiß, Sven; Stadler, Peter F; Prohaska, Sonja J

    2007-01-01

    Background Recent experimental and computational studies have provided overwhelming evidence for a plethora of diverse transcripts that are unrelated to protein-coding genes. One subclass consists of those RNAs that require distinctive secondary structure motifs to exert their biological function and hence exhibit distinctive patterns of sequence conservation characteristic for positive selection on RNA secondary structure. The deep-sequencing of 12 drosophilid species coordinated by the NHGRI provides an ideal data set of comparative computational approaches to determine those genomic loci that code for evolutionarily conserved RNA motifs. This class of loci includes the majority of the known small ncRNAs as well as structured RNA motifs in mRNAs. We report here on a genome-wide survey using RNAz. Results We obtain 16 000 high quality predictions among which we recover the majority of the known ncRNAs. Taking a pessimistically estimated false discovery rate of 40% into account, this implies that at least some ten thousand loci in the Drosophila genome show the hallmarks of stabilizing selection action of RNA structure, and hence are most likely functional at the RNA level. A subset of RNAz predictions overlapping with TRF1 and BRF binding sites [Isogai et al., EMBO J. 26: 79–89 (2007)], which are plausible candidates of Pol III transcripts, have been studied in more detail. Among these sequences we identify several "clusters" of ncRNA candidates with striking structural similarities. Conclusion The statistical evaluation of the RNAz predictions in comparison with a similar analysis of vertebrate genomes [Washietl et al., Nat. Biotech. 23: 1383–1390 (2005)] shows that qualitatively similar fractions of structured RNAs are found in introns, UTRs, and intergenic regions. The intergenic RNA structures, however, are concentrated much more closely around known protein-coding loci, suggesting that flies have significantly smaller complement of independent structured ncRNAs compared to mammals. PMID:17996037

  8. Secondary foundation species as drivers of trophic and functional diversity: evidence from a tree-epiphyte system.

    PubMed

    Angelini, Christine; Silliman, Brian R

    2014-01-01

    Facilitation cascades arise where primary foundation species facilitate secondary (dependent) foundation species, and collectively, they increase habitat complexity and quality to enhance biodiversity. Whether such phenomena occur in nonmarine systems and if secondary foundation species enhance food web structure (e.g., support novel feeding guilds) and ecosystem function (e.g., provide nursery for juveniles) remain unclear. Here we report on field experiments designed to test whether trees improve epiphyte survival and epiphytes secondarily increase the number and diversity of adult and juvenile invertebrates in a potential live oak-Tillandsia usneoides (Spanish moss) facilitation cascade. Our results reveal that trees reduce physical stress to facilitate Tillandsia, which, in turn, reduces desiccation and predation stress to facilitate invertebrates. In experimental removals, invertebrate total density, juvenile density, species richness and H' diversity were 16, 60, 1.7, and 1.5 times higher, and feeding guild richness and H' were 5 and 11 times greater in Tillandsia-colonized relative to Tillandsia-removal limb plots. Tillandsia enhanced communities similarly in a survey across the southeastern United States. These findings reveal that a facilitation cascade organizes this widespread terrestrial assemblage and expand the role of secondary foundation species as drivers of trophic structure and ecosystem function. We conceptualize the relationship between foundation species' structural attributes and associated species abundance and composition in a Foundation Species-Biodiversity (FSB) model. Importantly, the FSB predicts that, where secondary foundation species form expansive and functionally distinct structures that increase habitat availability and complexity within primary foundation species, they generate and maintain hot spots of biodiversity and trophic interactions.

  9. Protein Secondary Structures (alpha-helix and beta-sheet) at a Cellular Levle and Protein Fractions in Relation to Rumen Degradation Behaviours of Protein: A New Approach

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yu,P.

    2007-01-01

    Studying the secondary structure of proteins leads to an understanding of the components that make up a whole protein, and such an understanding of the structure of the whole protein is often vital to understanding its digestive behaviour and nutritive value in animals. The main protein secondary structures are the {alpha}-helix and {beta}-sheet. The percentage of these two structures in protein secondary structures influences protein nutritive value, quality and digestive behaviour. A high percentage of {beta}-sheet structure may partly cause a low access to gastrointestinal digestive enzymes, which results in a low protein value. The objectives of the present studymore » were to use advanced synchrotron-based Fourier transform IR (S-FTIR) microspectroscopy as a new approach to reveal the molecular chemistry of the protein secondary structures of feed tissues affected by heat-processing within intact tissue at a cellular level, and to quantify protein secondary structures using multicomponent peak modelling Gaussian and Lorentzian methods, in relation to protein digestive behaviours and nutritive value in the rumen, which was determined using the Cornell Net Carbohydrate Protein System. The synchrotron-based molecular chemistry research experiment was performed at the National Synchrotron Light Source at Brookhaven National Laboratory, US Department of Energy. The results showed that, with S-FTIR microspectroscopy, the molecular chemistry, ultrastructural chemical make-up and nutritive characteristics could be revealed at a high ultraspatial resolution ({approx}10 {mu}m). S-FTIR microspectroscopy revealed that the secondary structure of protein differed between raw and roasted golden flaxseeds in terms of the percentages and ratio of {alpha}-helixes and {beta}-sheets in the mid-IR range at the cellular level. By using multicomponent peak modelling, the results show that the roasting reduced (P <0.05) the percentage of {alpha}-helixes (from 47.1% to 36.1%: S-FTIR absorption intensity), increased the percentage of {beta}-sheets (from 37.2% to 49.8%: S-FTIR absorption intensity) and reduced the {alpha}-helix to {beta}-sheet ratio (from 0.3 to 0.7) in the golden flaxseeds, which indicated a negative effect of the roasting on protein values, utilisation and bioavailability. These results were proved by the Cornell Net Carbohydrate Protein System in situ animal trial, which also revealed that roasting increased the amount of protein bound to lignin, and well as of the Maillard reaction protein (both of which are poorly used by ruminants), and increased the level of indigestible and undegradable protein in ruminants. The present results demonstrate the potential of highly spatially resolved synchrotron-based infrared microspectroscopy to locate 'pure' protein in feed tissues, and reveal protein secondary structures and digestive behaviour, making a significant step forward in and an important contribution to protein nutritional research. Further study is needed to determine the sensitivities of protein secondary structures to various heat-processing conditions, and to quantify the relationship between protein secondary structures and the nutrient availability and digestive behaviour of various protein sources. Information from the present study arising from the synchrotron-based IR probing of the protein secondary structures of protein sources at the cellular level will be valuable as a guide to maintaining protein quality and predicting digestive behaviours.« less

  10. Protein backbone and sidechain torsion angles predicted from NMR chemical shifts using artificial neural networks

    PubMed Central

    Shen, Yang; Bax, Ad

    2013-01-01

    A new program, TALOS-N, is introduced for predicting protein backbone torsion angles from NMR chemical shifts. The program relies far more extensively on the use of trained artificial neural networks than its predecessor, TALOS+. Validation on an independent set of proteins indicates that backbone torsion angles can be predicted for a larger, ≥ 90% fraction of the residues, with an error rate smaller than ca 3.5%, using an acceptance criterion that is nearly two-fold tighter than that used previously, and a root mean square difference between predicted and crystallographically observed (φ,ψ) torsion angles of ca 12°. TALOS-N also reports sidechain χ1 rotameric states for about 50% of the residues, and a consistency with reference structures of 89%. The program includes a neural network trained to identify secondary structure from residue sequence and chemical shifts. PMID:23728592

  11. Association of BMPR-1B and GDF9 genes polymorphisms and secondary protein structure changes with reproduction traits in Mehraban ewes.

    PubMed

    Abdoli, R; Zamani, P; Deljou, A; Rezvan, H

    2013-07-25

    BMPR-1B and GDF9 genes are well known due to their important effects on litter size and mechanisms controlling ovulation rate in sheep. In the present study, polymorphisms of BMPR-1B gene exon 8 and GDF9 gene exon 1 were detected by single strand conformational polymorphism (SSCP) analysis and DNA sequencing methods in 100 Mehraban ewes. The PCR reaction forced to amplify 140 and 380-bp fragments of BMPR-1B and GDF9 genes, respectively. Two single nucleotide polymorphisms (SNPS) were identified in two different SSCP patterns of BMPR-1B gene (CC and CA genotypes) that deduced one amino acid exchange. Also, two SNPS were identified in three different SSCP patterns of GDF9 gene (AA, AG and GG genotypes) that deduced one amino acid exchanges. Two different secondary structures of protein were predicted for BMPR-1B exon 8, but the secondary protein structures predicted for GDF9 exon 1 were similar together. The evaluation of the associations between the SSCP patterns and the protein structure changes with reproduction traits showed that BMPR-1B exon 8 genotypes have significant effects on some of reproduction traits but the GDF9 genotypes did not have any significant effect. The CA genotype of BMPR-1B exon 8 had a significant positive effect on reproduction performance and could be considered as an important and new mutation, affecting the ewes reproduction performance. Marker assisted selection using BMPR-IB gene could be noticed to improve the reproduction traits in Mehraban sheep. Copyright © 2013 Elsevier B.V. All rights reserved.

  12. Global Organization of a Positive-strand RNA Virus Genome

    PubMed Central

    Wu, Baodong; Grigull, Jörg; Ore, Moriam O.; Morin, Sylvie; White, K. Andrew

    2013-01-01

    The genomes of plus-strand RNA viruses contain many regulatory sequences and structures that direct different viral processes. The traditional view of these RNA elements are as local structures present in non-coding regions. However, this view is changing due to the discovery of regulatory elements in coding regions and functional long-range intra-genomic base pairing interactions. The ∼4.8 kb long RNA genome of the tombusvirus tomato bushy stunt virus (TBSV) contains these types of structural features, including six different functional long-distance interactions. We hypothesized that to achieve these multiple interactions this viral genome must utilize a large-scale organizational strategy and, accordingly, we sought to assess the global conformation of the entire TBSV genome. Atomic force micrographs of the genome indicated a mostly condensed structure composed of interconnected protrusions extending from a central hub. This configuration was consistent with the genomic secondary structure model generated using high-throughput selective 2′-hydroxyl acylation analysed by primer extension (i.e. SHAPE), which predicted different sized RNA domains originating from a central region. Known RNA elements were identified in both domain and inter-domain regions, and novel structural features were predicted and functionally confirmed. Interestingly, only two of the six long-range interactions known to form were present in the structural model. However, for those interactions that did not form, complementary partner sequences were positioned relatively close to each other in the structure, suggesting that the secondary structure level of viral genome structure could provide a basic scaffold for the formation of different long-range interactions. The higher-order structural model for the TBSV RNA genome provides a snapshot of the complex framework that allows multiple functional components to operate in concert within a confined context. PMID:23717202

  13. Enniatin and Beauvericin Biosynthesis in Fusarium Species: Production Profiles and Structural Determinant Prediction.

    PubMed

    Liuzzi, Vania C; Mirabelli, Valentina; Cimmarusti, Maria Teresa; Haidukowski, Miriam; Leslie, John F; Logrieco, Antonio F; Caliandro, Rocco; Fanelli, Francesca; Mulè, Giuseppina

    2017-01-25

    Members of the fungal genus Fusarium can produce numerous secondary metabolites, including the nonribosomal mycotoxins beauvericin (BEA) and enniatins (ENNs). Both mycotoxins are synthesized by the multifunctional enzyme enniatin synthetase (ESYN1) that contains both peptide synthetase and S-adenosyl-l-methionine-dependent N -methyltransferase activities. Several Fusarium species can produce ENNs, BEA or both, but the mechanism(s) enabling these differential metabolic profiles is unknown. In this study, we analyzed the primary structure of ESYN1 by sequencing esyn1 transcripts from different Fusarium species. We measured ENNs and BEA production by ultra-performance liquid chromatography coupled with photodiode array and Acquity QDa mass detector (UPLC-PDA-QDa) analyses. We predicted protein structures, compared the predictions by multivariate analysis methods and found a striking correlation between BEA/ENN-producing profiles and ESYN1 three-dimensional structures. Structural differences in the β strand's Asn789-Ala793 and His797-Asp802 portions of the amino acid adenylation domain can be used to distinguish BEA/ENN-producing Fusarium isolates from those that produce only ENN.

  14. Lost in folding space? Comparing four variants of the thermodynamic model for RNA secondary structure prediction.

    PubMed

    Janssen, Stefan; Schudoma, Christian; Steger, Gerhard; Giegerich, Robert

    2011-11-03

    Many bioinformatics tools for RNA secondary structure analysis are based on a thermodynamic model of RNA folding. They predict a single, "optimal" structure by free energy minimization, they enumerate near-optimal structures, they compute base pair probabilities and dot plots, representative structures of different abstract shapes, or Boltzmann probabilities of structures and shapes. Although all programs refer to the same physical model, they implement it with considerable variation for different tasks, and little is known about the effects of heuristic assumptions and model simplifications used by the programs on the outcome of the analysis. We extract four different models of the thermodynamic folding space which underlie the programs RNAFOLD, RNASHAPES, and RNASUBOPT. Their differences lie within the details of the energy model and the granularity of the folding space. We implement probabilistic shape analysis for all models, and introduce the shape probability shift as a robust measure of model similarity. Using four data sets derived from experimentally solved structures, we provide a quantitative evaluation of the model differences. We find that search space granularity affects the computed shape probabilities less than the over- or underapproximation of free energy by a simplified energy model. Still, the approximations perform similar enough to implementations of the full model to justify their continued use in settings where computational constraints call for simpler algorithms. On the side, we observe that the rarely used level 2 shapes, which predict the complete arrangement of helices, multiloops, internal loops and bulges, include the "true" shape in a rather small number of predicted high probability shapes. This calls for an investigation of new strategies to extract high probability members from the (very large) level 2 shape space of an RNA sequence. We provide implementations of all four models, written in a declarative style that makes them easy to be modified. Based on our study, future work on thermodynamic RNA folding may make a choice of model based on our empirical data. It can take our implementations as a starting point for further program development.

  15. A Structural Equation Model Explaining 8th Grade Students' Mathematics Achievements

    ERIC Educational Resources Information Center

    Yurt, Eyüp; Sünbül, Ali Murat

    2014-01-01

    The purpose of this study is to investigate, via a model, the explanatory and predictive relationships among the following variables: Mathematical Problem Solving and Reasoning Skills, Sources of Mathematics Self-Efficacy, Spatial Ability, and Mathematics Achievements of Secondary School 8th Grade Students. The sample group of the study, itself…

  16. Genome-Wide Prediction of Intrinsic Disorder; Sequence Alignment of Intrinsically Disordered Proteins

    ERIC Educational Resources Information Center

    Midic, Uros

    2012-01-01

    Intrinsic disorder (ID) is defined as a lack of stable tertiary and/or secondary structure under physiological conditions in vitro. Intrinsically disordered proteins (IDPs) are highly abundant in nature. IDPs possess a number of crucial biological functions, being involved in regulation, recognition, signaling and control, e.g. their functional…

  17. Exact calculation of distributions on integers, with application to sequence alignment.

    PubMed

    Newberg, Lee A; Lawrence, Charles E

    2009-01-01

    Computational biology is replete with high-dimensional discrete prediction and inference problems. Dynamic programming recursions can be applied to several of the most important of these, including sequence alignment, RNA secondary-structure prediction, phylogenetic inference, and motif finding. In these problems, attention is frequently focused on some scalar quantity of interest, a score, such as an alignment score or the free energy of an RNA secondary structure. In many cases, score is naturally defined on integers, such as a count of the number of pairing differences between two sequence alignments, or else an integer score has been adopted for computational reasons, such as in the test of significance of motif scores. The probability distribution of the score under an appropriate probabilistic model is of interest, such as in tests of significance of motif scores, or in calculation of Bayesian confidence limits around an alignment. Here we present three algorithms for calculating the exact distribution of a score of this type; then, in the context of pairwise local sequence alignments, we apply the approach so as to find the alignment score distribution and Bayesian confidence limits.

  18. Absolute comparison of simulated and experimental protein-folding dynamics

    NASA Astrophysics Data System (ADS)

    Snow, Christopher D.; Nguyen, Houbi; Pande, Vijay S.; Gruebele, Martin

    2002-11-01

    Protein folding is difficult to simulate with classical molecular dynamics. Secondary structure motifs such as α-helices and β-hairpins can form in 0.1-10µs (ref. 1), whereas small proteins have been shown to fold completely in tens of microseconds. The longest folding simulation to date is a single 1-µs simulation of the villin headpiece; however, such single runs may miss many features of the folding process as it is a heterogeneous reaction involving an ensemble of transition states. Here, we have used a distributed computing implementation to produce tens of thousands of 5-20-ns trajectories (700µs) to simulate mutants of the designed mini-protein BBA5. The fast relaxation dynamics these predict were compared with the results of laser temperature-jump experiments. Our computational predictions are in excellent agreement with the experimentally determined mean folding times and equilibrium constants. The rapid folding of BBA5 is due to the swift formation of secondary structure. The convergence of experimentally and computationally accessible timescales will allow the comparison of absolute quantities characterizing in vitro and in silico (computed) protein folding.

  19. Systematically frameshifting by deletion of every 4th or 4th and 5th nucleotides during mitochondrial transcription: RNA self-hybridization regulates delRNA expression.

    PubMed

    Seligmann, Hervé

    2016-01-01

    In mitochondria, secondary structures punctuate post-transcriptional RNA processing. Recently described transcripts match the human mitogenome after systematic deletions of every 4th, respectively every 4th and 5th nucleotides, called delRNAs. Here I explore predicted stem-loop hairpin formation by delRNAs, and their associations with delRNA transcription and detected peptides matching their translation. Despite missing 25, respectively 40% of the nucleotides in the original sequence, del-transformed sequences form significantly more secondary structures than corresponding randomly shuffled sequences, indicating biological function, independently of, and in combination with, previously detected delRNA and thereof translated peptides. Self-hybridization decreases delRNA abundances, indicating downregulation. Systematic deletions of the human mitogenome reveal new, unsuspected coding and structural informations. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  20. A numerical study of the complex flow structure in a compound meandering channel

    NASA Astrophysics Data System (ADS)

    Moncho-Esteve, Ignacio J.; García-Villalba, Manuel; Muto, Yasu; Shiono, Koji; Palau-Salvador, Guillermo

    2018-06-01

    In this study, we report large eddy simulations of turbulent flow in a periodic compound meandering channel for three different depth conditions: one in-bank and two overbank conditions. The flow configuration corresponds to the experiments of Shiono and Muto (1998). The predicted mean streamwise velocities, mean secondary motions, velocity fluctuations, turbulent kinetic energy as well as mean flood flow angle to meandering channel are in good agreement with the experimental measurements. We have analyzed the flow structure as a function of the inundation level, with particular emphasis on the development of the secondary motions due to the interaction between the main channel and the floodplain flow. Bed shear stresses have been also estimated in the simulations. Floodplain flow has a significant impact on the flow structure leading to significantly different bed shear stress patterns within the main meandering channel. The implications of these results for natural compound meandering channels are also discussed.

  1. Predicting students' physical activity and health-related well-being: a prospective cross-domain investigation of motivation across school physical education and exercise settings.

    PubMed

    Standage, Martyn; Gillison, Fiona B; Ntoumanis, Nikos; Treasure, Darren C

    2012-02-01

    A three-wave prospective design was used to assess a model of motivation guided by self-determination theory (Ryan & Deci, 2008) spanning the contexts of school physical education (PE) and exercise. The outcome variables examined were health-related quality of life (HRQoL), physical self-concept (PSC), and 4 days of objectively assessed estimates of activity. Secondary school students (n = 494) completed questionnaires at three separate time points and were familiarized with how to use a sealed pedometer. Results of structural equation modeling supported a model in which perceptions of autonomy support from a PE teacher positively predicted PE-related need satisfaction (autonomy, competence, and relatedness). Competence predicted PSC, whereas relatedness predicted HRQoL. Autonomy and competence positively predicted autonomous motivation toward PE, which in turn positively predicted autonomous motivation toward exercise (i.e., 4-day pedometer step count). Autonomous motivation toward exercise positively predicted step count, HRQoL, and PSC. Results of multisample structural equation modeling supported gender invariance. Suggestions for future work are discussed.

  2. Deposition and reentrainment of Brownian particles in porous media under unfavorable chemical conditions: some concepts and applications.

    PubMed

    Hahn, Melinda W; O'Meliae, Charles R

    2004-01-01

    The deposition and reentrainment of particles in porous media have been examined theoretically and experimentally. A Brownian Dynamics/Monte Carlo (MC/BD) model has been developed that simulates the movement of Brownian particles near a collector under "unfavorable" chemical conditions and allows deposition in primary and secondary minima. A simple Maxwell approach has been used to estimate particle attachment efficiency by assuming deposition in the secondary minimum and calculating the probability of reentrainment. The MC/BD simulations and the Maxwell calculations support an alternative view of the deposition and reentrainment of Brownian particles under unfavorable chemical conditions. These calculations indicate that deposition into and subsequent release from secondary minima can explain reported discrepancies between classic model predictions that assume irreversible deposition in a primary well and experimentally determined deposition efficiencies that are orders of magnitude larger than Interaction Force Boundary Layer (IFBL) predictions. The commonly used IFBL model, for example, is based on the notion of transport over an energy barrier into the primary well and does not address contributions of secondary minimum deposition. A simple Maxwell model based on deposition into and reentrainment from secondary minima is much more accurate in predicting deposition rates for column experiments at low ionic strengths. It also greatly reduces the substantial particle size effects inherent in IFBL models, wherein particle attachment rates are predicted to decrease significantly with increasing particle size. This view is consistent with recent work by others addressing the composition and structure of the first few nanometers at solid-water interfaces including research on modeling water at solid-liquid interfaces, surface speciation, interfacial force measurements, and the rheological properties of concentrated suspensions. It follows that deposition under these conditions will depend on the depth of the secondary minimum and that some transition between secondary and primary depositions should occur when the height of the energy barrier is on the order of several kT. When deposition in secondary minima predominates, observed deposition should increase with increasing ionic strength, particle size, and Hamaker constant. Since an equilibrium can develop between bound and bulk particles, the collision efficiency [alpha] can no longer be considered a constant for a given physical and chemical system. Rather, in many cases it can decrease over time until it eventually reaches zero as equilibrium is established.

  3. General Mechanism of Two-State Protein Folding Kinetics

    PubMed Central

    Rollins, Geoffrey C.; Dill, Ken A.

    2016-01-01

    We describe here a general model of the kinetic mechanism of protein folding. In the Foldon Funnel Model, proteins fold in units of secondary structures, which form sequentially along the folding pathway, stabilized by tertiary interactions. The model predicts that the free energy landscape has a volcano shape, rather than a simple funnel, that folding is two-state (single-exponential) when secondary structures are intrinsically unstable, and that each structure along the folding path is a transition state for the previous structure. It shows how sequential pathways are consistent with multiple stochastic routes on funnel landscapes, and it gives good agreement with the 9 order of magnitude dependence of folding rates on protein size for a set of 93 proteins, at the same time it is consistent with the near independence of folding equilibrium constant on size. This model gives estimates of folding rates of proteomes, leading to a median folding time in Escherichia coli of about 5 s. PMID:25056406

  4. miRCat2: accurate prediction of plant and animal microRNAs from next-generation sequencing datasets

    PubMed Central

    Paicu, Claudia; Mohorianu, Irina; Stocks, Matthew; Xu, Ping; Coince, Aurore; Billmeier, Martina; Dalmay, Tamas; Moulton, Vincent; Moxon, Simon

    2017-01-01

    Abstract Motivation MicroRNAs are a class of ∼21–22 nt small RNAs which are excised from a stable hairpin-like secondary structure. They have important gene regulatory functions and are involved in many pathways including developmental timing, organogenesis and development in eukaryotes. There are several computational tools for miRNA detection from next-generation sequencing datasets. However, many of these tools suffer from high false positive and false negative rates. Here we present a novel miRNA prediction algorithm, miRCat2. miRCat2 incorporates a new entropy-based approach to detect miRNA loci, which is designed to cope with the high sequencing depth of current next-generation sequencing datasets. It has a user-friendly interface and produces graphical representations of the hairpin structure and plots depicting the alignment of sequences on the secondary structure. Results We test miRCat2 on a number of animal and plant datasets and present a comparative analysis with miRCat, miRDeep2, miRPlant and miReap. We also use mutants in the miRNA biogenesis pathway to evaluate the predictions of these tools. Results indicate that miRCat2 has an improved accuracy compared with other methods tested. Moreover, miRCat2 predicts several new miRNAs that are differentially expressed in wild-type versus mutants in the miRNA biogenesis pathway. Availability and Implementation miRCat2 is part of the UEA small RNA Workbench and is freely available from http://srna-workbench.cmp.uea.ac.uk/. Contact v.moulton@uea.ac.uk or s.moxon@uea.ac.uk Supplementary information Supplementary data are available at Bioinformatics online. PMID:28407097

  5. The unfoldomics decade: an update on intrinsically disordered proteins.

    PubMed

    Dunker, A Keith; Oldfield, Christopher J; Meng, Jingwei; Romero, Pedro; Yang, Jack Y; Chen, Jessica Walton; Vacic, Vladimir; Obradovic, Zoran; Uversky, Vladimir N

    2008-09-16

    Our first predictor of protein disorder was published just over a decade ago in the Proceedings of the IEEE International Conference on Neural Networks (Romero P, Obradovic Z, Kissinger C, Villafranca JE, Dunker AK (1997) Identifying disordered regions in proteins from amino acid sequence. Proceedings of the IEEE International Conference on Neural Networks, 1: 90-95). By now more than twenty other laboratory groups have joined the efforts to improve the prediction of protein disorder. While the various prediction methodologies used for protein intrinsic disorder resemble those methodologies used for secondary structure prediction, the two types of structures are entirely different. For example, the two structural classes have very different dynamic properties, with the irregular secondary structure class being much less mobile than the disorder class. The prediction of secondary structure has been useful. On the other hand, the prediction of intrinsic disorder has been revolutionary, leading to major modifications of the more than 100 year-old views relating protein structure and function. Experimentalists have been providing evidence over many decades that some proteins lack fixed structure or are disordered (or unfolded) under physiological conditions. In addition, experimentalists are also showing that, for many proteins, their functions depend on the unstructured rather than structured state; such results are in marked contrast to the greater than hundred year old views such as the lock and key hypothesis. Despite extensive data on many important examples, including disease-associated proteins, the importance of disorder for protein function has been largely ignored. Indeed, to our knowledge, current biochemistry books don't present even one acknowledged example of a disorder-dependent function, even though some reports of disorder-dependent functions are more than 50 years old. The results from genome-wide predictions of intrinsic disorder and the results from other bioinformatics studies of intrinsic disorder are demanding attention for these proteins. Disorder prediction has been important for showing that the relatively few experimentally characterized examples are members of a very large collection of related disordered proteins that are wide-spread over all three domains of life. Many significant biological functions are now known to depend directly on, or are importantly associated with, the unfolded or partially folded state. Here our goal is to review the key discoveries and to weave these discoveries together to support novel approaches for understanding sequence-function relationships. Intrinsically disordered protein is common across the three domains of life, but especially common among the eukaryotic proteomes. Signaling sequences and sites of posttranslational modifications are frequently, or very likely most often, located within regions of intrinsic disorder. Disorder-to-order transitions are coupled with the adoption of different structures with different partners. Also, the flexibility of intrinsic disorder helps different disordered regions to bind to a common binding site on a common partner. Such capacity for binding diversity plays important roles in both protein-protein interaction networks and likely also in gene regulation networks. Such disorder-based signaling is further modulated in multicellular eukaryotes by alternative splicing, for which such splicing events map to regions of disorder much more often than to regions of structure. Associating alternative splicing with disorder rather than structure alleviates theoretical and experimentally observed problems associated with the folding of different length, isomeric amino acid sequences. The combination of disorder and alternative splicing is proposed to provide a mechanism for easily "trying out" different signaling pathways, thereby providing the mechanism for generating signaling diversity and enabling the evolution of cell differentiation and multicellularity. Finally, several recent small molecules of interest as potential drugs have been shown to act by blocking protein-protein interactions based on intrinsic disorder of one of the partners. Study of these examples has led to a new approach for drug discovery, and bioinformatics analysis of the human proteome suggests that various disease-associated proteins are very rich in such disorder-based drug discovery targets.

  6. Hydrophobic cluster analysis of G protein-coupled receptors: a powerful tool to derive structural and functional information from 2D-representation of protein sequences.

    PubMed

    Lentes, K U; Mathieu, E; Bischoff, R; Rasmussen, U B; Pavirani, A

    1993-01-01

    Current methods for comparative analyses of protein sequences are 1D-alignments of amino acid sequences based on the maximization of amino acid identity (homology) and the prediction of secondary structure elements. This method has a major drawback once the amino acid identity drops below 20-25%, since maximization of a homology score does not take into account any structural information. A new technique called Hydrophobic Cluster Analysis (HCA) has been developed by Lemesle-Varloot et al. (Biochimie 72, 555-574), 1990). This consists of comparing several sequences simultaneously and combining homology detection with secondary structure analysis. HCA is primarily based on the detection and comparison of structural segments constituting the hydrophobic core of globular protein domains, with or without transmembrane domains. We have applied HCA to the analysis of different families of G-protein coupled receptors, such as catecholamine receptors as well as peptide hormone receptors. Utilizing HCA the thrombin receptor, a new and as yet unique member of the family of G-protein coupled receptors, can be clearly classified as being closely related to the family of neuropeptide receptors rather than to the catecholamine receptors for which the shape of the hydrophobic clusters and the length of their third cytoplasmic loop are very different. Furthermore, the potential of HCA to predict relationships between new putative and already characterized members of this family of receptors will be presented.

  7. Functional genomics of gam56: characterisation of the role of a 56 kilodalton sexual stage antigen in oocyst wall formation in Eimeria maxima.

    PubMed

    Belli, Sabina I; Witcombe, David; Wallach, Michael G; Smith, Nicholas C

    2002-12-19

    Gam56 (M(r) 56,000) is an antigen found in the sexual (macrogametocyte) stage of the intestinal parasite Eimeria maxima that is implicated in protective immunity. The gene (gam56) encoding this protein was cloned and sequenced. It is a single-copy, intronless gene, that localises to a 1,754 bp transcript, and is first detected at 120 h p.i. The gene predicts two distinct protein domains; a tyrosine-serine rich region, composed of amino acids implicated in oocyst wall formation in Eimeria spp., and a proline-methionine rich region often detected in extensins, protein components of plant cell walls. The tyrosine-serine rich region predicts a secondary structure commonly seen in the structural protein fibroin, a component of the cocoon of the caterpillar Bombyx mori. The inference that gam56 is a structural component of the oocyst wall was confirmed when a specific antibody to gam56 recognised the wall forming bodies in macrogametocytes, and the walls of oocysts and sporocysts. Together, these data identify a developmentally regulated, sexual stage gene in E. maxima that shares primary and secondary structure features in common with intrinsic structural proteins in other parasites such as Schistosoma mansoni and Fasciola hepatica, and other organisms across different phyla, including the caterpillar Bombyx mori. In addition, these findings provide evidence for the molecular mechanisms underlying oocyst wall formation in Eimeria and the role of gametocyte antigens in this process.

  8. A Grammatical Approach to RNA-RNA Interaction Prediction

    NASA Astrophysics Data System (ADS)

    Kato, Yuki; Akutsu, Tatsuya; Seki, Hiroyuki

    2007-11-01

    Much attention has been paid to two interacting RNA molecules involved in post-transcriptional control of gene expression. Although there have been a few studies on RNA-RNA interaction prediction based on dynamic programming algorithm, no grammar-based approach has been proposed. The purpose of this paper is to provide a new modeling for RNA-RNA interaction based on multiple context-free grammar (MCFG). We present a polynomial time parsing algorithm for finding the most likely derivation tree for the stochastic version of MCFG, which is applicable to RNA joint secondary structure prediction including kissing hairpin loops. Also, elementary tests on RNA-RNA interaction prediction have shown that the proposed method is comparable to Alkan et al.'s method.

  9. Oxysulfide LiAlSO: A Lithium Superionic Conductor from First Principles.

    PubMed

    Wang, Xuelong; Xiao, Ruijuan; Li, Hong; Chen, Liquan

    2017-05-12

    Through first-principles calculations and crystal structure prediction techniques, we identify a new layered oxysulfide LiAlSO in orthorhombic structure as a novel lithium superionic conductor. Two kinds of stacking sequences of layers of AlS_{2}O_{2} are found in different temperature ranges. Phonon and molecular dynamics simulations verify their dynamic stabilities, and wide band gaps up to 5.6 eV are found by electronic structure calculations. The lithium migration energy barrier simulations reveal the collective interstitial-host ion "kick-off" hopping mode with barriers lower than 50 meV as the dominating conduction mechanism for LiAlSO, indicating it to be a promising solid-state electrolyte in lithium secondary batteries with fast ionic conductivity and a wide electrochemical window. This is a first attempt in which the lithium superionic conductors are designed by the crystal structure prediction method and may help explore other mixed-anion battery materials.

  10. Oxysulfide LiAlSO: A Lithium Superionic Conductor from First Principles

    NASA Astrophysics Data System (ADS)

    Wang, Xuelong; Xiao, Ruijuan; Li, Hong; Chen, Liquan

    2017-05-01

    Through first-principles calculations and crystal structure prediction techniques, we identify a new layered oxysulfide LiAlSO in orthorhombic structure as a novel lithium superionic conductor. Two kinds of stacking sequences of layers of AlS2O2 are found in different temperature ranges. Phonon and molecular dynamics simulations verify their dynamic stabilities, and wide band gaps up to 5.6 eV are found by electronic structure calculations. The lithium migration energy barrier simulations reveal the collective interstitial-host ion "kick-off" hopping mode with barriers lower than 50 meV as the dominating conduction mechanism for LiAlSO, indicating it to be a promising solid-state electrolyte in lithium secondary batteries with fast ionic conductivity and a wide electrochemical window. This is a first attempt in which the lithium superionic conductors are designed by the crystal structure prediction method and may help explore other mixed-anion battery materials.

  11. SFESA: a web server for pairwise alignment refinement by secondary structure shifts.

    PubMed

    Tong, Jing; Pei, Jimin; Grishin, Nick V

    2015-09-03

    Protein sequence alignment is essential for a variety of tasks such as homology modeling and active site prediction. Alignment errors remain the main cause of low-quality structure models. A bioinformatics tool to refine alignments is needed to make protein alignments more accurate. We developed the SFESA web server to refine pairwise protein sequence alignments. Compared to the previous version of SFESA, which required a set of 3D coordinates for a protein, the new server will search a sequence database for the closest homolog with an available 3D structure to be used as a template. For each alignment block defined by secondary structure elements in the template, SFESA evaluates alignment variants generated by local shifts and selects the best-scoring alignment variant. A scoring function that combines the sequence score of profile-profile comparison and the structure score of template-derived contact energy is used for evaluation of alignments. PROMALS pairwise alignments refined by SFESA are more accurate than those produced by current advanced alignment methods such as HHpred and CNFpred. In addition, SFESA also improves alignments generated by other software. SFESA is a web-based tool for alignment refinement, designed for researchers to compute, refine, and evaluate pairwise alignments with a combined sequence and structure scoring of alignment blocks. To our knowledge, the SFESA web server is the only tool that refines alignments by evaluating local shifts of secondary structure elements. The SFESA web server is available at http://prodata.swmed.edu/sfesa.

  12. PRISM 3: expanded prediction of natural product chemical structures from microbial genomes.

    PubMed

    Skinnider, Michael A; Merwin, Nishanth J; Johnston, Chad W; Magarvey, Nathan A

    2017-07-03

    Microbial natural products represent a rich resource of pharmaceutically and industrially important compounds. Genome sequencing has revealed that the majority of natural products remain undiscovered, and computational methods to connect biosynthetic gene clusters to their corresponding natural products therefore have the potential to revitalize natural product discovery. Previously, we described PRediction Informatics for Secondary Metabolomes (PRISM), a combinatorial approach to chemical structure prediction for genetically encoded nonribosomal peptides and type I and II polyketides. Here, we present a ground-up rewrite of the PRISM structure prediction algorithm to derive prediction of natural products arising from non-modular biosynthetic paradigms. Within this new version, PRISM 3, natural product scaffolds are modeled as chemical graphs, permitting structure prediction for aminocoumarins, antimetabolites, bisindoles and phosphonate natural products, and building upon the addition of ribosomally synthesized and post-translationally modified peptides. Further, with the addition of cluster detection for 11 new cluster types, PRISM 3 expands to detect 22 distinct natural product cluster types. Other major modifications to PRISM include improved sequence input and ORF detection, user-friendliness and output. Distribution of PRISM 3 over a 300-core server grid improves the speed and capacity of the web application. PRISM 3 is available at http://magarveylab.ca/prism/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. Conditioning and Robustness of RNA Boltzmann Sampling under Thermodynamic Parameter Perturbations.

    PubMed

    Rogers, Emily; Murrugarra, David; Heitsch, Christine

    2017-07-25

    Understanding how RNA secondary structure prediction methods depend on the underlying nearest-neighbor thermodynamic model remains a fundamental challenge in the field. Minimum free energy (MFE) predictions are known to be "ill conditioned" in that small changes to the thermodynamic model can result in significantly different optimal structures. Hence, the best practice is now to sample from the Boltzmann distribution, which generates a set of suboptimal structures. Although the structural signal of this Boltzmann sample is known to be robust to stochastic noise, the conditioning and robustness under thermodynamic perturbations have yet to be addressed. We present here a mathematically rigorous model for conditioning inspired by numerical analysis, and also a biologically inspired definition for robustness under thermodynamic perturbation. We demonstrate the strong correlation between conditioning and robustness and use its tight relationship to define quantitative thresholds for well versus ill conditioning. These resulting thresholds demonstrate that the majority of the sequences are at least sample robust, which verifies the assumption of sampling's improved conditioning over the MFE prediction. Furthermore, because we find no correlation between conditioning and MFE accuracy, the presence of both well- and ill-conditioned sequences indicates the continued need for both thermodynamic model refinements and alternate RNA structure prediction methods beyond the physics-based ones. Copyright © 2017. Published by Elsevier Inc.

  14. Ionizing Radiation Environment on the International Space Station: Performance vs. Expectations for Avionics and Material

    NASA Technical Reports Server (NTRS)

    Koontz, Steven L.; Boeder, Paul A.; Pankop, Courtney; Reddell, Brandon

    2005-01-01

    The role of structural shielding mass in the design, verification, and in-flight performance of International Space Station (ISS), in both the natural and induced orbital ionizing radiation (IR) environments, is reported. Detailed consideration of the effects of both the natural and induced ionizing radiation environment during ISS design, development, and flight operations has produced a safe, efficient manned space platform that is largely immune to deleterious effects of the LEO ionizing radiation environment. The assumption of a small shielding mass for purposes of design and verification has been shown to be a valid worst-case approximation approach to design for reliability, though predicted dependences of single event effect (SEE) effects on latitude, longitude, SEP events, and spacecraft structural shielding mass are not observed. The Figure of Merit (FOM) method over predicts the rate for median shielding masses of about 10g/cm(exp 2) by only a factor of 3, while the Scott Effective Flux Approach (SEFA) method overestimated by about one order of magnitude as expected. The Integral Rectangular Parallelepiped (IRPP), SEFA, and FOM methods for estimating on-orbit (Single Event Upsets) SEU rates all utilize some version of the CREME-96 treatment of energetic particle interaction with structural shielding, which has been shown to underestimate the production of secondary particles in heavily shielded manned spacecraft. The need for more work directed to development of a practical understanding of secondary particle production in massive structural shielding for SEE design and verification is indicated. In contrast, total dose estimates using CAD based shielding mass distributions functions and the Shieldose Code provided a reasonable accurate estimate of accumulated dose in Grays internal to the ISS pressurized elements, albeit as a result of using worst-on-worst case assumptions (500 km altitude x 2) that compensate for ignoring both GCR and secondary particle production in massive structural shielding.

  15. Framework for Structural Online Health Monitoring of Aging and Degradation of Secondary Systems due to some Aspects of Erosion

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gribok, Andrei; Patnaik, Sobhan; Williams, Christian

    This report describes the current state of research related to critical aspects of erosion and selected aspects of degradation of secondary components in nuclear power plants. The report also proposes a framework for online health monitoring of aging and degradation of secondary components. The framework consists of an integrated multi-sensor modality system which can be used to monitor different piping configurations under different degradation conditions. The report analyses the currently known degradation mechanisms and available predictive models. Based on this analysis, the structural health monitoring framework is proposed. The Light Water Reactor Sustainability Program began to evaluate technologies that couldmore » be used to perform online monitoring of piping and other secondary system structural components in commercial NPPs. These online monitoring systems have the potential to identify when a more detailed inspection is needed using real-time measurements, rather than at a pre-determined inspection interval. This transition to condition-based, risk informed automated maintenance will contribute to a significant reduction of operations and maintenance costs that account for the majority of nuclear power generation costs. There is unanimous agreement between industry experts and academic researchers that identifying and prioritizing inspection locations in secondary piping systems (for example, in raw water piping or diesel piping) would eliminate many excessive in-service inspections. The proposed structural health monitoring framework takes aim at answering this challenge by combining long-range guided wave technologies with other monitoring techniques, which can significantly increase the inspection length and pinpoint the locations that degraded the most. More widely, the report suggests research efforts aimed at developing, validating, and deploying online corrosion monitoring techniques for complex geometries, which are pervasive in NPPs.« less

  16. Molecular dynamics simulation studies and in vitro site directed mutagenesis of avian beta-defensin Apl_AvBD2

    PubMed Central

    2010-01-01

    Background Defensins comprise a group of antimicrobial peptides, widely recognized as important elements of the innate immune system in both animals and plants. Cationicity, rather than the secondary structure, is believed to be the major factor defining the antimicrobial activity of defensins. To test this hypothesis and to improve the activity of the newly identified avian β-defensin Apl_AvBD2 by enhancing the cationicity, we performed in silico site directed mutagenesis, keeping the predicted secondary structure intact. Molecular dynamics (MD) simulation studies were done to predict the activity. Mutant proteins were made by in vitro site directed mutagenesis and recombinant protein expression, and tested for antimicrobial activity to confirm the results obtained in MD simulation analysis. Results MD simulation revealed subtle, but critical, structural variations between the wild type Apl_AvBD2 and the more cationic in silico mutants, which were not detected in the initial structural prediction by homology modelling. The C-terminal cationic 'claw' region, important in antimicrobial activity, which was intact in the wild type, showed changes in shape and orientation in all the mutant peptides. Mutant peptides also showed increased solvent accessible surface area and more number of hydrogen bonds with the surrounding water molecules. In functional studies, the Escherichia coli expressed, purified recombinant mutant proteins showed total loss of antimicrobial activity compared to the wild type protein. Conclusion The study revealed that cationicity alone is not the determining factor in the microbicidal activity of antimicrobial peptides. Factors affecting the molecular dynamics such as hydrophobicity, electrostatic interactions and the potential for oligomerization may also play fundamental roles. It points to the usefulness of MD simulation studies in successful engineering of antimicrobial peptides for improved activity and other desirable functions. PMID:20122244

  17. Social Support and Socioeconomic Status Predict Secondary Students' Grades and Educational Plans Indifferently across Immigrant Group and Gender

    ERIC Educational Resources Information Center

    Ulriksen, Robin; Sagatun, Åse; Zachrisson, Henrik Daae; Waaktaar, Trine; Lervåg, Arne Ola

    2015-01-01

    Social support and socioeconomic status (SES) have received considerable attention in explaining academic achievement and the achievement gap between students with ethic majority and immigrant background, and between boys and girls. Using a Structural Equation Modeling approach we examine (1) if there exist a gap in school achievements between…

  18. Multitrait, Random Regression, or Simple Repeatability Model in High-Throughput Phenotyping Data Improve Genomic Prediction for Wheat Grain Yield.

    PubMed

    Sun, Jin; Rutkoski, Jessica E; Poland, Jesse A; Crossa, José; Jannink, Jean-Luc; Sorrells, Mark E

    2017-07-01

    High-throughput phenotyping (HTP) platforms can be used to measure traits that are genetically correlated with wheat ( L.) grain yield across time. Incorporating such secondary traits in the multivariate pedigree and genomic prediction models would be desirable to improve indirect selection for grain yield. In this study, we evaluated three statistical models, simple repeatability (SR), multitrait (MT), and random regression (RR), for the longitudinal data of secondary traits and compared the impact of the proposed models for secondary traits on their predictive abilities for grain yield. Grain yield and secondary traits, canopy temperature (CT) and normalized difference vegetation index (NDVI), were collected in five diverse environments for 557 wheat lines with available pedigree and genomic information. A two-stage analysis was applied for pedigree and genomic selection (GS). First, secondary traits were fitted by SR, MT, or RR models, separately, within each environment. Then, best linear unbiased predictions (BLUPs) of secondary traits from the above models were used in the multivariate prediction models to compare predictive abilities for grain yield. Predictive ability was substantially improved by 70%, on average, from multivariate pedigree and genomic models when including secondary traits in both training and test populations. Additionally, (i) predictive abilities slightly varied for MT, RR, or SR models in this data set, (ii) results indicated that including BLUPs of secondary traits from the MT model was the best in severe drought, and (iii) the RR model was slightly better than SR and MT models under drought environment. Copyright © 2017 Crop Science Society of America.

  19. A computational method for predicting regulation of human microRNAs on the influenza virus genome

    PubMed Central

    2013-01-01

    Background While it has been suggested that host microRNAs (miRNAs) may downregulate viral gene expression as an antiviral defense mechanism, such a mechanism has not been explored in the influenza virus for human flu studies. As it is difficult to conduct related experiments on humans, computational studies can provide some insight. Although many computational tools have been designed for miRNA target prediction, there is a need for cross-species prediction, especially for predicting viral targets of human miRNAs. However, finding putative human miRNAs targeting influenza virus genome is still challenging. Results We developed machine-learning features and conducted comprehensive data training for predicting interactions between H1N1 genome segments and host miRNA. We defined our seed region as the first ten nucleotides from the 5' end of the miRNA to the 3' end of the miRNA and integrated various features including the number of consecutive matching bases in the seed region of 10 bases, a triplet feature in seed regions, thermodynamic energy, penalty of bulges and wobbles at binding sites, and the secondary structure of viral RNA for the prediction. Conclusions Compared to general predictive models, our model fully takes into account the conservation patterns and features of viral RNA secondary structures, and greatly improves the prediction accuracy. Our model identified some key miRNAs including hsa-miR-489, hsa-miR-325, hsa-miR-876-3p and hsa-miR-2117, which target HA, PB2, MP and NS of H1N1, respectively. Our study provided an interesting hypothesis concerning the miRNA-based antiviral defense mechanism against influenza virus in human, i.e., the binding between human miRNA and viral RNAs may not result in gene silencing but rather may block the viral RNA replication. PMID:24565017

  20. Tertiary alphabet for the observable protein structural universe.

    PubMed

    Mackenzie, Craig O; Zhou, Jianfu; Grigoryan, Gevorg

    2016-11-22

    Here, we systematically decompose the known protein structural universe into its basic elements, which we dub tertiary structural motifs (TERMs). A TERM is a compact backbone fragment that captures the secondary, tertiary, and quaternary environments around a given residue, comprising one or more disjoint segments (three on average). We seek the set of universal TERMs that capture all structure in the Protein Data Bank (PDB), finding remarkable degeneracy. Only ∼600 TERMs are sufficient to describe 50% of the PDB at sub-Angstrom resolution. However, more rare geometries also exist, and the overall structural coverage grows logarithmically with the number of TERMs. We go on to show that universal TERMs provide an effective mapping between sequence and structure. We demonstrate that TERM-based statistics alone are sufficient to recapitulate close-to-native sequences given either NMR or X-ray backbones. Furthermore, sequence variability predicted from TERM data agrees closely with evolutionary variation. Finally, locations of TERMs in protein chains can be predicted from sequence alone based on sequence signatures emergent from TERM instances in the PDB. For multisegment motifs, this method identifies spatially adjacent fragments that are not contiguous in sequence-a major bottleneck in structure prediction. Although all TERMs recur in diverse proteins, some appear specialized for certain functions, such as interface formation, metal coordination, or even water binding. Structural biology has benefited greatly from previously observed degeneracies in structure. The decomposition of the known structural universe into a finite set of compact TERMs offers exciting opportunities toward better understanding, design, and prediction of protein structure.

  1. Tertiary alphabet for the observable protein structural universe

    PubMed Central

    Mackenzie, Craig O.; Zhou, Jianfu; Grigoryan, Gevorg

    2016-01-01

    Here, we systematically decompose the known protein structural universe into its basic elements, which we dub tertiary structural motifs (TERMs). A TERM is a compact backbone fragment that captures the secondary, tertiary, and quaternary environments around a given residue, comprising one or more disjoint segments (three on average). We seek the set of universal TERMs that capture all structure in the Protein Data Bank (PDB), finding remarkable degeneracy. Only ∼600 TERMs are sufficient to describe 50% of the PDB at sub-Angstrom resolution. However, more rare geometries also exist, and the overall structural coverage grows logarithmically with the number of TERMs. We go on to show that universal TERMs provide an effective mapping between sequence and structure. We demonstrate that TERM-based statistics alone are sufficient to recapitulate close-to-native sequences given either NMR or X-ray backbones. Furthermore, sequence variability predicted from TERM data agrees closely with evolutionary variation. Finally, locations of TERMs in protein chains can be predicted from sequence alone based on sequence signatures emergent from TERM instances in the PDB. For multisegment motifs, this method identifies spatially adjacent fragments that are not contiguous in sequence—a major bottleneck in structure prediction. Although all TERMs recur in diverse proteins, some appear specialized for certain functions, such as interface formation, metal coordination, or even water binding. Structural biology has benefited greatly from previously observed degeneracies in structure. The decomposition of the known structural universe into a finite set of compact TERMs offers exciting opportunities toward better understanding, design, and prediction of protein structure. PMID:27810958

  2. Optimizing physical energy functions for protein folding.

    PubMed

    Fujitsuka, Yoshimi; Takada, Shoji; Luthey-Schulten, Zaida A; Wolynes, Peter G

    2004-01-01

    We optimize a physical energy function for proteins with the use of the available structural database and perform three benchmark tests of the performance: (1) recognition of native structures in the background of predefined decoy sets of Levitt, (2) de novo structure prediction using fragment assembly sampling, and (3) molecular dynamics simulations. The energy parameter optimization is based on the energy landscape theory and uses a Monte Carlo search to find a set of parameters that seeks the largest ratio deltaE(s)/DeltaE for all proteins in a training set simultaneously. Here, deltaE(s) is the stability gap between the native and the average in the denatured states and DeltaE is the energy fluctuation among these states. Some of the energy parameters optimized are found to show significant correlation with experimentally observed quantities: (1) In the recognition test, the optimized function assigns the lowest energy to either the native or a near-native structure among many decoy structures for all the proteins studied. (2) Structure prediction with the fragment assembly sampling gives structure models with root mean square deviation less than 6 A in one of the top five cluster centers for five of six proteins studied. (3) Structure prediction using molecular dynamics simulation gives poorer performance, implying the importance of having a more precise description of local structures. The physical energy function solely inferred from a structural database neither utilizes sequence information from the family of the target nor the outcome of the secondary structure prediction but can produce the correct native fold for many small proteins. Copyright 2003 Wiley-Liss, Inc.

  3. Accelerated probabilistic inference of RNA structure evolution

    PubMed Central

    Holmes, Ian

    2005-01-01

    Background Pairwise stochastic context-free grammars (Pair SCFGs) are powerful tools for evolutionary analysis of RNA, including simultaneous RNA sequence alignment and secondary structure prediction, but the associated algorithms are intensive in both CPU and memory usage. The same problem is faced by other RNA alignment-and-folding algorithms based on Sankoff's 1985 algorithm. It is therefore desirable to constrain such algorithms, by pre-processing the sequences and using this first pass to limit the range of structures and/or alignments that can be considered. Results We demonstrate how flexible classes of constraint can be imposed, greatly reducing the computational costs while maintaining a high quality of structural homology prediction. Any score-attributed context-free grammar (e.g. energy-based scoring schemes, or conditionally normalized Pair SCFGs) is amenable to this treatment. It is now possible to combine independent structural and alignment constraints of unprecedented general flexibility in Pair SCFG alignment algorithms. We outline several applications to the bioinformatics of RNA sequence and structure, including Waterman-Eggert N-best alignments and progressive multiple alignment. We evaluate the performance of the algorithm on test examples from the RFAM database. Conclusion A program, Stemloc, that implements these algorithms for efficient RNA sequence alignment and structure prediction is available under the GNU General Public License. PMID:15790387

  4. Immunoinformatic Analysis of Crimean Congo Hemorrhagic Fever Virus Glycoproteins and Epitope Prediction for Synthetic Peptide Vaccine.

    PubMed

    Tipu, Hamid Nawaz

    2016-02-01

    To determine the Crimean Congo Hemorrhagic Fever (CCHF) virus M segement glycoprotein's immunoinformatic parameters, and identify Human Leukocyte Antigen (HLA) class I binders as candidates for synthetic peptide vaccines. Cross-sectional study. Combined Military Hospital, Khuzdar Cantt, in May 2015. Data acquisition, antigenicity prediction, secondary and tertiary structure prediction, residue analysis were done using immunoinformatics tools. HLAclass I binders in glycoprotein's sequence were identified at nanomer length using NetMHC 3.4 and mapped onto tertiary structure. Docking was done for strongest binder against its corresponding allele with CABS-dock. HLAA*0101, 0201, 0301, 2402, 2601 and B*0702, 0801, 2705, 3901, 4001, 5801, 1501 were analyzed against two glycoprotein components of the virus. Atotal of 35 nanomers from GP1, and 3 from GP2 were identified. HLAB*0702 bound maximum number of peptides (6), while HLAB*4001 showed strongest binding affinity. HLAspecific glycoproteins epitope prediction can help identify synthetic peptide vaccine candidates.

  5. SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines

    PubMed Central

    2014-01-01

    Background It is important to predict the quality of a protein structural model before its native structure is known. The method that can predict the absolute local quality of individual residues in a single protein model is rare, yet particularly needed for using, ranking and refining protein models. Results We developed a machine learning tool (SMOQ) that can predict the distance deviation of each residue in a single protein model. SMOQ uses support vector machines (SVM) with protein sequence and structural features (i.e. basic feature set), including amino acid sequence, secondary structures, solvent accessibilities, and residue-residue contacts to make predictions. We also trained a SVM model with two new additional features (profiles and SOV scores) on 20 CASP8 targets and found that including them can only improve the performance when real deviations between native and model are higher than 5Å. The SMOQ tool finally released uses the basic feature set trained on 85 CASP8 targets. Moreover, SMOQ implemented a way to convert predicted local quality scores into a global quality score. SMOQ was tested on the 84 CASP9 single-domain targets. The average difference between the residue-specific distance deviation predicted by our method and the actual distance deviation on the test data is 2.637Å. The global quality prediction accuracy of the tool is comparable to other good tools on the same benchmark. Conclusion SMOQ is a useful tool for protein single model quality assessment. Its source code and executable are available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/. PMID:24776231

  6. SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines.

    PubMed

    Cao, Renzhi; Wang, Zheng; Wang, Yiheng; Cheng, Jianlin

    2014-04-28

    It is important to predict the quality of a protein structural model before its native structure is known. The method that can predict the absolute local quality of individual residues in a single protein model is rare, yet particularly needed for using, ranking and refining protein models. We developed a machine learning tool (SMOQ) that can predict the distance deviation of each residue in a single protein model. SMOQ uses support vector machines (SVM) with protein sequence and structural features (i.e. basic feature set), including amino acid sequence, secondary structures, solvent accessibilities, and residue-residue contacts to make predictions. We also trained a SVM model with two new additional features (profiles and SOV scores) on 20 CASP8 targets and found that including them can only improve the performance when real deviations between native and model are higher than 5Å. The SMOQ tool finally released uses the basic feature set trained on 85 CASP8 targets. Moreover, SMOQ implemented a way to convert predicted local quality scores into a global quality score. SMOQ was tested on the 84 CASP9 single-domain targets. The average difference between the residue-specific distance deviation predicted by our method and the actual distance deviation on the test data is 2.637Å. The global quality prediction accuracy of the tool is comparable to other good tools on the same benchmark. SMOQ is a useful tool for protein single model quality assessment. Its source code and executable are available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/.

  7. Protein contact prediction using patterns of correlation.

    PubMed

    Hamilton, Nicholas; Burrage, Kevin; Ragan, Mark A; Huber, Thomas

    2004-09-01

    We describe a new method for using neural networks to predict residue contact pairs in a protein. The main inputs to the neural network are a set of 25 measures of correlated mutation between all pairs of residues in two "windows" of size 5 centered on the residues of interest. While the individual pair-wise correlations are a relatively weak predictor of contact, by training the network on windows of correlation the accuracy of prediction is significantly improved. The neural network is trained on a set of 100 proteins and then tested on a disjoint set of 1033 proteins of known structure. An average predictive accuracy of 21.7% is obtained taking the best L/2 predictions for each protein, where L is the sequence length. Taking the best L/10 predictions gives an average accuracy of 30.7%. The predictor is also tested on a set of 59 proteins from the CASP5 experiment. The accuracy is found to be relatively consistent across different sequence lengths, but to vary widely according to the secondary structure. Predictive accuracy is also found to improve by using multiple sequence alignments containing many sequences to calculate the correlations. Copyright 2004 Wiley-Liss, Inc.

  8. Personality Traits and General Intelligence as Predictors of Academic Performance: A Structural Equation Modelling Approach

    ERIC Educational Resources Information Center

    Rosander, Pia; Backstrom, Martin; Stenberg, Georg

    2011-01-01

    The aim of the present study was to investigate the extent to which personality traits, after controlling for general intelligence, predict academic performance in different school subjects. Upper secondary school students in Sweden (N=315) completed the Wonderlic IQ test (Wonderlic, 1992) and the IPIP-NEO-PI test (Goldberg, 1999). A series of…

  9. A Model of Academic Self-Concept: Perceived Difficulty and Social Comparison among Academically Accelerated Secondary School Students

    ERIC Educational Resources Information Center

    Wilson, Hope E.; Siegle, Del; McCoach, D. Betsy; Little, Catherine A.; Reis, Sally M.

    2014-01-01

    Academic self-concept predicts students' future goals and is affected by a student's relative success compared with his or her peer group. This exploratory study used structural equation modeling to examine the contributions of the perceived level of difficulty of the curriculum, in addition to the contributions of social comparison and…

  10. Secondary impact hazard assessment

    NASA Technical Reports Server (NTRS)

    1986-01-01

    A series of light gas gun shots (4 to 7 km/sec) were performed with 5 mg nylon and aluminum projectiles to determine the size, mass, velocity, and spatial distribution of spall and ejecta from a number of graphite/epoxy targets. Similar determinations were also performed on a few aluminum targets. Target thickness and material were chosen to be representative of proposed Space Station structure. The data from these shots and other information were used to predict the hazard to Space Station elements from secondary particles resulting from impacts of micrometeoroids and orbital debris on the Space Station. This hazard was quantified as an additional flux over and above the primary micrometeoroid and orbital debris flux that must be considered in the design process. In order to simplify the calculations, eject and spall mass were assumed to scale directly with the energy of the projectile. Other scaling systems may be closer to reality. The secondary particles considered are only those particles that may impact other structure immediately after the primary impact. The addition to the orbital debris problem from these primary impacts was not addressed. Data from this study should be fed into the orbital debris model to see if Space Station secondaries make a significant contribution to orbital debris. The hazard to a Space Station element from secondary particles above and beyond the micrometeoroid and orbital debris hazard is categorized in terms of two factors: (1) the 'view factor' of the element to other Space Station structure or the geometry of placement of the element, and (2) the sensitivity to damage, stated in terms of energy. Several example cases were chosen, the Space Station module windows, windows of a Shuttle docked to the Space Station, the habitat module walls, and the photovoltaic solar cell arrays. For the examples chosen the secondary flux contributed no more than 10 percent to the total flux (primary and secondary) above a given calculated critical energy. A key assumption in these calculations is that above a certain critical energy, significant damage will be done. This is not true for all structures. Double-walled, bumpered structures are an example for which damage may be reduced as energy goes up. The critical energy assumption is probably conservative, however, in terms of secondary damage. To understand why the secondary impacts seem to, in general, contribute less than 10 percent of the flux above a given critical energy, consider the case of a meteoroid impact of a given energy on a fixed, large surface. This impact results in a variety of secondary particles, all of which have much less energy than the original impact. Conservation of energy prohibits any other situation. Thus if damage is linked to a critical energy of a particle, the primary flux will always deliver particles of much greater energy. Even if all the secondary particles impacted other Space Station structures, none would have a kinetic energy more than a fraction of the primary impact energy.

  11. Support Vector Machine-based classification of protein folds using the structural properties of amino acid residues and amino acid residue pairs.

    PubMed

    Shamim, Mohammad Tabrez Anwar; Anwaruddin, Mohammad; Nagarajaram, H A

    2007-12-15

    Fold recognition is a key step in the protein structure discovery process, especially when traditional sequence comparison methods fail to yield convincing structural homologies. Although many methods have been developed for protein fold recognition, their accuracies remain low. This can be attributed to insufficient exploitation of fold discriminatory features. We have developed a new method for protein fold recognition using structural information of amino acid residues and amino acid residue pairs. Since protein fold recognition can be treated as a protein fold classification problem, we have developed a Support Vector Machine (SVM) based classifier approach that uses secondary structural state and solvent accessibility state frequencies of amino acids and amino acid pairs as feature vectors. Among the individual properties examined secondary structural state frequencies of amino acids gave an overall accuracy of 65.2% for fold discrimination, which is better than the accuracy by any method reported so far in the literature. Combination of secondary structural state frequencies with solvent accessibility state frequencies of amino acids and amino acid pairs further improved the fold discrimination accuracy to more than 70%, which is approximately 8% higher than the best available method. In this study we have also tested, for the first time, an all-together multi-class method known as Crammer and Singer method for protein fold classification. Our studies reveal that the three multi-class classification methods, namely one versus all, one versus one and Crammer and Singer method, yield similar predictions. Dataset and stand-alone program are available upon request.

  12. The Phyre2 web portal for protein modelling, prediction and analysis

    PubMed Central

    Kelley, Lawrence A; Mezulis, Stefans; Yates, Christopher M; Wass, Mark N; Sternberg, Michael JE

    2017-01-01

    Summary Phyre2 is a suite of tools available on the web to predict and analyse protein structure, function and mutations. The focus of Phyre2 is to provide biologists with a simple and intuitive interface to state-of-the-art protein bioinformatics tools. Phyre2 replaces Phyre, the original version of the server for which we previously published a protocol. In this updated protocol, we describe Phyre2, which uses advanced remote homology detection methods to build 3D models, predict ligand binding sites, and analyse the effect of amino-acid variants (e.g. nsSNPs) for a user’s protein sequence. Users are guided through results by a simple interface at a level of detail determined by them. This protocol will guide a user from submitting a protein sequence to interpreting the secondary and tertiary structure of their models, their domain composition and model quality. A range of additional available tools is described to find a protein structure in a genome, to submit large number of sequences at once and to automatically run weekly searches for proteins difficult to model. The server is available at http://www.sbg.bio.ic.ac.uk/phyre2. A typical structure prediction will be returned between 30mins and 2 hours after submission. PMID:25950237

  13. Multicore and GPU algorithms for Nussinov RNA folding

    PubMed Central

    2014-01-01

    Background One segment of a RNA sequence might be paired with another segment of the same RNA sequence due to the force of hydrogen bonds. This two-dimensional structure is called the RNA sequence's secondary structure. Several algorithms have been proposed to predict an RNA sequence's secondary structure. These algorithms are referred to as RNA folding algorithms. Results We develop cache efficient, multicore, and GPU algorithms for RNA folding using Nussinov's algorithm. Conclusions Our cache efficient algorithm provides a speedup between 1.6 and 3.0 relative to a naive straightforward single core code. The multicore version of the cache efficient single core algorithm provides a speedup, relative to the naive single core algorithm, between 7.5 and 14.0 on a 6 core hyperthreaded CPU. Our GPU algorithm for the NVIDIA C2050 is up to 1582 times as fast as the naive single core algorithm and between 5.1 and 11.2 times as fast as the fastest previously known GPU algorithm for Nussinov RNA folding. PMID:25082539

  14. A method for probing the mutational landscape of amyloid structure.

    PubMed

    O'Donnell, Charles W; Waldispühl, Jérôme; Lis, Mieszko; Halfmann, Randal; Devadas, Srinivas; Lindquist, Susan; Berger, Bonnie

    2011-07-01

    Proteins of all kinds can self-assemble into highly ordered β-sheet aggregates known as amyloid fibrils, important both biologically and clinically. However, the specific molecular structure of a fibril can vary dramatically depending on sequence and environmental conditions, and mutations can drastically alter amyloid function and pathogenicity. Experimental structure determination has proven extremely difficult with only a handful of NMR-based models proposed, suggesting a need for computational methods. We present AmyloidMutants, a statistical mechanics approach for de novo prediction and analysis of wild-type and mutant amyloid structures. Based on the premise of protein mutational landscapes, AmyloidMutants energetically quantifies the effects of sequence mutation on fibril conformation and stability. Tested on non-mutant, full-length amyloid structures with known chemical shift data, AmyloidMutants offers roughly 2-fold improvement in prediction accuracy over existing tools. Moreover, AmyloidMutants is the only method to predict complete super-secondary structures, enabling accurate discrimination of topologically dissimilar amyloid conformations that correspond to the same sequence locations. Applied to mutant prediction, AmyloidMutants identifies a global conformational switch between Aβ and its highly-toxic 'Iowa' mutant in agreement with a recent experimental model based on partial chemical shift data. Predictions on mutant, yeast-toxic strains of HET-s suggest similar alternate folds. When applied to HET-s and a HET-s mutant with core asparagines replaced by glutamines (both highly amyloidogenic chemically similar residues abundant in many amyloids), AmyloidMutants surprisingly predicts a greatly reduced capacity of the glutamine mutant to form amyloid. We confirm this finding by conducting mutagenesis experiments. Our tool is publically available on the web at http://amyloid.csail.mit.edu/. lindquist_admin@wi.mit.edu; bab@csail.mit.edu.

  15. Predicting helix orientation for coiled-coil dimers

    PubMed Central

    Apgar, James R.; Gutwin, Karl N.; Keating, Amy E.

    2008-01-01

    The alpha-helical coiled coil is a structurally simple protein oligomerization or interaction motif consisting of two or more alpha helices twisted into a supercoiled bundle. Coiled coils can differ in their stoichiometry, helix orientation and axial alignment. Because of the near degeneracy of many of these variants, coiled coils pose a challenge to fold recognition methods for structure prediction. Whereas distinctions between some protein folds can be discriminated on the basis of hydrophobic/polar patterning or secondary structure propensities, the sequence differences that encode important details of coiled-coil structure can be subtle. This is emblematic of a larger problem in the field of protein structure and interaction prediction: that of establishing specificity between closely similar structures. We tested the behavior of different computational models on the problem of recognizing the correct orientation - parallel vs. antiparallel - of pairs of alpha helices that can form a dimeric coiled coil. For each of 131 examples of known structure, we constructed a large number of both parallel and antiparallel structural models and used these to asses the ability of five energy functions to recognize the correct fold. We also developed and tested three sequenced-based approaches that make use of varying degrees of implicit structural information. The best structural methods performed similarly to the best sequence methods, correctly categorizing ∼81% of dimers. Steric compatibility with the fold was important for some coiled coils we investigated. For many examples, the correct orientation was determined by smaller energy differences between parallel and antiparallel structures distributed over many residues and energy components. Prediction methods that used structure but incorporated varying approximations and assumptions showed quite different behaviors when used to investigate energetic contributions to orientation preference. Sequence based methods were sensitive to the choice of residue-pair interactions scored. PMID:18506779

  16. Ab-initio conformational epitope structure prediction using genetic algorithm and SVM for vaccine design.

    PubMed

    Moghram, Basem Ameen; Nabil, Emad; Badr, Amr

    2018-01-01

    T-cell epitope structure identification is a significant challenging immunoinformatic problem within epitope-based vaccine design. Epitopes or antigenic peptides are a set of amino acids that bind with the Major Histocompatibility Complex (MHC) molecules. The aim of this process is presented by Antigen Presenting Cells to be inspected by T-cells. MHC-molecule-binding epitopes are responsible for triggering the immune response to antigens. The epitope's three-dimensional (3D) molecular structure (i.e., tertiary structure) reflects its proper function. Therefore, the identification of MHC class-II epitopes structure is a significant step towards epitope-based vaccine design and understanding of the immune system. In this paper, we propose a new technique using a Genetic Algorithm for Predicting the Epitope Structure (GAPES), to predict the structure of MHC class-II epitopes based on their sequence. The proposed Elitist-based genetic algorithm for predicting the epitope's tertiary structure is based on Ab-Initio Empirical Conformational Energy Program for Peptides (ECEPP) Force Field Model. The developed secondary structure prediction technique relies on Ramachandran Plot. We used two alignment algorithms: the ROSS alignment and TM-Score alignment. We applied four different alignment approaches to calculate the similarity scores of the dataset under test. We utilized the support vector machine (SVM) classifier as an evaluation of the prediction performance. The prediction accuracy and the Area Under Receiver Operating Characteristic (ROC) Curve (AUC) were calculated as measures of performance. The calculations are performed on twelve similarity-reduced datasets of the Immune Epitope Data Base (IEDB) and a large dataset of peptide-binding affinities to HLA-DRB1*0101. The results showed that GAPES was reliable and very accurate. We achieved an average prediction accuracy of 93.50% and an average AUC of 0.974 in the IEDB dataset. Also, we achieved an accuracy of 95.125% and an AUC of 0.987 on the HLA-DRB1*0101 allele of the Wang benchmark dataset. The results indicate that the proposed prediction technique "GAPES" is a promising technique that will help researchers and scientists to predict the protein structure and it will assist them in the intelligent design of new epitope-based vaccines. Copyright © 2017 Elsevier B.V. All rights reserved.

  17. Principles of protein folding--a perspective from simple exact models.

    PubMed Central

    Dill, K. A.; Bromberg, S.; Yue, K.; Fiebig, K. M.; Yee, D. P.; Thomas, P. D.; Chan, H. S.

    1995-01-01

    General principles of protein structure, stability, and folding kinetics have recently been explored in computer simulations of simple exact lattice models. These models represent protein chains at a rudimentary level, but they involve few parameters, approximations, or implicit biases, and they allow complete explorations of conformational and sequence spaces. Such simulations have resulted in testable predictions that are sometimes unanticipated: The folding code is mainly binary and delocalized throughout the amino acid sequence. The secondary and tertiary structures of a protein are specified mainly by the sequence of polar and nonpolar monomers. More specific interactions may refine the structure, rather than dominate the folding code. Simple exact models can account for the properties that characterize protein folding: two-state cooperativity, secondary and tertiary structures, and multistage folding kinetics--fast hydrophobic collapse followed by slower annealing. These studies suggest the possibility of creating "foldable" chain molecules other than proteins. The encoding of a unique compact chain conformation may not require amino acids; it may require only the ability to synthesize specific monomer sequences in which at least one monomer type is solvent-averse. PMID:7613459

  18. Observations of Effective Teacher-Student Interactions in Secondary School Classrooms: Predicting Student Achievement with the Classroom Assessment Scoring System--Secondary

    ERIC Educational Resources Information Center

    Allen, Joseph; Gregory, Anne; Mikami, Amori; Lun, Janetta; Hamre, Bridget; Pianta, Robert

    2013-01-01

    Multilevel modeling techniques were used with a sample of 643 students enrolled in 37 secondary school classrooms to predict future student achievement (controlling for baseline achievement) from observed teacher interactions with students in the classroom, coded using the Classroom Assessment Scoring System--Secondary. After accounting for prior…

  19. Prediction of Protein-Protein Interaction Sites by Random Forest Algorithm with mRMR and IFS

    PubMed Central

    Li, Bi-Qing; Feng, Kai-Yan; Chen, Lei; Huang, Tao; Cai, Yu-Dong

    2012-01-01

    Prediction of protein-protein interaction (PPI) sites is one of the most challenging problems in computational biology. Although great progress has been made by employing various machine learning approaches with numerous characteristic features, the problem is still far from being solved. In this study, we developed a novel predictor based on Random Forest (RF) algorithm with the Minimum Redundancy Maximal Relevance (mRMR) method followed by incremental feature selection (IFS). We incorporated features of physicochemical/biochemical properties, sequence conservation, residual disorder, secondary structure and solvent accessibility. We also included five 3D structural features to predict protein-protein interaction sites and achieved an overall accuracy of 0.672997 and MCC of 0.347977. Feature analysis showed that 3D structural features such as Depth Index (DPX) and surface curvature (SC) contributed most to the prediction of protein-protein interaction sites. It was also shown via site-specific feature analysis that the features of individual residues from PPI sites contribute most to the determination of protein-protein interaction sites. It is anticipated that our prediction method will become a useful tool for identifying PPI sites, and that the feature analysis described in this paper will provide useful insights into the mechanisms of interaction. PMID:22937126

  20. Structure Elucidation of Unknown Metabolites in Metabolomics by Combined NMR and MS/MS Prediction

    DOE PAGES

    Boiteau, Rene M.; Hoyt, David W.; Nicora, Carrie D.; ...

    2018-01-17

    Here, we introduce a cheminformatics approach that combines highly selective and orthogonal structure elucidation parameters; accurate mass, MS/MS (MS 2), and NMR in a single analysis platform to accurately identify unknown metabolites in untargeted studies. The approach starts with an unknown LC-MS feature, and then combines the experimental MS/MS and NMR information of the unknown to effectively filter the false positive candidate structures based on their predicted MS/MS and NMR spectra. We demonstrate the approach on a model mixture and then we identify an uncatalogued secondary metabolite in Arabidopsis thaliana. The NMR/MS 2 approach is well suited for discovery ofmore » new metabolites in plant extracts, microbes, soils, dissolved organic matter, food extracts, biofuels, and biomedical samples, facilitating the identification of metabolites that are not present in experimental NMR and MS metabolomics databases.« less

  1. Structure Elucidation of Unknown Metabolites in Metabolomics by Combined NMR and MS/MS Prediction

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Boiteau, Rene M.; Hoyt, David W.; Nicora, Carrie D.

    Here, we introduce a cheminformatics approach that combines highly selective and orthogonal structure elucidation parameters; accurate mass, MS/MS (MS 2), and NMR in a single analysis platform to accurately identify unknown metabolites in untargeted studies. The approach starts with an unknown LC-MS feature, and then combines the experimental MS/MS and NMR information of the unknown to effectively filter the false positive candidate structures based on their predicted MS/MS and NMR spectra. We demonstrate the approach on a model mixture and then we identify an uncatalogued secondary metabolite in Arabidopsis thaliana. The NMR/MS 2 approach is well suited for discovery ofmore » new metabolites in plant extracts, microbes, soils, dissolved organic matter, food extracts, biofuels, and biomedical samples, facilitating the identification of metabolites that are not present in experimental NMR and MS metabolomics databases.« less

  2. Structure Elucidation of Unknown Metabolites in Metabolomics by Combined NMR and MS/MS Prediction

    PubMed Central

    Hoyt, David W.; Nicora, Carrie D.; Kinmonth-Schultz, Hannah A.; Ward, Joy K.

    2018-01-01

    We introduce a cheminformatics approach that combines highly selective and orthogonal structure elucidation parameters; accurate mass, MS/MS (MS2), and NMR into a single analysis platform to accurately identify unknown metabolites in untargeted studies. The approach starts with an unknown LC-MS feature, and then combines the experimental MS/MS and NMR information of the unknown to effectively filter out the false positive candidate structures based on their predicted MS/MS and NMR spectra. We demonstrate the approach on a model mixture, and then we identify an uncatalogued secondary metabolite in Arabidopsis thaliana. The NMR/MS2 approach is well suited to the discovery of new metabolites in plant extracts, microbes, soils, dissolved organic matter, food extracts, biofuels, and biomedical samples, facilitating the identification of metabolites that are not present in experimental NMR and MS metabolomics databases. PMID:29342073

  3. Design and Performance of the Terrestrial Planet Finder Coronagraph

    NASA Technical Reports Server (NTRS)

    White, Mary L.; Shaklan, Stuart; Lisman, P. Doulas; Ho, Timothy; Mouroulis, Pantazis; Basinger, Scott; Ledeboer, Bill; Kwack, Eug; Kissil, Andy; Mosier, Gary; hide

    2004-01-01

    Terrestrial Planet Finder Coronagraph, one of two potential architectures, is described. The telescope is designed to make a visible wavelength survey of the habitable zones of at least thirty stars in search of earth-like planets. The preliminary system requirements, optical parameters, mechanical and thermal design, operations scenario and predicted performance is presented. The 6-meter aperture telescope has a monolithic primary mirror, which along with the secondary tower, are being designed to meet the stringent optical tolerances of the planet-finding mission. Performance predictions include dynamic and thermal finite element analysis of the telescope optics and structure, which are used to make predictions of the optical performance of the system.

  4. Prediction of redox-sensitive cysteines using sequential distance and other sequence-based features.

    PubMed

    Sun, Ming-An; Zhang, Qing; Wang, Yejun; Ge, Wei; Guo, Dianjing

    2016-08-24

    Reactive oxygen species can modify the structure and function of proteins and may also act as important signaling molecules in various cellular processes. Cysteine thiol groups of proteins are particularly susceptible to oxidation. Meanwhile, their reversible oxidation is of critical roles for redox regulation and signaling. Recently, several computational tools have been developed for predicting redox-sensitive cysteines; however, those methods either only focus on catalytic redox-sensitive cysteines in thiol oxidoreductases, or heavily depend on protein structural data, thus cannot be widely used. In this study, we analyzed various sequence-based features potentially related to cysteine redox-sensitivity, and identified three types of features for efficient computational prediction of redox-sensitive cysteines. These features are: sequential distance to the nearby cysteines, PSSM profile and predicted secondary structure of flanking residues. After further feature selection using SVM-RFE, we developed Redox-Sensitive Cysteine Predictor (RSCP), a SVM based classifier for redox-sensitive cysteine prediction using primary sequence only. Using 10-fold cross-validation on RSC758 dataset, the accuracy, sensitivity, specificity, MCC and AUC were estimated as 0.679, 0.602, 0.756, 0.362 and 0.727, respectively. When evaluated using 10-fold cross-validation with BALOSCTdb dataset which has structure information, the model achieved performance comparable to current structure-based method. Further validation using an independent dataset indicates it is robust and of relatively better accuracy for predicting redox-sensitive cysteines from non-enzyme proteins. In this study, we developed a sequence-based classifier for predicting redox-sensitive cysteines. The major advantage of this method is that it does not rely on protein structure data, which ensures more extensive application compared to other current implementations. Accurate prediction of redox-sensitive cysteines not only enhances our understanding about the redox sensitivity of cysteine, it may also complement the proteomics approach and facilitate further experimental investigation of important redox-sensitive cysteines.

  5. Protein modeling and molecular dynamics simulation of SlWRKY4 protein cloned from drought tolerant tomato (Solanum habrochaites) line EC520061.

    PubMed

    Karkute, Suhas G; Easwaran, Murugesh; Gujjar, Ranjit Singh; Piramanayagam, Shanmughavel; Singh, Major

    2015-10-01

    WRKY genes are members of one of the largest families of plant transcription factors and play an important role in response to biotic and abiotic stresses, and overall growth and development. Understanding the interaction of WRKY proteins with other proteins/ligands in plant cells is of utmost importance to develop plants having tolerance to biotic and abiotic stresses. The SlWRKY4 gene was cloned from a drought tolerant wild species of tomato (Solanum habrochaites) and the secondary structure and 3D modeling of this protein were predicted using Schrödinger Suite-Prime. Predicted structures were also subjected to plot against Ramachandran's conformation, and the modeled structure was minimized using Macromodel. Finally, the minimized structure was simulated in the water environment to check the protein stability. The behavior of the modeled structure was well-simulated and analyzed through RMSD and RMSF of the protein. The present work provides the modeled 3D structure of SlWRKY4 that will help in understanding the mechanism of gene regulation by further in silico interaction studies.

  6. STRUCTURE OF THE INTERSTELLAR BOUNDARY EXPLORER RIBBON FROM SECONDARY CHARGE-EXCHANGE AT THE SOLAR–INTERSTELLAR INTERFACE

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zirnstein, E. J.; McComas, D. J.; Heerikhuisen, J., E-mail: ezirnstein@swri.edu, E-mail: dmccomas@swri.edu, E-mail: jacob.heerikhuisen@uah.edu

    2015-05-01

    In 2009, the Interstellar Boundary Explorer discovered a bright “ribbon” of energetic neutral atom (ENA) flux in the energy range ≤0.4–6 keV, encircling a large portion of the sky. This observation was not previously predicted by any models or theories, and since its discovery, it has been the subject of numerous studies of its origin and properties. One of the most studied mechanisms for its creation is the “secondary ENA” process. Here, solar wind ions, neutralized by charge-exchange with interstellar atoms, propagate outside the heliopause; experience two charge-exchange events in the dense outer heliosheath; and then propagate back inside themore » heliosphere, preferentially in the direction perpendicular to the local interstellar magnetic field. This process has been extensively analyzed using state-of-the-art modeling and simulation techniques, but it has been difficult to visualize. In this Letter, we show the three-dimensional structure of the source of the ribbon, providing a physical picture of the spatial and energy scales over which the secondary ENA process occurs. These results help us understand how the ribbon is generated and further supports a secondary ENA process as the leading ribbon source mechanism.« less

  7. Bipolar radiofrequency ablation of spinal tumors: predictability, safety and outcome.

    PubMed

    Gazis, Angelos N; Beuing, Oliver; Franke, Jörg; Jöllenbeck, Boris; Skalej, Martin

    2014-04-01

    Bone metastases are often the cause of tumor-associated pain and reduction of quality of life. For patients that cannot be treated by surgery, a local minimally invasive therapy such as radiofrequency ablation can be a useful option. In cases in which tumorous masses are adjacent to vulnerable structures, the monopolar radiofrequency can cause severe neuronal damage because of the unpredictability of current flow. The aim of this study is to show that the bipolar radiofrequency ablation provides an opportunity to safely treat such spinal lesions because of precise predictability of the emerging ablation zone. Prospective cohort study of 36 patients undergoing treatment at a single institution. Thirty-six patients in advanced tumor stage with primary or secondary tumor involvement of spine undergoing radiofrequency ablation. Prediction of emerging ablation zone. Clinical outcome of treated patients. X-ray-controlled treatment of 39 lesions by bipolar radiofrequency ablation. Magnetic resonance imaging was performed pre- and postinterventionally. Patients were observed clinically during their postinterventional stay. The extent of the ablation zones was predictable to the millimeter because it did not cross the peri-interventional planned dorsal and ventral boundaries in any case. No complications were observed. Ablation of tumorous masses adjacent to vulnerable structures is feasible and predictable by using the bipolar radiofrequency ablation. Damage of neuronal structures can be avoided through precise prediction of the ablation area. Copyright © 2014 Elsevier Inc. All rights reserved.

  8. Identification and cloning of four riboswitches from Burkholderia pseudomallei strain K96243

    NASA Astrophysics Data System (ADS)

    Munyati-Othman, Noor; Fatah, Ahmad Luqman Abdul; Piji, Mohd Al Akmarul Fizree Bin Md; Ramlan, Effirul Ikhwan; Raih, Mohd Firdaus

    2015-09-01

    Structured RNAs referred as riboswitches have been predicted to be present in the genome sequence of Burkholderia pseudomallei strain K96243. Four of the riboswitches were identified and analyzed through BLASTN, Rfam search and multiple sequence alignment. The RNA aptamers belong to the following riboswitch classifications: glycine riboswitch, cobalamin riboswitch, S-adenosyl-(L)-homocysteine (SAH) riboswitch and flavin mononucleotide (FMN) riboswitch. The conserved nucleotides for each aptamer were identified and were marked on the secondary structure generated by RNAfold. These riboswitches were successfully amplified and cloned for further study.

  9. Isolation and in silico analysis of a novel H+-pyrophosphatase gene orthologue from the halophytic grass Leptochloa fusca

    NASA Astrophysics Data System (ADS)

    Rauf, Muhammad; Saeed, Nasir A.; Habib, Imran; Ahmed, Moddassir; Shahzad, Khurram; Mansoor, Shahid; Ali, Rashid

    2017-02-01

    Structure prediction can provide information about function and active sites of protein which helps to design new functional proteins. H+-pyrophosphatase is transmembrane protein involved in establishing proton motive force for active transport of Na+ across membrane by Na+/H+ antiporters. A full length novel H+-pyrophosphatase gene was isolated from halophytic grass Leptochloa fusca using RT-PCR and RACE method. Full length LfVP1 gene sequence of 2292 nucleotides encodes protein of 764 amino acids. DNA and protein sequences were used for characterization using bioinformatics tools. Various important potential sites were predicted by PROSITE webserver. Primary structural analysis showed LfVP1 as stable protein and Grand average hydropathy (GRAVY) indicated that LfVP1 protein has good hydrosolubility. Secondary structure analysis showed that LfVP1 protein sequence contains significant proportion of alpha helix and random coil. Protein membrane topology suggested the presence of 14 transmembrane domains and presence of catalytic domain in TM3. Three dimensional structure from LfVP1 protein sequence also indicated the presence of 14 transmembrane domains and hydrophobicity surface model showed amino acid hydrophobicity. Ramachandran plot showed that 98% amino acid residues were predicted in the favored region.

  10. Cumulative metal leaching from utilisation of secondary building materials in river engineering.

    PubMed

    Leuven, R S E W; Willems, F H G

    2004-01-01

    The present paper estimates the utilisation of bulky wastes (minestone, steel slag, phosphorus slag and demolition waste) in hydraulic engineering structures in Dutch parts of the rivers Rhine, Meuse and Scheldt over the period 1980-2025. Although they offer several economic, technical and environmental benefits, these secondary building materials contain various metals that may leach into river water. A leaching model was used to predict annual emissions of arsenic, cadmium, copper, chromium, lead, mercury, nickel and zinc. Under the current utilisation and model assumptions, the contribution of secondary building materials to metal pollution in Dutch surface waters is expected to be relatively low compared to other sources (less than 0.1% and 0.2% in the years 2000 and 2025, respectively). However, continued and widespread large-scale applications of secondary building materials will increase pollutant leaching and may require further cuts to be made in emissions from other sources to meet emission reduction targets and water quality standards. It is recommended to validate available leaching models under various field conditions. Complete registration of secondary building materials will be required to improve input data for leaching models.

  11. Dynamics and Predictability of Tropical Cyclone Genesis, Structure and Intensity Change

    DTIC Science & Technology

    2012-09-30

    analyses and forecasts of tropical cyclones, including genesis, intensity change, and extratropical transition. A secondary objective is to understand... storm -centered assimilation algorithm. Basic research in Report Documentation Page Form ApprovedOMB No. 0704-0188 Public reporting burden for the...COMPLETED For the four storms consider (Nuri, Jangmi, Sinlaku, and Hagupit), an 80-member EnKF has been cycled on observations (surface, rawinsondes, GPS

  12. Implications of Mycobacterium Major Facilitator Superfamily for Novel Measures against Tuberculosis.

    PubMed

    Wang, Rui; Zhang, Zhen; Xie, Longxiang; Xie, Jianping

    2015-01-01

    Major facilitator superfamily (MFS) is an important secondary membrane transport protein superfamily conserved from prokaryotes to eukaryotes. The MFS proteins are widespread among bacteria and are responsible for the transfer of substrates. Pathogenic Mycobacterium MFS transporters, their distribution, function, phylogeny, and predicted crystal structures were studied to better understand the function of MFS and to discover specific inhibitors of MFS for better tuberculosis control.

  13. Probabilistic sampling of protein conformations: new hope for brute force?

    PubMed

    Feldman, Howard J; Hogue, Christopher W V

    2002-01-01

    Protein structure prediction from sequence alone by "brute force" random methods is a computationally expensive problem. Estimates have suggested that it could take all the computers in the world longer than the age of the universe to compute the structure of a single 200-residue protein. Here we investigate the use of a faster version of our FOLDTRAJ probabilistic all-atom protein-structure-sampling algorithm. We have improved the method so that it is now over twenty times faster than originally reported, and capable of rapidly sampling conformational space without lattices. It uses geometrical constraints and a Leonard-Jones type potential for self-avoidance. We have also implemented a novel method to add secondary structure-prediction information to make protein-like amounts of secondary structure in sampled structures. In a set of 100,000 probabilistic conformers of 1VII, 1ENH, and 1PMC generated, the structures with smallest Calpha RMSD from native are 3.95, 5.12, and 5.95A, respectively. Expanding this test to a set of 17 distinct protein folds, we find that all-helical structures are "hit" by brute force more frequently than beta or mixed structures. For small helical proteins or very small non-helical ones, this approach should have a "hit" close enough to detect with a good scoring function in a pool of several million conformers. By fitting the distribution of RMSDs from the native state of each of the 17 sets of conformers to the extreme value distribution, we are able to estimate the size of conformational space for each. With a 0.5A RMSD cutoff, the number of conformers is roughly 2N where N is the number of residues in the protein. This is smaller than previous estimates, indicating an average of only two possible conformations per residue when sterics are accounted for. Our method reduces the effective number of conformations available at each residue by probabilistic bias, without requiring any particular discretization of residue conformational space, and is the fastest method of its kind. With computer speeds doubling every 18 months and parallel and distributed computing becoming more practical, the brute force approach to protein structure prediction may yet have some hope in the near future. Copyright 2001 Wiley-Liss, Inc.

  14. Relative stability of major types of beta-turns as a function of amino acid composition: a study based on Ab initio energetic and natural abundance data.

    PubMed

    Perczel, András; Jákli, Imre; McAllister, Michael A; Csizmadia, Imre G

    2003-06-06

    Folding properties of small globular proteins are determined by their amino acid sequence (primary structure). This holds both for local (secondary structure) and for global conformational features of linear polypeptides and proteins composed from natural amino acid derivatives. It thus provides the rational basis of structure prediction algorithms. The shortest secondary structure element, the beta-turn, most typically adopts either a type I or a type II form, depending on the amino acid composition. Herein we investigate the sequence-dependent folding stability of both major types of beta-turns using simple dipeptide models (-Xxx-Yyy-). Gas-phase ab initio properties of 16 carefully selected and suitably protected dipeptide models (for example Val-Ser, Ala-Gly, Ser-Ser) were studied. For each backbone fold most probable side-chain conformers were considered. Fully optimized 321G RHF molecular structures were employed in medium level [B3LYP/6-311++G(d,p)//RHF/3-21G] energy calculations to estimate relative populations of the different backbone conformers. Our results show that the preference for beta-turn forms as calculated by quantum mechanics and observed in Xray determined proteins correlates significantly.

  15. The complex folding pathways of protein A suggest a multiple-funnelled energy landscape

    NASA Astrophysics Data System (ADS)

    St-Pierre, Jean-Francois; Mousseau, Normand; Derreumaux, Philippe

    2008-01-01

    Folding proteins into their native states requires the formation of both secondary and tertiary structures. Many questions remain, however, as to whether these form into a precise order, and various pictures have been proposed that place the emphasis on the first or the second level of structure in describing folding. One of the favorite test models for studying this question is the B domain of protein A, which has been characterized by numerous experiments and simulations. Using the activation-relaxation technique coupled with a generic energy model (optimized potential for efficient peptide structure prediction), we generate more than 50 folding trajectories for this 60-residue protein. While the folding pathways to the native state are fully consistent with the funnel-like description of the free energy landscape, we find a wide range of mechanisms in which secondary and tertiary structures form in various orders. Our nonbiased simulations also reveal the presence of a significant number of non-native β and α conformations both on and off pathway, including the visit, for a non-negligible fraction of trajectories, of fully ordered structures resembling the native state of nonhomologous proteins.

  16. TANGLE: Two-Level Support Vector Regression Approach for Protein Backbone Torsion Angle Prediction from Primary Sequences

    PubMed Central

    Song, Jiangning; Tan, Hao; Wang, Mingjun; Webb, Geoffrey I.; Akutsu, Tatsuya

    2012-01-01

    Protein backbone torsion angles (Phi) and (Psi) involve two rotation angles rotating around the Cα-N bond (Phi) and the Cα-C bond (Psi). Due to the planarity of the linked rigid peptide bonds, these two angles can essentially determine the backbone geometry of proteins. Accordingly, the accurate prediction of protein backbone torsion angle from sequence information can assist the prediction of protein structures. In this study, we develop a new approach called TANGLE (Torsion ANGLE predictor) to predict the protein backbone torsion angles from amino acid sequences. TANGLE uses a two-level support vector regression approach to perform real-value torsion angle prediction using a variety of features derived from amino acid sequences, including the evolutionary profiles in the form of position-specific scoring matrices, predicted secondary structure, solvent accessibility and natively disordered region as well as other global sequence features. When evaluated based on a large benchmark dataset of 1,526 non-homologous proteins, the mean absolute errors (MAEs) of the Phi and Psi angle prediction are 27.8° and 44.6°, respectively, which are 1% and 3% respectively lower than that using one of the state-of-the-art prediction tools ANGLOR. Moreover, the prediction of TANGLE is significantly better than a random predictor that was built on the amino acid-specific basis, with the p-value<1.46e-147 and 7.97e-150, respectively by the Wilcoxon signed rank test. As a complementary approach to the current torsion angle prediction algorithms, TANGLE should prove useful in predicting protein structural properties and assisting protein fold recognition by applying the predicted torsion angles as useful restraints. TANGLE is freely accessible at http://sunflower.kuicr.kyoto-u.ac.jp/~sjn/TANGLE/. PMID:22319565

  17. How Structure Defines Affinity in Protein-Protein Interactions

    PubMed Central

    Erijman, Ariel; Rosenthal, Eran; Shifman, Julia M.

    2014-01-01

    Protein-protein interactions (PPI) in nature are conveyed by a multitude of binding modes involving various surfaces, secondary structure elements and intermolecular interactions. This diversity results in PPI binding affinities that span more than nine orders of magnitude. Several early studies attempted to correlate PPI binding affinities to various structure-derived features with limited success. The growing number of high-resolution structures, the appearance of more precise methods for measuring binding affinities and the development of new computational algorithms enable more thorough investigations in this direction. Here, we use a large dataset of PPI structures with the documented binding affinities to calculate a number of structure-based features that could potentially define binding energetics. We explore how well each calculated biophysical feature alone correlates with binding affinity and determine the features that could be used to distinguish between high-, medium- and low- affinity PPIs. Furthermore, we test how various combinations of features could be applied to predict binding affinity and observe a slow improvement in correlation as more features are incorporated into the equation. In addition, we observe a considerable improvement in predictions if we exclude from our analysis low-resolution and NMR structures, revealing the importance of capturing exact intermolecular interactions in our calculations. Our analysis should facilitate prediction of new interactions on the genome scale, better characterization of signaling networks and design of novel binding partners for various target proteins. PMID:25329579

  18. Limitations of in silico predictability of specificity of co-immobilised cytochromes P450 and mimics in food-bioprocessing.

    PubMed

    Wiseman, Alan

    2003-04-01

    Cytochromes P450 (EC 1.14.14.1) are mixed function oxidases (oxygenases) that can catalyse redox bioconversions of food components. Also, efficacious removal of undesirable components can be achieved using solid-support immobilised enzyme (IME) of a selection from 2700 isoforms of cytochromes P450 (CYP). Cytochromes P450 co-immobilised with other enzymes, or protein receptors, may be used to confer a secondary order of regio- or stereo-specificity of chiral bioconversion: these can be predictable in silico by utilisation of QSARs (quantitative structure/activity relationships).

  19. Reducing the worst case running times of a family of RNA and CFG problems, using Valiant's approach.

    PubMed

    Zakov, Shay; Tsur, Dekel; Ziv-Ukelson, Michal

    2011-08-18

    RNA secondary structure prediction is a mainstream bioinformatic domain, and is key to computational analysis of functional RNA. In more than 30 years, much research has been devoted to defining different variants of RNA structure prediction problems, and to developing techniques for improving prediction quality. Nevertheless, most of the algorithms in this field follow a similar dynamic programming approach as that presented by Nussinov and Jacobson in the late 70's, which typically yields cubic worst case running time algorithms. Recently, some algorithmic approaches were applied to improve the complexity of these algorithms, motivated by new discoveries in the RNA domain and by the need to efficiently analyze the increasing amount of accumulated genome-wide data. We study Valiant's classical algorithm for Context Free Grammar recognition in sub-cubic time, and extract features that are common to problems on which Valiant's approach can be applied. Based on this, we describe several problem templates, and formulate generic algorithms that use Valiant's technique and can be applied to all problems which abide by these templates, including many problems within the world of RNA Secondary Structures and Context Free Grammars. The algorithms presented in this paper improve the theoretical asymptotic worst case running time bounds for a large family of important problems. It is also possible that the suggested techniques could be applied to yield a practical speedup for these problems. For some of the problems (such as computing the RNA partition function and base-pair binding probabilities), the presented techniques are the only ones which are currently known for reducing the asymptotic running time bounds of the standard algorithms.

  20. Reducing the worst case running times of a family of RNA and CFG problems, using Valiant's approach

    PubMed Central

    2011-01-01

    Background RNA secondary structure prediction is a mainstream bioinformatic domain, and is key to computational analysis of functional RNA. In more than 30 years, much research has been devoted to defining different variants of RNA structure prediction problems, and to developing techniques for improving prediction quality. Nevertheless, most of the algorithms in this field follow a similar dynamic programming approach as that presented by Nussinov and Jacobson in the late 70's, which typically yields cubic worst case running time algorithms. Recently, some algorithmic approaches were applied to improve the complexity of these algorithms, motivated by new discoveries in the RNA domain and by the need to efficiently analyze the increasing amount of accumulated genome-wide data. Results We study Valiant's classical algorithm for Context Free Grammar recognition in sub-cubic time, and extract features that are common to problems on which Valiant's approach can be applied. Based on this, we describe several problem templates, and formulate generic algorithms that use Valiant's technique and can be applied to all problems which abide by these templates, including many problems within the world of RNA Secondary Structures and Context Free Grammars. Conclusions The algorithms presented in this paper improve the theoretical asymptotic worst case running time bounds for a large family of important problems. It is also possible that the suggested techniques could be applied to yield a practical speedup for these problems. For some of the problems (such as computing the RNA partition function and base-pair binding probabilities), the presented techniques are the only ones which are currently known for reducing the asymptotic running time bounds of the standard algorithms. PMID:21851589

  1. Bioinformatics analysis of single and multi-hybrid epitopes of GRA-1, GRA-4, GRA-6 and GRA-7 proteins to improve DNA vaccine design against Toxoplasma gondii.

    PubMed

    Shaddel, Minoo; Ebrahimi, Mansour; Tabandeh, Mohammad Reza

    2018-06-01

    Toxoplasma gondii , is a causative agent of morbidity and mortality in immunocompromised and congenitally-infected individuals. Attempts to construct DNA vaccines against T. gondii using surface proteins are increasing. The dense granule antigens are highly expressed in the acute and chronic phases of T. gondii infection and considered as suitable DNA vaccine candidates to control toxoplasmosis. In the present study, bioinformatics tools and online software were used to predict, analyze and compare the structural, physical and chemical characters and immunogenicity of the GRA-1, GRA-4, GRA-6 and GRA-7 proteins. Sequence alignment results indicated that the GRA-1, GRA-4, GRA-6 and GRA-7 proteins had low similarity. The secondary structure prediction demonstrated that among the four proteins, GRA-1 and GRA-6 had similar secondary structure except for a little discrepancy. Hydrophilicity/hydrophobicity analysis showed multiple hydrophilic regions and some classical high hydrophilic domains for each protein sequence. Immunogenic epitope prediction results demonstrated that the GRA-1 and GRA-4 epitopes were stable and GRA-4 showed the highest degree of antigenicity. Although the GRA-7 epitope had the highest score of immunogenicity, this epitope was instable and had the lowest degree of antigenicity and half-time in eukaryotic cell. Also, the results indicated that GRA4-GRA7 epitope and GRA6-GRA7 had the highest degree of antigenicity and immunogenicity among multi-hybrid epitopes, respectively. Totally, in the present study, single epitopes showed the highest degree of antigenicity compared with multi-hybrid epitopes. Given the results, it can be concluded that GRA-4 and GRA-7 can be powerful DNA vaccine candidates against T. gondii .

  2. SeqRate: sequence-based protein folding type classification and rates prediction

    PubMed Central

    2010-01-01

    Background Protein folding rate is an important property of a protein. Predicting protein folding rate is useful for understanding protein folding process and guiding protein design. Most previous methods of predicting protein folding rate require the tertiary structure of a protein as an input. And most methods do not distinguish the different kinetic nature (two-state folding or multi-state folding) of the proteins. Here we developed a method, SeqRate, to predict both protein folding kinetic type (two-state versus multi-state) and real-value folding rate using sequence length, amino acid composition, contact order, contact number, and secondary structure information predicted from only protein sequence with support vector machines. Results We systematically studied the contributions of individual features to folding rate prediction. On a standard benchmark dataset, the accuracy of folding kinetic type classification is 80%. The Pearson correlation coefficient and the mean absolute difference between predicted and experimental folding rates (sec-1) in the base-10 logarithmic scale are 0.81 and 0.79 for two-state protein folders, and 0.80 and 0.68 for three-state protein folders. SeqRate is the first sequence-based method for protein folding type classification and its accuracy of fold rate prediction is improved over previous sequence-based methods. Its performance can be further enhanced with additional information, such as structure-based geometric contacts, as inputs. Conclusions Both the web server and software of predicting folding rate are publicly available at http://casp.rnet.missouri.edu/fold_rate/index.html. PMID:20438647

  3. RSRE: RNA structural robustness evaluator

    PubMed Central

    Shu, Wenjie; Zheng, Zhiqiang; Wang, Shengqi

    2007-01-01

    Biological robustness, defined as the ability to maintain stable functioning in the face of various perturbations, is an important and fundamental topic in current biology, and has become a focus of numerous studies in recent years. Although structural robustness has been explored in several types of RNA molecules, the origins of robustness are still controversial. Computational analysis results are needed to make up for the lack of evidence of robustness in natural biological systems. The RNA structural robustness evaluator (RSRE) web server presented here provides a freely available online tool to quantitatively evaluate the structural robustness of RNA based on the widely accepted definition of neutrality. Several classical structure comparison methods are employed; five randomization methods are implemented to generate control sequences; sub-optimal predicted structures can be optionally utilized to mitigate the uncertainty of secondary structure prediction. With a user-friendly interface, the web application is easy to use. Intuitive illustrations are provided along with the original computational results to facilitate analysis. The RSRE will be helpful in the wide exploration of RNA structural robustness and will catalyze our understanding of RNA evolution. The RSRE web server is freely available at http://biosrv1.bmi.ac.cn/RSRE/ or http://biotech.bmi.ac.cn/RSRE/. PMID:17567615

  4. A constraint logic programming approach to associate 1D and 3D structural components for large protein complexes.

    PubMed

    Dal Palù, Alessandro; Pontelli, Enrico; He, Jing; Lu, Yonggang

    2007-01-01

    The paper describes a novel framework, constructed using Constraint Logic Programming (CLP) and parallelism, to determine the association between parts of the primary sequence of a protein and alpha-helices extracted from 3D low-resolution descriptions of large protein complexes. The association is determined by extracting constraints from the 3D information, regarding length, relative position and connectivity of helices, and solving these constraints with the guidance of a secondary structure prediction algorithm. Parallelism is employed to enhance performance on large proteins. The framework provides a fast, inexpensive alternative to determine the exact tertiary structure of unknown proteins.

  5. Prediction of protein mutant stability using classification and regression tool.

    PubMed

    Huang, Liang-Tsung; Saraboji, K; Ho, Shinn-Ying; Hwang, Shiow-Fen; Ponnuswamy, M N; Gromiha, M Michael

    2007-02-01

    Prediction of protein stability upon amino acid substitutions is an important problem in molecular biology and the solving of which would help for designing stable mutants. In this work, we have analyzed the stability of protein mutants using two different datasets of 1396 and 2204 mutants obtained from ProTherm database, respectively for free energy change due to thermal (DeltaDeltaG) and denaturant denaturations (DeltaDeltaG(H(2)O)). We have used a set of 48 physical, chemical energetic and conformational properties of amino acid residues and computed the difference of amino acid properties for each mutant in both sets of data. These differences in amino acid properties have been related to protein stability (DeltaDeltaG and DeltaDeltaG(H(2)O)) and are used to train with classification and regression tool for predicting the stability of protein mutants. Further, we have tested the method with 4 fold, 5 fold and 10 fold cross validation procedures. We found that the physical properties, shape and flexibility are important determinants of protein stability. The classification of mutants based on secondary structure (helix, strand, turn and coil) and solvent accessibility (buried, partially buried, partially exposed and exposed) distinguished the stabilizing/destabilizing mutants at an average accuracy of 81% and 80%, respectively for DeltaDeltaG and DeltaDeltaG(H(2)O). The correlation between the experimental and predicted stability change is 0.61 for DeltaDeltaG and 0.44 for DeltaDeltaG(H(2)O). Further, the free energy change due to the replacement of amino acid residue has been predicted within an average error of 1.08 kcal/mol and 1.37 kcal/mol for thermal and chemical denaturation, respectively. The relative importance of secondary structure and solvent accessibility, and the influence of the dataset on prediction of protein mutant stability have been discussed.

  6. Cognitive reappraisal and secondary control coping: associations with working memory, positive and negative affect, and symptoms of anxiety/depression.

    PubMed

    Andreotti, Charissa; Thigpen, Jennifer E; Dunn, Madeleine J; Watson, Kelly; Potts, Jennifer; Reising, Michelle M; Robinson, Kristen E; Rodriguez, Erin M; Roubinov, Danielle; Luecken, Linda; Compas, Bruce E

    2013-01-01

    The current study examined the relations of measures of cognitive reappraisal and secondary control coping with working memory abilities, positive and negative affect, and symptoms of anxiety and depression in young adults (N=124). Results indicate significant relations between working memory abilities and reports of secondary control coping and between reports of secondary control coping and cognitive reappraisal. Associations were also found between measures of secondary control coping and cognitive reappraisal and positive and negative affect and symptoms of depression and anxiety. Further, the findings suggest that reports of cognitive reappraisal may be more strongly predictive of positive affect whereas secondary control coping may be more strongly predictive of negative affect and symptoms of depression and anxiety. Overall, the results suggest that current measures of secondary control coping and cognitive reappraisal capture related but distinct constructs and suggest that the assessment of working memory may be more strongly related to secondary control coping in predicting individual differences in distress.

  7. Constraint Logic Programming approach to protein structure prediction.

    PubMed

    Dal Palù, Alessandro; Dovier, Agostino; Fogolari, Federico

    2004-11-30

    The protein structure prediction problem is one of the most challenging problems in biological sciences. Many approaches have been proposed using database information and/or simplified protein models. The protein structure prediction problem can be cast in the form of an optimization problem. Notwithstanding its importance, the problem has very seldom been tackled by Constraint Logic Programming, a declarative programming paradigm suitable for solving combinatorial optimization problems. Constraint Logic Programming techniques have been applied to the protein structure prediction problem on the face-centered cube lattice model. Molecular dynamics techniques, endowed with the notion of constraint, have been also exploited. Even using a very simplified model, Constraint Logic Programming on the face-centered cube lattice model allowed us to obtain acceptable results for a few small proteins. As a test implementation their (known) secondary structure and the presence of disulfide bridges are used as constraints. Simplified structures obtained in this way have been converted to all atom models with plausible structure. Results have been compared with a similar approach using a well-established technique as molecular dynamics. The results obtained on small proteins show that Constraint Logic Programming techniques can be employed for studying protein simplified models, which can be converted into realistic all atom models. The advantage of Constraint Logic Programming over other, much more explored, methodologies, resides in the rapid software prototyping, in the easy way of encoding heuristics, and in exploiting all the advances made in this research area, e.g. in constraint propagation and its use for pruning the huge search space.

  8. Rapid and reliable protein structure determination via chemical shift threading.

    PubMed

    Hafsa, Noor E; Berjanskii, Mark V; Arndt, David; Wishart, David S

    2018-01-01

    Protein structure determination using nuclear magnetic resonance (NMR) spectroscopy can be both time-consuming and labor intensive. Here we demonstrate how chemical shift threading can permit rapid, robust, and accurate protein structure determination using only chemical shift data. Threading is a relatively old bioinformatics technique that uses a combination of sequence information and predicted (or experimentally acquired) low-resolution structural data to generate high-resolution 3D protein structures. The key motivations behind using NMR chemical shifts for protein threading lie in the fact that they are easy to measure, they are available prior to 3D structure determination, and they contain vital structural information. The method we have developed uses not only sequence and chemical shift similarity but also chemical shift-derived secondary structure, shift-derived super-secondary structure, and shift-derived accessible surface area to generate a high quality protein structure regardless of the sequence similarity (or lack thereof) to a known structure already in the PDB. The method (called E-Thrifty) was found to be very fast (often < 10 min/structure) and to significantly outperform other shift-based or threading-based structure determination methods (in terms of top template model accuracy)-with an average TM-score performance of 0.68 (vs. 0.50-0.62 for other methods). Coupled with recent developments in chemical shift refinement, these results suggest that protein structure determination, using only NMR chemical shifts, is becoming increasingly practical and reliable. E-Thrifty is available as a web server at http://ethrifty.ca .

  9. Infrared spectroscopy of the transiting extrasolar planet HD 209458 b during secondary eclipse

    NASA Astrophysics Data System (ADS)

    Richardson, Lee Jeremy

    2003-10-01

    We present spectroscopic observations that place strong limits on the atmospheric structure of the transiting extrasolar planet HD 209458 b. The discovery of the transit has led to several new observations that have provided the most de tailed information on the physical properties of a planet outside the solar system. These observations have concentrated on the primary eclipse, the time at which the planet crosses in front of the star as seen from Earth. The measurements have determined the basic physical characteristics of the planet, including radius, mass, average density, and orbital inclination, and have even refined values of the stellar mass and radius. Transmission spectroscopy of the system during primary eclipse resulted in the first detection of the atmosphere of an extrasolar planet, with the measurement of the sodium doublet. The present work discusses the first reported attempts to detect the secondary eclipse, or the disappearance of the planet behind the star, in the infrared. We devise the method of ‘occultation spectroscopy’ to detect the planetary spectrum, by searching in combined light for subtle changes in the shape of the spectrum as the planet passes behind the star. Predicted secondary eclipse events were observed from the Very Large Telescope (VLT) on UT 8 and 15 July 2001 using the Infrared Spectrometer and Array Camera (3.5 3.7 μm). Further observations from the NASA Infrared Telescope Facility (IRTF) using the SpeX instrument (1.9 4.2 μm) included two predicted secondary eclipse events on UT 20 and 27 September 2001. Analysis of these data reveal a statistically significant non- detection of the planetary spectrum. The results place strong limits on the structure of the planetary atmosphere and reject widely-accepted models for the planet that assume the incident stellar radiation is completely absorbed and re-emitted in the substellar hemisphere. Situations that remain consistent with our data include an isothermal atmosphere or the presence of a high absorptive or reflective cloud. The latter case is also consistent with the observed low sodium abundance from transmission spectroscopy. These results represent the strongest limits to date on the temperature structure of the planetary atmosphere.

  10. Protein structure modeling and refinement by global optimization in CASP12.

    PubMed

    Hong, Seung Hwan; Joung, InSuk; Flores-Canales, Jose C; Manavalan, Balachandran; Cheng, Qianyi; Heo, Seungryong; Kim, Jong Yun; Lee, Sun Young; Nam, Mikyung; Joo, Keehyoung; Lee, In-Ho; Lee, Sung Jong; Lee, Jooyoung

    2018-03-01

    For protein structure modeling in the CASP12 experiment, we have developed a new protocol based on our previous CASP11 approach. The global optimization method of conformational space annealing (CSA) was applied to 3 stages of modeling: multiple sequence-structure alignment, three-dimensional (3D) chain building, and side-chain re-modeling. For better template selection and model selection, we updated our model quality assessment (QA) method with the newly developed SVMQA (support vector machine for quality assessment). For 3D chain building, we updated our energy function by including restraints generated from predicted residue-residue contacts. New energy terms for the predicted secondary structure and predicted solvent accessible surface area were also introduced. For difficult targets, we proposed a new method, LEEab, where the template term played a less significant role than it did in LEE, complemented by increased contributions from other terms such as the predicted contact term. For TBM (template-based modeling) targets, LEE performed better than LEEab, but for FM targets, LEEab was better. For model refinement, we modified our CASP11 molecular dynamics (MD) based protocol by using explicit solvents and tuning down restraint weights. Refinement results from MD simulations that used a new augmented statistical energy term in the force field were quite promising. Finally, when using inaccurate information (such as the predicted contacts), it was important to use the Lorentzian function for which the maximal penalty arising from wrong information is always bounded. © 2017 Wiley Periodicals, Inc.

  11. A multivariate prediction model for Rho-dependent termination of transcription.

    PubMed

    Nadiras, Cédric; Eveno, Eric; Schwartz, Annie; Figueroa-Bossi, Nara; Boudvillain, Marc

    2018-06-21

    Bacterial transcription termination proceeds via two main mechanisms triggered either by simple, well-conserved (intrinsic) nucleic acid motifs or by the motor protein Rho. Although bacterial genomes can harbor hundreds of termination signals of either type, only intrinsic terminators are reliably predicted. Computational tools to detect the more complex and diversiform Rho-dependent terminators are lacking. To tackle this issue, we devised a prediction method based on Orthogonal Projections to Latent Structures Discriminant Analysis [OPLS-DA] of a large set of in vitro termination data. Using previously uncharacterized genomic sequences for biochemical evaluation and OPLS-DA, we identified new Rho-dependent signals and quantitative sequence descriptors with significant predictive value. Most relevant descriptors specify features of transcript C>G skewness, secondary structure, and richness in regularly-spaced 5'CC/UC dinucleotides that are consistent with known principles for Rho-RNA interaction. Descriptors collectively warrant OPLS-DA predictions of Rho-dependent termination with a ∼85% success rate. Scanning of the Escherichia coli genome with the OPLS-DA model identifies significantly more termination-competent regions than anticipated from transcriptomics and predicts that regions intrinsically refractory to Rho are primarily located in open reading frames. Altogether, this work delineates features important for Rho activity and describes the first method able to predict Rho-dependent terminators in bacterial genomes.

  12. Primary Versus Secondary Diagnosis of Generalized Anxiety Disorder in Youth: Is the Distinction an Important One?

    PubMed

    Ollendick, Thomas H; Jarrett, Matthew A; White, Bradley A; White, Susan W; Grills, Amie E

    2016-08-01

    Examine whether children with a primary diagnosis of generalized anxiety disorder (GAD) differ from children with a secondary diagnosis of GAD on clinician, parent, teacher, and youth-report measures. Based on consensus diagnoses, 64 youth referred to a general outpatient assessment clinic were categorized as having either a primary or secondary diagnosis of GAD. A semi-structured diagnostic interview was used to guide diagnostic decisions and assign primary versus secondary diagnostic status. We predicted that youth with a primary GAD diagnosis would present with greater anxiety symptomatology and symptom impairment on a variety of anxiety-related measures than youth with a secondary GAD diagnosis. Contrary to our hypotheses, no differences were found between those with primary versus secondary GAD diagnoses on measures of symptom severity and clinical impairment, comorbid diagnoses, or youth and teacher-report measures. Our findings have potential implications for the current practice of requiring primary anxiety diagnostic status as an inclusion criterion in clinical research and treatment outcome studies. Assuming our findings are confirmed in larger samples and with other anxiety disorders, future clinical trials and basic psychopathology research might not exclude youth based on absence of a particular anxiety disorder as the primary disorder but rather include individuals for whom that anxiety disorder is secondary as well.

  13. Sparse RNA folding revisited: space-efficient minimum free energy structure prediction.

    PubMed

    Will, Sebastian; Jabbari, Hosna

    2016-01-01

    RNA secondary structure prediction by energy minimization is the central computational tool for the analysis of structural non-coding RNAs and their interactions. Sparsification has been successfully applied to improve the time efficiency of various structure prediction algorithms while guaranteeing the same result; however, for many such folding problems, space efficiency is of even greater concern, particularly for long RNA sequences. So far, space-efficient sparsified RNA folding with fold reconstruction was solved only for simple base-pair-based pseudo-energy models. Here, we revisit the problem of space-efficient free energy minimization. Whereas the space-efficient minimization of the free energy has been sketched before, the reconstruction of the optimum structure has not even been discussed. We show that this reconstruction is not possible in trivial extension of the method for simple energy models. Then, we present the time- and space-efficient sparsified free energy minimization algorithm SparseMFEFold that guarantees MFE structure prediction. In particular, this novel algorithm provides efficient fold reconstruction based on dynamically garbage-collected trace arrows. The complexity of our algorithm depends on two parameters, the number of candidates Z and the number of trace arrows T; both are bounded by [Formula: see text], but are typically much smaller. The time complexity of RNA folding is reduced from [Formula: see text] to [Formula: see text]; the space complexity, from [Formula: see text] to [Formula: see text]. Our empirical results show more than 80 % space savings over RNAfold [Vienna RNA package] on the long RNAs from the RNA STRAND database (≥2500 bases). The presented technique is intentionally generalizable to complex prediction algorithms; due to their high space demands, algorithms like pseudoknot prediction and RNA-RNA-interaction prediction are expected to profit even stronger than "standard" MFE folding. SparseMFEFold is free software, available at http://www.bioinf.uni-leipzig.de/~will/Software/SparseMFEFold.

  14. The derivation and validation of a simple model for predicting in-hospital mortality of acutely admitted patients to internal medicine wards.

    PubMed

    Sakhnini, Ali; Saliba, Walid; Schwartz, Naama; Bisharat, Naiel

    2017-06-01

    Limited information is available about clinical predictors of in-hospital mortality in acute unselected medical admissions. Such information could assist medical decision-making.To develop a clinical model for predicting in-hospital mortality in unselected acute medical admissions and to test the impact of secondary conditions on hospital mortality.This is an analysis of the medical records of patients admitted to internal medicine wards at one university-affiliated hospital. Data obtained from the years 2013 to 2014 were used as a derivation dataset for creating a prediction model, while data from 2015 was used as a validation dataset to test the performance of the model. For each admission, a set of clinical and epidemiological variables was obtained. The main diagnosis at hospitalization was recorded, and all additional or secondary conditions that coexisted at hospital admission or that developed during hospital stay were considered secondary conditions.The derivation and validation datasets included 7268 and 7843 patients, respectively. The in-hospital mortality rate averaged 7.2%. The following variables entered the final model; age, body mass index, mean arterial pressure on admission, prior admission within 3 months, background morbidity of heart failure and active malignancy, and chronic use of statins and antiplatelet agents. The c-statistic (ROC-AUC) of the prediction model was 80.5% without adjustment for main or secondary conditions, 84.5%, with adjustment for the main diagnosis, and 89.5% with adjustment for the main diagnosis and secondary conditions. The accuracy of the predictive model reached 81% on the validation dataset.A prediction model based on clinical data with adjustment for secondary conditions exhibited a high degree of prediction accuracy. We provide a proof of concept that there is an added value for incorporating secondary conditions while predicting probabilities of in-hospital mortality. Further improvement of the model performance and validation in other cohorts are needed to aid hospitalists in predicting health outcomes.

  15. A Novel Method for Sampling Alpha-Helical Protein Backbones

    DOE R&D Accomplishments Database

    Fain, Boris; Levitt, Michael

    2001-01-01

    We present a novel technique of sampling the configurations of helical proteins. Assuming knowledge of native secondary structure, we employ assembly rules gathered from a database of existing structures to enumerate the geometrically possible 3-D arrangements of the constituent helices. We produce a library of possible folds for 25 helical protein cores. In each case the method finds significant numbers of conformations close to the native structure. In addition we assign coordinates to all atoms for 4 of the 25 proteins. In the context of database driven exhaustive enumeration our method performs extremely well, yielding significant percentages of structures (0.02%--82%) within 6A of the native structure. The method's speed and efficiency make it a valuable contribution towards the goal of predicting protein structure.

  16. The organization of irrational beliefs in posttraumatic stress symptomology: testing the predictions of REBT theory using structural equation modelling.

    PubMed

    Hyland, Philip; Shevlin, Mark; Adamson, Gary; Boduszek, Daniel

    2014-01-01

    This study directly tests a central prediction of rational emotive behaviour therapy (REBT) that has received little empirical attention regarding the core and intermediate beliefs in the development of posttraumatic stress symptoms. A theoretically consistent REBT model of posttraumatic stress disorder (PTSD) was examined using structural equation modelling techniques among a sample of 313 trauma-exposed military and law enforcement personnel. The REBT model of PTSD provided a good fit of the data, χ(2) = 599.173, df = 356, p < .001; standardized root mean square residual = .05 (confidence interval = .04-.05); standardized root mean square residual = .04; comparative fit index = .95; Tucker Lewis index = .95. Results demonstrated that demandingness beliefs indirectly affected the various symptom groups of PTSD through a set of secondary irrational beliefs that include catastrophizing, low frustration tolerance, and depreciation beliefs. Results were consistent with the predictions of REBT theory and provides strong empirical support that the cognitive variables described by REBT theory are critical cognitive constructs in the prediction of PTSD symptomology. © 2013 Wiley Periodicals, Inc.

  17. A comparison of different functions for predicted protein model quality assessment.

    PubMed

    Li, Juan; Fang, Huisheng

    2016-07-01

    In protein structure prediction, a considerable number of models are usually produced by either the Template-Based Method (TBM) or the ab initio prediction. The purpose of this study is to find the critical parameter in assessing the quality of the predicted models. A non-redundant template library was developed and 138 target sequences were modeled. The target sequences were all distant from the proteins in the template library and were aligned with template library proteins on the basis of the transformation matrix. The quality of each model was first assessed with QMEAN and its six parameters, which are C_β interaction energy (C_beta), all-atom pairwise energy (PE), solvation energy (SE), torsion angle energy (TAE), secondary structure agreement (SSA), and solvent accessibility agreement (SAE). Finally, the alignment score (score) was also used to assess the quality of model. Hence, a total of eight parameters (i.e., QMEAN, C_beta, PE, SE, TAE, SSA, SAE, score) were independently used to assess the quality of each model. The results indicate that SSA is the best parameter to estimate the quality of the model.

  18. Analysis of deep learning methods for blind protein contact prediction in CASP12.

    PubMed

    Wang, Sheng; Sun, Siqi; Xu, Jinbo

    2018-03-01

    Here we present the results of protein contact prediction achieved in CASP12 by our RaptorX-Contact server, which is an early implementation of our deep learning method for contact prediction. On a set of 38 free-modeling target domains with a median family size of around 58 effective sequences, our server obtained an average top L/5 long- and medium-range contact accuracy of 47% and 44%, respectively (L = length). A complete implementation has an average accuracy of 59% and 57%, respectively. Our deep learning method formulates contact prediction as a pixel-level image labeling problem and simultaneously predicts all residue pairs of a protein using a combination of two deep residual neural networks, taking as input the residue conservation information, predicted secondary structure and solvent accessibility, contact potential, and coevolution information. Our approach differs from existing methods mainly in (1) formulating contact prediction as a pixel-level image labeling problem instead of an image-level classification problem; (2) simultaneously predicting all contacts of an individual protein to make effective use of contact occurrence patterns; and (3) integrating both one-dimensional and two-dimensional deep convolutional neural networks to effectively learn complex sequence-structure relationship including high-order residue correlation. This paper discusses the RaptorX-Contact pipeline, both contact prediction and contact-based folding results, and finally the strength and weakness of our method. © 2017 Wiley Periodicals, Inc.

  19. Novel methods for predicting gas-particle partitioning during the formation of secondary organic aerosol

    NASA Astrophysics Data System (ADS)

    Wania, F.; Lei, Y. D.; Wang, C.; Abbatt, J. P. D.; Goss, K.-U.

    2014-12-01

    Several methods have been presented in the literature to predict an organic chemical's equilibrium partitioning between the water insoluble organic matter (WIOM) component of aerosol and the gas phase, Ki,WIOM, as a function of temperature. They include (i) polyparameter linear free energy relationships calibrated with empirical aerosol sorption data, as well as (ii) the solvation models implemented in SPARC and (iii) the quantum-chemical software COSMOtherm, which predict solvation equilibria from molecular structure alone. We demonstrate that these methods can be used to predict Ki,WIOM for large numbers of individual molecules implicated in secondary organic aerosol (SOA) formation, including those with multiple functional groups. Although very different in their theoretical foundations, these methods give remarkably consistent results for the products of the reaction of normal alkanes with OH, i.e. their partition coefficients Ki,WIOM generally agree within one order of magnitude over a range of more than ten orders of magnitude. This level of agreement is much better than that achieved by different vapour pressure estimation methods that are more commonly used in the SOA community. Also, in contrast to the agreement between vapour pressure estimates, the agreement between the Ki,WIOM estimates does not deteriorate with increasing number of functional groups. Furthermore, these partitioning coefficients Ki,WIOM predicted SOA mass yields in agreement with those measured in chamber experiments of the oxidation of normal alkanes. If a Ki,WIOM prediction method was based on one or more surrogate molecules representing the solvation properties of the mixed OM phase of SOA, the choice of those molecule(s) was found to have a relatively minor effect on the predicted Ki,WIOM, as long as the molecule(s) are not very polar. This suggests that a single surrogate molecule, such as 1-octanol or a hypothetical SOA structure proposed by Kalberer et al. (2004), may often be sufficient to represent the WIOM component of the SOA phase, greatly simplifying the prediction. The presented methods could substitute for vapour-pressure-based methods in studies such as the explicit modelling of SOA formation from single precursor molecules in chamber experiments.

  20. Molecular docking of Glycine max and Medicago truncatula ureases with urea; bioinformatics approaches.

    PubMed

    Filiz, Ertugrul; Vatansever, Recep; Ozyigit, Ibrahim Ilker

    2016-03-01

    Urease (EC 3.5.1.5) is a nickel-dependent metalloenzyme catalyzing the hydrolysis of urea into ammonia and carbon dioxide. It is present in many bacteria, fungi, yeasts and plants. Most species, with few exceptions, use nickel metalloenzyme urease to hydrolyze urea, which is one of the commonly used nitrogen fertilizer in plant growth thus its enzymatic hydrolysis possesses vital importance in agricultural practices. Considering the essentiality and importance of urea and urease activity in most plants, this study aimed to comparatively investigate the ureases of two important legume species such as Glycine max (soybean) and Medicago truncatula (barrel medic) from Fabaceae family. With additional plant species, primary and secondary structures of 37 plant ureases were comparatively analyzed using various bioinformatics tools. A structure based phylogeny was constructed using predicted 3D models of G. max and M. truncatula, whose crystallographic structures are not available, along with three additional solved urease structures from Canavalia ensiformis (PDB: 4GY7), Bacillus pasteurii (PDB: 4UBP) and Klebsiella aerogenes (PDB: 1FWJ). In addition, urease structures of these species were docked with urea to analyze the binding affinities, interacting amino acids and atom distances in urease-urea complexes. Furthermore, mutable amino acids which could potentially affect the protein active site, stability and flexibility as well as overall protein stability were analyzed in urease structures of G. max and M. truncatula. Plant ureases demonstrated similar physico-chemical properties with 833-878 amino acid residues and 89.39-90.91 kDa molecular weight with mainly acidic (5.15-6.10 pI) nature. Four protein domain structures such as urease gamma, urease beta, urease alpha and amidohydro 1 characterized the plant ureases. Secondary structure of plant ureases also demonstrated conserved protein architecture, with predominantly α-helix and random coil structures. In structure-based phylogeny, plant ureases from G. max, M. truncatula and C. ensiformis were clearly diverged from bacterial ureases of B. pasteurii and K. aerogenes. Glu, Thr, His and Gly were commonly found as interacting residues in most urease-urea docking complexes while Glu was available in all docked structures. Besides, Ala and Arg residues, which are reported in active-site architecture of plant and bacterial ureases were present in G. max urea-urease complex but not present in others. Moreover, Arg435 and Arg437 in M. truncatula and G. max, respectively were identified as highly mutable hotspot residues residing in amidohydro 1 domain of enzyme. In addition, a number of stabilizing residues were predicted upon mutation of these hotspot residues however Cys and Thr made strong implications since they were also found in codon-aligned sequences as substitutions of hotspot residues. Comparative analyses of primary sequence and secondary structure in 37 different plants demonstrated quite conserved natures of ureases in plant kingdom. Structure-based phylogeny indicated the presence of a possible prokaryote-eukaryote split and implicated the subjection of bacterial ureases to heavy selection in prokaryotic evolution compared to plants. Urea-urease docking complexes suggested that different species could share common interacting residues as well as may have some other uncommon residues at species-dependent way. In silico mutation analyses identified mutable amino acids, which were predicted to reside in catalytic site of enzyme therefore mutagenesis at these sites seemed to have adverse effects on enzyme efficiency or function. This study findings will become valuable preliminary resource for future studies to further understand the primary, secondary and tertiary structures of urease sequences in plants as well as it will provide insights about various binding features of urea-urease complexes.

  1. Secondary Structure Prediction of Protein Constructs Using Random Incremental Truncation and Vacuum-Ultraviolet CD Spectroscopy

    PubMed Central

    Pukáncsik, Mária; Orbán, Ágnes; Nagy, Kinga; Matsuo, Koichi; Gekko, Kunihiko; Maurin, Damien; Hart, Darren; Kézsmárki, István; Vertessy, Beata G.

    2016-01-01

    A novel uracil-DNA degrading protein factor (termed UDE) was identified in Drosophila melanogaster with no significant structural and functional homology to other uracil-DNA binding or processing factors. Determination of the 3D structure of UDE is excepted to provide key information on the description of the molecular mechanism of action of UDE catalysis, as well as in general uracil-recognition and nuclease action. Towards this long-term aim, the random library ESPRIT technology was applied to the novel protein UDE to overcome problems in identifying soluble expressing constructs given the absence of precise information on domain content and arrangement. Nine constructs of UDE were chosen to decipher structural and functional relationships. Vacuum ultraviolet circular dichroism (VUVCD) spectroscopy was performed to define the secondary structure content and location within UDE and its truncated variants. The quantitative analysis demonstrated exclusive α-helical content for the full-length protein, which is preserved in the truncated constructs. Arrangement of α-helical bundles within the truncated protein segments suggested new domain boundaries which differ from the conserved motifs determined by sequence-based alignment of UDE homologues. Here we demonstrate that the combination of ESPRIT and VUVCD spectroscopy provides a new structural description of UDE and confirms that the truncated constructs are useful for further detailed functional studies. PMID:27273007

  2. The PYRIN domain: A member of the death domain-fold superfamily

    PubMed Central

    Fairbrother, Wayne J.; Gordon, Nathaniel C.; Humke, Eric W.; O'Rourke, Karen M.; Starovasnik, Melissa A.; Yin, Jian-Ping; Dixit, Vishva M.

    2001-01-01

    PYRIN domains were identified recently as putative protein–protein interaction domains at the N-termini of several proteins thought to function in apoptotic and inflammatory signaling pathways. The ∼95 residue PYRIN domains have no statistically significant sequence homology to proteins with known three-dimensional structure. Using secondary structure prediction and potential-based fold recognition methods, however, the PYRIN domain is predicted to be a member of the six-helix bundle death domain-fold superfamily that includes death domains (DDs), death effector domains (DEDs), and caspase recruitment domains (CARDs). Members of the death domain-fold superfamily are well established mediators of protein–protein interactions found in many proteins involved in apoptosis and inflammation, indicating further that the PYRIN domains serve a similar function. An homology model of the PYRIN domain of CARD7/DEFCAP/NAC/NALP1, a member of the Apaf-1/Ced-4 family of proteins, was constructed using the three-dimensional structures of the FADD and p75 neurotrophin receptor DDs, and of the Apaf-1 and caspase-9 CARDs, as templates. Validation of the model using a variety of computational techniques indicates that the fold prediction is consistent with the sequence. Comparison of a circular dichroism spectrum of the PYRIN domain of CARD7/DEFCAP/NAC/NALP1 with spectra of several proteins known to adopt the death domain-fold provides experimental support for the structure prediction. PMID:11514682

  3. Multi-centre diagnostic classification of individual structural neuroimaging scans from patients with major depressive disorder.

    PubMed

    Mwangi, Benson; Ebmeier, Klaus P; Matthews, Keith; Steele, J Douglas

    2012-05-01

    Quantitative abnormalities of brain structure in patients with major depressive disorder have been reported at a group level for decades. However, these structural differences appear subtle in comparison with conventional radiologically defined abnormalities, with considerable inter-subject variability. Consequently, it has not been possible to readily identify scans from patients with major depressive disorder at an individual level. Recently, machine learning techniques such as relevance vector machines and support vector machines have been applied to predictive classification of individual scans with variable success. Here we describe a novel hybrid method, which combines machine learning with feature selection and characterization, with the latter aimed at maximizing the accuracy of machine learning prediction. The method was tested using a multi-centre dataset of T(1)-weighted 'structural' scans. A total of 62 patients with major depressive disorder and matched controls were recruited from referred secondary care clinical populations in Aberdeen and Edinburgh, UK. The generalization ability and predictive accuracy of the classifiers was tested using data left out of the training process. High prediction accuracy was achieved (~90%). While feature selection was important for maximizing high predictive accuracy with machine learning, feature characterization contributed only a modest improvement to relevance vector machine-based prediction (~5%). Notably, while the only information provided for training the classifiers was T(1)-weighted scans plus a categorical label (major depressive disorder versus controls), both relevance vector machine and support vector machine 'weighting factors' (used for making predictions) correlated strongly with subjective ratings of illness severity. These results indicate that machine learning techniques have the potential to inform clinical practice and research, as they can make accurate predictions about brain scan data from individual subjects. Furthermore, machine learning weighting factors may reflect an objective biomarker of major depressive disorder illness severity, based on abnormalities of brain structure.

  4. On the prediction of turbulent secondary flows

    NASA Technical Reports Server (NTRS)

    Speziale, C. G.; So, R. M. C.; Younis, B. A.

    1992-01-01

    The prediction of turbulent secondary flows, with Reynolds stress models, in circular pipes and non-circular ducts is reviewed. Turbulence-driven secondary flows in straight non-circular ducts are considered along with turbulent secondary flows in pipes and ducts that arise from curvature or a system rotation. The physical mechanisms that generate these different kinds of secondary flows are outlined and the level of turbulence closure required to properly compute each type is discussed in detail. Illustrative computations of a variety of different secondary flows obtained from two-equation turbulence models and second-order closures are provided to amplify these points.

  5. Modeling forest disturbance and recovery in secondary subtropical dry forests of Puerto Rico

    NASA Astrophysics Data System (ADS)

    Holm, J. A.; Shugart, H. H., Jr.; Van Bloem, S. J.

    2015-12-01

    Because of human pressures, the need to understand and predict the long-term dynamics of subtropical dry forests is urgent. Through modifications to the ZELIG vegetation demographic model, including the development of species- and site-specific parameters and internal modifications, the capability to predict forest change within the Guanica State Forest in Puerto Rico can now be accomplished. One objective was to test the capability of this new model (i.e. ZELIG-TROP) to predict successional patterns of secondary forests across a gradient of abandoned fields currently being reclaimed as forests. Model simulations found that abandoned fields that are on degraded lands have a delayed response to fully recover and reach a mature forest status during the simulated time period; 200 years. The forest recovery trends matched predictions published in other studies, such that attributes involving early resource acquisition (i.e. canopy height, canopy coverage, density) were the fastest to recover, but attributes used for structural development (i.e. biomass, basal area) were relatively slow in recovery. Biomass and basal area, two attributes that tend to increase during later successional stages, are significantly lower during the first 80-100 years of recovery compared to a mature forest, suggesting that the time scale of resilience in subtropical dry forests needs to be partially redefined. A second objective was to investigate the long and short-term effects of increasing hurricane disturbances on vegetation structure and dynamics, due to hurricanes playing an important role in maintaining dry forest structure in Puerto Rico. Hurricane disturbance simulations within ZELIG-TROP predicted that increasing hurricane intensity (i.e. up to 100% increase) did not lead to a large shift in long-term AGB or NPP. However, increased hurricane frequency did lead to a 5-40% decrease in AGB, and 32-50% increase in NPP, depending on the treatment. In addition, the modeling approach used here was able to track changes in litterfall, coarse woody debris, and other forest carbon components under various hurricane regimes, a critical step for understanding the future state of subtropical dry forests.

  6. Structure prediction, expression, and antigenicity of c-terminal of GRP78.

    PubMed

    Aghamollaei, Hossein; Mousavi Gargari, Seyed Latif; Ghanei, Mostafa; Rasaee, Mohamad Javad; Amani, Jafar; Bakherad, Hamid; Farnoosh, Gholamreza

    2017-01-01

    Glucose-regulated protein 78 (GRP78) is a typical endoplasmic reticulum luminal chaperone having a main role in the activation of the unfolded protein response. Because of hypoxia and nutrient deprivation in the tumor microenvironment, expression of GRP78 in these cells becomes higher than the native cells, which makes it a suitable candidate for cancer targeting. Suppression of survival signals by antibody production against C-terminal domain of GR78 (CGRP) can induce apoptosis of cancer cells. The aim of this study was in silico analysis, recombinant production, and characterization of CGRP in Escherichia coli. Structural prediction of CGRP by bioinformatics tools was done and the construct containing optimized sequence was transferred to E. coli T7 shuffle. Expression was induced by isopropyl-β-d-thiogalactoside, and recombinant protein was purified by Ni-NTA agarose resin. The content of secondary structures was obtained by circular dichroism (CD) spectrum. CGRP immunogenicity was evaluated from the immunized mouse sera. SDS-PAGE analysis showed CGRP expression in E. coli. CD spectrum also confirmed prediction of structures by bioinformatics tools. The enzyme-linked immunosorbent assay using sera from immunized mice revealed CGRP as a good immunogen. The results obtained in this study showed that the structure of truncated CGRP is very similar to its structure in the whole protein context. This protein can be used in cancer researches. © 2015 International Union of Biochemistry and Molecular Biology, Inc.

  7. Double-multiple streamtube model for studying vertical-axis wind turbines

    NASA Astrophysics Data System (ADS)

    Paraschivoiu, Ion

    1988-08-01

    This work describes the present state-of-the-art in double-multiple streamtube method for modeling the Darrieus-type vertical-axis wind turbine (VAWT). Comparisons of the analytical results with the other predictions and available experimental data show a good agreement. This method, which incorporates dynamic-stall and secondary effects, can be used for generating a suitable aerodynamic-load model for structural design analysis of the Darrieus rotor.

  8. Fine-grained parallelism accelerating for RNA secondary structure prediction with pseudoknots based on FPGA.

    PubMed

    Xia, Fei; Jin, Guoqing

    2014-06-01

    PKNOTS is a most famous benchmark program and has been widely used to predict RNA secondary structure including pseudoknots. It adopts the standard four-dimensional (4D) dynamic programming (DP) method and is the basis of many variants and improved algorithms. Unfortunately, the O(N(6)) computing requirements and complicated data dependency greatly limits the usefulness of PKNOTS package with the explosion in gene database size. In this paper, we present a fine-grained parallel PKNOTS package and prototype system for accelerating RNA folding application based on FPGA chip. We adopted a series of storage optimization strategies to resolve the "Memory Wall" problem. We aggressively exploit parallel computing strategies to improve computational efficiency. We also propose several methods that collectively reduce the storage requirements for FPGA on-chip memory. To the best of our knowledge, our design is the first FPGA implementation for accelerating 4D DP problem for RNA folding application including pseudoknots. The experimental results show a factor of more than 50x average speedup over the PKNOTS-1.08 software running on a PC platform with Intel Core2 Q9400 Quad CPU for input RNA sequences. However, the power consumption of our FPGA accelerator is only about 50% of the general-purpose micro-processors.

  9. Variation analysis of the severe acute respiratory syndrome coronavirus putative non-structural protein 2 gene and construction of three-dimensional model.

    PubMed

    Lu, Jia-hai; Zhang, Ding-mei; Wang, Guo-ling; Guo, Zhong-min; Zhang, Chuan-hai; Tan, Bing-yan; Ouyang, Li-ping; Lin, Li; Liu, Yi-min; Chen, Wei-qing; Ling, Wen-hua; Yu, Xin-bing; Zhong, Nan-shan

    2005-05-05

    The rapid transmission and high mortality rate made severe acute respiratory syndrome (SARS) a global threat for which no efficacious therapy is available now. Without sufficient knowledge about the SARS coronavirus (SARS-CoV), it is impossible to define the candidate for the anti-SARS targets. The putative non-structural protein 2 (nsp2) (3CL(pro), following the nomenclature by Gao et al, also known as nsp5 in Snidjer et al) of SARS-CoV plays an important role in viral transcription and replication, and is an attractive target for anti-SARS drug development, so we carried on this study to have an insight into putative polymerase nsp2 of SARS-CoV Guangdong (GD) strain. The SARS-CoV strain was isolated from a SARS patient in Guangdong, China, and cultured in Vero E6 cells. The nsp2 gene was amplified by reverse transcription-polymerase chain reaction (RT-PCR) and cloned into eukaryotic expression vector pCI-neo (pCI-neo/nsp2). Then the recombinant eukaryotic expression vector pCI-neo/nsp2 was transfected into COS-7 cells using lipofectin reagent to express the nsp2 protein. The expressive protein of SARS-CoV nsp2 was analyzed by 7% sodium dodecylsulfate polyacrylamide gel electrophoresis (SDS-PAGE). The nucleotide sequence and protein sequence of GD nsp2 were compared with that of other SARS-CoV strains by nucleotide-nucleotide basic local alignment search tool (BLASTN) and protein-protein basic local alignment search tool (BLASTP) to investigate its variance trend during the transmission. The secondary structure of GD strain and that of other strains were predicted by Garnier-Osguthorpe-Robson (GOR) Secondary Structure Prediction. Three-dimensional-PSSM Protein Fold Recognition (Threading) Server was employed to construct the three-dimensional model of the nsp2 protein. The putative polymerase nsp2 gene of GD strain was amplified by RT-PCR. The eukaryotic expression vector (pCI-neo/nsp2) was constructed and expressed the protein in COS-7 cells successfully. The result of sequencing and sequence comparison with other SARS-CoV strains showed that nsp2 gene was relatively conservative during the transmission and total five base sites mutated in about 100 strains investigated, three of which in the early and middle phases caused synonymous mutation, and another two base sites variation in the late phase resulted in the amino acid substitutions and secondary structure changes. The three-dimensional structure of the nsp2 protein was successfully constructed. The results suggest that polymerase nsp2 is relatively stable during the phase of epidemic. The amino acid and secondary structure change may be important for viral infection. The fact that majority of single nucleotide variations (SNVs) are predicted to cause synonymous, as well as the result of low mutation rate of nsp2 gene in the epidemic variations, indicates that the nsp2 is conservative and could be a target for anti-SARS drugs. The three-dimensional structure result indicates that the nsp2 protein of GD strain is high homologous with 3CL(pro) of SARS-CoV urbani strain, 3CL(pro) of transmissible gastroenteritis virus and 3CL(pro) of human coronavirus 229E strain, which further suggests that nsp2 protein of GD strain possesses the activity of 3CL(pro).

  10. Three-dimensional (3D) structure prediction and function analysis of the chitin-binding domain 3 protein HD73_3189 from Bacillus thuringiensis HD73.

    PubMed

    Zhan, Yiling; Guo, Shuyuan

    2015-01-01

    Bacillus thuringiensis (Bt) is capable of producing a chitin-binding protein believed to be functionally important to bacteria during the stationary phase of its growth cycle. In this paper, the chitin-binding domain 3 protein HD73_3189 from B. thuringiensis has been analyzed by computer technology. Primary and secondary structural analyses demonstrated that HD73_3189 is negatively charged and contains several α-helices, aperiodical coils and β-strands. Domain and motif analyses revealed that HD73_3189 contains a signal peptide, an N-terminal chitin binding 3 domains, two copies of a fibronectin-like domain 3 and a C-terminal carbohydrate binding domain classified as CBM_5_12. Moreover, analysis predicted the protein's associated localization site to be the cell wall. Ligand site prediction determined that amino acid residues GLU-312, TRP-334, ILE-341 and VAL-382 exposed on the surface of the target protein exhibit polar interactions with the substrate.

  11. CSI 3.0: a web server for identifying secondary and super-secondary structure in proteins using NMR chemical shifts

    PubMed Central

    Hafsa, Noor E.; Arndt, David; Wishart, David S.

    2015-01-01

    The Chemical Shift Index or CSI 3.0 (http://csi3.wishartlab.com) is a web server designed to accurately identify the location of secondary and super-secondary structures in protein chains using only nuclear magnetic resonance (NMR) backbone chemical shifts and their corresponding protein sequence data. Unlike earlier versions of CSI, which only identified three types of secondary structure (helix, β-strand and coil), CSI 3.0 now identifies total of 11 types of secondary and super-secondary structures, including helices, β-strands, coil regions, five common β-turns (type I, II, I′, II′ and VIII), β hairpins as well as interior and edge β-strands. CSI 3.0 accepts experimental NMR chemical shift data in multiple formats (NMR Star 2.1, NMR Star 3.1 and SHIFTY) and generates colorful CSI plots (bar graphs) and secondary/super-secondary structure assignments. The output can be readily used as constraints for structure determination and refinement or the images may be used for presentations and publications. CSI 3.0 uses a pipeline of several well-tested, previously published programs to identify the secondary and super-secondary structures in protein chains. Comparisons with secondary and super-secondary structure assignments made via standard coordinate analysis programs such as DSSP, STRIDE and VADAR on high-resolution protein structures solved by X-ray and NMR show >90% agreement between those made with CSI 3.0. PMID:25979265

  12. Structure and Regulatory Interactions of the Cytoplasmic Terminal Domains of Serotonin Transporter

    PubMed Central

    2014-01-01

    Uptake of neurotransmitters by sodium-coupled monoamine transporters of the NSS family is required for termination of synaptic transmission. Transport is tightly regulated by protein–protein interactions involving the small cytoplasmic segments at the amino- and carboxy-terminal ends of the transporter. Although structures of homologues provide information about the transmembrane regions of these transporters, the structural arrangement of the terminal domains remains largely unknown. Here, we combined molecular modeling, biochemical, and biophysical approaches in an iterative manner to investigate the structure of the 82-residue N-terminal and 30-residue C-terminal domains of human serotonin transporter (SERT). Several secondary structures were predicted in these domains, and structural models were built using the Rosetta fragment-based methodology. One-dimensional 1H nuclear magnetic resonance and circular dichroism spectroscopy supported the presence of helical elements in the isolated SERT N-terminal domain. Moreover, introducing helix-breaking residues within those elements altered the fluorescence resonance energy transfer signal between terminal cyan fluorescent protein and yellow fluorescent protein tags attached to full-length SERT, consistent with the notion that the fold of the terminal domains is relatively well-defined. Full-length models of SERT that are consistent with these and published experimental data were generated. The resultant models predict confined loci for the terminal domains and predict that they move apart during the transport-related conformational cycle, as predicted by structures of homologues and by the “rocking bundle” hypothesis, which is consistent with spectroscopic measurements. The models also suggest the nature of binding to regulatory interaction partners. This study provides a structural context for functional and regulatory mechanisms involving SERT terminal domains. PMID:25093911

  13. LES Modeling with Experimental Validation of a Compound Channel having Converging Floodplain

    NASA Astrophysics Data System (ADS)

    Mohanta, Abinash; Patra, K. C.

    2018-04-01

    Computational fluid dynamics (CFD) is often used to predict flow structures in developing areas of a flow field for the determination of velocity field, pressure, shear stresses, effect of turbulence and others. A two phase three-dimensional CFD model along with the large eddy simulation (LES) model is used to solve the turbulence equation. This study aims to validate CFD simulations of free surface flow or open channel flow by using volume of fluid method by comparing the data observed in hydraulics laboratory of the National Institute of Technology, Rourkela. The finite volume method with a dynamic sub grid scale was carried out for a constant aspect ratio and convergence condition. The results show that the secondary flow and centrifugal force influence flow pattern and show good agreement with experimental data. Within this paper over-bank flows have been numerically simulated using LES in order to predict accurate open channel flow behavior. The LES results are shown to accurately predict the flow features, specifically the distribution of secondary circulations both for in-bank channels as well as over-bank channels at varying depth and width ratios in symmetrically converging flood plain compound sections.

  14. Widespread Secondary Contact and New Glacial Refugia in the Halophilic Rotifer Brachionus plicatilis in the Iberian Peninsula

    PubMed Central

    Campillo, Sergi; Serra, Manuel; Carmona, María José; Gómez, Africa

    2011-01-01

    Small aquatic organisms harbour deep phylogeographic patterns and highly structured populations even at local scales. These patterns indicate restricted gene flow, despite these organisms' high dispersal abilities, and have been explained by a combination of (1) strong founder effects due to rapidly growing populations and very large population sizes, and (2) the development of diapausing egg banks and local adaptation, resulting in low effective gene flow, what is known as the Monopolization hypothesis. In this study, we build up on our understanding of the mitochondrial phylogeography of the halophilic rotifer Brachionus plicatilis in the Iberian Peninsula by both increasing the number of sampled ponds in areas where secondary contact is likely and doubling sample sizes. We analyzed partial mitochondrial sequences of 252 individuals. We found two deep mitochondrial DNA lineages differing in both their genetic diversity and the complexity of their phylogeographic structure. Our analyses suggest that several events of secondary contact between clades occurred after their expansion from glacial refugia. We found a pattern of isolation-by-distance, which we interpret as being the result of historical colonization events. We propose the existence of at least one glacial refugium in the SE of the Iberian Peninsula. Our findings challenge predictions of the Monopolization hypothesis, since coexistence (i.e., secondary contact) of divergent lineages in some ponds in the Iberian Peninsula is common. Our results indicate that phylogeographic structures in small organisms can be very complex and that gene flow between diverse lineages after population establishment can indeed occur. PMID:21698199

  15. Widespread secondary contact and new glacial refugia in the halophilic rotifer Brachionus plicatilis in the Iberian Peninsula.

    PubMed

    Campillo, Sergi; Serra, Manuel; Carmona, María José; Gómez, Africa

    2011-01-01

    Small aquatic organisms harbour deep phylogeographic patterns and highly structured populations even at local scales. These patterns indicate restricted gene flow, despite these organisms' high dispersal abilities, and have been explained by a combination of (1) strong founder effects due to rapidly growing populations and very large population sizes, and (2) the development of diapausing egg banks and local adaptation, resulting in low effective gene flow, what is known as the Monopolization hypothesis. In this study, we build up on our understanding of the mitochondrial phylogeography of the halophilic rotifer Brachionus plicatilis in the Iberian Peninsula by both increasing the number of sampled ponds in areas where secondary contact is likely and doubling sample sizes. We analyzed partial mitochondrial sequences of 252 individuals. We found two deep mitochondrial DNA lineages differing in both their genetic diversity and the complexity of their phylogeographic structure. Our analyses suggest that several events of secondary contact between clades occurred after their expansion from glacial refugia. We found a pattern of isolation-by-distance, which we interpret as being the result of historical colonization events. We propose the existence of at least one glacial refugium in the SE of the Iberian Peninsula. Our findings challenge predictions of the Monopolization hypothesis, since coexistence (i.e., secondary contact) of divergent lineages in some ponds in the Iberian Peninsula is common. Our results indicate that phylogeographic structures in small organisms can be very complex and that gene flow between diverse lineages after population establishment can indeed occur.

  16. Quasi-steady solar wind dynamics

    NASA Technical Reports Server (NTRS)

    Pizzo, V. J.

    1983-01-01

    Progress in understanding the large scale dynamics of quasisteady, corotating solar wind structure was reviewed. The nature of the solar wind at large heliocentric distances preliminary calculations from a 2-D MHD model are used to demonstrate theoretical expectations of corotating structure out to 30 AU. It is found that the forward and reverse shocks from adjacent CIR's begin to interact at about 10 AU, producing new shock pairs flanking secondary CIR's. These sawtooth secondary CIR's interact again at about 20 AU and survive as visible entities to 30 AU. The model predicts the velocity jumps at the leading edge of the secondary CIR's at 30 AU should be very small but there should still be sizable variations in the thermodynamic and magnetic parameters. The driving dynamic mechanism in the distant solar wind is the relaxation of pressure gradients. The second topic is the influence of weak, nonimpulsive time dependence in quasisteady dynamics. It is suggested that modest large scale variations in the coronal flow speed on periods of several hours to a day may be responsible for many of the remaining discrepancies between theory and observation. Effects offer a ready explanation for the apparent rounding of stream fronts between 0.3 and 1.0 AU discovered by Helios.

  17. Molecular Dynamics of "Fuzzy" Transcriptional Activator-Coactivator Interactions

    PubMed Central

    Scholes, Natalie S.; Weinzierl, Robert O. J.

    2016-01-01

    Transcriptional activation domains (ADs) are generally thought to be intrinsically unstructured, but capable of adopting limited secondary structure upon interaction with a coactivator surface. The indeterminate nature of this interface made it hitherto difficult to study structure/function relationships of such contacts. Here we used atomistic accelerated molecular dynamics (aMD) simulations to study the conformational changes of the GCN4 AD and variants thereof, either free in solution, or bound to the GAL11 coactivator surface. We show that the AD-coactivator interactions are highly dynamic while obeying distinct rules. The data provide insights into the constant and variable aspects of orientation of ADs relative to the coactivator, changes in secondary structure and energetic contributions stabilizing the various conformers at different time points. We also demonstrate that a prediction of α-helical propensity correlates directly with the experimentally measured transactivation potential of a large set of mutagenized ADs. The link between α-helical propensity and the stimulatory activity of ADs has fundamental practical and theoretical implications concerning the recruitment of ADs to coactivators. PMID:27175900

  18. QUASAR--scoring and ranking of sequence-structure alignments.

    PubMed

    Birzele, Fabian; Gewehr, Jan E; Zimmer, Ralf

    2005-12-15

    Sequence-structure alignments are a common means for protein structure prediction in the fields of fold recognition and homology modeling, and there is a broad variety of programs that provide such alignments based on sequence similarity, secondary structure or contact potentials. Nevertheless, finding the best sequence-structure alignment in a pool of alignments remains a difficult problem. QUASAR (quality of sequence-structure alignments ranking) provides a unifying framework for scoring sequence-structure alignments that aids finding well-performing combinations of well-known and custom-made scoring schemes. Those scoring functions can be benchmarked against widely accepted quality scores like MaxSub, TMScore, Touch and APDB, thus enabling users to test their own alignment scores against 'standard-of-truth' structure-based scores. Furthermore, individual score combinations can be optimized with respect to benchmark sets based on known structural relationships using QUASAR's in-built optimization routines.

  19. Smoothness within ruggedness: the role of neutrality in adaptation.

    PubMed Central

    Huynen, M A; Stadler, P F; Fontana, W

    1996-01-01

    RNA secondary structure folding algorithms predict the existence of connected networks of RNA sequences with identical structure. On such networks, evolving populations split into subpopulations, which diffuse independently in sequence space. This demands a distinction between two mutation thresholds: one at which genotypic information is lost and one at which phenotypic information is lost. In between, diffusion enables the search of vast areas in genotype space while still preserving the dominant phenotype. By this dynamic the success of phenotypic adaptation becomes much less sensitive to the initial conditions in genotype space. Images Fig. 2 PMID:8552647

  20. Using the Fast Fourier Transform to Accelerate the Computational Search for RNA Conformational Switches

    PubMed Central

    Senter, Evan; Sheikh, Saad; Dotu, Ivan; Ponty, Yann; Clote, Peter

    2012-01-01

    Using complex roots of unity and the Fast Fourier Transform, we design a new thermodynamics-based algorithm, FFTbor, that computes the Boltzmann probability that secondary structures differ by base pairs from an arbitrary initial structure of a given RNA sequence. The algorithm, which runs in quartic time and quadratic space , is used to determine the correlation between kinetic folding speed and the ruggedness of the energy landscape, and to predict the location of riboswitch expression platform candidates. A web server is available at http://bioinformatics.bc.edu/clotelab/FFTbor/. PMID:23284639

  1. Faraday wave lattice as an elastic metamaterial.

    PubMed

    Domino, L; Tarpin, M; Patinet, S; Eddi, A

    2016-05-01

    Metamaterials enable the emergence of novel physical properties due to the existence of an underlying subwavelength structure. Here, we use the Faraday instability to shape the fluid-air interface with a regular pattern. This pattern undergoes an oscillating secondary instability and exhibits spontaneous vibrations that are analogous to transverse elastic waves. By locally forcing these waves, we fully characterize their dispersion relation and show that a Faraday pattern presents an effective shear elasticity. We propose a physical mechanism combining surface tension with the Faraday structured interface that quantitatively predicts the elastic wave phase speed, revealing that the liquid interface behaves as an elastic metamaterial.

  2. CSI 3.0: a web server for identifying secondary and super-secondary structure in proteins using NMR chemical shifts.

    PubMed

    Hafsa, Noor E; Arndt, David; Wishart, David S

    2015-07-01

    The Chemical Shift Index or CSI 3.0 (http://csi3.wishartlab.com) is a web server designed to accurately identify the location of secondary and super-secondary structures in protein chains using only nuclear magnetic resonance (NMR) backbone chemical shifts and their corresponding protein sequence data. Unlike earlier versions of CSI, which only identified three types of secondary structure (helix, β-strand and coil), CSI 3.0 now identifies total of 11 types of secondary and super-secondary structures, including helices, β-strands, coil regions, five common β-turns (type I, II, I', II' and VIII), β hairpins as well as interior and edge β-strands. CSI 3.0 accepts experimental NMR chemical shift data in multiple formats (NMR Star 2.1, NMR Star 3.1 and SHIFTY) and generates colorful CSI plots (bar graphs) and secondary/super-secondary structure assignments. The output can be readily used as constraints for structure determination and refinement or the images may be used for presentations and publications. CSI 3.0 uses a pipeline of several well-tested, previously published programs to identify the secondary and super-secondary structures in protein chains. Comparisons with secondary and super-secondary structure assignments made via standard coordinate analysis programs such as DSSP, STRIDE and VADAR on high-resolution protein structures solved by X-ray and NMR show >90% agreement between those made with CSI 3.0. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  3. The conservation and function of RNA secondary structure in plants

    PubMed Central

    Vandivier, Lee E.; Anderson, Stephen J.; Foley, Shawn W.; Gregory, Brian D.

    2016-01-01

    RNA transcripts fold into secondary structures via intricate patterns of base pairing. These secondary structures impart catalytic, ligand binding, and scaffolding functions to a wide array of RNAs, forming a critical node of biological regulation. Among their many functions, RNA structural elements modulate epigenetic marks, alter mRNA stability and translation, regulate alternative splicing, transduce signals, and scaffold large macromolecular complexes. Thus, the study of RNA secondary structure is critical to understanding the function and regulation of RNA transcripts. Here, we review the origins, form, and function of RNA secondary structure, focusing on plants. We then provide an overview of methods for probing secondary structure, from physical methods such as X-ray crystallography and nuclear magnetic resonance imaging (NMR) to chemical and nuclease probing methods. Marriage with high-throughput sequencing has enabled these latter methods to scale across whole transcriptomes, yielding tremendous new insights into the form and function of RNA secondary structure. PMID:26865341

  4. Boosted food web productivity through ocean acidification collapses under warming.

    PubMed

    Goldenberg, Silvan U; Nagelkerken, Ivan; Ferreira, Camilo M; Ullah, Hadayet; Connell, Sean D

    2017-10-01

    Future climate is forecast to drive bottom-up (resource driven) and top-down (consumer driven) change to food web dynamics and community structure. Yet, our predictive understanding of these changes is hampered by an over-reliance on simplified laboratory systems centred on single trophic levels. Using a large mesocosm experiment, we reveal how future ocean acidification and warming modify trophic linkages across a three-level food web: that is, primary (algae), secondary (herbivorous invertebrates) and tertiary (predatory fish) producers. Both elevated CO 2 and elevated temperature boosted primary production. Under elevated CO 2 , the enhanced bottom-up forcing propagated through all trophic levels. Elevated temperature, however, negated the benefits of elevated CO 2 by stalling secondary production. This imbalance caused secondary producer populations to decline as elevated temperature drove predators to consume their prey more rapidly in the face of higher metabolic demand. Our findings demonstrate how anthropogenic CO 2 can function as a resource that boosts productivity throughout food webs, and how warming can reverse this effect by acting as a stressor to trophic interactions. Understanding the shifting balance between the propagation of resource enrichment and its consumption across trophic levels provides a predictive understanding of future dynamics of stability and collapse in food webs and fisheries production. © 2017 John Wiley & Sons Ltd.

  5. Mussel glue protein has an open conformation.

    PubMed

    Williams, T; Marumo, K; Waite, J H; Henkens, R W

    1989-03-01

    Both native glue protein from marine mussels and a synthetic nonhydroxylated analog were analyzed by far-uv CD under a variety of conditions. Analysis of the CD spectra using various models strongly suggest a primarily random coil structure for both forms of the protein, a fact also supported by the absence of spectral change for the glue protein upon dilution into 6 M guanidine hydrochloride. The nonhydroxylated analog, which consists of 20 repeats of the peptide sequence Ala-Lys-Pro-Ser-Tyr-Pro-Pro-Thr-Tyr-Lys, was further characterized by enzyme modification using mushroom tyrosinase. Enzymatic hydroxylation of tyrosines was found to be best fit by a model containing two rate constants, 5.6 (+/- 0.6) X 10(-3) and 7.2 (+/- 0.3) X 10(-2) min-1. At equilibrium, HPLC analysis of digests showed nearly 100% conversion of Tyr-9 and only 15 to 35% conversion of Tyr-5. The Chou and Fasman rules for predicting structure were applied to the repeat sequence listed above. The rules predict the absence of alpha helix and beta pleated sheets in the structure of this peptide. On the other hand, beta turns are predicted to be present with Tyr-5 being in the region of highest probability. These data suggest that the protein in solution has only a small amount of secondary structure.

  6. The persuasion network is modulated by drug-use risk and predicts anti-drug message effectiveness

    PubMed Central

    Mangus, J Michael; Turner, Benjamin O

    2017-01-01

    Abstract While a persuasion network has been proposed, little is known about how network connections between brain regions contribute to attitude change. Two possible mechanisms have been advanced. One hypothesis predicts that attitude change results from increased connectivity between structures implicated in affective and executive processing in response to increases in argument strength. A second functional perspective suggests that highly arousing messages reduce connectivity between structures implicated in the encoding of sensory information, which disrupts message processing and thereby inhibits attitude change. However, persuasion is a multi-determined construct that results from both message features and audience characteristics. Therefore, persuasive messages should lead to specific functional connectivity patterns among a priori defined structures within the persuasion network. The present study exposed 28 subjects to anti-drug public service announcements where arousal, argument strength, and subject drug-use risk were systematically varied. Psychophysiological interaction analyses provide support for the affective-executive hypothesis but not for the encoding-disruption hypothesis. Secondary analyses show that video-level connectivity patterns among structures within the persuasion network predict audience responses in independent samples (one college-aged, one nationally representative). We propose that persuasion neuroscience research is best advanced by considering network-level effects while accounting for interactions between message features and target audience characteristics. PMID:29140500

  7. Obsessive-compulsive symptoms in a normative Chinese sample of youth: prevalence, symptom dimensions, and factor structure of the Leyton Obsessional Inventory--Child Version.

    PubMed

    Sun, Jing; Boschen, Mark J; Farrell, Lara J; Buys, Nicholas; Li, Zhan-Jiang

    2014-08-01

    Chinese adolescents face life stresses from multiple sources, with higher levels of stress predictive of adolescent mental health outcomes, including in the area of obsessive-compulsive disorders (OCD). Valid assessment of OCD among this age group is therefore a critical need in China. This study aims to standardise the Chinese version of the Leyton short version scale for adolescents of secondary schools in order to assess this condition. Stratified randomly selected adolescents were selected from four high schools located in Beijing, China. The Chinese version of the Leyton scale was administered to 3221 secondary school students aged between 12 and 18 years. A high response rate was achieved, with 3185 adolescents responding to the survey (98.5 percent). Exploratory factor analysis (EFA) extracted four factors from the scale: compulsive thoughts, concerns of cleanliness, lucky number, repetitiveness and repeated checking. The four-factor structures were confirmed using Confirmatory Factor Analysis (CFA). Overall the four-factor structure had a good model fit and high levels of reliability for each individual dimension and reasonable content validity. Invariance analyses in unconstrained, factor loading, and error variance models demonstrated that the Leyton scale is invariant in relation to the presence or absence OCD, age and gender. Discriminant validity analysis demonstrated that the four-factor structure scale also had excellent ability to differentiate between OCD and non-OCD students, male and female students, and age groups. The dataset was a non-clinical sample of high school students, rather than a sample of individuals with OCD. Future research may examine symptom structure in clinical populations to assess whether this structure fits into both clinical and community population. The structure derived from the Leyton short version scale in a non-clinical secondary school sample of adolescents, suggests that a four-factor solution can be utilised as a screening tool to assess adolescents׳ psychopathological symptoms in the area of OCD in mainland Chinese non-clinical secondary school students. Copyright © 2014 Elsevier B.V. All rights reserved.

  8. MollDE: a homology modeling framework you can click with.

    PubMed

    Canutescu, Adrian A; Dunbrack, Roland L

    2005-06-15

    Molecular Integrated Development Environment (MolIDE) is an integrated application designed to provide homology modeling tools and protocols under a uniform, user-friendly graphical interface. Its main purpose is to combine the most frequent modeling steps in a semi-automatic, interactive way, guiding the user from the target protein sequence to the final three-dimensional protein structure. The typical basic homology modeling process is composed of building sequence profiles of the target sequence family, secondary structure prediction, sequence alignment with PDB structures, assisted alignment editing, side-chain prediction and loop building. All of these steps are available through a graphical user interface. MolIDE's user-friendly and streamlined interactive modeling protocol allows the user to focus on the important modeling questions, hiding from the user the raw data generation and conversion steps. MolIDE was designed from the ground up as an open-source, cross-platform, extensible framework. This allows developers to integrate additional third-party programs to MolIDE. http://dunbrack.fccc.edu/molide/molide.php rl_dunbrack@fccc.edu.

  9. Knotty: Efficient and Accurate Prediction of Complex RNA Pseudoknot Structures.

    PubMed

    Jabbari, Hosna; Wark, Ian; Montemagno, Carlo; Will, Sebastian

    2018-06-01

    The computational prediction of RNA secondary structure by free energy minimization has become an important tool in RNA research. However in practice, energy minimization is mostly limited to pseudoknot-free structures or rather simple pseudoknots, not covering many biologically important structures such as kissing hairpins. Algorithms capable of predicting sufficiently complex pseudoknots (for sequences of length n) used to have extreme complexities, e.g. Pknots (Rivas and Eddy, 1999) has O(n6) time and O(n4) space complexity. The algorithm CCJ (Chen et al., 2009) dramatically improves the asymptotic run time for predicting complex pseudoknots (handling almost all relevant pseudoknots, while being slightly less general than Pknots), but this came at the cost of large constant factors in space and time, which strongly limited its practical application (∼200 bases already require 256GB space). We present a CCJ-type algorithm, Knotty, that handles the same comprehensive pseudoknot class of structures as CCJ with improved space complexity of Θ(n3 + Z)-due to the applied technique of sparsification, the number of "candidates", Z, appears to grow significantly slower than n4 on our benchmark set (which include pseudoknotted RNAs up to 400 nucleotides). In terms of run time over this benchmark, Knotty clearly outperforms Pknots and the original CCJ implementation, CCJ 1.0; Knotty's space consumption fundamentally improves over CCJ 1.0, being on a par with the space-economic Pknots. By comparing to CCJ 2.0, our unsparsified Knotty variant, we demonstrate the isolated effect of sparsification. Moreover, Knotty employs the state-of-the-art energy model of "HotKnots DP09", which results in superior prediction accuracy over Pknots. Our software is available at https://github.com/HosnaJabbari/Knotty. will@tbi.unvie.ac.at. Supplementary data are available at Bioinformatics online.

  10. Integrated Analysis Seismic Inversion and Rockphysics for Determining Secondary Porosity Distribution of Carbonate Reservoir at “FR” Field

    NASA Astrophysics Data System (ADS)

    Rosid, M. S.; Augusta, F. F.; Haidar, M. W.

    2018-05-01

    In general, carbonate secondary pore structure is very complex due to the significant diagenesis process. Therefore, the determination of carbonate secondary pore types is an important factor which is related to study of production. This paper mainly deals not only to figure out the secondary pores types, but also to predict the distribution of the secondary pore types of carbonate reservoir. We apply Differential Effective Medium (DEM) for analyzing pore types of carbonate rocks. The input parameter of DEM inclusion model is fraction of porosity and the output parameters are bulk moduli and shear moduli as a function of porosity, which is used as input parameter for creating Vp and Vs modelling. We also apply seismic post-stack inversion technique that is used to map the pore type distribution from 3D seismic data. Afterward, we create porosity cube which is better to use geostatistical method due to the complexity of carbonate reservoir. Thus, the results of this study might show the secondary porosity distribution of carbonate reservoir at “FR” field. In this case, North – Northwest of study area are dominated by interparticle pores and crack pores. Hence, that area has highest permeability that hydrocarbon can be more accumulated.

  11. Ready to use bioinformatics analysis as a tool to predict immobilisation strategies for protein direct electron transfer (DET).

    PubMed

    Cazelles, R; Lalaoui, N; Hartmann, T; Leimkühler, S; Wollenberger, U; Antonietti, M; Cosnier, S

    2016-11-15

    Direct electron transfer (DET) to proteins is of considerable interest for the development of biosensors and bioelectrocatalysts. While protein structure is mainly used as a method of attaching the protein to the electrode surface, we employed bioinformatics analysis to predict the suitable orientation of the enzymes to promote DET. Structure similarity and secondary structure prediction were combined underlying localized amino-acids able to direct one of the enzyme's electron relays toward the electrode surface by creating a suitable bioelectrocatalytic nanostructure. The electro-polymerization of pyrene pyrrole onto a fluorine-doped tin oxide (FTO) electrode allowed the targeted orientation of the formate dehydrogenase enzyme from Rhodobacter capsulatus (RcFDH) by means of hydrophobic interactions. Its electron relays were directed to the FTO surface, thus promoting DET. The reduction of nicotinamide adenine dinucleotide (NAD(+)) generating a maximum current density of 1μAcm(-2) with 10mM NAD(+) leads to a turnover number of 0.09electron/s/molRcFDH. This work represents a practical approach to evaluate electrode surface modification strategies in order to create valuable bioelectrocatalysts. Copyright © 2016 Elsevier B.V. All rights reserved.

  12. Prediction of adolescents doing physical activity after completing secondary education.

    PubMed

    Moreno-Murcia, Juan Antonio; Huéscar, Elisa; Cervelló, Eduardo

    2012-03-01

    The purpose of this study, based on the self-determination theory (Ryan & Deci, 2000) was to test the prediction power of student's responsibility, psychological mediators, intrinsic motivation and the importance attached to physical education in the intention to continue to practice some form of physical activity and/or sport, and the possible relationships that exist between these variables. We used a sample of 482 adolescent students in physical education classes, with a mean age of 14.3 years, which were measured for responsibility, psychological mediators, sports motivation, the importance of physical education and intention to be physically active. We completed an analysis of structural equations modelling. The results showed that the responsibility positively predicted psychological mediators, and this predicted intrinsic motivation, which positively predicted the importance students attach to physical education, and this, finally, positively predicted the intention of the student to continue doing sport. Results are discussed in relation to the promotion of student's responsibility towards a greater commitment to the practice of physical exercise.

  13. Rapid Shifts in Soil and Forest Floor Microbial Communities with Changes in Vegetation during Secondary Tropical Forest Succession

    NASA Astrophysics Data System (ADS)

    Smith, A.; Marin-Spiotta, E.; Balser, T. C.

    2012-12-01

    Soil microorganisms regulate fundamental biochemical processes in plant litter decomposition and soil organic matter (SOM) transformations. In order to predict how disturbance affects belowground carbon storage, it is important to understand how the forest floor and soil microbial community respond to changes in land cover, and the consequences on SOM formation and stabilization. We are measuring microbial functional diversity and activity across a long-term successional chronosequence of secondary forests regrowing on abandoned pastures in the wet subtropical forest life zone of Puerto Rico. Here we report intra- and interannual data on soil and litter microbial community composition (via phospholipid fatty acid analysis, PLFA) and microbial activity (via extracellular enzyme activity) from active pastures, secondary forests aged 20, 30, 40, 70, and 90-years, and primary forests. Microbial community composition and extracellular enzyme activity differed significantly by season in these wet subtropical ecosystems, even though differences in mean monthly precipitation between the middle of the dry season (January) and the wet season (July) is only 30mm. Despite seasonal differences, there was a persistent strong effect of land cover type and forest successional stage, or age, on overall microbial community PLFA structure. Using principal component analysis, we found differences in microbial community structure among active pastures, early, and late successional forests. The separation of soil microbes into early and late successional communities parallels the clustering of tree composition data. While the successional patterns held across seasons, the importance of different microbial groups driving these patterns differed seasonally. Biomarkers for gram-positive and actinobacteria (i15:0 and 16:0 10Me) were associated with early (20, 30 & 40 year old) secondary forests in the dry season. These younger forest communities were identified by the biomarker for anaerobic gram-negative bacteria (c19:0) in the wet season, which suggests the presence of anaerobic microsites in these very clayey Oxisols. Enzymatic activity did not differ with succession but was highest in the dry season. We expect this may be due to decreased turnover of enzymes with low soil moisture. Interannual sampling has revealed a very rapid microbial response to changes in aboveground cover. Within a year following woody biomass encroachment, we detected a shift in the soil microbial community from a pasture-associated community to an early secondary forest community in one of our replicate pasture sites. This very rapid response in the belowground microbial community structure to changes in vegetation has not been strongly documented in the literature. This data supports a direct link between aboveground and belowground biotic community structures and highlights the importance of long-term repeated sampling of microbial communities in dynamic ecosystems. Our findings have implications for predicting rapid ecological responses to land-cover change.

  14. Advances in the understanding and use of the genomic base of microbial secondary metabolite biosynthesis for the discovery of new natural products.

    PubMed

    McAlpine, James B

    2009-03-27

    Over the past decade major changes have occurred in the access to genome sequences that encode the enzymes responsible for the biosynthesis of secondary metabolites, knowledge of how those sequences translate into the final structure of the metabolite, and the ability to alter the sequence to obtain predicted products via both homologous and heterologous expression. Novel genera have been discovered leading to new chemotypes, but more surprisingly several instances have been uncovered where the apparently general rules of modular translation have not applied. Several new biosynthetic pathways have been unearthed, and our general knowledge grows rapidly. This review aims to highlight some of the more striking discoveries and advances of the decade.

  15. Predicting Academic Success of Junior Secondary School Students in Mathematics through Cognitive Style and Problem Solving Technique

    ERIC Educational Resources Information Center

    Badru, Ademola K.

    2015-01-01

    This study examined the prediction of academic success of Junior secondary school mathematics students using their cognitive style and problem solving technique. A descriptive survey of correlation type was adopted for this study. A purposive sampling procedure was used to select five Public Junior secondary schools in Ijebu-Ode local government…

  16. Analysis of energy-based algorithms for RNA secondary structure prediction

    PubMed Central

    2012-01-01

    Background RNA molecules play critical roles in the cells of organisms, including roles in gene regulation, catalysis, and synthesis of proteins. Since RNA function depends in large part on its folded structures, much effort has been invested in developing accurate methods for prediction of RNA secondary structure from the base sequence. Minimum free energy (MFE) predictions are widely used, based on nearest neighbor thermodynamic parameters of Mathews, Turner et al. or those of Andronescu et al. Some recently proposed alternatives that leverage partition function calculations find the structure with maximum expected accuracy (MEA) or pseudo-expected accuracy (pseudo-MEA) methods. Advances in prediction methods are typically benchmarked using sensitivity, positive predictive value and their harmonic mean, namely F-measure, on datasets of known reference structures. Since such benchmarks document progress in improving accuracy of computational prediction methods, it is important to understand how measures of accuracy vary as a function of the reference datasets and whether advances in algorithms or thermodynamic parameters yield statistically significant improvements. Our work advances such understanding for the MFE and (pseudo-)MEA-based methods, with respect to the latest datasets and energy parameters. Results We present three main findings. First, using the bootstrap percentile method, we show that the average F-measure accuracy of the MFE and (pseudo-)MEA-based algorithms, as measured on our largest datasets with over 2000 RNAs from diverse families, is a reliable estimate (within a 2% range with high confidence) of the accuracy of a population of RNA molecules represented by this set. However, average accuracy on smaller classes of RNAs such as a class of 89 Group I introns used previously in benchmarking algorithm accuracy is not reliable enough to draw meaningful conclusions about the relative merits of the MFE and MEA-based algorithms. Second, on our large datasets, the algorithm with best overall accuracy is a pseudo MEA-based algorithm of Hamada et al. that uses a generalized centroid estimator of base pairs. However, between MFE and other MEA-based methods, there is no clear winner in the sense that the relative accuracy of the MFE versus MEA-based algorithms changes depending on the underlying energy parameters. Third, of the four parameter sets we considered, the best accuracy for the MFE-, MEA-based, and pseudo-MEA-based methods is 0.686, 0.680, and 0.711, respectively (on a scale from 0 to 1 with 1 meaning perfect structure predictions) and is obtained with a thermodynamic parameter set obtained by Andronescu et al. called BL* (named after the Boltzmann likelihood method by which the parameters were derived). Conclusions Large datasets should be used to obtain reliable measures of the accuracy of RNA structure prediction algorithms, and average accuracies on specific classes (such as Group I introns and Transfer RNAs) should be interpreted with caution, considering the relatively small size of currently available datasets for such classes. The accuracy of the MEA-based methods is significantly higher when using the BL* parameter set of Andronescu et al. than when using the parameters of Mathews and Turner, and there is no significant difference between the accuracy of MEA-based methods and MFE when using the BL* parameters. The pseudo-MEA-based method of Hamada et al. with the BL* parameter set significantly outperforms all other MFE and MEA-based algorithms on our large data sets. PMID:22296803

  17. Analysis of energy-based algorithms for RNA secondary structure prediction.

    PubMed

    Hajiaghayi, Monir; Condon, Anne; Hoos, Holger H

    2012-02-01

    RNA molecules play critical roles in the cells of organisms, including roles in gene regulation, catalysis, and synthesis of proteins. Since RNA function depends in large part on its folded structures, much effort has been invested in developing accurate methods for prediction of RNA secondary structure from the base sequence. Minimum free energy (MFE) predictions are widely used, based on nearest neighbor thermodynamic parameters of Mathews, Turner et al. or those of Andronescu et al. Some recently proposed alternatives that leverage partition function calculations find the structure with maximum expected accuracy (MEA) or pseudo-expected accuracy (pseudo-MEA) methods. Advances in prediction methods are typically benchmarked using sensitivity, positive predictive value and their harmonic mean, namely F-measure, on datasets of known reference structures. Since such benchmarks document progress in improving accuracy of computational prediction methods, it is important to understand how measures of accuracy vary as a function of the reference datasets and whether advances in algorithms or thermodynamic parameters yield statistically significant improvements. Our work advances such understanding for the MFE and (pseudo-)MEA-based methods, with respect to the latest datasets and energy parameters. We present three main findings. First, using the bootstrap percentile method, we show that the average F-measure accuracy of the MFE and (pseudo-)MEA-based algorithms, as measured on our largest datasets with over 2000 RNAs from diverse families, is a reliable estimate (within a 2% range with high confidence) of the accuracy of a population of RNA molecules represented by this set. However, average accuracy on smaller classes of RNAs such as a class of 89 Group I introns used previously in benchmarking algorithm accuracy is not reliable enough to draw meaningful conclusions about the relative merits of the MFE and MEA-based algorithms. Second, on our large datasets, the algorithm with best overall accuracy is a pseudo MEA-based algorithm of Hamada et al. that uses a generalized centroid estimator of base pairs. However, between MFE and other MEA-based methods, there is no clear winner in the sense that the relative accuracy of the MFE versus MEA-based algorithms changes depending on the underlying energy parameters. Third, of the four parameter sets we considered, the best accuracy for the MFE-, MEA-based, and pseudo-MEA-based methods is 0.686, 0.680, and 0.711, respectively (on a scale from 0 to 1 with 1 meaning perfect structure predictions) and is obtained with a thermodynamic parameter set obtained by Andronescu et al. called BL* (named after the Boltzmann likelihood method by which the parameters were derived). Large datasets should be used to obtain reliable measures of the accuracy of RNA structure prediction algorithms, and average accuracies on specific classes (such as Group I introns and Transfer RNAs) should be interpreted with caution, considering the relatively small size of currently available datasets for such classes. The accuracy of the MEA-based methods is significantly higher when using the BL* parameter set of Andronescu et al. than when using the parameters of Mathews and Turner, and there is no significant difference between the accuracy of MEA-based methods and MFE when using the BL* parameters. The pseudo-MEA-based method of Hamada et al. with the BL* parameter set significantly outperforms all other MFE and MEA-based algorithms on our large data sets.

  18. Statistical Analysis of the Ionosphere based on Singular Value Decomposition

    NASA Astrophysics Data System (ADS)

    Demir, Uygar; Arikan, Feza; Necat Deviren, M.; Toker, Cenk

    2016-07-01

    Ionosphere is made up of a spatio-temporally varying trend structure and secondary variations due to solar, geomagnetic, gravitational and seismic activities. Hence, it is important to monitor the ionosphere and acquire up-to-date information about its state in order both to better understand the physical phenomena that cause the variability and also to predict the effect of the ionosphere on HF and satellite communications, and satellite-based positioning systems. To charaterise the behaviour of the ionosphere, we propose to apply Singular Value Decomposition (SVD) to Total Electron Content (TEC) maps obtained from the TNPGN-Active (Turkish National Permanent GPS Network) CORS network. TNPGN-Active network consists of 146 GNSS receivers spread over Turkey. IONOLAB-TEC values estimated from each station are spatio-temporally interpolated using a Universal Kriging based algorithm with linear trend, namely IONOLAB-MAP, with very high spatial resolution. It is observed that the dominant singular value of TEC maps is an indicator of the trend structure of the ionosphere. The diurnal, seasonal and annual variability of the most dominant value is the representation of solar effect on ionosphere in midlatitude range. Secondary and smaller singular values are indicators of secondary variation which can have significance especially during geomagnetic storms or seismic disturbances. The dominant singular values are related to the physical basis vectors where ionosphere can be fully reconstructed using these vectors. Therefore, the proposed method can be used both for the monitoring of the current state of a region and also for the prediction and tracking of future states of ionosphere using singular values and singular basis vectors. This study is supported by by TUBITAK 115E915 and Joint TUBITAK 114E092 and AS CR14/001 projects.

  19. Genetic diversity based on 28S rDNA sequences among populations of Culex quinquefasciatus collected at different locations in Tamil Nadu, India.

    PubMed

    Sakthivelkumar, S; Ramaraj, P; Veeramani, V; Janarthanan, S

    2015-09-01

    The basis of the present study was to distinguish the existence of any genetic variability among populations of Culex quinquefasciatus which would be a valuable tool in the management of mosquito control programmes. In the present study, population of Cx. quinquefasciatus collected at different locations in Tamil Nadu were analyzed for their genetic variation based on 28S rDNA D2 region nucleotide sequences. A high degree of genetic polymorphism was detected in the sequences of D2 region of 28S rDNA on the predicted secondary structures in spite of high nucleotide sequence similarity. The findings based on secondary structure using rDNA sequences suggested the existence of a complex genotypic diversity of Cx. quinquefasciatus population collected at different locations of Tamil Nadu, India. This complexity in genetic diversity in a single mosquito population collected at different locations is considered an important issue towards their influence and nature of vector potential of these mosquitoes.

  20. The useful field of view assessment predicts simulated commercial motor vehicle driving safety.

    PubMed

    McManus, Benjamin; Heaton, Karen; Vance, David E; Stavrinos, Despina

    2016-10-02

    The Useful Field of View (UFOV) assessment, a measure of visual speed of processing, has been shown to be a predictive measure of motor vehicle collision (MVC) involvement in an older adult population, but it remains unknown whether UFOV predicts commercial motor vehicle (CMV) driving safety during secondary task engagement. The purpose of this study is to determine whether the UFOV assessment predicts simulated MVCs in long-haul CMV drivers. Fifty licensed CMV drivers (Mage = 39.80, SD = 8.38, 98% male, 56% Caucasian) were administered the 3-subtest version of the UFOV assessment, where lower scores measured in milliseconds indicated better performance. CMV drivers completed 4 simulated drives, each spanning approximately a 22.50-mile distance. Four secondary tasks were presented to participants in a counterbalanced order during the drives: (a) no secondary task, (b) cell phone conversation, (c) text messaging interaction, and (d) e-mailing interaction with an on-board dispatch device. The selective attention subtest significantly predicted simulated MVCs regardless of secondary task. Each 20 ms slower on subtest 3 was associated with a 25% increase in the risk of an MVC in the simulated drive. The e-mail interaction secondary task significantly predicted simulated MVCs with a 4.14 times greater risk of an MVC compared to the no secondary task condition. Subtest 3, a measure of visual speed of processing, significantly predicted MVCs in the email interaction task. Each 20 ms slower on subtest 3 was associated with a 25% increase in the risk of an MVC during the email interaction task. The UFOV subtest 3 may be a promising measure to identify CMV drivers who may be at risk for MVCs or in need of cognitive training aimed at improving speed of processing. Subtest 3 may also identify CMV drivers who are particularly at risk when engaged in secondary tasks while driving.

  1. Augmented Method to Improve Thermal Data for the Figure Drift Thermal Distortion Predictions of the JWST OTIS Cryogenic Vacuum Test

    NASA Technical Reports Server (NTRS)

    Park, Sang C.; Carnahan, Timothy M.; Cohen, Lester M.; Congedo, Cherie B.; Eisenhower, Michael J.; Ousley, Wes; Weaver, Andrew; Yang, Kan

    2017-01-01

    The JWST Optical Telescope Element (OTE) assembly is the largest optically stable infrared-optimized telescope currently being manufactured and assembled, and is scheduled for launch in 2018. The JWST OTE, including the 18 segment primary mirror, secondary mirror, and the Aft Optics Subsystem (AOS) are designed to be passively cooled and operate near 45K. These optical elements are supported by a complex composite backplane structure. As a part of the structural distortion model validation efforts, a series of tests are planned during the cryogenic vacuum test of the fully integrated flight hardware at NASA JSC Chamber A. The successful ends to the thermal-distortion phases are heavily dependent on the accurate temperature knowledge of the OTE structural members. However, the current temperature sensor allocations during the cryo-vac test may not have sufficient fidelity to provide accurate knowledge of the temperature distributions within the composite structure. A method based on an inverse distance relationship among the sensors and thermal model nodes was developed to improve the thermal data provided for the nanometer scale WaveFront Error (WFE) predictions. The Linear Distance Weighted Interpolation (LDWI) method was developed to augment the thermal model predictions based on the sparse sensor information. This paper will encompass the development of the LDWI method using the test data from the earlier pathfinder cryo-vac tests, and the results of the notional and as tested WFE predictions from the structural finite element model cases to characterize the accuracies of this LDWI method.

  2. Computational-based structural, functional and phylogenetic analysis of Enterobacter phytases.

    PubMed

    Pramanik, Krishnendu; Kundu, Shreyasi; Banerjee, Sandipan; Ghosh, Pallab Kumar; Maiti, Tushar Kanti

    2018-06-01

    Myo-inositol hexakisphosphate phosphohydrolases (i.e., phytases) are known to be a very important enzyme responsible for solubilization of insoluble phosphates. In the present study, Enterobacter phytases have characterized by different phylogenetic, structural and functional parameters using some standard bio-computational tools. Results showed that majority of the Enterobacter phytases are acidic in nature as most of the isoelectric points were under 7.0. The aliphatic indices predicted for the selected proteins were below 40 indicating their thermostable nature. The average molecular weight of the proteins was 48 kDa. The lower values of GRAVY of the said proteins implied that they have better interactions with water. Secondary structure prediction revealed that alpha-helical content was highest among the other forms such as sheets, coils, etc. Moreover, the predicted 3D structure of Enterobacter phytases divulged that the proteins consisted of four monomeric polypeptide chains i.e., it was a tetrameric protein. The predicted tertiary model of E. aerogenes (A0A0M3HCJ2) was deposited in Protein Model Database (Acc. No.: PM0080561) for further utilization after a thorough quality check from QMEAN and SAVES server. Functional analysis supported their classification as histidine acid phosphatases. Besides, multiple sequence alignment revealed that "DG-DP-LG" was the most highly conserved residues within the Enterobacter phytases. Thus, the present study will be useful in selecting suitable phytase-producing microbe exclusively for using in the animal food industry as a food additive.

  3. Mfold web server for nucleic acid folding and hybridization prediction.

    PubMed

    Zuker, Michael

    2003-07-01

    The abbreviated name, 'mfold web server', describes a number of closely related software applications available on the World Wide Web (WWW) for the prediction of the secondary structure of single stranded nucleic acids. The objective of this web server is to provide easy access to RNA and DNA folding and hybridization software to the scientific community at large. By making use of universally available web GUIs (Graphical User Interfaces), the server circumvents the problem of portability of this software. Detailed output, in the form of structure plots with or without reliability information, single strand frequency plots and 'energy dot plots', are available for the folding of single sequences. A variety of 'bulk' servers give less information, but in a shorter time and for up to hundreds of sequences at once. The portal for the mfold web server is http://www.bioinfo.rpi.edu/applications/mfold. This URL will be referred to as 'MFOLDROOT'.

  4. A TALE-inspired computational screen for proteins that contain approximate tandem repeats.

    PubMed

    Perycz, Malgorzata; Krwawicz, Joanna; Bochtler, Matthias

    2017-01-01

    TAL (transcription activator-like) effectors (TALEs) are bacterial proteins that are secreted from bacteria to plant cells to act as transcriptional activators. TALEs and related proteins (RipTALs, BurrH, MOrTL1 and MOrTL2) contain approximate tandem repeats that differ in conserved positions that define specificity. Using PERL, we screened ~47 million protein sequences for TALE-like architecture characterized by approximate tandem repeats (between 30 and 43 amino acids in length) and sequence variability in conserved positions, without requiring sequence similarity to TALEs. Candidate proteins were scored according to their propensity for nuclear localization, secondary structure, repeat sequence complexity, as well as covariation and predicted structural proximity of variable residues. Biological context was tentatively inferred from co-occurrence of other domains and interactome predictions. Approximate repeats with TALE-like features that merit experimental characterization were found in a protein of chestnut blight fungus, a eukaryotic plant pathogen.

  5. A TALE-inspired computational screen for proteins that contain approximate tandem repeats

    PubMed Central

    Krwawicz, Joanna

    2017-01-01

    TAL (transcription activator-like) effectors (TALEs) are bacterial proteins that are secreted from bacteria to plant cells to act as transcriptional activators. TALEs and related proteins (RipTALs, BurrH, MOrTL1 and MOrTL2) contain approximate tandem repeats that differ in conserved positions that define specificity. Using PERL, we screened ~47 million protein sequences for TALE-like architecture characterized by approximate tandem repeats (between 30 and 43 amino acids in length) and sequence variability in conserved positions, without requiring sequence similarity to TALEs. Candidate proteins were scored according to their propensity for nuclear localization, secondary structure, repeat sequence complexity, as well as covariation and predicted structural proximity of variable residues. Biological context was tentatively inferred from co-occurrence of other domains and interactome predictions. Approximate repeats with TALE-like features that merit experimental characterization were found in a protein of chestnut blight fungus, a eukaryotic plant pathogen. PMID:28617832

  6. The growth pattern of the human intestine and its mesentery.

    PubMed

    Soffers, Jelly H M; Hikspoors, Jill P J M; Mekonen, Hayelom K; Koehler, S Eleonore; Lamers, Wouter H

    2015-08-22

    It remains unclear to what extent midgut rotation determines human intestinal topography and pathology. We reinvestigated the midgut during its looping and herniation phases of development, using novel 3D visualization techniques. We distinguished 3 generations of midgut loops. The topography of primary and secondary loops was constant, but that of tertiary loops not. The orientation of the primary loop changed from sagittal to transverse due to the descent of ventral structures in a body with a still helical body axis. The 1st secondary loop (duodenum, proximal jejunum) developed intraabdominally towards a left-sided position. The 2nd secondary loop (distal jejunum) assumed a left-sided position inside the hernia before returning, while the 3rd and 4th secondary loops retained near-midline positions. Intestinal return into the abdomen resembled a backward sliding movement. Only after return, the 4th secondary loop (distal ileum, cecum) rapidly "slid" into the right lower abdomen. The seemingly random position of the tertiary small-intestinal loops may have a biomechanical origin. The interpretation of "intestinal rotation" as a mechanistic rather than a descriptive concept underlies much of the confusion accompanying the physiological herniation. We argue, instead, that the concept of "en-bloc rotation" of the developing midgut is a fallacy of schematic drawings. Primary, secondary and tertiary loops arise in a hierarchical fashion. The predictable position and growth of secondary loops is pre-patterned and determines adult intestinal topography. We hypothesize based on published accounts that malrotations result from stunted development of secondary loops.

  7. Comparison of predicted binders in Rhipicephalus (Boophilus) microplus intestine protein variants Bm86 Campo Grande strain, Bm86 and Bm95.

    PubMed

    Andreotti, Renato; Pedroso, Marisela S; Caetano, Alexandre R; Martins, Natália F

    2008-01-01

    This paper reports the sequence analysis of Bm86 Campo Grande strain comparing it with Bm86 and Bm95 antigens from the preparations TickGardPLUS and Gavac, respectively. The PCR product was cloned into pMOSBlue and sequenced. The secondary structure prediction tool PSIPRED was used to calculate alpha helices and beta strand contents of the predicted polypeptide. The hydrophobicity profile was calculated using the algorithms from the Hopp and Woods method, in addition to identification of potential MHC class-I binding regions in the antigens. Pair-wise alignment revealed that the similarity between Bm86 Campo Grande strain and Bm86 is 0.2% higher than that between Bm86 Campo Grande strain and Bm95 antigens. The identities were 96.5% and 96.3% respectively. Major suggestive differences in hydrophobicity were predicted among the sequences in two specific regions.

  8. Biodiversity and functional regeneration during secondary succession in a tropical dry forest: from microorganisms to mammals

    NASA Astrophysics Data System (ADS)

    do Espírito Santo, M. M.; Neves, F. S.; Valério, H. M.; Leite, L. O.; Falcão, L. A.; Borges, M.; Beirão, M.; Reis, R., Jr.; Berbara, R.; Nunes, Y. R.; Silva, A.; Silva, L. F.; Siqueira, P. R.

    2015-12-01

    In this study, we aimed to determine the changes on soil traits, forest structure and species richness and composition of multiple groups of organisms along secondary succession in a tropical dry forest (TDF) in southeastern Brazil. We defined three successional stages based in forest vertical and horizontal structure and age: early (18-25 years), intermediate (50-60 years) and late (no records of clearing). Five plots of 50 x 20 m were established per stage, and the following groups were sampled using specific techniques: rhizobacteria, mycorrhiza, trees and lianas, butterflies, ants, dung beetles, mosquitoes (Culicidae), birds and bats. We also determined soil chemical and physical characteristics and forest structure (tree height, density and basal area). Soil fertility increased along the successional gradient, and the same pattern was observed for all the forest structure variables. However, species richness and composition showed mixed results depending on the organism group. Three groups usually considered as good bioindicators of habitat quality did not differ in species richness and composition between stages: butterflies, ants and dung beetles. On the other hand, rizhobacteria and mycorrhiza differed both in species richness and composition between stages and may be more sensitive to changes in environmental conditions in TDFs. The other five groups differed either in species richness or composition between one or two pairs of successional stages. Although changes in abiotic conditions and forest structure match the predictions of classical successional models, the response of each group of organism is idiosyncratic in terms of diversity and ecological function, as a consequence of specific resource requirements and life-history traits. In general, diversity increased and functional groups changed mostly from early to intermediate-late stages, strengthening the importance of secondary forests to the maintenance of ecosystem integrity of TDFs.

  9. Meta-ecosystem dynamics and functioning on finite spatial networks

    PubMed Central

    Marleau, Justin N.; Guichard, Frédéric; Loreau, Michel

    2014-01-01

    The addition of spatial structure to ecological concepts and theories has spurred integration between sub-disciplines within ecology, including community and ecosystem ecology. However, the complexity of spatial models limits their implementation to idealized, regular landscapes. We present a model meta-ecosystem with finite and irregular spatial structure consisting of local nutrient–autotrophs–herbivores ecosystems connected through spatial flows of materials and organisms. We study the effect of spatial flows on stability and ecosystem functions, and provide simple metrics of connectivity that can predict these effects. Our results show that high rates of nutrient and herbivore movement can destabilize local ecosystem dynamics, leading to spatially heterogeneous equilibria or oscillations across the meta-ecosystem, with generally increased meta-ecosystem primary and secondary production. However, the onset and the spatial scale of these emergent dynamics depend heavily on the spatial structure of the meta-ecosystem and on the relative movement rate of the autotrophs. We show how this strong dependence on finite spatial structure eludes commonly used metrics of connectivity, but can be predicted by the eigenvalues and eigenvectors of the connectivity matrix that describe the spatial structure and scale. Our study indicates the need to consider finite-size ecosystems in meta-ecosystem theory. PMID:24403323

  10. regSNPs-splicing: a tool for prioritizing synonymous single-nucleotide substitution.

    PubMed

    Zhang, Xinjun; Li, Meng; Lin, Hai; Rao, Xi; Feng, Weixing; Yang, Yuedong; Mort, Matthew; Cooper, David N; Wang, Yue; Wang, Yadong; Wells, Clark; Zhou, Yaoqi; Liu, Yunlong

    2017-09-01

    While synonymous single-nucleotide variants (sSNVs) have largely been unstudied, since they do not alter protein sequence, mounting evidence suggests that they may affect RNA conformation, splicing, and the stability of nascent-mRNAs to promote various diseases. Accurately prioritizing deleterious sSNVs from a pool of neutral ones can significantly improve our ability of selecting functional genetic variants identified from various genome-sequencing projects, and, therefore, advance our understanding of disease etiology. In this study, we develop a computational algorithm to prioritize sSNVs based on their impact on mRNA splicing and protein function. In addition to genomic features that potentially affect splicing regulation, our proposed algorithm also includes dozens structural features that characterize the functions of alternatively spliced exons on protein function. Our systematical evaluation on thousands of sSNVs suggests that several structural features, including intrinsic disorder protein scores, solvent accessible surface areas, protein secondary structures, and known and predicted protein family domains, show significant differences between disease-causing and neutral sSNVs. Our result suggests that the protein structure features offer an added dimension of information while distinguishing disease-causing and neutral synonymous variants. The inclusion of structural features increases the predictive accuracy for functional sSNV prioritization.

  11. Absence of residual structure in the intrinsically disordered regulatory protein CP12 in its reduced state

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Launay, Hélène; Barré, Patrick; Puppo, Carine

    2016-08-12

    The redox switch protein CP12 is a key player of the regulation of the Benson–Calvin cycle. Its oxidation state is controlled by the formation/dissociation of two intramolecular disulphide bridges during the day/night cycle. CP12 was known to be globally intrinsically disordered on a large scale in its reduced state, while being partly ordered in the oxidised state. By combining Nuclear Magnetic Resonance and Small Angle X-ray Scattering experiments, we showed that, contrary to secondary structure or disorder predictions, reduced CP12 is fully disordered, with no transient or local residual structure likely to be precursor of the structures identified in themore » oxidised active state and/or in the bound state with GAPDH or PRK. These results highlight the diversity of the mechanisms of regulation of conditionally disordered redox switches, and question the stability of oxidised CP12 scaffold. - Highlights: • CP12 is predicted to form two helices in its N-terminal sequence. • Reduced CP12 is disordered as a random coil according to SAXS. • Limited or no transient structures are observed in reduced CP12 by NMR.« less

  12. Deciphering the shape and deformation of secondary structures through local conformation analysis

    PubMed Central

    2011-01-01

    Background Protein deformation has been extensively analysed through global methods based on RMSD, torsion angles and Principal Components Analysis calculations. Here we use a local approach, able to distinguish among the different backbone conformations within loops, α-helices and β-strands, to address the question of secondary structures' shape variation within proteins and deformation at interface upon complexation. Results Using a structural alphabet, we translated the 3 D structures of large sets of protein-protein complexes into sequences of structural letters. The shape of the secondary structures can be assessed by the structural letters that modeled them in the structural sequences. The distribution analysis of the structural letters in the three protein compartments (surface, core and interface) reveals that secondary structures tend to adopt preferential conformations that differ among the compartments. The local description of secondary structures highlights that curved conformations are preferred on the surface while straight ones are preferred in the core. Interfaces display a mixture of local conformations either preferred in core or surface. The analysis of the structural letters transition occurring between protein-bound and unbound conformations shows that the deformation of secondary structure is tightly linked to the compartment preference of the local conformations. Conclusion The conformation of secondary structures can be further analysed and detailed thanks to a structural alphabet which allows a better description of protein surface, core and interface in terms of secondary structures' shape and deformation. Induced-fit modification tendencies described here should be valuable information to identify and characterize regions under strong structural constraints for functional reasons. PMID:21284872

  13. Deciphering the shape and deformation of secondary structures through local conformation analysis.

    PubMed

    Baussand, Julie; Camproux, Anne-Claude

    2011-02-01

    Protein deformation has been extensively analysed through global methods based on RMSD, torsion angles and Principal Components Analysis calculations. Here we use a local approach, able to distinguish among the different backbone conformations within loops, α-helices and β-strands, to address the question of secondary structures' shape variation within proteins and deformation at interface upon complexation. Using a structural alphabet, we translated the 3 D structures of large sets of protein-protein complexes into sequences of structural letters. The shape of the secondary structures can be assessed by the structural letters that modeled them in the structural sequences. The distribution analysis of the structural letters in the three protein compartments (surface, core and interface) reveals that secondary structures tend to adopt preferential conformations that differ among the compartments. The local description of secondary structures highlights that curved conformations are preferred on the surface while straight ones are preferred in the core. Interfaces display a mixture of local conformations either preferred in core or surface. The analysis of the structural letters transition occurring between protein-bound and unbound conformations shows that the deformation of secondary structure is tightly linked to the compartment preference of the local conformations. The conformation of secondary structures can be further analysed and detailed thanks to a structural alphabet which allows a better description of protein surface, core and interface in terms of secondary structures' shape and deformation. Induced-fit modification tendencies described here should be valuable information to identify and characterize regions under strong structural constraints for functional reasons.

  14. Functional analysis screening for multiple topographies of problem behavior.

    PubMed

    Bell, Marlesha C; Fahmie, Tara A

    2018-04-23

    The current study evaluated a screening procedure for multiple topographies of problem behavior in the context of an ongoing functional analysis. Experimenters analyzed the function of a topography of primary concern while collecting data on topographies of secondary concern. We used visual analysis to predict the function of secondary topographies and a subsequent functional analysis to test those predictions. Results showed that a general function was accurately predicted for five of six (83%) secondary topographies. A specific function was predicted and supported for a subset of these topographies. The experimenters discuss the implication of these results for clinicians who have limited time for functional assessment. © 2018 Society for the Experimental Analysis of Behavior.

  15. Schooling Effects on Degree Performance: A Comparison of the Predictive Validity of Aptitude Testing and Secondary School Grades at Oxford University

    ERIC Educational Resources Information Center

    Ogg, Tom; Zimdars, Anna; Heath, Anthony

    2009-01-01

    This article examines the cause of school type effects upon gaining a first class degree at Oxford University, whereby for a given level of secondary school performance, private school students perform less well at degree level. We compare the predictive power of an aptitude test and secondary school grades (GCSEs) for final examination…

  16. Simplified Model to Predict Deflection and Natural Frequency of Steel Pole Structures

    NASA Astrophysics Data System (ADS)

    Balagopal, R.; Prasad Rao, N.; Rokade, R. P.

    2018-04-01

    Steel pole structures are suitable alternate to transmission line towers, due to difficulty encountered in finding land for the new right of way for installation of new lattice towers. The steel poles have tapered cross section and they are generally used for communication, power transmission and lighting purposes. Determination of deflection of steel pole is important to decide its functionality requirement. The excessive deflection of pole may affect the signal attenuation and short circuiting problems in communication/transmission poles. In this paper, a simplified method is proposed to determine both primary and secondary deflection based on dummy unit load/moment method. The predicted deflection from proposed method is validated with full scale experimental investigation conducted on 8 m and 30 m high lighting mast, 132 and 400 kV transmission pole and found to be in close agreement with each other. Determination of natural frequency is an important criterion to examine its dynamic sensitivity. A simplified semi-empirical method using the static deflection from the proposed method is formulated to determine its natural frequency. The natural frequency predicted from proposed method is validated with FE analysis results. Further the predicted results are validated with experimental results available in literature.

  17. Molecular Dynamics Simulations Reveal an Interplay between SHAPE Reagent Binding and RNA Flexibility.

    PubMed

    Mlýnský, Vojtěch; Bussi, Giovanni

    2018-01-18

    The function of RNA molecules usually depends on their overall fold and on the presence of specific structural motifs. Chemical probing methods are routinely used in combination with nearest-neighbor models to determine RNA secondary structure. Among the available methods, SHAPE is relevant due to its capability to probe all RNA nucleotides and the possibility to be used in vivo. However, the structural determinants for SHAPE reactivity and its mechanism of reaction are still unclear. Here molecular dynamics simulations and enhanced sampling techniques are used to predict the accessibility of nucleotide analogs and larger RNA structural motifs to SHAPE reagents. We show that local RNA reconformations are crucial in allowing reagents to reach the 2'-OH group of a particular nucleotide and that sugar pucker is a major structural factor influencing SHAPE reactivity.

  18. Nondestructive Evaluation and Monitoring Results from COPV Accelerated Stress Rupture Testing, NASA White Sands Test Facility (WSTF)

    NASA Technical Reports Server (NTRS)

    Saulsberry Regor

    2010-01-01

    Develop and demonstrate NDE techniques for real-time characterization of CPVs and, where possible, identification of NDE capable of assessing stress rupture related strength degradation and/or making vessel life predictions (structural health monitoring or periodic inspection modes). Secondary: Provide the COPV user and materials community with quality carbon/epoxy (C/Ep) COPV stress rupture progression rate data. Aid in modeling, manufacturing, and application of COPVs for NASA spacecraft.

  19. FRAMEWORK FOR STRUCTURAL ONLINE HEALTH MONITORING OF AGING AND DEGRADATION OF SECONDARY PIPING SYSTEMS DUE TO SOME ASPECTS OF EROSION

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gribok, Andrei V.; Agarwal, Vivek

    This paper describes the current state of research related to critical aspects of erosion and selected aspects of degradation of secondary components in nuclear power plants (NPPs). The paper also proposes a framework for online health monitoring of aging and degradation of secondary components. The framework consists of an integrated multi-sensor modality system, which can be used to monitor different piping configurations under different degradation conditions. The report analyses the currently known degradation mechanisms and available predictive models. Based on this analysis, the structural health monitoring framework is proposed. The Light Water Reactor Sustainability Program began to evaluate technologies thatmore » could be used to perform online monitoring of piping and other secondary system structural components in commercial NPPs. These online monitoring systems have the potential to identify when a more detailed inspection is needed using real time measurements, rather than at a pre-determined inspection interval. This transition to condition-based, risk-informed automated maintenance will contribute to a significant reduction of operations and maintenance costs that account for the majority of nuclear power generation costs. Furthermore, of the operations and maintenance costs in U.S. plants, approximately 80% are labor costs. To address the issue of rising operating costs and economic viability, in 2017, companies that operate the national nuclear energy fleet started the Delivering the Nuclear Promise Initiative, which is a 3 year program aimed at maintaining operational focus, increasing value, and improving efficiency. There is unanimous agreement between industry experts and academic researchers that identifying and prioritizing inspection locations in secondary piping systems (for example, in raw water piping or diesel piping) would eliminate many excessive in-service inspections. The proposed structural health monitoring framework takes aim at answering this challenge by combining long range guided wave technologies with other monitoring techniques, which can significantly increase the inspection length and pinpoint the locations that degraded the most. More widely, the report suggests research efforts aimed at developing, validating, and deploying online corrosion monitoring techniques for complex geometries, which are pervasive in NPPs.« less

  20. Influence of thermodynamically unfavorable secondary structures on DNA hybridization kinetics

    PubMed Central

    Hata, Hiroaki; Kitajima, Tetsuro

    2018-01-01

    Abstract Nucleic acid secondary structure plays an important role in nucleic acid–nucleic acid recognition/hybridization processes, and is also a vital consideration in DNA nanotechnology. Although the influence of stable secondary structures on hybridization kinetics has been characterized, unstable secondary structures, which show positive ΔG° with self-folding, can also form, and their effects have not been systematically investigated. Such thermodynamically unfavorable secondary structures should not be ignored in DNA hybridization kinetics, especially under isothermal conditions. Here, we report that positive ΔG° secondary structures can change the hybridization rate by two-orders of magnitude, despite the fact that their hybridization obeyed second-order reaction kinetics. The temperature dependence of hybridization rates showed non-Arrhenius behavior; thus, their hybridization is considered to be nucleation limited. We derived a model describing how ΔG° positive secondary structures affect hybridization kinetics in stopped-flow experiments with 47 pairs of oligonucleotides. The calculated hybridization rates, which were based on the model, quantitatively agreed with the experimental rate constant. PMID:29220504

  1. Functional analysis of environmental DNA-derived type II polyketide synthases reveals structurally diverse secondary metabolites.

    PubMed

    Feng, Zhiyang; Kallifidas, Dimitris; Brady, Sean F

    2011-08-02

    A single gram of soil is predicted to contain thousands of unique bacterial species. The majority of these species remain recalcitrant to standard culture methods, prohibiting their use as sources of unique bioactive small molecules. The cloning and analysis of DNA extracted directly from environmental samples (environmental DNA, eDNA) provides a means of exploring the biosynthetic capacity of natural bacterial populations. Environmental DNA libraries contain large reservoirs of bacterial genetic diversity from which new secondary metabolite gene clusters can be systematically recovered and studied. The identification and heterologous expression of type II polyketide synthase-containing eDNA clones is reported here. Functional analysis of three soil DNA-derived polyketide synthase systems in Streptomyces albus revealed diverse metabolites belonging to well-known, rare, and previously uncharacterized structural families. The first of these systems is predicted to encode the production of the known antibiotic landomycin E. The second was found to encode the production of a metabolite with a previously uncharacterized pentacyclic ring system. The third was found to encode the production of unique KB-3346-5 derivatives, which show activity against methicillin-resistant Staphylococcus aureus and vancomycin-resistant Enterococcus faecalis. These results, together with those of other small-molecule-directed metagenomic studies, suggest that culture-independent approaches are capable of accessing biosynthetic diversity that has not yet been extensively explored using culture-based methods. The large-scale functional screening of eDNA clones should be a productive strategy for generating structurally previously uncharacterized chemical entities for use in future drug development efforts.

  2. In silico structural analysis of group 3, 6 and 9 allergens from Dermatophagoides farinae.

    PubMed

    Teng, Feixiang; Yu, Lili; Bian, Yonghua; Sun, Jinxia; Wu, Juansong; Ling, Cunbao; Yang, Li; Wang, Yungang; Cui, Yubao

    2015-05-01

    Dermatophagoides farinae (Hughes; Acari: Pyroglyphidae) are the predominant source of dust mite allergens, which provoke allergic diseases, such as rhinitis, asthma and eczema. Of the 30 allergen groups produced by D. farinae, the Der f 3, Der f 6 and Der f 9 allergens are all trypsin‑associated proteins, however little else is currently known about them. The present study used in silico tools to compare the amino acid sequences, and predict the secondary and tertiary structures of Der f 3, Der f 6 and Der f 9 allergens. Protein sequence alignment detected ~46% identity between Der f 3, Der f 6 and Der f 9. Furthermore, each protein was shown to contain three active sites and two highly conserved trypsin functional domains. Predictions of the secondary and tertiary structure identified α‑helices, β‑sheets and random coils. The active sites of the three proteins appeared to fold onto each other in a three‑dimensional model, constituting the active site of the enzyme. Epitope analysis demonstrated that Der f 3, Der f 6 and Der f 9 have 4‑5 potential epitopes located in random coils, and the epitope sequences of Der f 3, Der f 6 and Der f 9 were shown to overlap in two domains (at amino acids 83‑87 and 179‑180); however the residues in these two domains were not identical. The present study aimed to conduct a biochemical and genetic analysis of these three allergens, and to potentially contribute to the development of vaccines for allergen‑specific immunotherapy.

  3. Computational approach to analyze isolated ssDNA aptamers against angiotensin II.

    PubMed

    Heiat, Mohammad; Najafi, Ali; Ranjbar, Reza; Latifi, Ali Mohammad; Rasaee, Mohammad Javad

    2016-07-20

    Aptamers are oligonucleotides with highly structured molecules that can bind to their targets through specific 3-D conformation. Commonly, not all the nucleotides such as primer binding fixed region and some other sequences are vital for aptamers folding and interaction. Elimination of unnecessary regions needs trustworthy prediction tools to reduce experimental efforts and errors. Here we introduced a manipulated in-silico approach to predict the 3-D structure of aptamers and their target interactions. To design an approach for computational analysis of isolated ssDNA aptamers (FLC112, FLC125 and their truncated core region including CRC112 and CRC125), their secondary and tertiary structures were modeled by Mfold and RNA composer respectively. Output PDB files were modified from RNA to DNA in the discovery studio visualizer software. Using ZDOCK server, the aptamer-target interactions were predicted. Finally, the interaction scores were compared with the experimental results. In-silico interaction scores and the experimental outcomes were in the same descending arrangement of FLC112>CRC125>CRC112>FLC125 with similar intensity. The consistent results of innovative in-silico method with experimental outputs, affirmed that the present method may be a reliable approach. Also, it showed that the exact in-silico predictions can be utilized as a credible reference to find aptameric fragments binding potency. Copyright © 2016 Elsevier B.V. All rights reserved.

  4. Residue length and solvation model dependency of elastinlike polypeptides

    NASA Astrophysics Data System (ADS)

    Bilsel, Mustafa; Arkin, Handan

    2010-05-01

    We have performed exhaustive multicanonical Monte Carlo simulations of elastinlike polypeptides with a chain including amino acids (valine-proline-glycine-valine-glycine)n or in short (VPGVG)n , where n changes from 1 to 4, in order to investigate the thermodynamic and structural properties. To predict the characteristic secondary structure motifs of the molecules, Ramachandran plots were prepared and analyzed as well. In these studies, we utilized a realistic model where the interactions between all types of atoms were taken into account. Effects of solvation were also simulated by using an implicit-solvent model with two commonly used solvation parameter sets and compared with the vacuum case.

  5. Structural assessment of a Space Station solar dynamic heat receiver thermal energy storage canister

    NASA Technical Reports Server (NTRS)

    Tong, M. T.; Kerslake, T. W.; Thompson, R. L.

    1988-01-01

    This paper assesses the structural performance of a Space Station thermal energy storage (TES) canister subject to orbital solar flux variation and engine cold start-up operating conditions. The impact of working fluid temperature and salt-void distribution on the canister structure are assessed. Both analytical and experimental studies were conducted to determine the temperature distribution of the canister. Subsequent finite-element structural analyses of the canister were performed using both analytically and experimentally obtained temperatures. The Arrhenius creep law was incorporated into the procedure, using secondary creep data for the canister material, Haynes-188 alloy. The predicted cyclic creep strain accumulations at the hot spot were used to assess the structural performance of the canister. In addition, the structural performance of the canister based on the analytically-determined temperature was compared with that based on the experimentally-measured temperature data.

  6. Structural assessment of a space station solar dynamic heat receiver thermal energy storage canister

    NASA Technical Reports Server (NTRS)

    Thompson, R. L.; Kerslake, T. W.; Tong, M. T.

    1988-01-01

    The structural performance of a space station thermal energy storage (TES) canister subject to orbital solar flux variation and engine cold start up operating conditions was assessed. The impact of working fluid temperature and salt-void distribution on the canister structure are assessed. Both analytical and experimental studies were conducted to determine the temperature distribution of the canister. Subsequent finite element structural analyses of the canister were performed using both analytically and experimentally obtained temperatures. The Arrhenius creep law was incorporated into the procedure, using secondary creep data for the canister material, Haynes 188 alloy. The predicted cyclic creep strain accumulations at the hot spot were used to assess the structural performance of the canister. In addition, the structural performance of the canister based on the analytically determined temperature was compared with that based on the experimentally measured temperature data.

  7. Predicting cyanobacterial abundance, microcystin, and geosmin in a eutrophic drinking-water reservoir using a 14-year dataset

    USGS Publications Warehouse

    Harris, Ted D.; Graham, Jennifer L.

    2017-01-01

    Cyanobacterial blooms degrade water quality in drinking water supply reservoirs by producing toxic and taste-and-odor causing secondary metabolites, which ultimately cause public health concerns and lead to increased treatment costs for water utilities. There have been numerous attempts to create models that predict cyanobacteria and their secondary metabolites, most using linear models; however, linear models are limited by assumptions about the data and have had limited success as predictive tools. Thus, lake and reservoir managers need improved modeling techniques that can accurately predict large bloom events that have the highest impact on recreational activities and drinking-water treatment processes. In this study, we compared 12 unique linear and nonlinear regression modeling techniques to predict cyanobacterial abundance and the cyanobacterial secondary metabolites microcystin and geosmin using 14 years of physiochemical water quality data collected from Cheney Reservoir, Kansas. Support vector machine (SVM), random forest (RF), boosted tree (BT), and Cubist modeling techniques were the most predictive of the compared modeling approaches. SVM, RF, and BT modeling techniques were able to successfully predict cyanobacterial abundance, microcystin, and geosmin concentrations <60,000 cells/mL, 2.5 µg/L, and 20 ng/L, respectively. Only Cubist modeling predicted maxima concentrations of cyanobacteria and geosmin; no modeling technique was able to predict maxima microcystin concentrations. Because maxima concentrations are a primary concern for lake and reservoir managers, Cubist modeling may help predict the largest and most noxious concentrations of cyanobacteria and their secondary metabolites.

  8. REPPER—repeats and their periodicities in fibrous proteins

    PubMed Central

    Gruber, Markus; Söding, Johannes; Lupas, Andrei N.

    2005-01-01

    REPPER (REPeats and their PERiodicities) is an integrated server that detects and analyzes regions with short gapless repeats in protein sequences or alignments. It finds periodicities by Fourier Transform (FTwin) and internal similarity analysis (REPwin). FTwin assigns numerical values to amino acids that reflect certain properties, for instance hydrophobicity, and gives information on corresponding periodicities. REPwin uses self-alignments and displays repeats that reveal significant internal similarities. Both programs use a sliding window to ensure that different periodic regions within the same protein are detected independently. FTwin and REPwin are complemented by secondary structure prediction (PSIPRED) and coiled coil prediction (COILS), making the server a versatile analysis tool for sequences of fibrous proteins. REPPER is available at . PMID:15980460

  9. Prediction of protein structural classes by Chou's pseudo amino acid composition: approached using continuous wavelet transform and principal component analysis.

    PubMed

    Li, Zhan-Chao; Zhou, Xi-Bin; Dai, Zong; Zou, Xiao-Yong

    2009-07-01

    A prior knowledge of protein structural classes can provide useful information about its overall structure, so it is very important for quick and accurate determination of protein structural class with computation method in protein science. One of the key for computation method is accurate protein sample representation. Here, based on the concept of Chou's pseudo-amino acid composition (AAC, Chou, Proteins: structure, function, and genetics, 43:246-255, 2001), a novel method of feature extraction that combined continuous wavelet transform (CWT) with principal component analysis (PCA) was introduced for the prediction of protein structural classes. Firstly, the digital signal was obtained by mapping each amino acid according to various physicochemical properties. Secondly, CWT was utilized to extract new feature vector based on wavelet power spectrum (WPS), which contains more abundant information of sequence order in frequency domain and time domain, and PCA was then used to reorganize the feature vector to decrease information redundancy and computational complexity. Finally, a pseudo-amino acid composition feature vector was further formed to represent primary sequence by coupling AAC vector with a set of new feature vector of WPS in an orthogonal space by PCA. As a showcase, the rigorous jackknife cross-validation test was performed on the working datasets. The results indicated that prediction quality has been improved, and the current approach of protein representation may serve as a useful complementary vehicle in classifying other attributes of proteins, such as enzyme family class, subcellular localization, membrane protein types and protein secondary structure, etc.

  10. Fast prediction of RNA-RNA interaction using heuristic algorithm.

    PubMed

    Montaseri, Soheila

    2015-01-01

    Interaction between two RNA molecules plays a crucial role in many medical and biological processes such as gene expression regulation. In this process, an RNA molecule prohibits the translation of another RNA molecule by establishing stable interactions with it. Some algorithms have been formed to predict the structure of the RNA-RNA interaction. High computational time is a common challenge in most of the presented algorithms. In this context, a heuristic method is introduced to accurately predict the interaction between two RNAs based on minimum free energy (MFE). This algorithm uses a few dot matrices for finding the secondary structure of each RNA and binding sites between two RNAs. Furthermore, a parallel version of this method is presented. We describe the algorithm's concurrency and parallelism for a multicore chip. The proposed algorithm has been performed on some datasets including CopA-CopT, R1inv-R2inv, Tar-Tar*, DIS-DIS, and IncRNA54-RepZ in Escherichia coli bacteria. The method has high validity and efficiency, and it is run in low computational time in comparison to other approaches.

  11. [Bioinformatics analysis of mosquito densovirus nostructure protein NS1].

    PubMed

    Dong, Yun-qiao; Ma, Wen-li; Gu, Jin-bao; Zheng, Wen-ling

    2009-12-01

    To analyze and predict the structure and function of mosquito densovirus (MDV) nostructual protein1 (NS1). Using different bioinformatics software, the EXPASY pmtparam tool, ClustalX1.83, Bioedit, MEGA3.1, ScanProsite, and Motifscan, respectively to comparatively analyze and predict the physic-chemical parameters, homology, evolutionary relation, secondary structure and main functional motifs of NS1. MDV NS1 protein was a unstable hydrophilic protein and the amino acid sequence was highly conserved which had a relatively closer evolutionary distance with infectious hypodermal and hematopoietic necrosis virus (IHHNV). MDV NS1 has a specific domain of superfamily 3 helicase of small DNA viruses. This domain contains the NTP-binding region with a metal ion-dependent ATPase activity. A virus replication roller rolling-circle replication(RCR) initiation domain was found near the N terminal of this protein. This protien has the biological function of single stranded incision enzyme. The bioinformatics prediction results suggest that MDV NS1 protein plays a key role in viral replication, packaging, and the other stages of viral life.

  12. Nanostructure and molecular mechanics of spider dragline silk protein assemblies

    PubMed Central

    Keten, Sinan; Buehler, Markus J.

    2010-01-01

    Spider silk is a self-assembling biopolymer that outperforms most known materials in terms of its mechanical performance, despite its underlying weak chemical bonding based on H-bonds. While experimental studies have shown that the molecular structure of silk proteins has a direct influence on the stiffness, toughness and failure strength of silk, no molecular-level analysis of the nanostructure and associated mechanical properties of silk assemblies have been reported. Here, we report atomic-level structures of MaSp1 and MaSp2 proteins from the Nephila clavipes spider dragline silk sequence, obtained using replica exchange molecular dynamics, and subject these structures to mechanical loading for a detailed nanomechanical analysis. The structural analysis reveals that poly-alanine regions in silk predominantly form distinct and orderly beta-sheet crystal domains, while disorderly regions are formed by glycine-rich repeats that consist of 31-helix type structures and beta-turns. Our structural predictions are validated against experimental data based on dihedral angle pair calculations presented in Ramachandran plots, alpha-carbon atomic distances, as well as secondary structure content. Mechanical shearing simulations on selected structures illustrate that the nanoscale behaviour of silk protein assemblies is controlled by the distinctly different secondary structure content and hydrogen bonding in the crystalline and semi-amorphous regions. Both structural and mechanical characterization results show excellent agreement with available experimental evidence. Our findings set the stage for extensive atomistic investigations of silk, which may contribute towards an improved understanding of the source of the strength and toughness of this biological superfibre. PMID:20519206

  13. Nanostructure and molecular mechanics of spider dragline silk protein assemblies.

    PubMed

    Keten, Sinan; Buehler, Markus J

    2010-12-06

    Spider silk is a self-assembling biopolymer that outperforms most known materials in terms of its mechanical performance, despite its underlying weak chemical bonding based on H-bonds. While experimental studies have shown that the molecular structure of silk proteins has a direct influence on the stiffness, toughness and failure strength of silk, no molecular-level analysis of the nanostructure and associated mechanical properties of silk assemblies have been reported. Here, we report atomic-level structures of MaSp1 and MaSp2 proteins from the Nephila clavipes spider dragline silk sequence, obtained using replica exchange molecular dynamics, and subject these structures to mechanical loading for a detailed nanomechanical analysis. The structural analysis reveals that poly-alanine regions in silk predominantly form distinct and orderly beta-sheet crystal domains, while disorderly regions are formed by glycine-rich repeats that consist of 3₁-helix type structures and beta-turns. Our structural predictions are validated against experimental data based on dihedral angle pair calculations presented in Ramachandran plots, alpha-carbon atomic distances, as well as secondary structure content. Mechanical shearing simulations on selected structures illustrate that the nanoscale behaviour of silk protein assemblies is controlled by the distinctly different secondary structure content and hydrogen bonding in the crystalline and semi-amorphous regions. Both structural and mechanical characterization results show excellent agreement with available experimental evidence. Our findings set the stage for extensive atomistic investigations of silk, which may contribute towards an improved understanding of the source of the strength and toughness of this biological superfibre.

  14. Coevolutionary modeling of protein sequences: Predicting structure, function, and mutational landscapes

    NASA Astrophysics Data System (ADS)

    Weigt, Martin

    Over the last years, biological research has been revolutionized by experimental high-throughput techniques, in particular by next-generation sequencing technology. Unprecedented amounts of data are accumulating, and there is a growing request for computational methods unveiling the information hidden in raw data, thereby increasing our understanding of complex biological systems. Statistical-physics models based on the maximum-entropy principle have, in the last few years, played an important role in this context. To give a specific example, proteins and many non-coding RNA show a remarkable degree of structural and functional conservation in the course of evolution, despite a large variability in amino acid sequences. We have developed a statistical-mechanics inspired inference approach - called Direct-Coupling Analysis - to link this sequence variability (easy to observe in sequence alignments, which are available in public sequence databases) to bio-molecular structure and function. In my presentation I will show, how this methodology can be used (i) to infer contacts between residues and thus to guide tertiary and quaternary protein structure prediction and RNA structure prediction, (ii) to discriminate interacting from non-interacting protein families, and thus to infer conserved protein-protein interaction networks, and (iii) to reconstruct mutational landscapes and thus to predict the phenotypic effect of mutations. References [1] M. Figliuzzi, H. Jacquier, A. Schug, O. Tenaillon and M. Weigt ''Coevolutionary landscape inference and the context-dependence of mutations in beta-lactamase TEM-1'', Mol. Biol. Evol. (2015), doi: 10.1093/molbev/msv211 [2] E. De Leonardis, B. Lutz, S. Ratz, S. Cocco, R. Monasson, A. Schug, M. Weigt ''Direct-Coupling Analysis of nucleotide coevolution facilitates RNA secondary and tertiary structure prediction'', Nucleic Acids Research (2015), doi: 10.1093/nar/gkv932 [3] F. Morcos, A. Pagnani, B. Lunt, A. Bertolino, D. Marks, C. Sander, R. Zecchina, J.N. Onuchic, T. Hwa, M. Weigt, ''Direct-coupling analysis of residue co-evolution captures native contacts across many protein families'', Proc. Natl. Acad. Sci. 108, E1293-E1301 (2011).

  15. Structural and magnetic properties of quaternary Co{sub 2}Mn{sub 1-x}Cr{sub x}Si Heusler alloy thin films

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Aftab, M.; Department of Physics, Quaid-i-Azam University, Islamabad; Hassnain Jaffari, G.

    2011-09-01

    We present the structural, magnetic, and transport properties of quaternary Co{sub 2}Mn{sub 1-x}Cr{sub x}Si (0 {<=} x {<=} 1) Heusler alloy thin films prepared by DC magnetron sputtering on commercially available glass substrates without any buffer layer. Recent theoretical calculations have shown the compositions to be half-metallic. XRD patterns show the presence of L2{sub 1} structure in the films for x = 0, however, the peaks intensities are not in accordance with the literature. High resolution transmission electron microscopy images of films show granular morphologies, crystalline growth, and an ordered L2{sub 1} structure for x {<=} 0.6. For higher Crmore » concentrations, secondary phases start to appear in the films. Magnetization measurements as a function of applied magnetic field show that the saturation moments for x {<=} 0.2 follow the Slater-Pauling rule, however, for 0.2 < x {<=} 0.6 the saturation moments fall short of the theoretically predicted values. Transport measurements at room temperature show a monotonic increase in resistivity with increasing Cr concentration. These results are explained in terms of texturing effects, Co-Cr antisite disorder, presence of secondary phases, and the amount of disorder present in the films.« less

  16. The dynamics of interacting salt structures and associated fluid flow in the western Norwegian-Danish Basin

    NASA Astrophysics Data System (ADS)

    Olsen, Mikkel S.; Clausen, Ole R.; Andresen, Katrine J.; Korstgård, John A.

    2015-04-01

    Minor secondary structures observed along the flanks of major salt structures in the Norwegian-Danish Basin appear to be generated mainly during the early stages of halokinesis. Seismic anomalies in the cover sediments at the flanks of the major salt structures and in relation to one of the secondary structures show several circular patterns. The circular patterns are generally interpreted as faults related to collapsing salt, indicating a subtle and dynamic cannibalization relationship between the secondary structure and the main diapir. High-amplitude reflections interpreted as either entrapped gas along the circular faults or diagenetic changes induced by the fluids originating from the salt-sediment interface generally enhances the seismic appearance of the circular faults, but potentially also disturb the seismic imaging of the faults. Other secondary salt structures, with a similar geometry, do not show sign of collapse, apparently due to a greater distance from the main salt structures and therefore not within the reach of being cannibalized by these. The observations furthermore suggest a trend showing a more advanced development of the main salt structures when the secondary structures are cannibalized. The lateral distribution of the main salt structures thus appears to be controlled not only by the initial thickness of the Zechstein salt, and possible underlying structures, but also by subtle variations in the location and evolution of secondary structures. The secondary structures have a major impact on the drainage of the deep Mesozoic succession as indicated by the fluid flow pattern also observed in the study, which emphasizes that a detailed mapping of salt structures including secondary structures at the flanks is of major importance during evaluation of petroleum systems in areas dominated by halokinesis.

  17. Structure and in vitro activities of a Copper II-chelating anionic peptide from the venom of the scorpion Tityus stigmurus.

    PubMed

    Melo, Menilla M A; Daniele-Silva, Alessandra; Teixeira, Diego G; Estrela, Andréia B; Melo, Karolline R T; Oliveira, Verônica S; Rocha, Hugo A O; Ferreira, Leandro de Santis; Pontes, Daniel L; Lima, João P M S; Silva-Júnior, Arnóbio A; Barbosa, Euzebio G; Carvalho, Eneas; Fernandes-Pedrosa, Matheus F

    2017-08-01

    Anionic Peptides are molecules rich in aspartic acid (Asp) and/or glutamic acid (Glu) residues in the primary structure. This work presents, for the first time, structural characterization and biological activity assays of an anionic peptide from the venom of the scorpion Tityus stigmurus, named TanP. The three-dimensional structure of TanP was obtained by computational modeling and refined by molecular dynamic (MD) simulations. Furthermore, we have performed circular dichroism (CD) analysis to predict TanP secondary structure, and UV-vis spectroscopy to evaluate its chelating activity. CD indicated predominance of random coil conformation in aqueous medium, as well as changes in structure depending on pH and temperature. TanP has chelating activity on copper ions, which modified the peptide's secondary structure. These results were corroborated by MD data. The molar ratio of binding (TanP:copper) depends on the concentration of peptide: at lower TanP concentration, the molar ratio was 1:5 (TanP:Cu 2+ ), whereas in concentrated TanP solution, the molar ratio was 1:3 (TanP:Cu 2+ ). TanP was not cytotoxic to non-neoplastic or cancer cell lines, and showed an ability to inhibit the in vitro release of nitric oxide by LPS-stimulated macrophages. Altogether, the results suggest TanP is a promising peptide for therapeutic application as a chelating agent. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. Hepatitis Delta Antigen Requires a Flexible Quasi-Double-Stranded RNA Structure To Bind and Condense Hepatitis Delta Virus RNA in a Ribonucleoprotein Complex

    PubMed Central

    Griffin, Brittany L.; Chasovskikh, Sergey; Dritschilo, Anatoly

    2014-01-01

    ABSTRACT The circular genome and antigenome RNAs of hepatitis delta virus (HDV) form characteristic unbranched, quasi-double-stranded RNA secondary structures in which short double-stranded helical segments are interspersed with internal loops and bulges. The ribonucleoprotein complexes (RNPs) formed by these RNAs with the virus-encoded protein hepatitis delta antigen (HDAg) perform essential roles in the viral life cycle, including viral replication and virion formation. Little is understood about the formation and structure of these complexes and how they function in these key processes. Here, the specific RNA features required for HDAg binding and the topology of the complexes formed were investigated. Selective 2′OH acylation analyzed by primer extension (SHAPE) applied to free and HDAg-bound HDV RNAs indicated that the characteristic secondary structure of the RNA is preserved when bound to HDAg. Notably, the analysis indicated that predicted unpaired positions in the RNA remained dynamic in the RNP. Analysis of the in vitro binding activity of RNAs in which internal loops and bulges were mutated and of synthetically designed RNAs demonstrated that the distinctive secondary structure, not the primary RNA sequence, is the major determinant of HDAg RNA binding specificity. Atomic force microscopy analysis of RNPs formed in vitro revealed complexes in which the HDV RNA is substantially condensed by bending or wrapping. Our results support a model in which the internal loops and bulges in HDV RNA contribute flexibility to the quasi-double-stranded structure that allows RNA bending and condensing by HDAg. IMPORTANCE RNA-protein complexes (RNPs) formed by the hepatitis delta virus RNAs and protein, HDAg, perform critical roles in virus replication. Neither the structures of these RNPs nor the RNA features required to form them have been characterized. HDV RNA is unusual in that it forms an unbranched quasi-double-stranded structure in which short base-paired segments are interspersed with internal loops and bulges. We analyzed the role of the HDV RNA sequence and secondary structure in the formation of a minimal RNP and visualized the structure of this RNP using atomic force microscopy. Our results indicate that HDAg does not recognize the primary sequence of the RNA; rather, the principle contribution of unpaired bases in HDV RNA to HDAg binding is to allow flexibility in the unbranched quasi-double-stranded RNA structure. Visualization of RNPs by atomic force microscopy indicated that the RNA is significantly bent or condensed in the complex. PMID:24741096

  19. Comparison of the Walz Nomogram and Presence of Secondary Circulating Prostate Cells for Predicting Early Biochemical Failure after Radical Prostatectomy for Prostate Cancer in Chilean Men.

    PubMed

    Murray, Nigel P; Reyes, Eduardo; Orellana, Nelson; Fuentealba, Cynthia; Jacob, Omar

    2015-01-01

    To determine the utility of secondary circulating prostate cells for predicting early biochemical failure after radical prostatectomy for prostate cancer and compare the results with the Walz nomagram. A single centre, prospective study of men with prostate cancer treated with radical prostatectomy between 2004 and 2014 was conducted, with registration of clinical-pathological details, total serum PSA pre-surgery, Gleason score, extracapsular extension, positive surgical margins, infiltration of lymph nodes, seminal vesicles and pathological stage. Secondary circulating prostate cells were obtained using differential gel centrifugation and assessed using standard immunocytochemistry with anti-PSA. Biochemical failure was defined as a PSA >0.2ng/ml, predictive values werecalculated using the Walz nomagram and CPC detection. A total of 326 men participated, with a median follow up of 5 years; 64 had biochemical failure within two years. Extracapsular extension, positive surgical margins, pathological stage, Gleason score ≥ 8, infiltration of seminal vesicles and lymph nodes were all associated with higher risk of biochemical failure. The discriminative value for the nomogram and circulating prostate cells was high (AUC >0.80), predictive values were higher for circulating prostate cell detection, with a negative predictive value of 99%, sensitivity of 96% and specificity of 75%. The nomagram had good predictive power to identify men with a high risk of biochemical failure within two years. The presence of circulating prostate cells had the same predictive power, with a higher sensitivity and negative predictive value. The presence of secondary circulating prostate cells identifies a group of men with a high risk of early biochemical failure. Those negative for secondary CPCs have a very low risk of early biochemical failure.

  20. The persuasion network is modulated by drug-use risk and predicts anti-drug message effectiveness.

    PubMed

    Huskey, Richard; Mangus, J Michael; Turner, Benjamin O; Weber, René

    2017-12-01

    While a persuasion network has been proposed, little is known about how network connections between brain regions contribute to attitude change. Two possible mechanisms have been advanced. One hypothesis predicts that attitude change results from increased connectivity between structures implicated in affective and executive processing in response to increases in argument strength. A second functional perspective suggests that highly arousing messages reduce connectivity between structures implicated in the encoding of sensory information, which disrupts message processing and thereby inhibits attitude change. However, persuasion is a multi-determined construct that results from both message features and audience characteristics. Therefore, persuasive messages should lead to specific functional connectivity patterns among a priori defined structures within the persuasion network. The present study exposed 28 subjects to anti-drug public service announcements where arousal, argument strength, and subject drug-use risk were systematically varied. Psychophysiological interaction analyses provide support for the affective-executive hypothesis but not for the encoding-disruption hypothesis. Secondary analyses show that video-level connectivity patterns among structures within the persuasion network predict audience responses in independent samples (one college-aged, one nationally representative). We propose that persuasion neuroscience research is best advanced by considering network-level effects while accounting for interactions between message features and target audience characteristics. © The Author (2017). Published by Oxford University Press.

  1. IMG-ABC: An Atlas of Biosynthetic Gene Clusters to Fuel the Discovery of Novel Secondary Metabolites

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, I-Min; Chu, Ken; Ratner, Anna

    2014-10-28

    In the discovery of secondary metabolites (SMs), large-scale analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of relevant computational resources. We present IMG-ABC (https://img.jgi.doe.gov/abc/) -- An Atlas of Biosynthetic gene Clusters within the Integrated Microbial Genomes (IMG) system1. IMG-ABC is a rich repository of both validated and predicted biosynthetic clusters (BCs) in cultured isolates, single-cells and metagenomes linked with the SM chemicals they produce and enhanced with focused analysis tools within IMG. The underlying scalable framework enables traversal of phylogenetic dark matter and chemical structure space -- serving as a doorwaymore » to a new era in the discovery of novel molecules.« less

  2. Computational design of d-peptide inhibitors of hepatitis delta antigen dimerization

    NASA Astrophysics Data System (ADS)

    Elkin, Carl D.; Zuccola, Harmon J.; Hogle, James M.; Joseph-McCarthy, Diane

    2000-11-01

    Hepatitis delta virus (HDV) encodes a single polypeptide called hepatitis delta antigen (DAg). Dimerization of DAg is required for viral replication. The structure of the dimerization region, residues 12 to 60, consists of an anti-parallel coiled coil [Zuccola et al., Structure, 6 (1998) 821]. Multiple Copy Simultaneous Searches (MCSS) of the hydrophobic core region formed by the bend in the helix of one monomer of this structure were carried out for many diverse functional groups. Six critical interaction sites were identified. The Protein Data Bank was searched for backbone templates to use in the subsequent design process by matching to these sites. A 14 residue helix expected to bind to the d-isomer of the target structure was selected as the template. Over 200 000 mutant sequences of this peptide were generated based on the MCSS results. A secondary structure prediction algorithm was used to screen all sequences, and in general only those that were predicted to be highly helical were retained. Approximately 100 of these 14-mers were model built as d-peptides and docked with the l-isomer of the target monomer. Based on calculated interaction energies, predicted helicity, and intrahelical salt bridge patterns, a small number of peptides were selected as the most promising candidates. The ligand design approach presented here is the computational analogue of mirror image phage display. The results have been used to characterize the interactions responsible for formation of this model anti-parallel coiled coil and to suggest potential ligands to disrupt it.

  3. Secondary reconstruction of maxillofacial trauma.

    PubMed

    Castro-Núñez, Jaime; Van Sickels, Joseph E

    2017-08-01

    Craniomaxillofacial trauma is one of the most complex clinical conditions in contemporary maxillofacial surgery. Vital structures and possible functional and esthetic sequelae are important considerations following this type of trauma and intervention. Despite the best efforts of the primary surgery, there are a group of patients that will have poor outcomes requiring secondary reconstruction to restore form and function. The purpose of this study is to review current concepts on secondary reconstruction to the maxillofacial complex. The evaluation of a posttraumatic patient for a secondary reconstruction must include an assessment of the different subunits of the upper face, middle face, and lower face. Virtual surgical planning and surgical guides represent the most important innovations in secondary reconstruction over the past few years. Intraoperative navigational surgery/computed-assisted navigation is used in complex cases. Facial asymmetry can be corrected or significantly improved by segmentation of the computerized tomography dataset and mirroring of the unaffected side by means of virtual surgical planning. Navigational surgery/computed-assisted navigation allows for a more precise surgical correction when secondary reconstruction involves the replacement of extensive anatomical areas. The use of technology can result in custom-made replacements and prebent plates, which are more stable and resistant to fracture because of metal fatigue. Careful perioperative evaluation is the key to positive outcomes of secondary reconstruction after trauma. The advent of technological tools has played a capital role in helping the surgical team perform a given treatment plan in a more precise and predictable manner.

  4. Use of Predictive Text in Text Messaging over the Course of a Year and Its Relationship with Spelling, Orthographic Processing and Grammar

    ERIC Educational Resources Information Center

    Waldron, Sam; Wood, Clare; Kemp, Nenagh

    2017-01-01

    An investigation into the impact of predictive text use upon the literacy skills of primary school, secondary school and university cohorts was conducted over the course of a year. No differences in use of text abbreviations ("textisms") were found between predictive text users and nonusers. However, secondary school children who used…

  5. Emergence of Secondary Trigger Sites after Primary Migraine Surgery.

    PubMed

    Punjabi, Ayesha; Brown, Matthew; Guyuron, Bahman

    2016-04-01

    Surgical decompression of a migraine headache may unmask headaches originating from secondary sites. A retrospective chart review investigated the incidence and characteristics of secondary trigger sites to identify clinical patterns that could aid in predicting and perhaps reducing postoperative migraines. One hundred eighty-five charts for migraine patients who underwent surgery at the senior author's (B.G.) practice were reviewed. Sites from which migraine headaches initiated or occurred independently were considered primary. The sites that were not active at the time of preoperative evaluation but became active after surgery were considered secondary. Bivariate analysis was performed to characterize postoperative migraines. Of 185 patients, 33 (17.8 percent) developed secondary migraine headache trigger sites. Of patients with primary site I (frontal) symptoms, 20.83 percent had site III (septonasal) symptoms unmasked after surgery (versus 7 percent for patients with other primary sites; p = 0.04). Of the patients with site II (temporal) migraines, 17.14 percent had secondary frontal symptoms (versus 5.68 percent; p = 0.04). Primary site II symptoms predicted postoperative site IV (occipital) symptoms (11.43 versus 1.1 percent; p = 0.008), and primary occipital symptoms predicted postoperative temporal symptoms (11.1 versus 2.33 percent; p = 0.04). The authors observed that 17.8 percent of patients develop postoperative migraine headache triggers that are not reported during the initial assessment. Knowledge of secondary migraine emergence patterns, and the presence of some preoperative symptoms, can aid in predicting the migraines that will arise from a new site postoperatively. Therapeutic, IV.

  6. Temperature-dependence of biomass accumulation rates during secondary succession.

    PubMed

    Anderson, Kristina J; Allen, Andrew P; Gillooly, James F; Brown, James H

    2006-06-01

    Rates of ecosystem recovery following disturbance affect many ecological processes, including carbon cycling in the biosphere. Here, we present a model that predicts the temperature dependence of the biomass accumulation rate following disturbances in forests. Model predictions are derived based on allometric and biochemical principles that govern plant energetics and are tested using a global database of 91 studies of secondary succession compiled from the literature. The rate of biomass accumulation during secondary succession increases with average growing season temperature as predicted based on the biochemical kinetics of photosynthesis in chloroplasts. In addition, the rate of biomass accumulation is greater in angiosperm-dominated communities than in gymnosperm-dominated ones and greater in plantations than in naturally regenerating stands. By linking the temperature-dependence of photosynthesis to the rate of whole-ecosystem biomass accumulation during secondary succession, our model and results provide one example of how emergent, ecosystem-level rate processes can be predicted based on the kinetics of individual metabolic rate.

  7. The Role of ABC Proteins in Drug Resistant Breast Cancer Cells

    DTIC Science & Technology

    2008-04-01

    are indicated. (B) Cartoon of the predicted PfMDR1 secondary structure indicating helices ( gray ), loops (lines), and Walker A (vertical stripes...GTC 6 22 2 18 11 GTG 11 20 5 2 13 GTT 41 41 31 13 5 6064 Biochemistry, Vol. 46, No. 20, 2007 Amoah et al. to vanadate (Figure 4E) but high sensitivity... gray vs black bars) as were inhibitory effects seen at high dose CQ (Figure 9C, solid bars). DISCUSSION Reproducible high level overexpression of

  8. Special Focus

    PubMed Central

    Nawrocki, Eric P.; Burge, Sarah W.

    2013-01-01

    The development of RNA bioinformatic tools began more than 30 y ago with the description of the Nussinov and Zuker dynamic programming algorithms for single sequence RNA secondary structure prediction. Since then, many tools have been developed for various RNA sequence analysis problems such as homology search, multiple sequence alignment, de novo RNA discovery, read-mapping, and many more. In this issue, we have collected a sampling of reviews and original research that demonstrate some of the many ways bioinformatics is integrated with current RNA biology research. PMID:23948768

  9. In silico methods for co-transcriptional RNA secondary structure prediction and for investigating alternative RNA structure expression.

    PubMed

    Meyer, Irmtraud M

    2017-05-01

    RNA transcripts are the primary products of active genes in any living organism, including many viruses. Their cellular destiny not only depends on primary sequence signals, but can also be determined by RNA structure. Recent experimental evidence shows that many transcripts can be assigned more than a single functional RNA structure throughout their cellular life and that structure formation happens co-transcriptionally, i.e. as the transcript is synthesised in the cell. Moreover, functional RNA structures are not limited to non-coding transcripts, but can also feature in coding transcripts. The picture that now emerges is that RNA structures constitute an additional layer of information that can be encoded in any RNA transcript (and on top of other layers of information such as protein-context) in order to exert a wide range of functional roles. Moreover, different encoded RNA structures can be expressed at different stages of a transcript's life in order to alter the transcript's behaviour depending on its actual cellular context. Similar to the concept of alternative splicing for protein-coding genes, where a single transcript can yield different proteins depending on cellular context, it is thus appropriate to propose the notion of alternative RNA structure expression for any given transcript. This review introduces several computational strategies that my group developed to detect different aspects of RNA structure expression in vivo. Two aspects are of particular interest to us: (1) RNA secondary structure features that emerge during co-transcriptional folding and (2) functional RNA structure features that are expressed at different times of a transcript's life and potentially mutually exclusive. Copyright © 2017. Published by Elsevier Inc.

  10. Computational analysis of human and mouse CREB3L4 Protein

    PubMed Central

    Velpula, Kiran Kumar; Rehman, Azeem Abdul; Chigurupati, Soumya; Sanam, Ramadevi; Inampudi, Krishna Kishore; Akila, Chandra Sekhar

    2012-01-01

    CREB3L4 is a member of the CREB/ATF transcription factor family, characterized by their regulation of gene expression through the cAMP-responsive element. Previous studies identified this protein in mice and humans. Whereas CREB3L4 in mice (referred to as Tisp40) is found in the testes and functions in spermatogenesis, human CREB3L4 is primarily detected in the prostate and has been implicated in cancer. We conducted computational analyses to compare the structural homology between murine Tisp40α human CREB3L4. Our results reveal that the primary and secondary structures of the two proteins contain high similarity. Additionally, predicted helical transmembrane structure reveals that the proteins likely have similar structure and function. This study offers preliminary findings that support the translation of mouse Tisp40α findings into human models, based on structural homology. PMID:22829733

  11. Exploring GPCR-Lipid Interactions by Molecular Dynamics Simulations: Excitements, Challenges, and the Way Forward.

    PubMed

    Sengupta, Durba; Prasanna, Xavier; Mohole, Madhura; Chattopadhyay, Amitabha

    2018-06-07

    Gprotein-coupled receptors (GPCRs) are seven transmembrane receptors that mediate a large number of cellular responses and are important drug targets. One of the current challenges in GPCR biology is to analyze the molecular signatures of receptor-lipid interactions and their subsequent effects on GPCR structure, organization, and function. Molecular dynamics simulation studies have been successful in predicting molecular determinants of receptor-lipid interactions. In particular, predicted cholesterol interaction sites appear to correspond well with experimentally determined binding sites and estimated time scales of association. In spite of several success stories, the methodologies in molecular dynamics simulations are still emerging. In this Feature Article, we provide a comprehensive overview of coarse-grain and atomistic molecular dynamics simulations of GPCR-lipid interaction in the context of experimental observations. In addition, we discuss the effect of secondary and tertiary structural constraints in coarse-grain simulations in the context of functional dynamics and structural plasticity of GPCRs. We envision that this comprehensive overview will help resolve differences in computational studies and provide a way forward.

  12. GeneBee-net: Internet-based server for analyzing biopolymers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brodsky, L.I.; Ivanov, V.V.; Nikolaev, V.K.

    This work describes a network server for searching databanks of biopolymer structures and performing other biocomputing procedures; it is available via direct Internet connection. Basic server procedures are dedicated to homology (similarity) search of sequence and 3D structure of proteins. The homologies found could be used to build multiple alignments, predict protein and RNA secondary structure, and construct phylogenetic trees. In addition to traditional methods of sequence similarity search, the authors propose {open_quotes}non-matrix{close_quotes} (correlational) search. An analogous approach is used to identify regions of similar tertiary structure of proteins. Algorithm concepts and usage examples are presented for new methods. Servicemore » logic is based upon interaction of a client program and server procedures. The client program allows the compilation of queries and the processing of results of an analysis.« less

  13. Frnakenstein: multiple target inverse RNA folding.

    PubMed

    Lyngsø, Rune B; Anderson, James W J; Sizikova, Elena; Badugu, Amarendra; Hyland, Tomas; Hein, Jotun

    2012-10-09

    RNA secondary structure prediction, or folding, is a classic problem in bioinformatics: given a sequence of nucleotides, the aim is to predict the base pairs formed in its three dimensional conformation. The inverse problem of designing a sequence folding into a particular target structure has only more recently received notable interest. With a growing appreciation and understanding of the functional and structural properties of RNA motifs, and a growing interest in utilising biomolecules in nano-scale designs, the interest in the inverse RNA folding problem is bound to increase. However, whereas the RNA folding problem from an algorithmic viewpoint has an elegant and efficient solution, the inverse RNA folding problem appears to be hard. In this paper we present a genetic algorithm approach to solve the inverse folding problem. The main aims of the development was to address the hitherto mostly ignored extension of solving the inverse folding problem, the multi-target inverse folding problem, while simultaneously designing a method with superior performance when measured on the quality of designed sequences. The genetic algorithm has been implemented as a Python program called Frnakenstein. It was benchmarked against four existing methods and several data sets totalling 769 real and predicted single structure targets, and on 292 two structure targets. It performed as well as or better at finding sequences which folded in silico into the target structure than all existing methods, without the heavy bias towards CG base pairs that was observed for all other top performing methods. On the two structure targets it also performed well, generating a perfect design for about 80% of the targets. Our method illustrates that successful designs for the inverse RNA folding problem does not necessarily have to rely on heavy biases in base pair and unpaired base distributions. The design problem seems to become more difficult on larger structures when the target structures are real structures, while no deterioration was observed for predicted structures. Design for two structure targets is considerably more difficult, but far from impossible, demonstrating the feasibility of automated design of artificial riboswitches. The Python implementation is available at http://www.stats.ox.ac.uk/research/genome/software/frnakenstein.

  14. Frnakenstein: multiple target inverse RNA folding

    PubMed Central

    2012-01-01

    Background RNA secondary structure prediction, or folding, is a classic problem in bioinformatics: given a sequence of nucleotides, the aim is to predict the base pairs formed in its three dimensional conformation. The inverse problem of designing a sequence folding into a particular target structure has only more recently received notable interest. With a growing appreciation and understanding of the functional and structural properties of RNA motifs, and a growing interest in utilising biomolecules in nano-scale designs, the interest in the inverse RNA folding problem is bound to increase. However, whereas the RNA folding problem from an algorithmic viewpoint has an elegant and efficient solution, the inverse RNA folding problem appears to be hard. Results In this paper we present a genetic algorithm approach to solve the inverse folding problem. The main aims of the development was to address the hitherto mostly ignored extension of solving the inverse folding problem, the multi-target inverse folding problem, while simultaneously designing a method with superior performance when measured on the quality of designed sequences. The genetic algorithm has been implemented as a Python program called Frnakenstein. It was benchmarked against four existing methods and several data sets totalling 769 real and predicted single structure targets, and on 292 two structure targets. It performed as well as or better at finding sequences which folded in silico into the target structure than all existing methods, without the heavy bias towards CG base pairs that was observed for all other top performing methods. On the two structure targets it also performed well, generating a perfect design for about 80% of the targets. Conclusions Our method illustrates that successful designs for the inverse RNA folding problem does not necessarily have to rely on heavy biases in base pair and unpaired base distributions. The design problem seems to become more difficult on larger structures when the target structures are real structures, while no deterioration was observed for predicted structures. Design for two structure targets is considerably more difficult, but far from impossible, demonstrating the feasibility of automated design of artificial riboswitches. The Python implementation is available at http://www.stats.ox.ac.uk/research/genome/software/frnakenstein. PMID:23043260

  15. Metal Cations in G-Quadruplex Folding and Stability

    NASA Astrophysics Data System (ADS)

    Bhattacharyya, Debmalya; Mirihana Arachchilage, Gayan; Basu, Soumitra

    2016-09-01

    This review is focused on the structural and physico-chemical aspects of metal cation coordination to G-Quadruplexes (GQ) and their effects on GQ stability and conformation. G-Quadruplex structures are non-canonical secondary structures formed by both DNA and RNA. G-quadruplexes regulate a wide range of important biochemical processes. Besides the sequence requirements, the coordination of monovalent cations in the GQ is essential for its formation and determines the stability and polymorphism of GQ structures. The nature, location and dynamics of the cation coordination and their impact on the overall GQ stability are dependent on several factors such as the ionic radii, hydration energy and the bonding strength to the O6 of guanines. The intracellular monovalent cation concentration and the localized ion concentrations determine the formation of GQs and can potentially dictate their regulatory roles. A wide range of biochemical and biophysical studies on an array of GQ enabling sequences have generated at a minimum the knowledge base that allows us to often predict the stability of GQs in presence of the physiologically relevant metal ions, however, prediction of conformation of such GQs is still out of the realm.

  16. Gap formation following climatic events in spatially structured plant communities

    PubMed Central

    Liao, Jinbao; De Boeck, Hans J.; Li, Zhenqing; Nijs, Ivan

    2015-01-01

    Gaps play a crucial role in maintaining species diversity, yet how community structure and composition influence gap formation is still poorly understood. We apply a spatially structured community model to predict how species diversity and intraspecific aggregation shape gap patterns emerging after climatic events, based on species-specific mortality responses. In multispecies communities, average gap size and gap-size diversity increased rapidly with increasing mean mortality once a mortality threshold was exceeded, greatly promoting gap recolonization opportunity. This result was observed at all levels of species richness. Increasing interspecific difference likewise enhanced these metrics, which may promote not only diversity maintenance but also community invasibility, since more diverse niches for both local and exotic species are provided. The richness effects on gap size and gap-size diversity were positive, but only expressed when species were sufficiently different. Surprisingly, while intraspecific clumping strongly promoted gap-size diversity, it hardly influenced average gap size. Species evenness generally reduced gap metrics induced by climatic events, so the typical assumption of maximum evenness in many experiments and models may underestimate community diversity and invasibility. Overall, understanding the factors driving gap formation in spatially structured assemblages can help predict community secondary succession after climatic events. PMID:26114803

  17. Structurally coloured secondary particles composed of black and white colloidal particles.

    PubMed

    Takeoka, Yukikazu; Yoshioka, Shinya; Teshima, Midori; Takano, Atsushi; Harun-Ur-Rashid, Mohammad; Seki, Takahiro

    2013-01-01

    This study investigated the colourful secondary particles formed by controlling the aggregation states of colloidal silica particles and the enhancement of the structural colouration of the secondary particles caused by adding black particles. We obtained glossy, partially structurally coloured secondary particles in the absence of NaCl, but matte, whitish secondary particles were obtained in the presence of NaCl. When a small amount of carbon black was incorporated into both types of secondary particles, the incoherent multiple scattering of light from the amorphous region was considerably reduced. However, the peak intensities in the reflection spectra, caused by Bragg reflection and by coherent single wavelength scattering, were only slightly decreased. Consequently, a brighter structural colour of these secondary particles was observed with the naked eye. Furthermore, when magnetite was added as a black particle, the coloured secondary particles could be moved and collected by applying an external magnetic field.

  18. Structurally Coloured Secondary Particles Composed of Black and White Colloidal Particles

    PubMed Central

    Takeoka, Yukikazu; Yoshioka, Shinya; Teshima, Midori; Takano, Atsushi; Harun-Ur-Rashid, Mohammad; Seki, Takahiro

    2013-01-01

    This study investigated the colourful secondary particles formed by controlling the aggregation states of colloidal silica particles and the enhancement of the structural colouration of the secondary particles caused by adding black particles. We obtained glossy, partially structurally coloured secondary particles in the absence of NaCl, but matte, whitish secondary particles were obtained in the presence of NaCl. When a small amount of carbon black was incorporated into both types of secondary particles, the incoherent multiple scattering of light from the amorphous region was considerably reduced. However, the peak intensities in the reflection spectra, caused by Bragg reflection and by coherent single wavelength scattering, were only slightly decreased. Consequently, a brighter structural colour of these secondary particles was observed with the naked eye. Furthermore, when magnetite was added as a black particle, the coloured secondary particles could be moved and collected by applying an external magnetic field. PMID:23917891

  19. Highly sensitive detection of individual HEAT and ARM repeats with HHpred and COACH.

    PubMed

    Kippert, Fred; Gerloff, Dietlind L

    2009-09-24

    HEAT and ARM repeats occur in a large number of eukaryotic proteins. As these repeats are often highly diverged, the prediction of HEAT or ARM domains can be challenging. Except for the most clear-cut cases, identification at the individual repeat level is indispensable, in particular for determining domain boundaries. However, methods using single sequence queries do not have the sensitivity required to deal with more divergent repeats and, when applied to proteins with known structures, in some cases failed to detect a single repeat. Testing algorithms which use multiple sequence alignments as queries, we found two of them, HHpred and COACH, to detect HEAT and ARM repeats with greatly enhanced sensitivity. Calibration against experimentally determined structures suggests the use of three score classes with increasing confidence in the prediction, and prediction thresholds for each method. When we applied a new protocol using both HHpred and COACH to these structures, it detected 82% of HEAT repeats and 90% of ARM repeats, with the minimum for a given protein of 57% for HEAT repeats and 60% for ARM repeats. Application to bona fide HEAT and ARM proteins or domains indicated that similar numbers can be expected for the full complement of HEAT/ARM proteins. A systematic screen of the Protein Data Bank for false positive hits revealed their number to be low, in particular for ARM repeats. Double false positive hits for a given protein were rare for HEAT and not at all observed for ARM repeats. In combination with fold prediction and consistency checking (multiple sequence alignments, secondary structure prediction, and position analysis), repeat prediction with the new HHpred/COACH protocol dramatically improves prediction in the twilight zone of fold prediction methods, as well as the delineation of HEAT/ARM domain boundaries. A protocol is presented for the identification of individual HEAT or ARM repeats which is straightforward to implement. It provides high sensitivity at a low false positive rate and will therefore greatly enhance the accuracy of predictions of HEAT and ARM domains.

  20. Highly Sensitive Detection of Individual HEAT and ARM Repeats with HHpred and COACH

    PubMed Central

    Kippert, Fred; Gerloff, Dietlind L.

    2009-01-01

    Background HEAT and ARM repeats occur in a large number of eukaryotic proteins. As these repeats are often highly diverged, the prediction of HEAT or ARM domains can be challenging. Except for the most clear-cut cases, identification at the individual repeat level is indispensable, in particular for determining domain boundaries. However, methods using single sequence queries do not have the sensitivity required to deal with more divergent repeats and, when applied to proteins with known structures, in some cases failed to detect a single repeat. Methodology and Principal Findings Testing algorithms which use multiple sequence alignments as queries, we found two of them, HHpred and COACH, to detect HEAT and ARM repeats with greatly enhanced sensitivity. Calibration against experimentally determined structures suggests the use of three score classes with increasing confidence in the prediction, and prediction thresholds for each method. When we applied a new protocol using both HHpred and COACH to these structures, it detected 82% of HEAT repeats and 90% of ARM repeats, with the minimum for a given protein of 57% for HEAT repeats and 60% for ARM repeats. Application to bona fide HEAT and ARM proteins or domains indicated that similar numbers can be expected for the full complement of HEAT/ARM proteins. A systematic screen of the Protein Data Bank for false positive hits revealed their number to be low, in particular for ARM repeats. Double false positive hits for a given protein were rare for HEAT and not at all observed for ARM repeats. In combination with fold prediction and consistency checking (multiple sequence alignments, secondary structure prediction, and position analysis), repeat prediction with the new HHpred/COACH protocol dramatically improves prediction in the twilight zone of fold prediction methods, as well as the delineation of HEAT/ARM domain boundaries. Significance A protocol is presented for the identification of individual HEAT or ARM repeats which is straightforward to implement. It provides high sensitivity at a low false positive rate and will therefore greatly enhance the accuracy of predictions of HEAT and ARM domains. PMID:19777061

Top