Science.gov

Sample records for structure based predictive

  1. Collective prediction based on community structure

    NASA Astrophysics Data System (ADS)

    Jiang, Yasong; Li, Taisong; Zhang, Yan; Yan, Yonghong

    2017-01-01

    Collective prediction algorithms have been used to improve performances when network structures are involved in prediction tasks. The training dataset of such tasks often contain information of content, links and labels, while the testing dataset have only content and link information. Conventional collective prediction algorithms conduct predictions based on the content of a node and the information of its direct neighbors with a base classifier. However, the information of some direct neighbor nodes may be not consistent with the target one. In addition, the information of indirect neighbors can be helpful when that of direct neighbors is scant. In this paper, instead of using information of direct neighbors, we propose to apply community structures in networks to prediction tasks. A community detection method is aggregated into the collective prediction process to improve prediction performance. Experimental results show that the proposed algorithm outperforms a number of standard prediction algorithms specially under conditions that labeled training dataset are limited.

  2. Structure-Based Predictions of Activity Cliffs

    PubMed Central

    Husby, Jarmila; Bottegoni, Giovanni; Kufareva, Irina; Abagyan, Ruben; Cavalli, Andrea

    2015-01-01

    In drug discovery, it is generally accepted that neighboring molecules in a given descriptors' space display similar activities. However, even in regions that provide strong predictability, structurally similar molecules can occasionally display large differences in potency. In QSAR jargon, these discontinuities in the activity landscape are known as ‘activity cliffs’. In this study, we assessed the reliability of ligand docking and virtual ligand screening schemes in predicting activity cliffs. We performed our calculations on a diverse, independently collected database of cliff-forming co-crystals. Starting from ideal situations, which allowed us to establish our baseline, we progressively moved toward simulating more realistic scenarios. Ensemble- and template-docking achieved a significant level of accuracy, suggesting that, despite the well-known limitations of empirical scoring schemes, activity cliffs can be accurately predicted by advanced structure-based methods. PMID:25918827

  3. Ensemble-based prediction of RNA secondary structures.

    PubMed

    Aghaeepour, Nima; Hoos, Holger H

    2013-04-24

    Accurate structure prediction methods play an important role for the understanding of RNA function. Energy-based, pseudoknot-free secondary structure prediction is one of the most widely used and versatile approaches, and improved methods for this task have received much attention over the past five years. Despite the impressive progress that as been achieved in this area, existing evaluations of the prediction accuracy achieved by various algorithms do not provide a comprehensive, statistically sound assessment. Furthermore, while there is increasing evidence that no prediction algorithm consistently outperforms all others, no work has been done to exploit the complementary strengths of multiple approaches. In this work, we present two contributions to the area of RNA secondary structure prediction. Firstly, we use state-of-the-art, resampling-based statistical methods together with a previously published and increasingly widely used dataset of high-quality RNA structures to conduct a comprehensive evaluation of existing RNA secondary structure prediction procedures. The results from this evaluation clarify the performance relationship between ten well-known existing energy-based pseudoknot-free RNA secondary structure prediction methods and clearly demonstrate the progress that has been achieved in recent years. Secondly, we introduce AveRNA, a generic and powerful method for combining a set of existing secondary structure prediction procedures into an ensemble-based method that achieves significantly higher prediction accuracies than obtained from any of its component procedures. Our new, ensemble-based method, AveRNA, improves the state of the art for energy-based, pseudoknot-free RNA secondary structure prediction by exploiting the complementary strengths of multiple existing prediction procedures, as demonstrated using a state-of-the-art statistical resampling approach. In addition, AveRNA allows an intuitive and effective control of the trade-off between

  4. Ensemble-based prediction of RNA secondary structures

    PubMed Central

    2013-01-01

    Background Accurate structure prediction methods play an important role for the understanding of RNA function. Energy-based, pseudoknot-free secondary structure prediction is one of the most widely used and versatile approaches, and improved methods for this task have received much attention over the past five years. Despite the impressive progress that as been achieved in this area, existing evaluations of the prediction accuracy achieved by various algorithms do not provide a comprehensive, statistically sound assessment. Furthermore, while there is increasing evidence that no prediction algorithm consistently outperforms all others, no work has been done to exploit the complementary strengths of multiple approaches. Results In this work, we present two contributions to the area of RNA secondary structure prediction. Firstly, we use state-of-the-art, resampling-based statistical methods together with a previously published and increasingly widely used dataset of high-quality RNA structures to conduct a comprehensive evaluation of existing RNA secondary structure prediction procedures. The results from this evaluation clarify the performance relationship between ten well-known existing energy-based pseudoknot-free RNA secondary structure prediction methods and clearly demonstrate the progress that has been achieved in recent years. Secondly, we introduce AveRNA, a generic and powerful method for combining a set of existing secondary structure prediction procedures into an ensemble-based method that achieves significantly higher prediction accuracies than obtained from any of its component procedures. Conclusions Our new, ensemble-based method, AveRNA, improves the state of the art for energy-based, pseudoknot-free RNA secondary structure prediction by exploiting the complementary strengths of multiple existing prediction procedures, as demonstrated using a state-of-the-art statistical resampling approach. In addition, AveRNA allows an intuitive and effective

  5. Improving structure-based function prediction using molecular dynamics

    PubMed Central

    Glazer, Dariya S.; Radmer, Randall J.; Altman, Russ B.

    2009-01-01

    Summary The number of molecules with solved three-dimensional structure but unknown function is increasing rapidly. Particularly problematic are novel folds with little detectable similarity to molecules of known function. Experimental assays can determine the functions of such molecules, but are time-consuming and expensive. Computational approaches can identify potential functional sites; however, these approaches generally rely on single static structures and do not use information about dynamics. In fact, structural dynamics can enhance function prediction: we coupled molecular dynamics simulations with structure-based function prediction algorithms that identify Ca2+ binding sites. When applied to 11 challenging proteins, both methods showed substantial improvement in performance, revealing 22 more sites in one case and 12 more in the other, with a modest increase in apparent false positives. Thus, we show that treating molecules as dynamic entities improves the performance of structure-based function prediction methods. PMID:19604472

  6. OPTIMIZATION BIAS IN ENERGY-BASED STRUCTURE PREDICTION.

    PubMed

    Petrella, Robert J

    2013-12-01

    Physics-based computational approaches to predicting the structure of macromolecules such as proteins are gaining increased use, but there are remaining challenges. In the current work, it is demonstrated that in energy-based prediction methods, the degree of optimization of the sampled structures can influence the prediction results. In particular, discrepancies in the degree of local sampling can bias the predictions in favor of the oversampled structures by shifting the local probability distributions of the minimum sampled energies. In simple systems, it is shown that the magnitude of the errors can be calculated from the energy surface, and for certain model systems, derived analytically. Further, it is shown that for energy wells whose forms differ only by a randomly assigned energy shift, the optimal accuracy of prediction is achieved when the sampling around each structure is equal. Energy correction terms can be used in cases of unequal sampling to reproduce the total probabilities that would occur under equal sampling, but optimal corrections only partially restore the prediction accuracy lost to unequal sampling. For multiwell systems, the determination of the correction terms is a multibody problem; it is shown that the involved cross-correlation multiple integrals can be reduced to simpler integrals. The possible implications of the current analysis for macromolecular structure prediction are discussed.

  7. OPTIMIZATION BIAS IN ENERGY-BASED STRUCTURE PREDICTION

    PubMed Central

    Petrella, Robert J.

    2014-01-01

    Physics-based computational approaches to predicting the structure of macromolecules such as proteins are gaining increased use, but there are remaining challenges. In the current work, it is demonstrated that in energy-based prediction methods, the degree of optimization of the sampled structures can influence the prediction results. In particular, discrepancies in the degree of local sampling can bias the predictions in favor of the oversampled structures by shifting the local probability distributions of the minimum sampled energies. In simple systems, it is shown that the magnitude of the errors can be calculated from the energy surface, and for certain model systems, derived analytically. Further, it is shown that for energy wells whose forms differ only by a randomly assigned energy shift, the optimal accuracy of prediction is achieved when the sampling around each structure is equal. Energy correction terms can be used in cases of unequal sampling to reproduce the total probabilities that would occur under equal sampling, but optimal corrections only partially restore the prediction accuracy lost to unequal sampling. For multiwell systems, the determination of the correction terms is a multibody problem; it is shown that the involved cross-correlation multiple integrals can be reduced to simpler integrals. The possible implications of the current analysis for macromolecular structure prediction are discussed. PMID:25552783

  8. Assessing the Accuracy of Template-Based Structure Prediction Metaservers by Comparison with Structural Genomics Structures

    PubMed Central

    Gront, Dominik; Grabowski, Marek; Raynor, John; Tkaczuk, Karolina L.; Minor, Wladek

    2014-01-01

    The explosion of the size of the universe of known protein sequences has stimulated two complementary approaches to structural mapping of these sequences: theoretical structure prediction and experimental determination by structural genomics (SG). In this work, we assess the accuracy of structure prediction by two automated template-based structure prediction metaservers (genesilico.pl and bioinfo.pl) by measuring the structural similarity of the predicted models to corresponding experimental models determined a posteriori. Of 199 targets chosen from SG programs, the metaservers predicted the structures of about a fourth of them “correctly.” (In this case, “correct” was defined as placing more than 70% of the alpha carbon atoms in the model within 2 Å of the experimentally determined positions.) Almost all of the targets that could be modeled to this accuracy were those with an available template in the Protein Data Bank (PDB) with more than 25% sequence identity. The majority of those SG targets with lower sequence identity to structures in the PDB were not predicted by the metaservers with this accuracy. We also compared metaserver results to CASP8 results, finding that the models obtained by participants in the CASP competition were significantly better than those produced by the metaservers. PMID:23086054

  9. Blind Test of Physics-Based Prediction of Protein Structures

    PubMed Central

    Shell, M. Scott; Ozkan, S. Banu; Voelz, Vincent; Wu, Guohong Albert; Dill, Ken A.

    2009-01-01

    We report here a multiprotein blind test of a computer method to predict native protein structures based solely on an all-atom physics-based force field. We use the AMBER 96 potential function with an implicit (GB/SA) model of solvation, combined with replica-exchange molecular-dynamics simulations. Coarse conformational sampling is performed using the zipping and assembly method (ZAM), an approach that is designed to mimic the putative physical routes of protein folding. ZAM was applied to the folding of six proteins, from 76 to 112 monomers in length, in CASP7, a community-wide blind test of protein structure prediction. Because these predictions have about the same level of accuracy as typical bioinformatics methods, and do not utilize information from databases of known native structures, this work opens up the possibility of predicting the structures of membrane proteins, synthetic peptides, or other foldable polymers, for which there is little prior knowledge of native structures. This approach may also be useful for predicting physical protein folding routes, non-native conformations, and other physical properties from amino acid sequences. PMID:19186130

  10. Blind test of physics-based prediction of protein structures.

    PubMed

    Shell, M Scott; Ozkan, S Banu; Voelz, Vincent; Wu, Guohong Albert; Dill, Ken A

    2009-02-01

    We report here a multiprotein blind test of a computer method to predict native protein structures based solely on an all-atom physics-based force field. We use the AMBER 96 potential function with an implicit (GB/SA) model of solvation, combined with replica-exchange molecular-dynamics simulations. Coarse conformational sampling is performed using the zipping and assembly method (ZAM), an approach that is designed to mimic the putative physical routes of protein folding. ZAM was applied to the folding of six proteins, from 76 to 112 monomers in length, in CASP7, a community-wide blind test of protein structure prediction. Because these predictions have about the same level of accuracy as typical bioinformatics methods, and do not utilize information from databases of known native structures, this work opens up the possibility of predicting the structures of membrane proteins, synthetic peptides, or other foldable polymers, for which there is little prior knowledge of native structures. This approach may also be useful for predicting physical protein folding routes, non-native conformations, and other physical properties from amino acid sequences.

  11. Structure-based Methods for Computational Protein Functional Site Prediction

    PubMed Central

    Dukka, B KC

    2013-01-01

    Due to the advent of high throughput sequencing techniques and structural genomic projects, the number of gene and protein sequences has been ever increasing. Computational methods to annotate these genes and proteins are even more indispensable. Proteins are important macromolecules and study of the function of proteins is an important problem in structural bioinformatics. This paper discusses a number of methods to predict protein functional site especially focusing on protein ligand binding site prediction. Initially, a short overview is presented on recent advances in methods for selection of homologous sequences. Furthermore, a few recent structural based approaches and sequence-and-structure based approaches for protein functional sites are discussed in details. PMID:24688745

  12. Distance matrix-based approach to protein structure prediction.

    PubMed

    Kloczkowski, Andrzej; Jernigan, Robert L; Wu, Zhijun; Song, Guang; Yang, Lei; Kolinski, Andrzej; Pokarowski, Piotr

    2009-03-01

    Much structural information is encoded in the internal distances; a distance matrix-based approach can be used to predict protein structure and dynamics, and for structural refinement. Our approach is based on the square distance matrix D = [r(ij)(2)] containing all square distances between residues in proteins. This distance matrix contains more information than the contact matrix C, that has elements of either 0 or 1 depending on whether the distance r (ij) is greater or less than a cutoff value r (cutoff). We have performed spectral decomposition of the distance matrices D = sigma lambda(k)V(k)V(kT), in terms of eigenvalues lambda kappa and the corresponding eigenvectors v kappa and found that it contains at most five nonzero terms. A dominant eigenvector is proportional to r (2)--the square distance of points from the center of mass, with the next three being the principal components of the system of points. By predicting r (2) from the sequence we can approximate a distance matrix of a protein with an expected RMSD value of about 7.3 A, and by combining it with the prediction of the first principal component we can improve this approximation to 4.0 A. We can also explain the role of hydrophobic interactions for the protein structure, because r is highly correlated with the hydrophobic profile of the sequence. Moreover, r is highly correlated with several sequence profiles which are useful in protein structure prediction, such as contact number, the residue-wise contact order (RWCO) or mean square fluctuations (i.e. crystallographic temperature factors). We have also shown that the next three components are related to spatial directionality of the secondary structure elements, and they may be also predicted from the sequence, improving overall structure prediction. We have also shown that the large number of available HIV-1 protease structures provides a remarkable sampling of conformations, which can be viewed as direct structural information about the

  13. Structure-based prediction of host-pathogen protein interactions.

    PubMed

    Mariano, Rachelle; Wuchty, Stefan

    2017-03-16

    The discovery, validation, and characterization of protein-based interactions from different species are crucial for translational research regarding a variety of pathogens, ranging from the malaria parasite Plasmodium falciparum to HIV-1. Here, we review recent advances in the prediction of host-pathogen protein interfaces using structural information. In particular, we observe that current methods chiefly perform machine learning on sequence and domain information to produce large sets of candidate interactions that are further assessed and pruned to generate final, highly probable sets. Structure-based studies have also emphasized the electrostatic properties and evolutionary transformations of pathogenic interfaces, supplying crucial insight into antigenic determinants and the ways pathogens compete for host protein binding. Advancements in spectroscopic and crystallographic methods complement the aforementioned techniques, permitting the rigorous study of true positives at a molecular level. Together, these approaches illustrate how protein structure on a variety of levels functions coordinately and dynamically to achieve host takeover.

  14. PROTEUS2: a web server for comprehensive protein structure prediction and structure-based annotation.

    PubMed

    Montgomerie, Scott; Cruz, Joseph A; Shrivastava, Savita; Arndt, David; Berjanskii, Mark; Wishart, David S

    2008-07-01

    PROTEUS2 is a web server designed to support comprehensive protein structure prediction and structure-based annotation. PROTEUS2 accepts either single sequences (for directed studies) or multiple sequences (for whole proteome annotation) and predicts the secondary and, if possible, tertiary structure of the query protein(s). Unlike most other tools or servers, PROTEUS2 bundles signal peptide identification, transmembrane helix prediction, transmembrane beta-strand prediction, secondary structure prediction (for soluble proteins) and homology modeling (i.e. 3D structure generation) into a single prediction pipeline. Using a combination of progressive multi-sequence alignment, structure-based mapping, hidden Markov models, multi-component neural nets and up-to-date databases of known secondary structure assignments, PROTEUS is able to achieve among the highest reported levels of predictive accuracy for signal peptides (Q2 = 94%), membrane spanning helices (Q2 = 87%) and secondary structure (Q3 score of 81.3%). PROTEUS2's homology modeling services also provide high quality 3D models that compare favorably with those generated by SWISS-MODEL and 3D JigSaw (within 0.2 A RMSD). The average PROTEUS2 prediction takes approximately 3 min per query sequence. The PROTEUS2 server along with source code for many of its modules is accessible a http://wishart.biology.ualberta.ca/proteus2.

  15. Gene function prediction based on the Gene Ontology hierarchical structure.

    PubMed

    Cheng, Liangxi; Lin, Hongfei; Hu, Yuncui; Wang, Jian; Yang, Zhihao

    2014-01-01

    The information of the Gene Ontology annotation is helpful in the explanation of life science phenomena, and can provide great support for the research of the biomedical field. The use of the Gene Ontology is gradually affecting the way people store and understand bioinformatic data. To facilitate the prediction of gene functions with the aid of text mining methods and existing resources, we transform it into a multi-label top-down classification problem and develop a method that uses the hierarchical relationships in the Gene Ontology structure to relieve the quantitative imbalance of positive and negative training samples. Meanwhile the method enhances the discriminating ability of classifiers by retaining and highlighting the key training samples. Additionally, the top-down classifier based on a tree structure takes the relationship of target classes into consideration and thus solves the incompatibility between the classification results and the Gene Ontology structure. Our experiment on the Gene Ontology annotation corpus achieves an F-value performance of 50.7% (precision: 52.7% recall: 48.9%). The experimental results demonstrate that when the size of training set is small, it can be expanded via topological propagation of associated documents between the parent and child nodes in the tree structure. The top-down classification model applies to the set of texts in an ontology structure or with a hierarchical relationship.

  16. Characterization and Prediction of Protein Flexibility Based on Structural Alphabets

    PubMed Central

    Liu, Bin

    2016-01-01

    Motivation. To assist efforts in determining and exploring the functional properties of proteins, it is desirable to characterize and predict protein flexibilities. Results. In this study, the conformational entropy is used as an indicator of the protein flexibility. We first explore whether the conformational change can capture the protein flexibility. The well-defined decoy structures are converted into one-dimensional series of letters from a structural alphabet. Four different structure alphabets, including the secondary structure in 3-class and 8-class, the PB structure alphabet (16-letter), and the DW structure alphabet (28-letter), are investigated. The conformational entropy is then calculated from the structure alphabet letters. Some of the proteins show high correlation between the conformation entropy and the protein flexibility. We then predict the protein flexibility from basic amino acid sequence. The local structures are predicted by the dual-layer model and the conformational entropy of the predicted class distribution is then calculated. The results show that the conformational entropy is a good indicator of the protein flexibility, but false positives remain a problem. The DW structure alphabet performs the best, which means that more subtle local structures can be captured by large number of structure alphabet letters. Overall this study provides a simple and efficient method for the characterization and prediction of the protein flexibility. PMID:27660756

  17. Prediction of silicon-based layered structures for optoelectronic applications.

    PubMed

    Luo, Wei; Ma, Yanming; Gong, Xingao; Xiang, Hongjun

    2014-11-12

    A method based on the particle swarm optimization algorithm is presented to design quasi-two-dimensional materials. With this development, various single-layer and bilayer materials of C, Si, Ge, Sn, and Pb were predicted. A new Si bilayer structure is found to have a more favored energy than the previously widely accepted configuration. Both single-layer and bilayer Si materials have small band gaps, limiting their usages in optoelectronic applications. Hydrogenation has therefore been used to tune the electronic and optical properties of Si layers. We discover two hydrogenated materials of layered Si8H2 and Si6H2 possessing quasidirect band gaps of 0.75 and 1.59 eV, respectively. Their potential applications for light-emitting diode and photovoltaics are proposed and discussed. Our study opened up the possibility of hydrogenated Si layered materials as next-generation optoelectronic devices.

  18. An Efficient Scheme for Crystal Structure Prediction Based on Structural Motifs

    DOE PAGES

    Zhu, Zizhong; Wu, Ping; Wu, Shunqing; ...

    2017-05-15

    An efficient scheme based on structural motifs is proposed for the crystal structure prediction of materials. The key advantage of the present method comes in two fold: first, the degrees of freedom of the system are greatly reduced, since each structural motif, regardless of its size, can always be described by a set of parameters (R, θ, φ) with five degrees of freedom; second, the motifs could always appear in the predicted structures when the energies of the structures are relatively low. Both features make the present scheme a very efficient method for predicting desired materials. The method has beenmore » applied to the case of LiFePO4, an important cathode material for lithium-ion batteries. Numerous new structures of LiFePO4 have been found, compared to those currently available, available, demonstrating the reliability of the present methodology and illustrating the promise of the concept of structural motifs.« less

  19. A new hybrid coding for protein secondary structure prediction based on primary structure similarity.

    PubMed

    Li, Zhong; Wang, Jing; Zhang, Shunpu; Zhang, Qifeng; Wu, Wuming

    2017-03-16

    The coding pattern of protein can greatly affect the prediction accuracy of protein secondary structure. In this paper, a novel hybrid coding method based on the physicochemical properties of amino acids and tendency factors is proposed for the prediction of protein secondary structure. The principal component analysis (PCA) is first applied to the physicochemical properties of amino acids to construct a 3-bit-code, and then the 3 tendency factors of amino acids are calculated to generate another 3-bit-code. Two 3-bit-codes are fused to form a novel hybrid 6-bit-code. Furthermore, we make a geometry-based similarity comparison of the protein primary structure between the reference set and the test set before the secondary structure prediction. We finally use the support vector machine (SVM) to predict those amino acids which are not detected by the primary structure similarity comparison. Experimental results show that our method achieves a satisfactory improvement in accuracy in the prediction of protein secondary structure.

  20. Prediction of Silicon-Based Layered Structures for Optoelectronic Applications

    NASA Astrophysics Data System (ADS)

    Luo, Wei; Ma, Yanming; Gong, Xingao; Xiang, Hongjun; CCMG Team

    2015-03-01

    A method based on the particle swarm optimization (PSO) algorithm is presented to design quasi-two-dimensional (Q2D) materials. With this development, various single-layer and bi-layer materials in C, Si, Ge, Sn, and Pb were predicted. A new Si bi-layer structure is found to have a much-favored energy than the previously widely accepted configuration. Both single-layer and bi-layer Si materials have small band gaps, limiting their usages in optoelectronic applications. Hydrogenation has therefore been used to tune the electronic and optical properties of Si layers. We discover two hydrogenated materials of layered Si8H2andSi6H2 possessing quasi-direct band gaps of 0.75 eV and 1.59 eV, respectively. Their potential applications for light emitting diode and photovoltaics are proposed and discussed. Our study opened up the possibility of hydrogenated Si layered materials as next-generation optoelectronic devices.

  1. Finite Element Based HWB Centerbody Structural Optimization and Weight Prediction

    NASA Technical Reports Server (NTRS)

    Gern, Frank H.

    2012-01-01

    This paper describes a scalable structural model suitable for Hybrid Wing Body (HWB) centerbody analysis and optimization. The geometry of the centerbody and primary wing structure is based on a Vehicle Sketch Pad (VSP) surface model of the aircraft and a FLOPS compatible parameterization of the centerbody. Structural analysis, optimization, and weight calculation are based on a Nastran finite element model of the primary HWB structural components, featuring centerbody, mid section, and outboard wing. Different centerbody designs like single bay or multi-bay options are analyzed and weight calculations are compared to current FLOPS results. For proper structural sizing and weight estimation, internal pressure and maneuver flight loads are applied. Results are presented for aerodynamic loads, deformations, and centerbody weight.

  2. Protein structure prediction provides comparable performance to crystallographic structures in docking-based virtual screening.

    PubMed

    Du, Hongying; Brender, Jeffrey R; Zhang, Jian; Zhang, Yang

    2015-01-01

    Structure based virtual screening has largely been limited to protein targets for which either an experimental structure is available or a strongly homologous template exists so that a high-resolution model can be constructed. The performance of state of the art protein structure predictions in virtual screening in systems where only weakly homologous templates are available is largely untested. Using the challenging DUD database of structural decoys, we show here that even using templates with only weak sequence homology (<30% sequence identity) structural models can be constructed by I-TASSER which achieve comparable enrichment rates to using the experimental bound crystal structure in the majority of the cases studied. For 65% of the targets, the I-TASSER models, which are constructed essentially in the apo conformations, reached 70% of the virtual screening performance of using the holo-crystal structures. A correlation was observed between the success of I-TASSER in modeling the global fold and local structures in the binding pockets of the proteins versus the relative success in virtual screening. The virtual screening performance can be further improved by the recognition of chemical features of the ligand compounds. These results suggest that the combination of structure-based docking and advanced protein structure modeling methods should be a valuable approach to the large-scale drug screening and discovery studies, especially for the proteins lacking crystallographic structures.

  3. Structure-Based Predictive model for Coal Char Combustion.

    SciTech Connect

    Hurt, R.; Colo, J; Essenhigh, R.; Hadad, C; Stanley, E.

    1997-09-24

    During the third quarter of this project, progress was made on both major technical tasks. Progress was made in the chemistry department at OSU on the calculation of thermodynamic properties for a number of model organic compounds. Modelling work was carried out at Brown to adapt a thermodynamic model of carbonaceous mesophase formation, originally applied to pitch carbonization, to the prediction of coke texture in coal combustion. This latter work makes use of the FG-DVC model of coal pyrolysis developed by Advanced Fuel Research to specify the pool of aromatic clusters that participate in the order/disorder transition. This modelling approach shows promise for the mechanistic prediction of the rank dependence of char structure and will therefore be pursued further. Crystalline ordering phenomena were also observed in a model char prepared from phenol-formaldehyde carbonized at 900{degrees}C and 1300{degrees}C using high-resolution TEM fringe imaging. Dramatic changes occur in the structure between 900 and 1300{degrees}C, making this char a suitable candidate for upcoming in situ work on the hot stage TEM. Work also proceeded on molecular dynamics simulations at Boston University and on equipment modification and testing for the combustion experiments with widely varying flame types at Ohio State.

  4. Towards fully automated structure-based function prediction in structural genomics: a case study.

    PubMed

    Watson, James D; Sanderson, Steve; Ezersky, Alexandra; Savchenko, Alexei; Edwards, Aled; Orengo, Christine; Joachimiak, Andrzej; Laskowski, Roman A; Thornton, Janet M

    2007-04-13

    As the global Structural Genomics projects have picked up pace, the number of structures annotated in the Protein Data Bank as hypothetical protein or unknown function has grown significantly. A major challenge now involves the development of computational methods to assign functions to these proteins accurately and automatically. As part of the Midwest Center for Structural Genomics (MCSG) we have developed a fully automated functional analysis server, ProFunc, which performs a battery of analyses on a submitted structure. The analyses combine a number of sequence-based and structure-based methods to identify functional clues. After the first stage of the Protein Structure Initiative (PSI), we review the success of the pipeline and the importance of structure-based function prediction. As a dataset, we have chosen all structures solved by the MCSG during the 5 years of the first PSI. Our analysis suggests that two of the structure-based methods are particularly successful and provide examples of local similarity that is difficult to identify using current sequence-based methods. No one method is successful in all cases, so, through the use of a number of complementary sequence and structural approaches, the ProFunc server increases the chances that at least one method will find a significant hit that can help elucidate function. Manual assessment of the results is a time-consuming process and subject to individual interpretation and human error. We present a method based on the Gene Ontology (GO) schema using GO-slims that can allow the automated assessment of hits with a success rate approaching that of expert manual assessment.

  5. Structural kinematics based damage zone prediction in gradient structures using vibration database

    NASA Astrophysics Data System (ADS)

    Talha, Mohammad; Ashokkumar, Chimpalthradi R.

    2014-05-01

    To explore the applications of functionally graded materials (FGMs) in dynamic structures, structural kinematics based health monitoring technique becomes an important problem. Depending upon the displacements in three dimensions, the health of the material to withstand dynamic loads is inferred in this paper, which is based on the net compressive and tensile displacements that each structural degree of freedom takes. These net displacements at each finite element node predicts damage zones of the FGM where the material is likely to fail due to a vibration response which is categorized according to loading condition. The damage zone prediction of a dynamically active FGMs plate have been accomplished using Reddy's higher-order theory. The constituent material properties are assumed to vary in the thickness direction according to the power-law behavior. The proposed C0 finite element model (FEM) is applied to get net tensile and compressive displacement distributions across the structures. A plate made of Aluminum/Ziconia is considered to illustrate the concept of structural kinematics-based health monitoring aspects of FGMs.

  6. Structure-Based Predictive model for Coal Char Combustion.

    SciTech Connect

    Hurt, R.; Calo, J.; Essenhigh, R.; Hadad, C.; Stanley, E.

    1997-06-25

    During the second quarter of this project, progress was made on both major technical tasks. Three parallel efforts were initiated on the modeling of carbon structural evolution. Structural ordering during carbonization was studied by a numerical simulation scheme proposed by Alan Kerstein involving molecular weight growth and rotational mobility. Work was also initiated to adapt a model of carbonaceous mesophase formation, originally developed under parallel NSF funding, to the prediction of coke texture. This latter work makes use of the FG-DVC model of coal pyrolysis developed by Advanced Fuel Research to specify the pool of aromatic clusters that participate in the order/disorder transition. Boston University has initiated molecular dynamics simulations of carbonization processes and Ohio State has begun theoretical treatment of surface reactions. Experimental work has also begun on model compound studies at Brown and on pilot-scale combustion systems with widely varying flame types at OSE. The work on mobility / growth models shows great promise and is discussed in detail in the body of the report.

  7. STRUCTURE BASED PREDICTIVE MODEL FOR COAL CHAR COMBUSTION

    SciTech Connect

    Robert Hurt; Joseph Calo; Robert Essenhigh; Christopher Hadad

    2001-06-15

    This report is part on the ongoing effort at Brown University and Ohio State University to develop structure based models of coal combustion. A very fundamental approach is taken to the description of coal chars and their reaction processes, and the results are therefore expected to have broad applicability to the spectrum of carbon materials of interest in energy technologies. This quarter, the project was in a period no-cost extension and discussions were held about the end phase of the project and possible continuations. The technical tasks were essentially dormant this period, but presentations of results were made, and plans were formulated for renewed activity in the fiscal year 2001.

  8. STRUCTURE-BASED PREDICTIVE MODEL FOR COAL CHAR COMBUSTION

    SciTech Connect

    CHRISTOPHER M. HADAD; JOSEPH M. CALO; ROBERT H. ESSENHIGH; ROBERT H. HURT

    1998-06-04

    During the past quarter of this project, significant progress continued was made on both major technical tasks. Progress was made at OSU on advancing the application of computational chemistry to oxidative attack on model polyaromatic hydrocarbons (PAHs) and graphitic structures. This work is directed at the application of quantitative ab initio molecular orbital theory to address the decomposition products and mechanisms of coal char reactivity. Previously, it was shown that the �hybrid� B3LYP method can be used to provide quantitative information concerning the stability of the corresponding radicals that arise by hydrogen atom abstraction from monocyclic aromatic rings. In the most recent quarter, these approaches have been extended to larger carbocyclic ring systems, such as coronene, in order to compare the properties of a large carbonaceous PAH to that of the smaller, monocyclic aromatic systems. It was concluded that, at least for bond dissociation energy considerations, the properties of the large PAHs can be modeled reasonably well by smaller systems. In addition to the preceding work, investigations were initiated on the interaction of selected radicals in the �radical pool� with the different types of aromatic structures. In particular, the different pathways for addition vs. abstraction to benzene and furan by H and OH radicals were examined. Thus far, the addition channel appears to be significantly favored over abstraction on both kinetic and thermochemical grounds. Experimental work at Brown University in support of the development of predictive structural models of coal char combustion was focused on elucidating the role of coal mineral matter impurities on reactivity. An �inverse� approach was used where a carbon material was doped with coal mineral matter. The carbon material was derived from a high carbon content fly ash (Fly Ash 23 from the Salem Basin Power Plant. The ash was obtained from Pittsburgh #8 coal (PSOC 1451). Doped

  9. Structure Based Predictive Model for Coal Char Combustion

    SciTech Connect

    Robert Hurt; Joseph Calo; Robert Essenhigh; Christopher Hadad

    2000-12-30

    This unique collaborative project has taken a very fundamental look at the origin of structure, and combustion reactivity of coal chars. It was a combined experimental and theoretical effort involving three universities and collaborators from universities outside the U.S. and from U.S. National Laboratories and contract research companies. The project goal was to improve our understanding of char structure and behavior by examining the fundamental chemistry of its polyaromatic building blocks. The project team investigated the elementary oxidative attack on polyaromatic systems, and coupled with a study of the assembly processes that convert these polyaromatic clusters to mature carbon materials (or chars). We believe that the work done in this project has defined a powerful new science-based approach to the understanding of char behavior. The work on aromatic oxidation pathways made extensive use of computational chemistry, and was led by Professor Christopher Hadad in the Department of Chemistry at Ohio State University. Laboratory experiments on char structure, properties, and combustion reactivity were carried out at both OSU and Brown, led by Principle Investigators Joseph Calo, Robert Essenhigh, and Robert Hurt. Modeling activities were divided into two parts: first unique models of crystal structure development were formulated by the team at Brown (PI'S Hurt and Calo) with input from Boston University and significant collaboration with Dr. Alan Kerstein at Sandia and with Dr. Zhong-Ying chen at SAIC. Secondly, new combustion models were developed and tested, led by Professor Essenhigh at OSU, Dieter Foertsch (a collaborator at the University of Stuttgart), and Professor Hurt at Brown. One product of this work is the CBK8 model of carbon burnout, which has already found practical use in CFD codes and in other numerical models of pulverized fuel combustion processes, such as EPRI's NOxLOI Predictor. The remainder of the report consists of detailed technical

  10. Evolutionary Algorithm for RNA Secondary Structure Prediction Based on Simulated SHAPE Data.

    PubMed

    Montaseri, Soheila; Ganjtabesh, Mohammad; Zare-Mirakabad, Fatemeh

    2016-01-01

    Non-coding RNAs perform a wide range of functions inside the living cells that are related to their structures. Several algorithms have been proposed to predict RNA secondary structure based on minimum free energy. Low prediction accuracy of these algorithms indicates that free energy alone is not sufficient to predict the functional secondary structure. Recently, the obtained information from the SHAPE experiment greatly improves the accuracy of RNA secondary structure prediction by adding this information to the thermodynamic free energy as pseudo-free energy. In this paper, a new method is proposed to predict RNA secondary structure based on both free energy and SHAPE pseudo-free energy. For each RNA sequence, a population of secondary structures is constructed and their SHAPE data are simulated. Then, an evolutionary algorithm is used to improve each structure based on both free and pseudo-free energies. Finally, a structure with minimum summation of free and pseudo-free energies is considered as the predicted RNA secondary structure. Computationally simulating the SHAPE data for a given RNA sequence requires its secondary structure. Here, we overcome this limitation by employing a population of secondary structures. This helps us to simulate the SHAPE data for any RNA sequence and consequently improves the accuracy of RNA secondary structure prediction as it is confirmed by our experiments. The source code and web server of our proposed method are freely available at http://mostafa.ut.ac.ir/ESD-Fold/.

  11. STRUCTURE-BASED PREDICTIVE MODEL FOR COAL CHAR COMBUSTION

    SciTech Connect

    Robert H. Hurt; Eric M. Suuberg

    2000-05-03

    This report is part on the ongoing effort at Brown University and Ohio State University to develop structure based models of coal combustion. A very fundamental approach is taken to the description of coal chars and their reaction processes, and the results are therefore expected to have broad applicability to the spectrum of carbon materials of interest in energy technologies. This quarter, our work on structure development in carbons continued. A combination of hot stage in situ and ex situ polarized light microscopy was used to identify the preferred orientational of graphene layers at gas interfaces in pitches used as carbon material precursors. The experiments show that edge-on orientation is the equilibrium state of the gas/pitch interface, implying that basal-rich surfaces have higher free energies than edge-rich surfaces in pitch. This result is in agreement with previous molecular modeling studies and TEM observations in the early stages of carbonization. The results may have important implications for the design of tailored carbons with edge-rich or basal-rich surfaces. In the computational chemistry task, we have continued our investigations into the reactivity of large aromatic rings. The role of H-atom abstraction as well as radical addition to monocyclic aromatic rings has been examined, and a manuscript is currently being revised after peer review. We have also shown that OH radical is more effective than H atom in the radical addition process with monocyclic rings. We have extended this analysis to H-atom and OH-radical addition to phenanthrene. Work on combustion kinetics focused on the theoretical analysis of the data previously gathered using thermogravametric analysis.

  12. Predictive model for early math skills based on structural equations.

    PubMed

    Aragón, Estíbaliz; Navarro, José I; Aguilar, Manuel; Cerda, Gamal; García-Sedeño, Manuel

    2016-12-01

    Early math skills are determined by higher cognitive processes that are particularly important for acquiring and developing skills during a child's early education. Such processes could be a critical target for identifying students at risk for math learning difficulties. Few studies have considered the use of a structural equation method to rationalize these relations. Participating in this study were 207 preschool students ages 59 to 72 months, 108 boys and 99 girls. Performance with respect to early math skills, early literacy, general intelligence, working memory, and short-term memory was assessed. A structural equation model explaining 64.3% of the variance in early math skills was applied. Early literacy exhibited the highest statistical significance (β = 0.443, p < 0.05), followed by intelligence (β = 0.286, p < 0.05), working memory (β = 0.220, p < 0.05), and short-term memory (β = 0.213, p < 0.05). Correlations between the independent variables were also significant (p < 0.05). According to the results, cognitive variables should be included in remedial intervention programs. © 2016 Scandinavian Psychological Associations and John Wiley & Sons Ltd.

  13. STRUCTURE-BASED PREDICTIVE MODEL FOR COAL CHAR COMBUSTION

    SciTech Connect

    CHRISTOPHER M. HADAD; JOSEPH M. CALO; ROBERT H. ESSENHIGH; ROBERT H. HURT

    1999-01-13

    Significant progress continued to be made during the past reporting quarter on both major technical tasks. During the reporting period at OSU, computational investigations were conducted of addition vs. abstraction reactions of H, O(3 P), and OH with monocyclic aromatic hydrocarbons. The potential energy surface for more than 80 unique reactions of H, O ( 3 P), and OH with aromatic hydrocarbons were determined at the B3LYP/6-31G(d) level of theory. The calculated transition state barriers and reaction free energies indicate that the addition channel is preferred at 298K, but that the abstraction channel becomes dominant at high temperatures. The thermodynamic preference for reactivity with aromatic hydrocarbons increases in the order O(3 P) < H < OH. Abstraction from six-membered aromatic rings is more facile than abstraction from five-membered aromatic rings. However, addition to five-membered rings is thermodynamically more favorable than addition to six-membered rings. The free energies for the abstraction and addition reactions of H, O, and OH with aromatic hydrocarbons and the characteristics of the respective transition states can be used to calculate the reaction rate constants for these important combustion reactions. Experimental work at Brown University on the effect of reaction on the structural evolution of different chars (i.e., phenolic resin char and chars produced from three different coals) have been investigated in a TGA/TPD-MS system. It has been found that samples of different age of these chars appeared to lose their "memory" concerning their initial structures at high burn-offs. During the reporting period, thermal desorption experiments of selected samples were conducted. These spectra show that the population of low temperature oxygen surface complexes, which are primarily responsible for reactivity, are more similar for the high burn-off than for the low burn-off samples of different ages; i.e., the population of active sites are more

  14. PSRna: Prediction of small RNA secondary structures based on reverse complementary folding method.

    PubMed

    Li, Jin; Xu, Chengzhen; Wang, Lei; Liang, Hong; Feng, Weixing; Cai, Zhongxi; Wang, Ying; Cong, Wang; Liu, Yunlong

    2016-08-01

    Prediction of RNA secondary structures is an important problem in computational biology and bioinformatics, since RNA secondary structures are fundamental for functional analysis of RNA molecules. However, small RNA secondary structures are scarce and few algorithms have been specifically designed for predicting the secondary structures of small RNAs. Here we propose an algorithm named "PSRna" for predicting small-RNA secondary structures using reverse complementary folding and characteristic hairpin loops of small RNAs. Unlike traditional algorithms that usually generate multi-branch loops and 5[Formula: see text] end self-folding, PSRna first estimated the maximum number of base pairs of RNA secondary structures based on the dynamic programming algorithm and a path matrix is constructed at the same time. Second, the backtracking paths are extracted from the path matrix based on backtracking algorithm, and each backtracking path represents a secondary structure. To improve accuracy, the predicted RNA secondary structures are filtered based on their free energy, where only the secondary structure with the minimum free energy was identified as the candidate secondary structure. Our experiments on real data show that the proposed algorithm is superior to two popular methods, RNAfold and RNAstructure, in terms of sensitivity, specificity and Matthews correlation coefficient (MCC).

  15. Structure-based prediction of DNA target sites by regulatory proteins.

    PubMed

    Kono, H; Sarai, A

    1999-04-01

    Regulatory proteins play a critical role in controlling complex spatial and temporal patterns of gene expression in higher organism, by recognizing multiple DNA sequences and regulating multiple target genes. Increasing amounts of structural data on the protein-DNA complex provides clues for the mechanism of target recognition by regulatory proteins. The analyses of the propensities of base-amino acid interactions observed in those structural data show that there is no one-to-one correspondence in the interaction, but clear preferences exist. On the other hand, the analysis of spatial distribution of amino acids around bases shows that even those amino acids with strong base preference such as Arg with G are distributed in a wide space around bases. Thus, amino acids with many different geometries can form a similar type of interaction with bases. The redundancy and structural flexibility in the interaction suggest that there are no simple rules in the sequence recognition, and its prediction is not straightforward. However, the spatial distributions of amino acids around bases indicate a possibility that the structural data can be used to derive empirical interaction potentials between amino acids and bases. Such information extracted from structural databases has been successfully used to predict amino acid sequences that fold into particular protein structures. We surmised that the structures of protein-DNA complexes could be used to predict DNA target sites for regulatory proteins, because determining DNA sequences that bind to a particular protein structure should be similar to finding amino acid sequences that fold into a particular structure. Here we demonstrate that the structural data can be used to predict DNA target sequences for regulatory proteins. Pairwise potentials that determine the interaction between bases and amino acids were empirically derived from the structural data. These potentials were then used to examine the compatibility between DNA

  16. Predicting Protein Secondary Structure Using Consensus Data Mining (CDM) Based on Empirical Statistics and Evolutionary Information.

    PubMed

    Kandoi, Gaurav; Leelananda, Sumudu P; Jernigan, Robert L; Sen, Taner Z

    2017-01-01

    Predicting the secondary structure of a protein from its sequence still remains a challenging problem. The prediction accuracies remain around 80 %, and for very diverse methods. Using evolutionary information and machine learning algorithms in particular has had the most impact. In this chapter, we will first define secondary structures, then we will review the Consensus Data Mining (CDM) technique based on the robust GOR algorithm and Fragment Database Mining (FDM) approach. GOR V is an empirical method utilizing a sliding window approach to model the secondary structural elements of a protein by making use of generalized evolutionary information. FDM uses data mining from experimental structure fragments, and is able to successfully predict the secondary structure of a protein by combining experimentally determined structural fragments based on sequence similarities of the fragments. The CDM method combines predictions from GOR V and FDM in a hierarchical manner to produce consensus predictions for secondary structure. In other words, if sequence fragment are not available, then it uses GOR V to make the secondary structure prediction. The online server of CDM is available at http://gor.bb.iastate.edu/cdm/ .

  17. Secondary Structure Predictions for Long RNA Sequences Based on Inversion Excursions and MapReduce.

    PubMed

    Yehdego, Daniel T; Zhang, Boyu; Kodimala, Vikram K R; Johnson, Kyle L; Taufer, Michela; Leung, Ming-Ying

    2013-05-01

    Secondary structures of ribonucleic acid (RNA) molecules play important roles in many biological processes including gene expression and regulation. Experimental observations and computing limitations suggest that we can approach the secondary structure prediction problem for long RNA sequences by segmenting them into shorter chunks, predicting the secondary structures of each chunk individually using existing prediction programs, and then assembling the results to give the structure of the original sequence. The selection of cutting points is a crucial component of the segmenting step. Noting that stem-loops and pseudoknots always contain an inversion, i.e., a stretch of nucleotides followed closely by its inverse complementary sequence, we developed two cutting methods for segmenting long RNA sequences based on inversion excursions: the centered and optimized method. Each step of searching for inversions, chunking, and predictions can be performed in parallel. In this paper we use a MapReduce framework, i.e., Hadoop, to extensively explore meaningful inversion stem lengths and gap sizes for the segmentation and identify correlations between chunking methods and prediction accuracy. We show that for a set of long RNA sequences in the RFAM database, whose secondary structures are known to contain pseudoknots, our approach predicts secondary structures more accurately than methods that do not segment the sequence, when the latter predictions are possible computationally. We also show that, as sequences exceed certain lengths, some programs cannot computationally predict pseudoknots while our chunking methods can. Overall, our predicted structures still retain the accuracy level of the original prediction programs when compared with known experimental secondary structure.

  18. RNA secondary structure prediction based on SHAPE data in helix regions.

    PubMed

    Lotfi, Mohadeseh; Zare-Mirakabad, Fatemeh; Montaseri, Soheila

    2015-09-07

    RNA molecules play important and fundamental roles in biological processes. Frequently, the functional form of single-stranded RNA molecules requires a specific tertiary structure. Classically, RNA structure determination has mostly been accomplished by X-Ray crystallography or Nuclear Magnetic Resonance approaches. These experimental methods are time consuming and expensive. In the past two decades, some computational methods and algorithms have been developed for RNA secondary structure prediction. In these algorithms, minimum free energy is known as the best criterion. However, the results of algorithms show that minimum free energy is not a sufficient criterion to predict RNA secondary structure. These algorithms need some additional knowledge about the structure, which has to be added in the methods. Recently, the information obtained from some experimental data, called SHAPE, can greatly improve the consistency between the native and predicted RNA secondary structure. In this paper, we investigate the influence of SHAPE data on four types of RNA substructures, helices, loops, base pairs from the start and end of helices and two base pairs from the start and end of helices. The results show that SHAPE data in helix regions can improve the prediction. We represent a new method to apply SHAPE data in helix regions for finding RNA secondary structure. Finally, we compare the results of the method on a set of RNAs to predict minimum free energy structure based on considering all SHAPE data and only SHAPE data in helix regions as pseudo free energy and without SHAPE data (without any pseudo free energy). The results show that RNA secondary structure prediction based on considering only SHAPE data in helix regions is more successful than not considering SHAPE data and it provides competitive results in comparison with considering all SHAPE data. Copyright © 2015 Elsevier Ltd. All rights reserved.

  19. Structural features based genome-wide characterization and prediction of nucleosome organization

    PubMed Central

    2012-01-01

    Background Nucleosome distribution along chromatin dictates genomic DNA accessibility and thus profoundly influences gene expression. However, the underlying mechanism of nucleosome formation remains elusive. Here, taking a structural perspective, we systematically explored nucleosome formation potential of genomic sequences and the effect on chromatin organization and gene expression in S. cerevisiae. Results We analyzed twelve structural features related to flexibility, curvature and energy of DNA sequences. The results showed that some structural features such as DNA denaturation, DNA-bending stiffness, Stacking energy, Z-DNA, Propeller twist and free energy, were highly correlated with in vitro and in vivo nucleosome occupancy. Specifically, they can be classified into two classes, one positively and the other negatively correlated with nucleosome occupancy. These two kinds of structural features facilitated nucleosome binding in centromere regions and repressed nucleosome formation in the promoter regions of protein-coding genes to mediate transcriptional regulation. Based on these analyses, we integrated all twelve structural features in a model to predict more accurately nucleosome occupancy in vivo than the existing methods that mainly depend on sequence compositional features. Furthermore, we developed a novel approach, named DLaNe, that located nucleosomes by detecting peaks of structural profiles, and built a meta predictor to integrate information from different structural features. As a comparison, we also constructed a hidden Markov model (HMM) to locate nucleosomes based on the profiles of these structural features. The result showed that the meta DLaNe and HMM-based method performed better than the existing methods, demonstrating the power of these structural features in predicting nucleosome positions. Conclusions Our analysis revealed that DNA structures significantly contribute to nucleosome organization and influence chromatin structure and gene

  20. Predicting structure and stability for RNA complexes with intermolecular loop–loop base-pairing

    PubMed Central

    Cao, Song; Xu, Xiaojun; Chen, Shi-Jie

    2014-01-01

    RNA loop–loop interactions are essential for genomic RNA dimerization and regulation of gene expression. In this article, a statistical mechanics-based computational method that predicts the structures and thermodynamic stabilities of RNA complexes with loop–loop kissing interactions is described. The method accounts for the entropy changes for the formation of loop–loop interactions, which is a notable advancement that other computational models have neglected. Benchmark tests with several experimentally validated systems show that the inclusion of the entropy parameters can indeed improve predictions for RNA complexes. Furthermore, the method can predict not only the native structures of RNA/RNA complexes but also alternative metastable structures. For instance, the model predicts that the SL1 domain of HIV-1 RNA can form two different dimer structures with similar stabilities. The prediction is consistent with experimental observation. In addition, the model predicts two different binding sites for hTR dimerization: One binding site has been experimentally proposed, and the other structure, which has a higher stability, is structurally feasible and needs further experimental validation. PMID:24751648

  1. Prediction of protein structural classes for low-similarity sequences using reduced PSSM and position-based secondary structural features.

    PubMed

    Wang, Junru; Wang, Cong; Cao, Jiajia; Liu, Xiaoqing; Yao, Yuhua; Dai, Qi

    2015-01-10

    Many efficient methods have been proposed to advance protein structural class prediction, but there are still some challenges where additional insight or technology is needed for low-similarity sequences. In this work, we schemed out a new prediction method for low-similarity datasets using reduced PSSM and position-based secondary structural features. We evaluated the proposed method with four experiments and compared it with the available competing prediction methods. The results indicate that the proposed method achieved the best performance among the evaluated methods, with overall accuracy 3-5% higher than the existing best-performing method. This paper also found that the reduced alphabets with size 13 simplify PSSM structures efficiently while reserving its maximal information. This understanding can be used to design more powerful prediction methods for protein structural class.

  2. Relative Packing Groups in Template-Based Structure Prediction: Cooperative Effects of True Positive Constraints

    PubMed Central

    Day, Ryan; Qu, Xiaotao; Swanson, Rosemarie; Bohannan, Zach; Bliss, Robert

    2011-01-01

    Abstract Most current template-based structure prediction methods concentrate on finding the correct backbone conformation and then packing sidechains within that backbone. Our packing-based method derives distance constraints from conserved relative packing groups (RPGs). In our refinement approach, the RPGs provide a level of resolution that restrains global topology while allowing conformational sampling. In this study, we test our template-based structure prediction method using 51 prediction units from CASP7 experiments. RPG-based constraints are able to substantially improve approximately two-thirds of starting templates. Upon deeper investigation, we find that true positive spatial constraints, especially those non-local in sequence, derived from the RPGs were important to building nearer native models. Surprisingly, the fraction of incorrect or false positive constraints does not strongly influence the quality of the final candidate. This result indicates that our RPG-based true positive constraints sample the self-consistent, cooperative interactions of the native structure. The lack of such reinforcing cooperativity explains the weaker effect of false positive constraints. Generally, these findings are encouraging indications that RPGs will improve template-based structure prediction. PMID:21210729

  3. Relative packing groups in template-based structure prediction: cooperative effects of true positive constraints.

    PubMed

    Day, Ryan; Qu, Xiaotao; Swanson, Rosemarie; Bohannan, Zach; Bliss, Robert; Tsai, Jerry

    2011-01-01

    Most current template-based structure prediction methods concentrate on finding the correct backbone conformation and then packing sidechains within that backbone. Our packing-based method derives distance constraints from conserved relative packing groups (RPGs). In our refinement approach, the RPGs provide a level of resolution that restrains global topology while allowing conformational sampling. In this study, we test our template-based structure prediction method using 51 prediction units from CASP7 experiments. RPG-based constraints are able to substantially improve approximately two-thirds of starting templates. Upon deeper investigation, we find that true positive spatial constraints, especially those non-local in sequence, derived from the RPGs were important to building nearer native models. Surprisingly, the fraction of incorrect or false positive constraints does not strongly influence the quality of the final candidate. This result indicates that our RPG-based true positive constraints sample the self-consistent, cooperative interactions of the native structure. The lack of such reinforcing cooperativity explains the weaker effect of false positive constraints. Generally, these findings are encouraging indications that RPGs will improve template-based structure prediction.

  4. Antibody structure determination using a combination of homology modeling, energy-based refinement, and loop prediction.

    PubMed

    Zhu, Kai; Day, Tyler; Warshaviak, Dora; Murrett, Colleen; Friesner, Richard; Pearlman, David

    2014-08-01

    We present the blinded prediction results in the Second Antibody Modeling Assessment (AMA-II) using a fully automatic antibody structure prediction method implemented in the programs BioLuminate and Prime. We have developed a novel knowledge based approach to model the CDR loops, using a combination of sequence similarity, geometry matching, and the clustering of database structures. The homology models are further optimized with a physics-based energy function (VSGB2.0), which improves the model quality significantly. H3 loop modeling remains the most challenging task. Our ab initio loop prediction performs well for the H3 loop in the crystal structure context, and allows improved results when refining the H3 loops in the context of homology models. For the 10 human and mouse derived antibodies in this assessment, the average RMSDs for the homology model Fv and framework regions are 1.19 Å and 0.74 Å, respectively. The average RMSDs for five non-H3 CDR loops range from 0.61 Å to 1.05 Å, and the H3 loop average RMSD is 2.91 Å using our knowledge-based loop prediction approach. The ab initio H3 loop predictions yield an average RMSD of 1.28 Å when performed in the context of the crystal structure and 2.67 Å in the context of the homology modeled structure. Notably, our method for predicting the H3 loop in the crystal structure environment ranked first among the seven participating groups in AMA-II, and our method made the best prediction among all participants for seven of the ten targets.

  5. Why Is There a Glass Ceiling for Threading Based Protein Structure Prediction Methods?

    PubMed

    Skolnick, Jeffrey; Zhou, Hongyi

    2017-04-20

    Despite their different implementations, comparison of the best threading approaches to the prediction of evolutionary distant protein structures reveals that they tend to succeed or fail on the same protein targets. This is true despite the fact that the structural template library has good templates for all cases. Thus, a key question is why are certain protein structures threadable while others are not. Comparison with threading results on a set of artificial sequences selected for stability further argues that the failure of threading is due to the nature of the protein structures themselves. Using a new contact map based alignment algorithm, we demonstrate that certain folds are highly degenerate in that they can have very similar coarse grained fractions of native contacts aligned and yet differ significantly from the native structure. For threadable proteins, this is not the case. Thus, contemporary threading approaches appear to have reached a plateau, and new approaches to structure prediction are required.

  6. A novel method for structure-based prediction of ion channel conductance properties.

    PubMed Central

    Smart, O S; Breed, J; Smith, G R; Sansom, M S

    1997-01-01

    A rapid and easy-to-use method of predicting the conductance of an ion channel from its three-dimensional structure is presented. The method combines the pore dimensions of the channel as measured in the HOLE program with an Ohmic model of conductance. An empirically based correction factor is then applied. The method yielded good results for six experimental channel structures (none of which were included in the training set) with predictions accurate to within an average factor of 1.62 to the true values. The predictive r2 was equal to 0.90, which is indicative of a good predictive ability. The procedure is used to validate model structures of alamethicin and phospholamban. Two genuine predictions for the conductance of channels with known structure but without reported conductances are given. A modification of the procedure that calculates the expected results for the effect of the addition of nonelectrolyte polymers on conductance is set out. Results for a cholera toxin B-subunit crystal structure agree well with the measured values. The difficulty in interpreting such studies is discussed, with the conclusion that measurements on channels of known structure are required. Images FIGURE 1 FIGURE 3 FIGURE 4 FIGURE 6 FIGURE 10 PMID:9138559

  7. Effect of using suboptimal alignments in template-based protein structure prediction.

    PubMed

    Chen, Hao; Kihara, Daisuke

    2011-01-01

    Computational protein structure prediction remains a challenging task in protein bioinformatics. In the recent years, the importance of template-based structure prediction is increasing because of the growing number of protein structures solved by the structural genomics projects. To capitalize the significant efforts and investments paid on the structural genomics projects, it is urgent to establish effective ways to use the solved structures as templates by developing methods for exploiting remotely related proteins that cannot be simply identified by homology. In this work, we examine the effect of using suboptimal alignments in template-based protein structure prediction. We showed that suboptimal alignments are often more accurate than the optimal one, and such accurate suboptimal alignments can occur even at a very low rank of the alignment score. Suboptimal alignments contain a significant number of correct amino acid residue contacts. Moreover, suboptimal alignments can improve template-based models when used as input to Modeller. Finally, we use suboptimal alignments for handling a contact potential in a probabilistic way in a threading program, SUPRB. The probabilistic contacts strategy outperforms the partly thawed approach, which only uses the optimal alignment in defining residue contacts, and also the re-ranking strategy, which uses the contact potential in re-ranking alignments. The comparison with existing methods in the template-recognition test shows that SUPRB is very competitive and outperforms existing methods. © 2010 Wiley-Liss, Inc.

  8. Effect of Using Suboptimal Alignments in Template-Based Protein Structure Prediction

    PubMed Central

    Chen, Hao; Kihara, Daisuke

    2010-01-01

    Computational protein structure prediction remains a challenging task in protein bioinformatics. In the recent years, the importance of template-based structure prediction is increasing due to the growing number of protein structures solved by the structural genomics projects. To capitalize the significant efforts and investments paid on the structural genomics projects, it is urgent to establish effective ways to use the solved structures as templates by developing methods for exploiting remotely related proteins that cannot be simply identified by homology. In this work, we examine the effect of employing suboptimal alignments in template-based protein structure prediction. We showed that suboptimal alignments are often more accurate than the optimal one, and such accurate suboptimal alignments can occur even at a very low rank of the alignment score. Suboptimal alignments contain a significant number of correct amino acid residue contacts. Moreover, suboptimal alignments can improve template-based models when used as input to Modeller. Finally, we employ suboptimal alignments for handling a contact potential in a probabilistic way in a threading program, SUPRB. The probabilistic contacts strategy outperforms the partly thawed approach which only uses the optimal alignment in defining residue contacts and also the reranking strategy, which uses the contact potential in reranking alignments. The comparison with existing methods in the template-recognition test shows that SUPRB is very competitive and outperform existing methods. PMID:21058297

  9. Prediction of protein secondary structure using probability based features and a hybrid system.

    PubMed

    Ghanty, Pradip; Pal, Nikhil R; Mudi, Rajani K

    2013-10-01

    In this paper, we propose some co-occurrence probability-based features for prediction of protein secondary structure. The features are extracted using occurrence/nonoccurrence of secondary structures in the protein sequences. We explore two types of features: position-specific (based on position of amino acid on fragments of protein sequences) as well as position-independent (independent of amino acid position on fragments of protein sequences). We use a hybrid system, NEUROSVM, consisting of neural networks and support vector machines for classification of secondary structures. We propose two schemes NSVMps and NSVM for protein secondary structure prediction. The NSVMps uses position-specific probability-based features and NEUROSVM classifier whereas NSVM uses the same classifier with position-independent probability-based features. The proposed method falls in the single-sequence category of methods because it does not use any sequence profile information such as position specific scoring matrices (PSSM) derived from PSI-BLAST. Two widely used datasets RS126 and CB513 are used in the experiments. The results obtained using the proposed features and NEUROSVM classifier are better than most of the existing single-sequence prediction methods. Most importantly, the results using NSVMps that are obtained using lower dimensional features, are comparable to those by other existing methods. The NSVMps and NSVM are finally tested on target proteins of the critical assessment of protein structure prediction experiment-9 (CASP9). A larger dataset is used to compare the performance of the proposed methods with that of two recent single-sequence prediction methods. We also investigate the impact of presence of different amino acid residues (in protein sequences) that are responsible for the formation of different secondary structures.

  10. An effective structure prediction method for layered materials based on 2D particle swarm optimization algorithm.

    PubMed

    Wang, Yanchao; Miao, Maosheng; Lv, Jian; Zhu, Li; Yin, Ketao; Liu, Hanyu; Ma, Yanming

    2012-12-14

    A structure prediction method for layered materials based on two-dimensional (2D) particle swarm optimization algorithm is developed. The relaxation of atoms in the perpendicular direction within a given range is allowed. Additional techniques including structural similarity determination, symmetry constraint enforcement, and discretization of structure constructions based on space gridding are implemented and demonstrated to significantly improve the global structural search efficiency. Our method is successful in predicting the structures of known 2D materials, including single layer and multi-layer graphene, 2D boron nitride (BN) compounds, and some quasi-2D group 6 metals(VIB) chalcogenides. Furthermore, by use of this method, we predict a new family of mono-layered boron nitride structures with different chemical compositions. The first-principles electronic structure calculations reveal that the band gap of these N-rich BN systems can be tuned from 5.40 eV to 2.20 eV by adjusting the composition.

  11. High-throughput imaging-based nephrotoxicity prediction for xenobiotics with diverse chemical structures.

    PubMed

    Su, Ran; Xiong, Sijing; Zink, Daniele; Loo, Lit-Hsin

    2016-11-01

    The kidney is a major target for xenobiotics, which include drugs, industrial chemicals, environmental toxicants and other compounds. Accurate methods for screening large numbers of potentially nephrotoxic xenobiotics with diverse chemical structures are currently not available. Here, we describe an approach for nephrotoxicity prediction that combines high-throughput imaging of cultured human renal proximal tubular cells (PTCs), quantitative phenotypic profiling, and machine learning methods. We automatically quantified 129 image-based phenotypic features, and identified chromatin and cytoskeletal features that can predict the human in vivo PTC toxicity of 44 reference compounds with ~82 % (primary PTCs) or 89 % (immortalized PTCs) test balanced accuracies. Surprisingly, our results also revealed that a DNA damage response is commonly induced by different PTC toxicants that have diverse chemical structures and injury mechanisms. Together, our results show that human nephrotoxicity can be predicted with high efficiency and accuracy by combining cell-based and computational methods that are suitable for automation.

  12. FINDSITE: a combined evolution/structure-based approach to protein function prediction

    PubMed Central

    Brylinski, Michal

    2009-01-01

    A key challenge of the post-genomic era is the identification of the function(s) of all the molecules in a given organism. Here, we review the status of sequence and structure-based approaches to protein function inference and ligand screening that can provide functional insights for a significant fraction of the ∼50% of ORFs of unassigned function in an average proteome. We then describe FINDSITE, a recently developed algorithm for ligand binding site prediction, ligand screening and molecular function prediction, which is based on binding site conservation across evolutionary distant proteins identified by threading. Importantly, FINDSITE gives comparable results when high-resolution experimental structures as well as predicted protein models are used. PMID:19324930

  13. Interaction prediction in structure-based virtual screening using deep learning.

    PubMed

    Gonczarek, Adam; Tomczak, Jakub M; Zaręba, Szymon; Kaczmar, Joanna; Dąbrowski, Piotr; Walczak, Michał J

    2017-09-14

    We introduce a deep learning architecture for structure-based virtual screening that generates fixed-sized fingerprints of proteins and small molecules by applying learnable atom convolution and softmax operations to each molecule separately. These fingerprints are further non-linearly transformed, their inner product is calculated and used to predict the binding potential. Moreover, we show that widely used benchmark datasets may be insufficient for testing structure-based virtual screening methods that utilize machine learning. Therefore, we introduce a new benchmark dataset, which we constructed based on DUD-E, MUV and PDBBind databases. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. SVM-PB-Pred: SVM based protein block prediction method using sequence profiles and secondary structures.

    PubMed

    Suresh, V; Parthasarathy, S

    2014-01-01

    We developed a support vector machine based web server called SVM-PB-Pred, to predict the Protein Block for any given amino acid sequence. The input features of SVM-PB-Pred include i) sequence profiles (PSSM) and ii) actual secondary structures (SS) from DSSP method or predicted secondary structures from NPS@ and GOR4 methods. There were three combined input features PSSM+SS(DSSP), PSSM+SS(NPS@) and PSSM+SS(GOR4) used to test and train the SVM models. Similarly, four datasets RS90, DB433, LI1264 and SP1577 were used to develop the SVM models. These four SVM models developed were tested using three different benchmarking tests namely; (i) self consistency, (ii) seven fold cross validation test and (iii) independent case test. The maximum possible prediction accuracy of ~70% was observed in self consistency test for the SVM models of both LI1264 and SP1577 datasets, where PSSM+SS(DSSP) input features was used to test. The prediction accuracies were reduced to ~53% for PSSM+SS(NPS@) and ~43% for PSSM+SS(GOR4) in independent case test, for the SVM models of above two same datasets. Using our method, it is possible to predict the protein block letters for any query protein sequence with ~53% accuracy, when the SP1577 dataset and predicted secondary structure from NPS@ server were used. The SVM-PB-Pred server can be freely accessed through http://bioinfo.bdu.ac.in/~svmpbpred.

  15. A protein structural classes prediction method based on PSI-BLAST profile.

    PubMed

    Ding, Shuyan; Yan, Shoujiang; Qi, Shuhua; Li, Yan; Yao, Yuhua

    2014-07-21

    Knowledge of protein structural classes plays an important role in understanding protein folding patterns. Prediction of protein structural class based solely on sequence data remains to be a challenging problem. In this study, we extract the long-range correlation information and linear correlation information from position-specific score matrix (PSSM). A total of 3600 features are extracted, then, 278 features are selected by a filter feature selection method based on 1189 dataset. To verify the performance of our method (named by LCC-PSSM), jackknife tests are performed on three widely used low similarity benchmark datasets. Comparison of our results with the existing methods shows that our method provides the favorable performance for protein structural class prediction. Stand-alone version of the proposed method (LCC-PSSM) is written in MATLAB language and it can be downloaded from http://bioinfo.zstu.edu.cn/LCC-PSSM/.

  16. Prediction of Protein Structural Class Based on Gapped-Dipeptides and a Recursive Feature Selection Approach.

    PubMed

    Liu, Taigang; Qin, Yufang; Wang, Yongjie; Wang, Chunhua

    2015-12-24

    The prior knowledge of protein structural class may offer useful clues on understanding its functionality as well as its tertiary structure. Though various significant efforts have been made to find a fast and effective computational approach to address this problem, it is still a challenging topic in the field of bioinformatics. The position-specific score matrix (PSSM) profile has been shown to provide a useful source of information for improving the prediction performance of protein structural class. However, this information has not been adequately explored. To this end, in this study, we present a feature extraction technique which is based on gapped-dipeptides composition computed directly from PSSM. Then, a careful feature selection technique is performed based on support vector machine-recursive feature elimination (SVM-RFE). These optimal features are selected to construct a final predictor. The results of jackknife tests on four working datasets show that our method obtains satisfactory prediction accuracies by extracting features solely based on PSSM and could serve as a very promising tool to predict protein structural class.

  17. Optimized distance-dependent atom-pair-based potential DOOP for protein structure prediction.

    PubMed

    Chae, Myong-Ho; Krull, Florian; Knapp, Ernst-Walter

    2015-05-01

    The DOcking decoy-based Optimized Potential (DOOP) energy function for protein structure prediction is based on empirical distance-dependent atom-pair interactions. To optimize the atom-pair interactions, native protein structures are decomposed into polypeptide chain segments that correspond to structural motives involving complete secondary structure elements. They constitute near native ligand-receptor systems (or just pairs). Thus, a total of 8609 ligand-receptor systems were prepared from 954 selected proteins. For each of these hypothetical ligand-receptor systems, 1000 evenly sampled docking decoys with 0-10 Å interface root-mean-square-deviation (iRMSD) were generated with a method used before for protein-protein docking. A neural network-based optimization method was applied to derive the optimized energy parameters using these decoys so that the energy function mimics the funnel-like energy landscape for the interaction between these hypothetical ligand-receptor systems. Thus, our method hierarchically models the overall funnel-like energy landscape of native protein structures. The resulting energy function was tested on several commonly used decoy sets for native protein structure recognition and compared with other statistical potentials. In combination with a torsion potential term which describes the local conformational preference, the atom-pair-based potential outperforms other reported statistical energy functions in correct ranking of native protein structures for a variety of decoy sets. This is especially the case for the most challenging ROSETTA decoy set, although it does not take into account side chain orientation-dependence explicitly. The DOOP energy function for protein structure prediction, the underlying database of protein structures with hypothetical ligand-receptor systems and their decoys are freely available at http://agknapp.chemie.fu-berlin.de/doop/. © 2015 Wiley Periodicals, Inc.

  18. Prediction of reversible disulfide based on features from local structural signatures.

    PubMed

    Sun, Ming-An; Wang, Yejun; Zhang, Qing; Xia, Yiji; Ge, Wei; Guo, Dianjing

    2017-04-04

    Disulfide bonds are traditionally considered to play only structural roles. In recent years, increasing evidence suggests that the disulfide proteome is made up of structural disulfides and reversible disulfides. Unlike structural disulfides, reversible disulfides are usually of important functional roles and may serve as redox switches. Interestingly, only specific disulfide bonds are reversible while others are not. However, whether reversible disulfides can be predicted based on structural information remains largely unknown. In this study, two datasets with both types of disulfides were compiled using independent approaches. By comparison of various features extracted from the local structural signatures, we identified several features that differ significantly between reversible and structural disulfides, including disulfide bond length, along with the number, amino acid composition, secondary structure and physical-chemical properties of surrounding amino acids. A SVM-based classifier was developed for predicting reversible disulfides. RESULTS: By 10-fold cross-validation, the model achieved accuracy of 0.750, sensitivity of 0.352, specificity of 0.953, MCC of 0.405 and AUC of 0.751 using the RevSS_PDB dataset. The robustness was further validated by using RevSS_RedoxDB as independent testing dataset. This model was applied to proteins with known structures in the PDB database. The results show that one third of the predicted reversible disulfide containing proteins are well-known redox enzymes, while the remaining are non-enzyme proteins. Given that reversible disulfides are frequently reported from functionally important non-enzyme proteins such as transcription factors, the predictions may provide valuable candidates of novel reversible disulfides for further experimental investigation. This study provides the first comparative analysis between the reversible and the structural disulfides. Distinct features remarkably different between these two

  19. Protein subcellular localization prediction based on compartment-specific features and structure conservation

    PubMed Central

    Su, Emily Chia-Yu; Chiu, Hua-Sheng; Lo, Allan; Hwang, Jenn-Kang; Sung, Ting-Yi; Hsu, Wen-Lian

    2007-01-01

    Background Protein subcellular localization is crucial for genome annotation, protein function prediction, and drug discovery. Determination of subcellular localization using experimental approaches is time-consuming; thus, computational approaches become highly desirable. Extensive studies of localization prediction have led to the development of several methods including composition-based and homology-based methods. However, their performance might be significantly degraded if homologous sequences are not detected. Moreover, methods that integrate various features could suffer from the problem of low coverage in high-throughput proteomic analyses due to the lack of information to characterize unknown proteins. Results We propose a hybrid prediction method for Gram-negative bacteria that combines a one-versus-one support vector machines (SVM) model and a structural homology approach. The SVM model comprises a number of binary classifiers, in which biological features derived from Gram-negative bacteria translocation pathways are incorporated. In the structural homology approach, we employ secondary structure alignment for structural similarity comparison and assign the known localization of the top-ranked protein as the predicted localization of a query protein. The hybrid method achieves overall accuracy of 93.7% and 93.2% using ten-fold cross-validation on the benchmark data sets. In the assessment of the evaluation data sets, our method also attains accurate prediction accuracy of 84.0%, especially when testing on sequences with a low level of homology to the training data. A three-way data split procedure is also incorporated to prevent overestimation of the predictive performance. In addition, we show that the prediction accuracy should be approximately 85% for non-redundant data sets of sequence identity less than 30%. Conclusion Our results demonstrate that biological features derived from Gram-negative bacteria translocation pathways yield a significant

  20. An Energy Based Fatigue Life Prediction Framework for In-Service Structural Components

    SciTech Connect

    H. Ozaltun; M. H.H. Shen; T. George; C. Cross

    2011-06-01

    An energy based fatigue life prediction framework has been developed for calculation of remaining fatigue life of in service gas turbine materials. The purpose of the life prediction framework is to account aging effect caused by cyclic loadings on fatigue strength of gas turbine engines structural components which are usually designed for very long life. Previous studies indicate the total strain energy dissipated during a monotonic fracture process and a cyclic process is a material property that can be determined by measuring the area underneath the monotonic true stress-strain curve and the sum of the area within each hysteresis loop in the cyclic process, respectively. The energy-based fatigue life prediction framework consists of the following entities: (1) development of a testing procedure to achieve plastic energy dissipation per life cycle and (2) incorporation of an energy-based fatigue life calculation scheme to determine the remaining fatigue life of in-service gas turbine materials. The accuracy of the remaining fatigue life prediction method was verified by comparison between model approximation and experimental results of Aluminum 6061-T6. The comparison shows promising agreement, thus validating the capability of the framework to produce accurate fatigue life prediction.

  1. Genetic programming based quantitative structure-retention relationships for the prediction of Kovats retention indices.

    PubMed

    Goel, Purva; Bapat, Sanket; Vyas, Renu; Tambe, Amruta; Tambe, Sanjeev S

    2015-11-13

    The development of quantitative structure-retention relationships (QSRR) aims at constructing an appropriate linear/nonlinear model for the prediction of the retention behavior (such as Kovats retention index) of a solute on a chromatographic column. Commonly, multi-linear regression and artificial neural networks are used in the QSRR development in the gas chromatography (GC). In this study, an artificial intelligence based data-driven modeling formalism, namely genetic programming (GP), has been introduced for the development of quantitative structure based models predicting Kovats retention indices (KRI). The novelty of the GP formalism is that given an example dataset, it searches and optimizes both the form (structure) and the parameters of an appropriate linear/nonlinear data-fitting model. Thus, it is not necessary to pre-specify the form of the data-fitting model in the GP-based modeling. These models are also less complex, simple to understand, and easy to deploy. The effectiveness of GP in constructing QSRRs has been demonstrated by developing models predicting KRIs of light hydrocarbons (case study-I) and adamantane derivatives (case study-II). In each case study, two-, three- and four-descriptor models have been developed using the KRI data available in the literature. The results of these studies clearly indicate that the GP-based models possess an excellent KRI prediction accuracy and generalization capability. Specifically, the best performing four-descriptor models in both the case studies have yielded high (>0.9) values of the coefficient of determination (R(2)) and low values of root mean squared error (RMSE) and mean absolute percent error (MAPE) for training, test and validation set data. The characteristic feature of this study is that it introduces a practical and an effective GP-based method for developing QSRRs in gas chromatography that can be gainfully utilized for developing other types of data-driven models in chromatography science.

  2. Machine-learning scoring functions to improve structure-based binding affinity prediction and virtual screening.

    PubMed

    Ain, Qurrat Ul; Aleksandrova, Antoniya; Roessler, Florian D; Ballester, Pedro J

    2015-01-01

    Docking tools to predict whether and how a small molecule binds to a target can be applied if a structural model of such target is available. The reliability of docking depends, however, on the accuracy of the adopted scoring function (SF). Despite intense research over the years, improving the accuracy of SFs for structure-based binding affinity prediction or virtual screening has proven to be a challenging task for any class of method. New SFs based on modern machine-learning regression models, which do not impose a predetermined functional form and thus are able to exploit effectively much larger amounts of experimental data, have recently been introduced. These machine-learning SFs have been shown to outperform a wide range of classical SFs at both binding affinity prediction and virtual screening. The emerging picture from these studies is that the classical approach of using linear regression with a small number of expert-selected structural features can be strongly improved by a machine-learning approach based on nonlinear regression allied with comprehensive data-driven feature selection. Furthermore, the performance of classical SFs does not grow with larger training datasets and hence this performance gap is expected to widen as more training data becomes available in the future. Other topics covered in this review include predicting the reliability of a SF on a particular target class, generating synthetic data to improve predictive performance and modeling guidelines for SF development. WIREs Comput Mol Sci 2015, 5:405-424. doi: 10.1002/wcms.1225 For further resources related to this article, please visit the WIREs website.

  3. Prediction of atomic structure of Pt-based bimetallic nanoalloys by using genetic algorithm

    NASA Astrophysics Data System (ADS)

    Oh, Jung Soo; Nam, Ho-Seok; Choi, Jung-Hae; Lee, Seung-Cheol

    2013-05-01

    The atom-arrangements in Pt-based bimetallic nanoalloys were predicted by the combined use of genetic algorithm (GA) and molecular dynamics (MD) simulations. The nanoparticles of these nanoalloys were assumed to be a 3.5 nm-diameter truncated octahedron with Pt and noble metals of fixed composition ratio of 1:1. For the GA, a Python code, which concurrently linked with the MD method that uses the embedded atom method inter-atomic potentials, was developed for the prediction of the atom arrangements in these bimetallic nanoalloys. Successfully, the GA calculation predicted the core-shell structures for both Pt-Ag and Pt-Au nanoalloy, but an onion-like multilayered core-shell structure for Pt-Cu nanoalloy. The structural characteristics in the bimetallic nanoalloy were mainly due to the differences in the surface energy and cohesive energy between Pt and the other alloying metal elements and their miscibility gap and so on. Briefly, the prediction performance was analyzed to show the superior searching ability of GA.

  4. Automated protein motif generation in the structure-based protein function prediction tool ProMOL.

    PubMed

    Osipovitch, Mikhail; Lambrecht, Mitchell; Baker, Cameron; Madha, Shariq; Mills, Jeffrey L; Craig, Paul A; Bernstein, Herbert J

    2015-12-01

    ProMOL, a plugin for the PyMOL molecular graphics system, is a structure-based protein function prediction tool. ProMOL includes a set of routines for building motif templates that are used for screening query structures for enzyme active sites. Previously, each motif template was generated manually and required supervision in the optimization of parameters for sensitivity and selectivity. We developed an algorithm and workflow for the automation of motif building and testing routines in ProMOL. The algorithm uses a set of empirically derived parameters for optimization and requires little user intervention. The automated motif generation algorithm was first tested in a performance comparison with a set of manually generated motifs based on identical active sites from the same 112 PDB entries. The two sets of motifs were equally effective in identifying alignments with homologs and in rejecting alignments with unrelated structures. A second set of 296 active site motifs were generated automatically, based on Catalytic Site Atlas entries with literature citations, as an expansion of the library of existing manually generated motif templates. The new motif templates exhibited comparable performance to the existing ones in terms of hit rates against native structures, homologs with the same EC and Pfam designations, and randomly selected unrelated structures with a different EC designation at the first EC digit, as well as in terms of RMSD values obtained from local structural alignments of motifs and query structures. This research is supported by NIH grant GM078077.

  5. Toward a unifying strategy for the structure-based prediction of toxicological endpoints.

    PubMed

    Carrió, Pau; Sanz, Ferran; Pastor, Manuel

    2016-10-01

    Most computational methods used for the prediction of toxicity endpoints are based on the assumption that similar compounds have similar biological properties. This principle can be exploited using computational methods like read across or quantitative structure-activity relationships. However, there is no general agreement about which method is the most appropriate for quantifying compound similarity neither for exploiting the similarity principle in order to obtain reliable estimations of the compound properties. Moreover, optimal similarity metrics and modeling methods might depend on the characteristics of the endpoints and training series used in each case. This study describes a comparative analysis of the predictive performance of diverse similarity metrics and modeling methods in toxicological applications. A collection of two quantitative (n = 660, n = 1114) and three qualitative (n = 447, n = 905, n = 1220) datasets representing very different endpoints of interest in drug safety evaluation and rigorous methods were used to estimate the external predictive ability in each case. The results confirm that no single approach produces the best results in all instances, and the best predictions were obtained using different tools in different situations. The trends observed in this study were exploited to propose a unifying strategy allowing the use of the most suitable method for every compound. A comparison of the quality of the predictions obtained by the unifying strategy with those obtained by standard prediction methods confirmed the usefulness of the proposed approach.

  6. Predicting protein structural classes based on complex networks and recurrence analysis.

    PubMed

    Olyaee, Mohammad H; Yaghoubi, Ali; Yaghoobi, Mahdi

    2016-09-07

    Protein sequences are divided into four structural classes. The determination of class is a challenging and beneficial task in the bioinformatics field. Several methods have been proposed to this end, but most utilize too many features and produce unsuitable results. In the present, features are extracted based on the predicted secondary structures. At first, predicted secondary structure sequences are mapped into two time series by the chaos game representation. Then, a recurrence matrix is calculated from each of the time series. The recurrence matrix is identified with the adjacency matrix of a complex network and measures are applied for the characterization of complex networks to these recurrence matrixes. For a given protein sequence, a total of 24 characteristic features can be calculated and these are fed into Fisher's discriminated analysis algorithm for classification. To examine the proposed method, two widely used low similarity benchmark datasets design and test its performance. A comparison with the results of existing methods shows that the current study's approach provides a satisfactory performance for protein structural class prediction. Copyright © 2016 Elsevier Ltd. All rights reserved.

  7. Fast reconstruction and prediction of frozen flow turbulence based on structured Kalman filtering.

    PubMed

    Fraanje, Rufus; Rice, Justin; Verhaegen, Michel; Doelman, Niek

    2010-11-01

    Efficient and optimal prediction of frozen flow turbulence using the complete observation history of the wavefront sensor is an important issue in adaptive optics for large ground-based telescopes. At least for the sake of error budgeting and algorithm performance, the evaluation of an accurate estimate of the optimal performance of a particular adaptive optics configuration is important. However, due to the large number of grid points, high sampling rates, and the non-rationality of the turbulence power spectral density, the computational complexity of the optimal predictor is huge. This paper shows how a structure in the frozen flow propagation can be exploited to obtain a state-space innovation model with a particular sparsity structure. This sparsity structure enables one to efficiently compute a structured Kalman filter. By simulation it is shown that the performance can be improved and the computational complexity can be reduced in comparison with auto-regressive predictors of low order.

  8. Highly Accurate Structure-Based Prediction of HIV-1 Coreceptor Usage Suggests Intermolecular Interactions Driving Tropism.

    PubMed

    Kieslich, Chris A; Tamamis, Phanourios; Guzman, Yannis A; Onel, Melis; Floudas, Christodoulos A

    2016-01-01

    HIV-1 entry into host cells is mediated by interactions between the V3-loop of viral glycoprotein gp120 and chemokine receptor CCR5 or CXCR4, collectively known as HIV-1 coreceptors. Accurate genotypic prediction of coreceptor usage is of significant clinical interest and determination of the factors driving tropism has been the focus of extensive study. We have developed a method based on nonlinear support vector machines to elucidate the interacting residue pairs driving coreceptor usage and provide highly accurate coreceptor usage predictions. Our models utilize centroid-centroid interaction energies from computationally derived structures of the V3-loop:coreceptor complexes as primary features, while additional features based on established rules regarding V3-loop sequences are also investigated. We tested our method on 2455 V3-loop sequences of various lengths and subtypes, and produce a median area under the receiver operator curve of 0.977 based on 500 runs of 10-fold cross validation. Our study is the first to elucidate a small set of specific interacting residue pairs between the V3-loop and coreceptors capable of predicting coreceptor usage with high accuracy across major HIV-1 subtypes. The developed method has been implemented as a web tool named CRUSH, CoReceptor USage prediction for HIV-1, which is available at http://ares.tamu.edu/CRUSH/.

  9. Structure-based prediction of the effects of a missense variant on protein stability.

    PubMed

    Yang, Yang; Chen, Biao; Tan, Ge; Vihinen, Mauno; Shen, Bairong

    2013-03-01

    Predicting the effects of amino acid substitutions on protein stability provides invaluable information for protein design, the assignment of biological function, and for understanding disease-associated variations. To understand the effects of substitutions, computational models are preferred to time-consuming and expensive experimental methods. Several methods have been proposed for this task including machine learning-based approaches. However, models trained using limited data have performance problems and many model parameters tend to be over-fitted. To decrease the number of model parameters and to improve the generalization potential, we calculated the amino acid contact energy change for point variations using a structure-based coarse-grained model. Based on the structural properties including contact energy (CE) and further physicochemical properties of the amino acids as input features, we developed two support vector machine classifiers. M47 predicted the stability of variant proteins with an accuracy of 87 % and a Matthews correlation coefficient of 0.68 for a large dataset of 1925 variants, whereas M8 performed better when a relatively small dataset of 388 variants was used for 20-fold cross-validation. The performance of the M47 classifier on all six tested contingency table evaluation parameters is better than that of existing machine learning-based models or energy function-based protein stability classifiers.

  10. Profiles and majority voting-based ensemble method for protein secondary structure prediction.

    PubMed

    Bouziane, Hafida; Messabih, Belhadri; Chouarfia, Abdallah

    2011-01-01

    Machine learning techniques have been widely applied to solve the problem of predicting protein secondary structure from the amino acid sequence. They have gained substantial success in this research area. Many methods have been used including k-Nearest Neighbors (k-NNs), Hidden Markov Models (HMMs), Artificial Neural Networks (ANNs) and Support Vector Machines (SVMs), which have attracted attention recently. Today, the main goal remains to improve the prediction quality of the secondary structure elements. The prediction accuracy has been continuously improved over the years, especially by using hybrid or ensemble methods and incorporating evolutionary information in the form of profiles extracted from alignments of multiple homologous sequences. In this paper, we investigate how best to combine k-NNs, ANNs and Multi-class SVMs (M-SVMs) to improve secondary structure prediction of globular proteins. An ensemble method which combines the outputs of two feed-forward ANNs, k-NN and three M-SVM classifiers has been applied. Ensemble members are combined using two variants of majority voting rule. An heuristic based filter has also been applied to refine the prediction. To investigate how much improvement the general ensemble method can give rather than the individual classifiers that make up the ensemble, we have experimented with the proposed system on the two widely used benchmark datasets RS126 and CB513 using cross-validation tests by including PSI-BLAST position-specific scoring matrix (PSSM) profiles as inputs. The experimental results reveal that the proposed system yields significant performance gains when compared with the best individual classifier.

  11. Disulfide Connectivity Prediction Based on Modelled Protein 3D Structural Information and Random Forest Regression.

    PubMed

    Yu, Dong-Jun; Li, Yang; Hu, Jun; Yang, Xibei; Yang, Jing-Yu; Shen, Hong-Bin

    2015-01-01

    Disulfide connectivity is an important protein structural characteristic. Accurately predicting disulfide connectivity solely from protein sequence helps to improve the intrinsic understanding of protein structure and function, especially in the post-genome era where large volume of sequenced proteins without being functional annotated is quickly accumulated. In this study, a new feature extracted from the predicted protein 3D structural information is proposed and integrated with traditional features to form discriminative features. Based on the extracted features, a random forest regression model is performed to predict protein disulfide connectivity. We compare the proposed method with popular existing predictors by performing both cross-validation and independent validation tests on benchmark datasets. The experimental results demonstrate the superiority of the proposed method over existing predictors. We believe the superiority of the proposed method benefits from both the good discriminative capability of the newly developed features and the powerful modelling capability of the random forest. The web server implementation, called TargetDisulfide, and the benchmark datasets are freely available at: http://csbio.njust.edu.cn/bioinf/TargetDisulfide for academic use.

  12. Structure Based Thermostability Prediction Models for Protein Single Point Mutations with Machine Learning Tools.

    PubMed

    Jia, Lei; Yarlagadda, Ramya; Reed, Charles C

    2015-01-01

    Thermostability issue of protein point mutations is a common occurrence in protein engineering. An application which predicts the thermostability of mutants can be helpful for guiding decision making process in protein design via mutagenesis. An in silico point mutation scanning method is frequently used to find "hot spots" in proteins for focused mutagenesis. ProTherm (http://gibk26.bio.kyutech.ac.jp/jouhou/Protherm/protherm.html) is a public database that consists of thousands of protein mutants' experimentally measured thermostability. Two data sets based on two differently measured thermostability properties of protein single point mutations, namely the unfolding free energy change (ddG) and melting temperature change (dTm) were obtained from this database. Folding free energy change calculation from Rosetta, structural information of the point mutations as well as amino acid physical properties were obtained for building thermostability prediction models with informatics modeling tools. Five supervised machine learning methods (support vector machine, random forests, artificial neural network, naïve Bayes classifier, K nearest neighbor) and partial least squares regression are used for building the prediction models. Binary and ternary classifications as well as regression models were built and evaluated. Data set redundancy and balancing, the reverse mutations technique, feature selection, and comparison to other published methods were discussed. Rosetta calculated folding free energy change ranked as the most influential features in all prediction models. Other descriptors also made significant contributions to increasing the accuracy of the prediction models.

  13. Structure-based predictions broadly link transcription factor mutations to gene expression changes in cancers.

    PubMed

    Ashworth, Justin; Bernard, Brady; Reynolds, Sheila; Plaisier, Christopher L; Shmulevich, Ilya; Baliga, Nitin S

    2014-12-01

    Thousands of unique mutations in transcription factors (TFs) arise in cancers, and the functional and biological roles of relatively few of these have been characterized. Here, we used structure-based methods developed specifically for DNA-binding proteins to systematically predict the consequences of mutations in several TFs that are frequently mutated in cancers. The explicit consideration of protein-DNA interactions was crucial to explain the roles and prevalence of mutations in TP53 and RUNX1 in cancers, and resulted in a higher specificity of detection for known p53-regulated genes among genetic associations between TP53 genotypes and genome-wide expression in The Cancer Genome Atlas, compared to existing methods of mutation assessment. Biophysical predictions also indicated that the relative prevalence of TP53 missense mutations in cancer is proportional to their thermodynamic impacts on protein stability and DNA binding, which is consistent with the selection for the loss of p53 transcriptional function in cancers. Structure and thermodynamics-based predictions of the impacts of missense mutations that focus on specific molecular functions may be increasingly useful for the precise and large-scale inference of aberrant molecular phenotypes in cancer and other complex diseases.

  14. CASP11--An Evaluation of a Modular BCL::Fold-Based Protein Structure Prediction Pipeline.

    PubMed

    Fischer, Axel W; Heinze, Sten; Putnam, Daniel K; Li, Bian; Pino, James C; Xia, Yan; Lopez, Carlos F; Meiler, Jens

    2016-01-01

    In silico prediction of a protein's tertiary structure remains an unsolved problem. The community-wide Critical Assessment of Protein Structure Prediction (CASP) experiment provides a double-blind study to evaluate improvements in protein structure prediction algorithms. We developed a protein structure prediction pipeline employing a three-stage approach, consisting of low-resolution topology search, high-resolution refinement, and molecular dynamics simulation to predict the tertiary structure of proteins from the primary structure alone or including distance restraints either from predicted residue-residue contacts, nuclear magnetic resonance (NMR) nuclear overhauser effect (NOE) experiments, or mass spectroscopy (MS) cross-linking (XL) data. The protein structure prediction pipeline was evaluated in the CASP11 experiment on twenty regular protein targets as well as thirty-three 'assisted' protein targets, which also had distance restraints available. Although the low-resolution topology search module was able to sample models with a global distance test total score (GDT_TS) value greater than 30% for twelve out of twenty proteins, frequently it was not possible to select the most accurate models for refinement, resulting in a general decay of model quality over the course of the prediction pipeline. In this study, we provide a detailed overall analysis, study one target protein in more detail as it travels through the protein structure prediction pipeline, and evaluate the impact of limited experimental data.

  15. FPGA accelerator for protein secondary structure prediction based on the GOR algorithm

    PubMed Central

    2011-01-01

    Background Protein is an important molecule that performs a wide range of functions in biological systems. Recently, the protein folding attracts much more attention since the function of protein can be generally derived from its molecular structure. The GOR algorithm is one of the most successful computational methods and has been widely used as an efficient analysis tool to predict secondary structure from protein sequence. However, the execution time is still intolerable with the steep growth in protein database. Recently, FPGA chips have emerged as one promising application accelerator to accelerate bioinformatics algorithms by exploiting fine-grained custom design. Results In this paper, we propose a complete fine-grained parallel hardware implementation on FPGA to accelerate the GOR-IV package for 2D protein structure prediction. To improve computing efficiency, we partition the parameter table into small segments and access them in parallel. We aggressively exploit data reuse schemes to minimize the need for loading data from external memory. The whole computation structure is carefully pipelined to overlap the sequence loading, computing and back-writing operations as much as possible. We implemented a complete GOR desktop system based on an FPGA chip XC5VLX330. Conclusions The experimental results show a speedup factor of more than 430x over the original GOR-IV version and 110x speedup over the optimized version with multi-thread SIMD implementation running on a PC platform with AMD Phenom 9650 Quad CPU for 2D protein structure prediction. However, the power consumption is only about 30% of that of current general-propose CPUs. PMID:21342582

  16. Molecular Simulation-Based Structural Prediction of Protein Complexes in Mass Spectrometry: The Human Insulin Dimer

    PubMed Central

    Li, Jinyu; Rossetti, Giulia; Dreyer, Jens; Raugei, Simone; Ippoliti, Emiliano; Lüscher, Bernhard; Carloni, Paolo

    2014-01-01

    Protein electrospray ionization (ESI) mass spectrometry (MS)-based techniques are widely used to provide insight into structural proteomics under the assumption that non-covalent protein complexes being transferred into the gas phase preserve basically the same intermolecular interactions as in solution. Here we investigate the applicability of this assumption by extending our previous structural prediction protocol for single proteins in ESI-MS to protein complexes. We apply our protocol to the human insulin dimer (hIns2) as a test case. Our calculations reproduce the main charge and the collision cross section (CCS) measured in ESI-MS experiments. Molecular dynamics simulations for 0.075 ms show that the complex maximizes intermolecular non-bonded interactions relative to the structure in water, without affecting the cross section. The overall gas-phase structure of hIns2 does exhibit differences with the one in aqueous solution, not inferable from a comparison with calculated CCS. Hence, care should be exerted when interpreting ESI-MS proteomics data based solely on NMR and/or X-ray structural information. PMID:25210764

  17. Computational Analysis of structure-based interactions and ligand properties can predict efflux effects on antibiotics

    PubMed Central

    Sarkar, Aurijit; Anderson, Kelcey C.; Kellogg, Glen E.

    2012-01-01

    AcrA-AcrB-TolC efflux pumps extrude drugs of multiple classes from bacterial cells and are a leading cause for antimicrobial resistance. Thus, they are of paramount interest to those engaged in antibiotic discovery. Accurate prediction of antibiotic efflux has been elusive, despite several studies aimed at this purpose. Minimum inhibitory concentration (MIC) ratios of 32 β-lactam antibiotics were collected from literature. 3-Dimensional Quantitative Structure Activity Relationship on the β-lactam antibiotic structures revealed seemingly predictive models (q2 = 0.53), but the lack of a general superposition rule does not allow its use on antibiotics that lack the β-lactam moiety. Since MIC ratios must depend on interactions of antibiotics with lipid membranes and transport proteins during influx, capture and extrusion of antibiotics from the bacterial cell, descriptors representing these factors were calculated and used in building mathematical models that quantitatively classify antibiotics as having high/low efflux (>93% accuracy). Our models provide preliminary evidence that it is possible to predict the effects of antibiotic efflux if the passage of antibiotics into, and out of, bacterial cells is taken into account – something descriptor and field-based QSAR models cannot do. While the paucity of data in the public domain remains the limiting factor in such studies, these models show significant improvements in predictions over simple LogP-based regression models and should pave the path towards further work in this field. This method should also be extensible to other pharmacologically and biologically relevant transport proteins. PMID:22483632

  18. Predicting adsorption of aromatic compounds by carbon nanotubes based on quantitative structure property relationship principles

    NASA Astrophysics Data System (ADS)

    Rahimi-Nasrabadi, Mehdi; Akhoondi, Reza; Pourmortazavi, Seied Mahdi; Ahmadi, Farhad

    2015-11-01

    Quantitative structure property relationship (QSPR) models were developed to predict the adsorption of aromatic compounds by carbon nanotubes (CNTs). Five descriptors chosen by combining self-organizing map and stepwise multiple linear regression (MLR) techniques were used to connect the structure of the studied chemicals with their adsorption descriptor (K∞) using linear and nonlinear modeling techniques. Correlation coefficient (R2) of 0.99 and root-mean square error (RMSE) of 0.29 for multilayered perceptron neural network (MLP-NN) model are signs of the superiority of the developed nonlinear model over MLR model with R2 of 0.93 and RMSE of 0.36. The results of cross-validation test showed the reliability of MLP-NN to predict the K∞ values for the aromatic contaminants. Molar volume and hydrogen bond accepting ability were found to be the factors much influencing the adsorption of the compounds. The developed QSPR, as a neural network based model, could be used to predict the adsorption of organic compounds by CNTs.

  19. Crystal structure and prediction.

    PubMed

    Thakur, Tejender S; Dubey, Ritesh; Desiraju, Gautam R

    2015-04-01

    The notion of structure is central to the subject of chemistry. This review traces the development of the idea of crystal structure since the time when a crystal structure could be determined from a three-dimensional diffraction pattern and assesses the feasibility of computationally predicting an unknown crystal structure of a given molecule. Crystal structure prediction is of considerable fundamental and applied importance, and its successful execution is by no means a solved problem. The ease of crystal structure determination today has resulted in the availability of large numbers of crystal structures of higher-energy polymorphs and pseudopolymorphs. These structural libraries lead to the concept of a crystal structure landscape. A crystal structure of a compound may accordingly be taken as a data point in such a landscape.

  20. Crystal Structure and Prediction

    NASA Astrophysics Data System (ADS)

    Thakur, Tejender S.; Dubey, Ritesh; Desiraju, Gautam R.

    2015-04-01

    The notion of structure is central to the subject of chemistry. This review traces the development of the idea of crystal structure since the time when a crystal structure could be determined from a three-dimensional diffraction pattern and assesses the feasibility of computationally predicting an unknown crystal structure of a given molecule. Crystal structure prediction is of considerable fundamental and applied importance, and its successful execution is by no means a solved problem. The ease of crystal structure determination today has resulted in the availability of large numbers of crystal structures of higher-energy polymorphs and pseudopolymorphs. These structural libraries lead to the concept of a crystal structure landscape. A crystal structure of a compound may accordingly be taken as a data point in such a landscape.

  1. Prediction of molecular properties including symmetry from quantum-based molecular structural formulas, VIF.

    PubMed

    Alia, Joseph D; Vlaisavljevich, Bess; Abbot, Matthew; Warneke, Hallie; Mastin, Tyson

    2008-10-09

    Structurally covariant valency interaction formulas, VIF, gain chemical significance by comparison with resonance structures and natural bond orbital, NBO, bonding schemes and at the same time allow for additional prediction such as symmetry of ring systems and destabilization of electron pairs with respect to reference energy of -1/2 Eh. Comparisons are based on three chemical interpretations of Sinanoğlu's theory of structural covariance: (1) sets of structurally covariant quantum structural formulas, VIF, are interpreted as the same quantum operator represented in linearly related basis frames; (2) structurally covariant VIF pictures are interpreted as sets of molecular species with similar energy; and (3) the same VIF picture can be interpreted as different quantum operators, one-electron density or Hamiltonian; for example. According to these three interpretations, bond pair, lone pair, and free radical electrons understood in terms of a localized orbital representation are recognized as having energies above, below, or equal to a predetermined reference, frequently-1/2 Eh. The probable position of electron pairs and radical electrons is predicted. The selectivity of concerted ring closures in allyl anion and cation is described. Symmetries of conjugated ring systems are predicted according to their numbers of pi-electrons and spin-multiplicity. The pi-distortivity of benzene is predicted.The 3c/2e- H-bridging bonds in diborane are derived in a natural way according to the notion that the bridging bonds will have delocalizing interactions between them consistent with results of the NBO method. Key chemical bonding motifs are described using VIF. These include 2c/1e-, 2c/2e-, 2c/3e-, 3c/2e-, 3c/3e-,3c/4e-, 4n antiaromatic, and 4n+2 aromatic bonding systems. Some common organic functional groups are represented as VIF pictures and because these pictures can be interpreted simultaneously as one-electron density and Hamiltonian operators, the valence shell

  2. Can a structured, behavior-based interview predict future resident success?

    PubMed

    Strand, Eric A; Moore, Elizabeth; Laube, Douglas W

    2011-05-01

    To determine whether a structured, behavior-based applicant interview predicts future success in an obstetrics and gynecology residency program. Using a modified pre-post study design, we compared behavior-based interview scores of our residency applicants to a postmatch evaluation completed by the applicant's current residency program director. Applicants were evaluated on the following areas: academic record, professionalism, leadership, trainability/suitability for the specialty, and fit for the program. Information was obtained for 45 (63%) applicants. The overall interview score did not correlate with overall resident performance. Applicant leadership subscore was predictive of leadership performance as a resident (P = .042). Academic record was associated with patient care performance as a resident (P = .014), but only for graduates of US medical schools. Five residents changed programs; these residents had significantly lower scores for trainability/suitability for the specialty (P = .020). Behavioral interviewing can provide predictive information regarding success in an obstetrics and gynecology training program. Copyright © 2011 Mosby, Inc. All rights reserved.

  3. Chemical structure-based predictive model for the oxidation of trace organic contaminants by sulfate radical.

    PubMed

    Ye, Tiantian; Wei, Zongsu; Spinney, Richard; Tang, Chong-Jian; Luo, Shuang; Xiao, Ruiyang; Dionysiou, Dionysios D

    2017-06-01

    Second-order rate constants [Formula: see text] for the reaction of sulfate radical anion (SO4(•-)) with trace organic contaminants (TrOCs) are of scientific and practical importance for assessing their environmental fate and removal efficiency in water treatment systems. Here, we developed a chemical structure-based model for predicting [Formula: see text] using 32 molecular fragment descriptors, as this type of model provides a quick estimate at low computational cost. The model was constructed using the multiple linear regression (MLR) and artificial neural network (ANN) methods. The MLR method yielded adequate fit for the training set (Rtraining(2)=0.88,n=75) and reasonable predictability for the validation set (Rvalidation(2)=0.62,n=38). In contrast, the ANN method produced a more statistical robustness but rather poor predictability (Rtraining(2)=0.99andRvalidation(2)=0.42). The reaction mechanisms of SO4(•-) reactivity with TrOCs were elucidated. Our result shows that the coefficients of functional groups reflect their electron donating/withdrawing characters. For example, electron donating groups typically exhibit positive coefficients, indicating enhanced SO4(•-) reactivity. Electron withdrawing groups exhibit negative values, indicating reduced reactivity. With its quick and accurate features, we applied this structure-based model to 55 discrete TrOCs culled from the Contaminant Candidate List 4, and quantitatively compared their removal efficiency with SO4(•-) and OH in the presence of environmental matrices. This high-throughput model helps prioritize TrOCs that are persistent to SO4(•-) based oxidation technologies at the screening level, and provide diagnostics of SO4(•-) reaction mechanisms. Copyright © 2017 Elsevier Ltd. All rights reserved.

  4. A Novel Wavelet-Based Approach for Predicting Nucleosome Positions Using DNA Structural Information.

    PubMed

    Gan, Yanglan; Zou, Guobing; Guan, Jihong; Xu, Guangwei

    2014-01-01

    Nucleosomes are basic elements of chromatin structure. The positioning of nucleosomes along a genome is very important to dictate eukaryotic DNA compaction and access. Current computational methods have focused on the analysis of nucleosome occupancy and the positioning of well-positioned nucleosomes. However, fuzzy nucleosomes require more complex configurations and are more difficult to predict their positions. We analyzed the positioning of well-positioned and fuzzy nucleosomes from a novel structural perspective, and proposed WaveNuc, a computational approach for inferring their positions based on continuous wavelet transformation. The comparative analysis demonstrates that these two kinds of nucleosomes exhibit different propeller twist structural characteristics. Well-positioned nucleosomes tend to locate at sharp peaks of the propeller twist profile, whereas fuzzy nucleosomes correspond to broader peaks. The sharpness of these peaks shows that the propeller twist profile may contain nucleosome positioning information. Exploiting this knowledge, we applied WaveNuc to detect the two different kinds of peaks of the propeller twist profile along the genome. We compared the performance of our method with existing methods on real data sets. The results show that the proposed method can accurately resolve complex configurations of fuzzy nucleosomes, which leads to better performance of nucleosome positioning prediction on the whole genome.

  5. A nonlinear viscoelastic approach to durability predictions for polymer based composite structures

    NASA Technical Reports Server (NTRS)

    Brinson, Hal F.

    1991-01-01

    Current industry approaches for the durability assessment of metallic structures are briefly reviewed. For polymer based composite structures, it is suggested that new approaches must be adopted to include memory or viscoelastic effects which could lead to delayed failures that might not be predicted using current techniques. A durability or accelerated life assessment plan for fiber reinforced plastics (FRP) developed and documented over the last decade or so is reviewed and discussed. Limitations to the plan are outlined and suggestions to remove the limitations are given. These include the development of a finite element code to replace the previously used lamination theory code and the development of new specimen geometries to evaluate delamination failures. The new DCB model is reviewed and results are presented. Finally, it is pointed out that new procedures are needed to determine interfacial properties and current efforts underway to determine such properties are reviewed. Suggestions for additional efforts to develop a consistent and accurate durability predictive approach for FRP structures are outlined.

  6. A nonlinear viscoelastic approach to durability predictions for polymer based composite structures

    NASA Technical Reports Server (NTRS)

    Brinson, Hal F.; Hiel, C. C.

    1990-01-01

    Current industry approaches for the durability assessment of metallic structures are briefly reviewed. For polymer based composite structures, it is suggested that new approaches must be adopted to include memory or viscoelastic effects which could lead to delayed failures that might not be predicted using current techniques. A durability or accelerated life assessment plan for fiber reinforced plastics (FRP) developed and documented over the last decade or so is reviewed and discussed. Limitations to the plan are outlined and suggestions to remove the limitations are given. These include the development of a finite element code to replace the previously used lamination theory code and the development of new specimen geometries to evaluate delamination failures. The new DCB model is reviewed and results are presented. Finally, it is pointed out that new procedures are needed to determine interfacial properties and current efforts underway to determine such properties are reviewed. Suggestions for additional efforts to develop a consistent and accurate durability predictive approach for FRP structures is outlined.

  7. Coupled agent-based and finite-element models for predicting scar structure following myocardial infarction.

    PubMed

    Rouillard, Andrew D; Holmes, Jeffrey W

    2014-08-01

    Following myocardial infarction, damaged muscle is gradually replaced by collagenous scar tissue. The structural and mechanical properties of the scar are critical determinants of heart function, as well as the risk of serious post-infarction complications such as infarct rupture, infarct expansion, and progression to dilated heart failure. A number of therapeutic approaches currently under development aim to alter infarct mechanics in order to reduce complications, such as implantation of mechanical restraint devices, polymer injection, and peri-infarct pacing. Because mechanical stimuli regulate scar remodeling, the long-term consequences of therapies that alter infarct mechanics must be carefully considered. Computational models have the potential to greatly improve our ability to understand and predict how such therapies alter heart structure, mechanics, and function over time. Toward this end, we developed a straightforward method for coupling an agent-based model of scar formation to a finite-element model of tissue mechanics, creating a multi-scale model that captures the dynamic interplay between mechanical loading, scar deformation, and scar material properties. The agent-based component of the coupled model predicts how fibroblasts integrate local chemical, structural, and mechanical cues as they deposit and remodel collagen, while the finite-element component predicts local mechanics at any time point given the current collagen fiber structure and applied loads. We used the coupled model to explore the balance between increasing stiffness due to collagen deposition and increasing wall stress due to infarct thinning and left ventricular dilation during the normal time course of healing in myocardial infarcts, as well as the negative feedback between strain anisotropy and the structural anisotropy it promotes in healing scar. The coupled model reproduced the observed evolution of both collagen fiber structure and regional deformation following coronary

  8. Structured prediction models for RNN based sequence labeling in clinical text

    PubMed Central

    Jagannatha, Abhyuday N; Yu, Hong

    2016-01-01

    Sequence labeling is a widely used method for named entity recognition and information extraction from unstructured natural language data. In clinical domain one major application of sequence labeling involves extraction of medical entities such as medication, indication, and side-effects from Electronic Health Record narratives. Sequence labeling in this domain, presents its own set of challenges and objectives. In this work we experimented with various CRF based structured learning models with Recurrent Neural Networks. We extend the previously studied LSTM-CRF models with explicit modeling of pairwise potentials. We also propose an approximate version of skip-chain CRF inference with RNN potentials. We use these methodologies1 for structured prediction in order to improve the exact phrase detection of various medical entities. PMID:28004040

  9. Structure-based constitutive model can accurately predict planar biaxial properties of aortic wall tissue.

    PubMed

    Polzer, S; Gasser, T C; Novak, K; Man, V; Tichy, M; Skacel, P; Bursa, J

    2015-03-01

    Structure-based constitutive models might help in exploring mechanisms by which arterial wall histology is linked to wall mechanics. This study aims to validate a recently proposed structure-based constitutive model. Specifically, the model's ability to predict mechanical biaxial response of porcine aortic tissue with predefined collagen structure was tested. Histological slices from porcine thoracic aorta wall (n=9) were automatically processed to quantify the collagen fiber organization, and mechanical testing identified the non-linear properties of the wall samples (n=18) over a wide range of biaxial stretches. Histological and mechanical experimental data were used to identify the model parameters of a recently proposed multi-scale constitutive description for arterial layers. The model predictive capability was tested with respect to interpolation and extrapolation. Collagen in the media was predominantly aligned in circumferential direction (planar von Mises distribution with concentration parameter bM=1.03 ± 0.23), and its coherence decreased gradually from the luminal to the abluminal tissue layers (inner media, b=1.54 ± 0.40; outer media, b=0.72 ± 0.20). In contrast, the collagen in the adventitia was aligned almost isotropically (bA=0.27 ± 0.11), and no features, such as families of coherent fibers, were identified. The applied constitutive model captured the aorta biaxial properties accurately (coefficient of determination R(2)=0.95 ± 0.03) over the entire range of biaxial deformations and with physically meaningful model parameters. Good predictive properties, well outside the parameter identification space, were observed (R(2)=0.92 ± 0.04). Multi-scale constitutive models equipped with realistic micro-histological data can predict macroscopic non-linear aorta wall properties. Collagen largely defines already low strain properties of media, which explains the origin of wall anisotropy seen at this strain level. The structure and mechanical

  10. A simple structure-based model for the prediction of HIV-1 co-receptor tropism

    PubMed Central

    2014-01-01

    Background Human Immunodeficiency Virus 1 enters host cells through interaction of its V3 loop (which is part of the gp120 protein) with the host cell receptor CD4 and one of two co-receptors, namely CCR5 or CXCR4. Entry inhibitors binding the CCR5 co-receptor can prevent viral entry. As these drugs are only available for CCR5-using viruses, accurate prediction of this so-called co-receptor tropism is important in order to ensure an effective personalized therapy. With the development of next-generation sequencing technologies, it is now possible to sequence representative subpopulations of the viral quasispecies. Results Here we present T-CUP 2.0, a model for predicting co-receptor tropism. Based on our recently published T-CUP model, we developed a more accurate and even faster solution. Similarly to its predecessor, T-CUP 2.0 models co-receptor tropism using information of the electrostatic potential and hydrophobicity of V3-loops. However, extracting this information from a simplified structural vacuum-model leads to more accurate and faster predictions. The area-under-the-ROC-curve (AUC) achieved with T-CUP 2.0 on the training set is 0.968±0.005 in a leave-one-patient-out cross-validation. When applied to an independent dataset, T-CUP 2.0 has an improved prediction accuracy of around 3% when compared to the original T-CUP. Conclusions We found that it is possible to model co-receptor tropism in HIV-1 based on a simplified structure-based model of the V3 loop. In this way, genotypic prediction of co-receptor tropism is very accurate, fast and can be applied to large datasets derived from next-generation sequencing technologies. The reduced complexity of the electrostatic modeling makes T-CUP 2.0 independent from third-party software, making it easy to install and use. PMID:25120583

  11. Structure-based function prediction of the expanding mollusk tyrosinase family

    NASA Astrophysics Data System (ADS)

    Huang, Ronglian; Li, Li; Zhang, Guofan

    2017-01-01

    Tyrosinase (Ty) is a common enzyme found in many different animal groups. In our previous study, genome sequencing revealed that the Ty family is expanded in the Pacific oyster (Crassostrea gigas). Here, we examine the larger number of Ty family members in the Pacific oyster by high-level structure prediction to obtain more information about their function and evolution, especially the unknown role in biomineralization. We verified 12 Ty gene sequences from Crassostrea gigas genome and Pinctada fucata martensii transcriptome. By using phylogenetic analysis of these Tys with functionally known Tys from other molluscan species, eight subgroups were identified (CgTy_s1, CgTy_s2, MolTy_s1, MolTy-s2, MolTy-s3, PinTy-s1, PinTy-s2 and PviTy). Structural data and surface pockets of the dinuclear copper center in the eight subgroups of molluscan Ty were obtained using the latest versions of prediction online servers. Structural comparison with other Ty proteins from the protein databank revealed functionally important residues (HA1, HA2, HA3, HB1, HB2, HB3, Z1-Z9) and their location within these protein structures. The structural and chemical features of these pockets which may related to the substrate binding showed considerable variability among mollusks, which undoubtedly defines Ty substrate binding. Finally, we discuss the potential driving forces of Ty family evolution in mollusks. Based on these observations, we conclude that the Ty family has rapidly evolved as a consequence of substrate adaptation in mollusks.

  12. The Prediction of Botulinum Toxin Structure Based on in Silico and in Vitro Analysis

    NASA Astrophysics Data System (ADS)

    Suzuki, Tomonori; Miyazaki, Satoru

    2011-01-01

    Many of biological system mediated through protein-protein interactions. Knowledge of protein-protein complex structure is required for understanding the function. The determination of huge size and flexible protein-protein complex structure by experimental studies remains difficult, costly and five-consuming, therefore computational prediction of protein structures by homolog modeling and docking studies is valuable method. In addition, MD simulation is also one of the most powerful methods allowing to see the real dynamics of proteins. Here, we predict protein-protein complex structure of botulinum toxin to analyze its property. These bioinformatics methods are useful to report the relation between the flexibility of backbone structure and the activity.

  13. A core competency-based objective structured clinical examination (OSCE) can predict future resident performance.

    PubMed

    Wallenstein, Joshua; Heron, Sheryl; Santen, Sally; Shayne, Philip; Ander, Douglas

    2010-10-01

    This study evaluated the ability of an objective structured clinical examination (OSCE) administered in the first month of residency to predict future resident performance in the Accreditation Council for Graduate Medical Education (ACGME) core competencies. Eighteen Postgraduate Year 1 (PGY-1) residents completed a five-station OSCE in the first month of postgraduate training. Performance was graded in each of the ACGME core competencies. At the end of 18 months of training, faculty evaluations of resident performance in the emergency department (ED) were used to calculate a cumulative clinical evaluation score for each core competency. The correlations between OSCE scores and clinical evaluation scores at 18 months were assessed on an overall level and in each core competency. There was a statistically significant correlation between overall OSCE scores and overall clinical evaluation scores (R = 0.48, p < 0.05) and in the individual competencies of patient care (R = 0.49, p < 0.05), medical knowledge (R = 0.59, p < 0.05), and practice-based learning (R = 0.49, p < 0.05). No correlation was noted in the systems-based practice, interpersonal and communication skills, or professionalism competencies. An early-residency OSCE has the ability to predict future postgraduate performance on a global level and in specific core competencies. Used appropriately, such information can be a valuable tool for program directors in monitoring residents' progress and providing more tailored guidance. © 2010 by the Society for Academic Emergency Medicine.

  14. Quaternary structure predictions of transmembrane proteins starting from the monomer: a docking-based approach

    PubMed Central

    Casciari, D; Seeber, M; Fanelli, F

    2006-01-01

    Background We introduce a computational protocol for effective predictions of the supramolecular organization of integral transmembrane proteins, starting from the monomer. Despite the demonstrated constitutive and functional importance of supramolecular assemblies of transmembrane subunits or proteins, effective tools for structure predictions of such assemblies are still lacking. Our computational approach consists in rigid-body docking samplings, starting from the docking of two identical copies of a given monomer. Each docking run is followed by membrane topology filtering and cluster analysis. Prediction of the native oligomer is therefore accomplished by a number of progressive growing steps, each made of one docking run, filtering and cluster analysis. With this approach, knowledge about the oligomerization status of the protein is required neither for improving sampling nor for the filtering step. Furthermore, there are no size-limitations in the systems under study, which are not limited to the transmembrane domains but include also the water-soluble portions. Results Benchmarks of the approach were done on ten homo-oligomeric membrane proteins with known quaternary structure. For all these systems, predictions led to native-like quaternary structures, i.e. with Cα-RMSDs lower than 2.5 Å from the native oligomer, regardless of the resolution of the structural models. Conclusion Collectively, the results of this study emphasize the effectiveness of the prediction protocol that will be extensively challenged in quaternary structure predictions of other integral membrane proteins. PMID:16836758

  15. Balancing exploration and exploitation in population-based sampling improves fragment-based de novo protein structure prediction.

    PubMed

    Simoncini, David; Schiex, Thomas; Zhang, Kam Y J

    2017-05-01

    Conformational search space exploration remains a major bottleneck for protein structure prediction methods. Population-based meta-heuristics typically enable the possibility to control the search dynamics and to tune the balance between local energy minimization and search space exploration. EdaFold is a fragment-based approach that can guide search by periodically updating the probability distribution over the fragment libraries used during model assembly. We implement the EdaFold algorithm as a Rosetta protocol and provide two different probability update policies: a cluster-based variation (EdaRosec ) and an energy-based one (EdaRoseen ). We analyze the search dynamics of our new Rosetta protocols and show that EdaRosec is able to provide predictions with lower C αRMSD to the native structure than EdaRoseen and Rosetta AbInitio Relax protocol. Our software is freely available as a C++ patch for the Rosetta suite and can be downloaded from http://www.riken.jp/zhangiru/software/. Our protocols can easily be extended in order to create alternative probability update policies and generate new search dynamics. Proteins 2017; 85:852-858. © 2016 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  16. APL: An angle probability list to improve knowledge-based metaheuristics for the three-dimensional protein structure prediction.

    PubMed

    Borguesan, Bruno; Barbachan e Silva, Mariel; Grisci, Bruno; Inostroza-Ponta, Mario; Dorn, Márcio

    2015-12-01

    Tertiary protein structure prediction is one of the most challenging problems in structural bioinformatics. Despite the advances in algorithm development and computational strategies, predicting the folded structure of a protein only from its amino acid sequence remains as an unsolved problem. We present a new computational approach to predict the native-like three-dimensional structure of proteins. Conformational preferences of amino acid residues and secondary structure information were obtained from protein templates stored in the Protein Data Bank and represented as an Angle Probability List. Two knowledge-based prediction methods based on Genetic Algorithms and Particle Swarm Optimization were developed using this information. The proposed method has been tested with twenty-six case studies selected to validate our approach with different classes of proteins and folding patterns. Stereochemical and structural analysis were performed for each predicted three-dimensional structure. Results achieved suggest that the Angle Probability List can improve the effectiveness of metaheuristics used to predicted the three-dimensional structure of protein molecules by reducing its conformational search space. Copyright © 2015 Elsevier Ltd. All rights reserved.

  17. Structure-based approach to the prediction of disulfide bonds in proteins.

    PubMed

    Salam, Noeris K; Adzhigirey, Matvey; Sherman, Woody; Pearlman, David A

    2014-10-01

    Protein engineering remains an area of growing importance in pharmaceutical and biotechnology research. Stabilization of a folded protein conformation is a frequent goal in projects that deal with affinity optimization, enzyme design, protein construct design, and reducing the size of functional proteins. Indeed, it can be desirable to assess and improve protein stability in order to avoid liabilities such as aggregation, degradation, and immunogenic response that may arise during development. One way to stabilize a protein is through the introduction of disulfide bonds. Here, we describe a method to predict pairs of protein residues that can be mutated to form a disulfide bond. We combine a physics-based approach that incorporates implicit solvent molecular mechanics with a knowledge-based approach. We first assign relative weights to the terms that comprise our scoring function using a genetic algorithm applied to a set of 75 wild-type structures that each contains a disulfide bond. The method is then tested on a separate set of 13 engineered proteins comprising 15 artificial stabilizing disulfides introduced via site-directed mutagenesis. We find that the native disulfide in the wild-type proteins is scored well, on average (within the top 6% of the reasonable pairs of residues that could form a disulfide bond) while 6 out of the 15 artificial stabilizing disulfides scored within the top 13% of ranked predictions. Overall, this suggests that the physics-based approach presented here can be useful for triaging possible pairs of mutations for disulfide bond formation to improve protein stability.

  18. Direct data-based model predictive control with applications to structures, robotic swarms, and aircraft

    NASA Astrophysics Data System (ADS)

    Barlow, Jonathan S.

    A direct method to design data-based model predictive controllers is presented. The design method uses system identification techniques to identify model predictive controller gains directly from a set of excitation input and disturbance corrupted output. The design is direct in that the controller gains can be designed directly from input and disturbance corrupted output data without an intermediate identification step. The direct design is simpler than previous two-step designs and reduces computation time for the design of the controller. The direct design also enables an adaptive implementation capable of identifying controller gains online. The direct data-based controllers can be used for vibration suppression, disturbance rejection, tracking and is applied to structures, robot swarms and aircraft. For the cases of vibration suppression and disturbance rejection, the data-based controller has the advantage that any disturbances present in the design data are automatically rejected without needing to know the details of the disturbances. For the case of robot swarms, extensions are made for formation control and obstacle avoidance, and the controller can be implemented as a decentralized controller in real time and in parallel on individual vehicles with communication limited to past input and past output data. A formulation for improving the robustness of the controller to parametric variations is also developed. Finally, the adaptive implementation is shown to be useful for the control of linear time-varying systems and has been successfully implemented to control a linear time-varying model of a Cruise Efficient Short Take-Off and Landing (CESTOL) type aircraft.

  19. Performance of protein-structure predictions with the physics-based UNRES force field in CASP11.

    PubMed

    Krupa, Paweł; Mozolewska, Magdalena A; Wiśniewska, Marta; Yin, Yanping; He, Yi; Sieradzan, Adam K; Ganzynkowicz, Robert; Lipska, Agnieszka G; Karczyńska, Agnieszka; Ślusarz, Magdalena; Ślusarz, Rafał; Giełdoń, Artur; Czaplewski, Cezary; Jagieła, Dawid; Zaborowski, Bartłomiej; Scheraga, Harold A; Liwo, Adam

    2016-11-01

    Participating as the Cornell-Gdansk group, we have used our physics-based coarse-grained UNited RESidue (UNRES) force field to predict protein structure in the 11th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP11). Our methodology involved extensive multiplexed replica exchange simulations of the target proteins with a recently improved UNRES force field to provide better reproductions of the local structures of polypeptide chains. All simulations were started from fully extended polypeptide chains, and no external information was included in the simulation process except for weak restraints on secondary structure to enable us to finish each prediction within the allowed 3-week time window. Because of simplified UNRES representation of polypeptide chains, use of enhanced sampling methods, code optimization and parallelization and sufficient computational resources, we were able to treat, for the first time, all 55 human prediction targets with sizes from 44 to 595 amino acid residues, the average size being 251 residues. Complete structures of six single-domain proteins were predicted accurately, with the highest accuracy being attained for the T0769, for which the CαRMSD was 3.8 Å for 97 residues of the experimental structure. Correct structures were also predicted for 13 domains of multi-domain proteins with accuracy comparable to that of the best template-based modeling methods. With further improvements of the UNRES force field that are now underway, our physics-based coarse-grained approach to protein-structure prediction will eventually reach global prediction capacity and, consequently, reliability in simulating protein structure and dynamics that are important in biochemical processes. Freely available on the web at http://www.unres.pl/ CONTACT: has5@cornell.edu. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  20. Ligand and structure-based methodologies for the prediction of the activity of G protein-coupled receptor ligands

    NASA Astrophysics Data System (ADS)

    Costanzi, Stefano; Tikhonova, Irina G.; Harden, T. Kendall; Jacobson, Kenneth A.

    2009-11-01

    Accurate in silico models for the quantitative prediction of the activity of G protein-coupled receptor (GPCR) ligands would greatly facilitate the process of drug discovery and development. Several methodologies have been developed based on the properties of the ligands, the direct study of the receptor-ligand interactions, or a combination of both approaches. Ligand-based three-dimensional quantitative structure-activity relationships (3D-QSAR) techniques, not requiring knowledge of the receptor structure, have been historically the first to be applied to the prediction of the activity of GPCR ligands. They are generally endowed with robustness and good ranking ability; however they are highly dependent on training sets. Structure-based techniques generally do not provide the level of accuracy necessary to yield meaningful rankings when applied to GPCR homology models. However, they are essentially independent from training sets and have a sufficient level of accuracy to allow an effective discrimination between binders and nonbinders, thus qualifying as viable lead discovery tools. The combination of ligand and structure-based methodologies in the form of receptor-based 3D-QSAR and ligand and structure-based consensus models results in robust and accurate quantitative predictions. The contribution of the structure-based component to these combined approaches is expected to become more substantial and effective in the future, as more sophisticated scoring functions are developed and more detailed structural information on GPCRs is gathered.

  1. Ligand and Structure-based Methodologies for the Prediction of the Activity of G Protein-Coupled Receptor Ligands

    PubMed Central

    Costanzi, Stefano; Tikhonova, Irina G.; Harden, T. Kendall; Jacobson, Kenneth A.

    2008-01-01

    Summary Accurate in silico models for the quantitative prediction of the activity of G protein-coupled receptor (GPCR) ligands would greatly facilitate the process of drug discovery and development. Several methodologies have been developed based on the properties of the ligands, the direct study of the receptor-ligand interactions, or a combination of both approaches. Ligand-based three-dimensional quantitative structure-activity relationships (3D-QSAR) techniques, not requiring knowledge of the receptor structure, have been historically the first to be applied to the prediction of the activity of GPCR ligands. They are generally endowed with robustness and good ranking ability; however they are highly dependent on training sets. Structure-based techniques generally do not provide the level of accuracy necessary to yield meaningful rankings when applied to GPCR homology models. However, they are essentially independent from training sets and have a sufficient level of accuracy to allow an effective discrimination between binders and nonbinders, thus qualifying as viable lead discovery tools. The combination of ligand and structure-based methodologies in the form of receptor-based 3D-QSAR and ligand and structure-based consensus models results in robust and accurate quantitative predictions. The contribution of the structure-based component to these combined approaches is expected to become more substantial and effective in the future, as more sophisticated scoring functions are developed and more detailed structural information on GPCRs is gathered. PMID:18483766

  2. Thermodynamic Properties of Asphaltenes: A Predictive Approach Based On Computer Assisted Structure Elucidation and Atomistic Simulations

    SciTech Connect

    Diallo, Mamadou S.; Cagin, Tahir; Faulon, Jean Loup; Goddard, William A.

    2000-08-01

    The authors describe a new methodology for predicting the thermodynamic properties of petroleum geomacromolecules (asphaltenes and resins). This methodology combines computer assisted structure elucidation (CASE) with atomistic simulations (molecular mechanics and molecular dynamics and statistical mechanics). They use quantitative and qualitative structural data as input to a CASE program (SIGNATURE) to generate a sample of ten asphaltene model structures for a Saudi crude oil (Arab Berri). MM calculations and MD simulations are used to estimate selected volumetric and thermal properties of the model structures.

  3. Fast assessment of structural models of ion channels based on their predicted current-voltage characteristics.

    PubMed

    Dyrka, Witold; Kurczyńska, Monika; Konopka, Bogumił M; Kotulska, Małgorzata

    2016-02-01

    Computational prediction of protein structures is a difficult task, which involves fast and accurate evaluation of candidate model structures. We propose to enhance single-model quality assessment with a functionality evaluation phase for proteins whose quantitative functional characteristics are known. In particular, this idea can be applied to evaluation of structural models of ion channels, whose main function - conducting ions - can be quantitatively measured with the patch-clamp technique providing the current-voltage characteristics. The study was performed on a set of KcsA channel models obtained from complete and incomplete contact maps. A fast continuous electrodiffusion model was used for calculating the current-voltage characteristics of structural models. We found that the computed charge selectivity and total current were sensitive to structural and electrostatic quality of models. In practical terms, we show that evaluating predicted conductance values is an appropriate method to eliminate models with an occluded pore or with multiple erroneously created pores. Moreover, filtering models on the basis of their predicted charge selectivity results in a substantial enrichment of the candidate set in highly accurate models. Tests on three other ion channels indicate that, in addition to being a proof of the concept, our function-oriented single-model quality assessment method can be directly applied to evaluation of structural models of some classes of protein channels. Finally, our work raises an important question whether a computational validation of functionality should be included in the evaluation process of structural models, whenever possible. © 2015 Wiley Periodicals, Inc.

  4. Combined computational metabolite prediction and automated structure-based analysis of mass spectrometric data.

    PubMed

    Stranz, David D; Miao, Shichang; Campbell, Scott; Maydwell, George; Ekins, Sean

    2008-01-01

    ABSTRACT As high-throughput technologies have developed in the pharmaceutical industry, the demand for identification of possible metabolites using predominantly liquid chromatographic/mass spectrometry-mass spectrometry/mass spectrometry (LC/MS-MS/MS) for a large number of molecules in drug discovery has also increased. In parallel, computational technologies have also been developed to generate predictions for metabolites alongside methods to predict MS spectra and score the quality of the match with experimental spectra. The goal of the current study was to generate metabolite predictions from molecular structure with a software product, MetaDrug. In vitro microsomal incubations were used to ultimately produce MS data that could be used to verify the predictions with Apex, which is a new software tool that can predict the molecular ion spectrum and a fragmentation spectrum, automating the detailed examination of both MS and MS/MS spectra. For the test molecule imipramine used to illustrate the combined in vitro/in silico process proposed, MetaDrug predicts 16 metabolites. Following rat microsomal incubations with imipramine and analysis of the MS(n) data using the Apex software, strong evidence was found for imipramine and five metabolites and weaker evidence for five additional metabolites. This study suggests a new approach to streamline MS data analysis using a combination of predictive computational approaches with software capable of comparing the predicted metabolite output with empirical data when looking at drug metabolites.

  5. Prediction of Protein Structural Classes for Low-Similarity Sequences Based on Consensus Sequence and Segmented PSSM.

    PubMed

    Liang, Yunyun; Liu, Sanyang; Zhang, Shengli

    2015-01-01

    Prediction of protein structural classes for low-similarity sequences is useful for understanding fold patterns, regulation, functions, and interactions of proteins. It is well known that feature extraction is significant to prediction of protein structural class and it mainly uses protein primary sequence, predicted secondary structure sequence, and position-specific scoring matrix (PSSM). Currently, prediction solely based on the PSSM has played a key role in improving the prediction accuracy. In this paper, we propose a novel method called CSP-SegPseP-SegACP by fusing consensus sequence (CS), segmented PsePSSM, and segmented autocovariance transformation (ACT) based on PSSM. Three widely used low-similarity datasets (1189, 25PDB, and 640) are adopted in this paper. Then a 700-dimensional (700D) feature vector is constructed and the dimension is decreased to 224D by using principal component analysis (PCA). To verify the performance of our method, rigorous jackknife cross-validation tests are performed on 1189, 25PDB, and 640 datasets. Comparison of our results with the existing PSSM-based methods demonstrates that our method achieves the favorable and competitive performance. This will offer an important complementary to other PSSM-based methods for prediction of protein structural classes for low-similarity sequences.

  6. Prediction of protein secondary structure based on residue pair types and conformational states using dynamic programming algorithm.

    PubMed

    Sadeghi, Mehdi; Parto, Sahar; Arab, Shahriar; Ranjbar, Bijan

    2005-06-20

    We have used a statistical approach for protein secondary structure prediction based on information theory and simultaneously taking into consideration pairwise residue types and conformational states. Since the prediction of residue secondary structure by one residue window sliding make ambiguity in state prediction, we used a dynamic programming algorithm to find the path with maximum score. A score system for residue pairs in particular conformations is derived for adjacent neighbors up to ten residue apart in sequence. The three state overall per-residue accuracy, Q3, of this method in a jackknife test with dataset created from PDBSELECT is more than 70%.

  7. Protein Tertiary Structure Prediction Based on Main Chain Angle Using a Hybrid Bees Colony Optimization Algorithm

    NASA Astrophysics Data System (ADS)

    Mahmood, Zakaria N.; Mahmuddin, Massudi; Mahmood, Mohammed Nooraldeen

    Encoding proteins of amino acid sequence to predict classified into their respective families and subfamilies is important research area. However for a given protein, knowing the exact action whether hormonal, enzymatic, transmembranal or nuclear receptors does not depend solely on amino acid sequence but on the way the amino acid thread folds as well. This study provides a prototype system that able to predict a protein tertiary structure. Several methods are used to develop and evaluate the system to produce better accuracy in protein 3D structure prediction. The Bees Optimization algorithm which inspired from the honey bees food foraging method, is used in the searching phase. In this study, the experiment is conducted on short sequence proteins that have been used by the previous researches using well-known tools. The proposed approach shows a promising result.

  8. Predictions of Crystal Structure Based on Radius Ratio: How Reliable Are They?

    ERIC Educational Resources Information Center

    Nathan, Lawrence C.

    1985-01-01

    Discussion of crystalline solids in undergraduate curricula often includes the use of radius ratio rules as a method for predicting which type of crystal structure is likely to be adopted by a given ionic compound. Examines this topic, establishing more definitive guidelines for the use and reliability of the rules. (JN)

  9. Predictions of Crystal Structure Based on Radius Ratio: How Reliable Are They?

    ERIC Educational Resources Information Center

    Nathan, Lawrence C.

    1985-01-01

    Discussion of crystalline solids in undergraduate curricula often includes the use of radius ratio rules as a method for predicting which type of crystal structure is likely to be adopted by a given ionic compound. Examines this topic, establishing more definitive guidelines for the use and reliability of the rules. (JN)

  10. Local structure based method for prediction of the biochemical function of proteins: Applications to glycoside hydrolases.

    PubMed

    Parasuram, Ramya; Mills, Caitlyn L; Wang, Zhouxi; Somasundaram, Saroja; Beuning, Penny J; Ondrechen, Mary Jo

    2016-01-15

    Thousands of protein structures of unknown or uncertain function have been reported as a result of high-throughput structure determination techniques developed by Structural Genomics (SG) projects. However, many of the putative functional assignments of these SG proteins in the Protein Data Bank (PDB) are incorrect. While high-throughput biochemical screening techniques have provided valuable functional information for limited sets of SG proteins, the biochemical functions for most SG proteins are still unknown or uncertain. Therefore, computational methods for the reliable prediction of protein function from structure can add tremendous value to the existing SG data. In this article, we show how computational methods may be used to predict the function of SG proteins, using examples from the six-hairpin glycosidase (6-HG) and the concanavalin A-like lectin/glucanase (CAL/G) superfamilies. Using a set of predicted functional residues, obtained from computed electrostatic and chemical properties for each protein structure, it is shown that these superfamilies may be sorted into functional families according to biochemical function. Within these superfamilies, a total of 18 SG proteins were analyzed according to their predicted, local functional sites: 13 from the 6-HG superfamily, five from the CAL/G superfamily. Within the 6-HG superfamily, an uncharacterized protein BACOVA_03626 from Bacteroides ovatus (PDB 3ON6) and a hypothetical protein BT3781 from Bacteroides thetaiotaomicron (PDB 2P0V) are shown to have very strong active site matches with exo-α-1,6-mannosidases, thus likely possessing this function. Also in this superfamily, it is shown that protein BH0842, a putative glycoside hydrolase from Bacillus halodurans (PDB 2RDY), has a predicted active site that matches well with a known α-L-galactosidase. In the CAL/G superfamily, an uncharacterized glycosyl hydrolase family 16 protein from Mycobacterium smegmatis (PDB 3RQ0) is shown to have local structural

  11. Fine-grained parallelism accelerating for RNA secondary structure prediction with pseudoknots based on FPGA.

    PubMed

    Xia, Fei; Jin, Guoqing

    2014-06-01

    PKNOTS is a most famous benchmark program and has been widely used to predict RNA secondary structure including pseudoknots. It adopts the standard four-dimensional (4D) dynamic programming (DP) method and is the basis of many variants and improved algorithms. Unfortunately, the O(N(6)) computing requirements and complicated data dependency greatly limits the usefulness of PKNOTS package with the explosion in gene database size. In this paper, we present a fine-grained parallel PKNOTS package and prototype system for accelerating RNA folding application based on FPGA chip. We adopted a series of storage optimization strategies to resolve the "Memory Wall" problem. We aggressively exploit parallel computing strategies to improve computational efficiency. We also propose several methods that collectively reduce the storage requirements for FPGA on-chip memory. To the best of our knowledge, our design is the first FPGA implementation for accelerating 4D DP problem for RNA folding application including pseudoknots. The experimental results show a factor of more than 50x average speedup over the PKNOTS-1.08 software running on a PC platform with Intel Core2 Q9400 Quad CPU for input RNA sequences. However, the power consumption of our FPGA accelerator is only about 50% of the general-purpose micro-processors.

  12. Guided macro-mutation in a graded energy based genetic algorithm for protein structure prediction.

    PubMed

    Rashid, Mahmood A; Iqbal, Sumaiya; Khatib, Firas; Hoque, Md Tamjidul; Sattar, Abdul

    2016-04-01

    Protein structure prediction is considered as one of the most challenging and computationally intractable combinatorial problem. Thus, the efficient modeling of convoluted search space, the clever use of energy functions, and more importantly, the use of effective sampling algorithms become crucial to address this problem. For protein structure modeling, an off-lattice model provides limited scopes to exercise and evaluate the algorithmic developments due to its astronomically large set of data-points. In contrast, an on-lattice model widens the scopes and permits studying the relatively larger proteins because of its finite set of data-points. In this work, we took the full advantage of an on-lattice model by using a face-centered-cube lattice that has the highest packing density with the maximum degree of freedom. We proposed a graded energy-strategically mixes the Miyazawa-Jernigan (MJ) energy with the hydrophobic-polar (HP) energy-based genetic algorithm (GA) for conformational search. In our application, we introduced a 2 × 2 HP energy guided macro-mutation operator within the GA to explore the best possible local changes exhaustively. Conversely, the 20 × 20 MJ energy model-the ultimate objective function of our GA that needs to be minimized-considers the impacts amongst the 20 different amino acids and allow searching the globally acceptable conformations. On a set of benchmark proteins, our proposed approach outperformed state-of-the-art approaches in terms of the free energy levels and the root-mean-square deviations.

  13. Structural MRI-Based Predictions in Patients with Treatment-Refractory Depression (TRD)

    PubMed Central

    Johnston, Blair A.; Steele, J. Douglas; Tolomeo, Serenella; Christmas, David; Matthews, Keith

    2015-01-01

    The application of machine learning techniques to psychiatric neuroimaging offers the possibility to identify robust, reliable and objective disease biomarkers both within and between contemporary syndromal diagnoses that could guide routine clinical practice. The use of quantitative methods to identify psychiatric biomarkers is consequently important, particularly with a view to making predictions relevant to individual patients, rather than at a group-level. Here, we describe predictions of treatment-refractory depression (TRD) diagnosis using structural T1-weighted brain scans obtained from twenty adult participants with TRD and 21 never depressed controls. We report 85% accuracy of individual subject diagnostic prediction. Using an automated feature selection method, the major brain regions supporting this significant classification were in the caudate, insula, habenula and periventricular grey matter. It was not, however, possible to predict the degree of ‘treatment resistance’ in individual patients, at least as quantified by the Massachusetts General Hospital (MGH-S) clinical staging method; but the insula was again identified as a region of interest. Structural brain imaging data alone can be used to predict diagnostic status, but not MGH-S staging, with a high degree of accuracy in patients with TRD. PMID:26186455

  14. Evaluation of machine learning algorithms for treatment outcome prediction in patients with epilepsy based on structural connectome data.

    PubMed

    Munsell, Brent C; Wee, Chong-Yaw; Keller, Simon S; Weber, Bernd; Elger, Christian; da Silva, Laura Angelica Tomaz; Nesland, Travis; Styner, Martin; Shen, Dinggang; Bonilha, Leonardo

    2015-09-01

    The objective of this study is to evaluate machine learning algorithms aimed at predicting surgical treatment outcomes in groups of patients with temporal lobe epilepsy (TLE) using only the structural brain connectome. Specifically, the brain connectome is reconstructed using white matter fiber tracts from presurgical diffusion tensor imaging. To achieve our objective, a two-stage connectome-based prediction framework is developed that gradually selects a small number of abnormal network connections that contribute to the surgical treatment outcome, and in each stage a linear kernel operation is used to further improve the accuracy of the learned classifier. Using a 10-fold cross validation strategy, the first stage in the connectome-based framework is able to separate patients with TLE from normal controls with 80% accuracy, and second stage in the connectome-based framework is able to correctly predict the surgical treatment outcome of patients with TLE with 70% accuracy. Compared to existing state-of-the-art methods that use VBM data, the proposed two-stage connectome-based prediction framework is a suitable alternative with comparable prediction performance. Our results additionally show that machine learning algorithms that exclusively use structural connectome data can predict treatment outcomes in epilepsy with similar accuracy compared with "expert-based" clinical decision. In summary, using the unprecedented information provided in the brain connectome, machine learning algorithms may uncover pathological changes in brain network organization and improve outcome forecasting in the context of epilepsy.

  15. Struct2Net: a web service to predict protein-protein interactions using a structure-based approach.

    PubMed

    Singh, Rohit; Park, Daniel; Xu, Jinbo; Hosur, Raghavendra; Berger, Bonnie

    2010-07-01

    Struct2Net is a web server for predicting interactions between arbitrary protein pairs using a structure-based approach. Prediction of protein-protein interactions (PPIs) is a central area of interest and successful prediction would provide leads for experiments and drug design; however, the experimental coverage of the PPI interactome remains inadequate. We believe that Struct2Net is the first community-wide resource to provide structure-based PPI predictions that go beyond homology modeling. Also, most web-resources for predicting PPIs currently rely on functional genomic data (e.g. GO annotation, gene expression, cellular localization, etc.). Our structure-based approach is independent of such methods and only requires the sequence information of the proteins being queried. The web service allows multiple querying options, aimed at maximizing flexibility. For the most commonly studied organisms (fly, human and yeast), predictions have been pre-computed and can be retrieved almost instantaneously. For proteins from other species, users have the option of getting a quick-but-approximate result (using orthology over pre-computed results) or having a full-blown computation performed. The web service is freely available at http://struct2net.csail.mit.edu.

  16. Damage Prediction and Estimation in Structural Mechanics Based on Data Mining

    SciTech Connect

    Sandhu, S S; Kanapady, R; Tamma, K K; Kamath, C; Kumar, V

    2001-07-23

    Damage in a material includes localized softening or cracks in a structural component due to high operational loads, or the presence of flaws in a structure due to various manufacturing processes. Methods that identify the presence, the location and the severity of damage in the structure are useful for non-destructive evaluation procedures that are typically employed in agile manufacturing and rapid prototyping systems. The current state-of-the art techniques for these inverse problems are computationally intensive or ill conditioned when insufficient data exists. Early work by a number of researchers has shown that data mining techniques can provide a potential solution to this problem. In this paper, they investigate the use of data mining techniques for predicting failure in a variety of 2D and 3D structures using artificial neural networks (ANNs) and decision trees. This work shows that if the correct features are chosen to build the model, and the model is trained on an adequate amount of data, the model can then correctly classify the failure event as well as predict location and severity of the damage in these structures.

  17. Predictive Methods for Dense Polymer Networks: Combating Bias with Bio-Based Structures

    DTIC Science & Technology

    2016-03-16

    Architectural Bias • Comparison of Petroleum -Based and Bio-Based Chemical Architectures • Continuing Research on Structure-Property Relationships using...inexpensive supply of phenol from petroleum refining. Consequently, certain architectures are over-represented in a “random” sample of network-forming...A much wider array of structures can be accessed from these sources than is available from refining petroleum 11Distribution A: Approved for public

  18. Evaluation of machine learning algorithms for treatment outcome prediction in patients with epilepsy based on structural connectome data

    PubMed Central

    Munsell, Brent C.; Wee, Chong-Yaw; Keller, Simon S.; Weber, Bernd; Elger, Christian; da Silva, Laura Angelica Tomaz; Nesland, Travis; Styner, Martin; Shen, Dinggang; Bonilha, Leonardo

    2015-01-01

    The objective of this study is to evaluate machine learning algorithms aimed at predicting surgical treatment outcomes in groups of patients with temporal lobe epilepsy (TLE) using only the structural brain connectome. Specifically, the brain connectome is reconstructed using white matter fiber tracts from presurgical diffusion tensor imaging. To achieve our objective, a two-stage connectome-based prediction framework is developed that gradually selects a small number of abnormal network connections that contribute to the surgical treatment outcome, and in each stage a linear kernel operation is used to further improve the accuracy of the learned classifier. Using a 10-fold cross validation strategy, the first stage in the connectome-based framework is able to separate patients with TLE from normal controls with 80% accuracy, and second stage in the connectome-based framework is able to correctly predict the surgical treatment outcome of patients with TLE with 70% accuracy. Compared to existing state-of-the-art methods that use VBM data, the proposed two-stage connectome-based prediction framework is a suitable alternative with comparable prediction performance. Our results additionally show that machine learning algorithms that exclusively use structural connectome data can predict treatment outcomes in epilepsy with similar accuracy compared with “expert-based” clinical decision. In summary, using the unprecedented information provided in the brain connectome, machine learning algorithms may uncover pathological changes in brain network organization and improve outcome forecasting in the context of epilepsy. PMID:26054876

  19. beta Structure of aqueous staphylococcal enterotoxin B by spectropolarimetry and sequence-based conformational predictions.

    PubMed

    Muñoz, P A; Warren, J R; Noelken, M E

    1976-10-19

    Conformations of the globular protein staphylococcal enterotoxin B have been examined experimentally by ultraviolet circular dichroism (CD) and visible optical rotatory dispersion (ORD). Chen-Yang-Chau analysis (Chen, Y.-H., Yang, J.T., and Chau, K. H. (1974), Biochemistry 13, 3350) of the far-ultraviolet CD spectrum of native enterotoxin B revealed (assuming an average helix length of 11 residues) 9% alpha helix, 38% beta structure, and 53% random coil. A fourfold increase in alpha-helix was observed for enterotoxin exposed to 0.2% sodium dodecyl sulfate, behavior typical for globular proteins of low helical content. Values of -40 to -50 for the Moffitt-Yang parameter b0 calculated from visible ORD suggested 6-13% alpha helix in native enterotoxin. Application of a new predictive model (Chou, P. Y., and Fasman, G. D. (1974), Biochemistry 13,222) to the amino acid sequence of enterotoxin B indicated 11% alpha helix, 34% beta structure, and 55% coil in native enterotoxin. The excellent agreement for the amount of alpha and beta conformation utilizing different optical and predictive methods indicates beta structure as the dominant secondary structure in native enterotoxin B. Most of the beta structure is predicted by Chou-Fasman analysis to reside in two large regions of antiparallel beta sheet involving residues 81-148 and residues 184-217. Such highly cooperative regions of anti-parallel beta sheet account for the slow unfolding of enterotoxin B in concentrated guanidine hydrochloride and rapid folding of guanidine hydrochloride denatured enterotoxin B to native conformation(s) (Warren, J.R., Spero, L., and Metzger, J. F. (1974), Biochemistry 13, 1678). A more than twofold increase in alpha-helix content with a small diminution in beta structure was detected by CD and ORD upon acidification of aqueous enterotoxin to pH 2.5. Thus, the beta structure of enterotoxin B appears to resist isothermal denaturation and constitutes a stable interior core of structure in the

  20. Structure-based prediction of free energy changes of binding of PTP1B inhibitors

    NASA Astrophysics Data System (ADS)

    Wang, Jing; Ling Chan, Shek; Ramnarayan, Kal

    2003-08-01

    The goals were (1) to understand the driving forces in the binding of small molecule inhibitors to the active site of PTP1B and (2) to develop a molecular mechanics-based empirical free energy function for compound potency prediction. A set of compounds with known activities was docked onto the active site. The related energy components and molecular surface areas were calculated. The bridging water molecules were identified and their contributions were considered. Linear relationships were explored between the above terms and the binding free energies of compounds derived based on experimental inhibition constants. We found that minimally three terms are required to give rise to a good correlation (0.86) with predictive power in five-group cross-validation test (q2 = 0.70). The dominant terms are the electrostatic energy and non-electrostatic energy stemming from the intra- and intermolecular interactions of solutes and from those of bridging water molecules in complexes.

  1. Rigorous assessment and integration of the sequence and structure based features to predict hot spots

    PubMed Central

    2011-01-01

    Background Systematic mutagenesis studies have shown that only a few interface residues termed hot spots contribute significantly to the binding free energy of protein-protein interactions. Therefore, hot spots prediction becomes increasingly important for well understanding the essence of proteins interactions and helping narrow down the search space for drug design. Currently many computational methods have been developed by proposing different features. However comparative assessment of these features and furthermore effective and accurate methods are still in pressing need. Results In this study, we first comprehensively collect the features to discriminate hot spots and non-hot spots and analyze their distributions. We find that hot spots have lower relASA and larger relative change in ASA, suggesting hot spots tend to be protected from bulk solvent. In addition, hot spots have more contacts including hydrogen bonds, salt bridges, and atomic contacts, which favor complexes formation. Interestingly, we find that conservation score and sequence entropy are not significantly different between hot spots and non-hot spots in Ab+ dataset (all complexes). While in Ab- dataset (antigen-antibody complexes are excluded), there are significant differences in two features between hot pots and non-hot spots. Secondly, we explore the predictive ability for each feature and the combinations of features by support vector machines (SVMs). The results indicate that sequence-based feature outperforms other combinations of features with reasonable accuracy, with a precision of 0.69, a recall of 0.68, an F1 score of 0.68, and an AUC of 0.68 on independent test set. Compared with other machine learning methods and two energy-based approaches, our approach achieves the best performance. Moreover, we demonstrate the applicability of our method to predict hot spots of two protein complexes. Conclusion Experimental results show that support vector machine classifiers are quite

  2. hβ2R-Gαs complex: prediction versus crystal structure--how valuable are predictions based on molecular modeling studies?

    PubMed

    Straßer, Andrea; Wittmann, Hans-Joachim

    2012-07-01

    In 2010, we predicted two models for the hβ(2)R-Gα(s) complex by combining the technique of homology modeling with a potential energy surface scan, since a complete crystal structure of the hβ(2)R-Gα(s) complex was not available. The crystal structure of opsin co-crystallized with part of the C-terminus of Gα (3DQB) was used as a template to model the hβ(2)R, whereas the crystal structure of Gα (1AZT) was used as a template to model Gα(s). Utilizing a potential energy surface scan between hβ(2)R and Gα(s), a six-dimensional potential energy surface was obtained. Two significant minimum regions were located on this surface, and each was associated with a distinct hβ(2)R-Gα(s) complex, namely model I and model II [Straßer A, Wittmann H-J (2010) J Mol Model 16:1307-1318]. The crystal structure of the hβ(2)R-Gα(s)βγ complex has recently been published. Thus, the aim of the current study was, on the one hand, to compare our predicted structures with the true crystal structure, and on the other to discuss the question: how valuable are predictions based on molecular modeling studies?

  3. Decomposing cerebral blood flow MRI into functional and structural components: A non-local approach based on prediction

    PubMed Central

    Kandel, Benjamin M.; Wang, Danny JJ; Detre, John A.; Gee, James C.; Avants, Brian B.

    2014-01-01

    We present RIPMMARC (Rotation Invariant Patch-based Multi-Modality Analysis aRChitecture), a flexible and widely applicable method for extracting information unique to a given modality from a multi-modal data set. We use RIPMMARC to improve interpretation of arterial spin labeling (ASL) perfusion images by removing the component of perfusion that is predicted by the underlying anatomy. Using patch-based, rotation invariant descriptors derived from the anatomical image, we learn a predictive relationship between local neuroanatomical structure and the corresponding perfusion image. This relation allows us to produce an image of perfusion that would be predicted given only the underlying anatomy and a residual image that represents perfusion information that cannot be predicted by anatomical features. Our learned structural features are significantly better at predicting brain perfusion than tissue probability maps, which are the input to standard partial volume correction techniques. Studies in test-retest data show that both the anatomically predicted and residual perfusion signal are highly replicable for a given subject. In a pediatric population, both the raw perfusion and structurally predicted images are tightly linked to age throughout adolescence throughout the brain. Interestingly, the residual perfusion also shows a strong correlation with age in select regions including the hippocampi (corr= 0.38, p-value < 10−6), precuneus (corr= −0.44, p < 10−5), and combined default mode network regions (corr= −0.45, p < 10−8) that is independent of global anatomy-perfusion trends. This finding suggests that there is a regionally heterogeneous pattern of functional specialization that is distinct from that of cortical structural development. PMID:25449745

  4. Decomposing cerebral blood flow MRI into functional and structural components: a non-local approach based on prediction.

    PubMed

    Kandel, Benjamin M; Wang, Danny J J; Detre, John A; Gee, James C; Avants, Brian B

    2015-01-15

    We present RIPMMARC (Rotation Invariant Patch-based Multi-Modality Analysis aRChitecture), a flexible and widely applicable method for extracting information unique to a given modality from a multi-modal data set. We use RIPMMARC to improve the interpretation of arterial spin labeling (ASL) perfusion images by removing the component of perfusion that is predicted by the underlying anatomy. Using patch-based, rotation invariant descriptors derived from the anatomical image, we learn a predictive relationship between local neuroanatomical structure and the corresponding perfusion image. This relation allows us to produce an image of perfusion that would be predicted given only the underlying anatomy and a residual image that represents perfusion information that cannot be predicted by anatomical features. Our learned structural features are significantly better at predicting brain perfusion than tissue probability maps, which are the input to standard partial volume correction techniques. Studies in test-retest data show that both the anatomically predicted and residual perfusion signals are highly replicable for a given subject. In a pediatric population, both the raw perfusion and structurally predicted images are tightly linked to age throughout adolescence throughout the brain. Interestingly, the residual perfusion also shows a strong correlation with age in selected regions including the hippocampi (corr = 0.38, p-value <10(-6)), precuneus (corr = -0.44, p < 10(-5)), and combined default mode network regions (corr = -0.45, p < 10(-8)) that is independent of global anatomy-perfusion trends. This finding suggests that there is a regionally heterogeneous pattern of functional specialization that is distinct from that of cortical structural development.

  5. Predictive structural dynamic network analysis.

    PubMed

    Chen, Rong; Herskovits, Edward H

    2015-04-30

    Classifying individuals based on magnetic resonance data is an important task in neuroscience. Existing brain network-based methods to classify subjects analyze data from a cross-sectional study and these methods cannot classify subjects based on longitudinal data. We propose a network-based predictive modeling method to classify subjects based on longitudinal magnetic resonance data. Our method generates a dynamic Bayesian network model for each group which represents complex spatiotemporal interactions among brain regions, and then calculates a score representing that subject's deviation from expected network patterns. This network-derived score, along with other candidate predictors, are used to construct predictive models. We validated the proposed method based on simulated data and the Alzheimer's Disease Neuroimaging Initiative study. For the Alzheimer's Disease Neuroimaging Initiative study, we built a predictive model based on the baseline biomarker characterizing the baseline state and the network-based score which was constructed based on the state transition probability matrix. We found that this combined model achieved 0.86 accuracy, 0.85 sensitivity, and 0.87 specificity. For the Alzheimer's Disease Neuroimaging Initiative study, the model based on the baseline biomarkers achieved 0.77 accuracy. The accuracy of our model is significantly better than the model based on the baseline biomarkers (p-value=0.002). We have presented a method to classify subjects based on structural dynamic network model based scores. This method is of great importance to distinguish subjects based on structural network dynamics and the understanding of the network architecture of brain processes and disorders. Copyright © 2015 Elsevier B.V. All rights reserved.

  6. A Structure Based Model for the Prediction of Phospholipidosis Induction Potential of Small Molecules

    PubMed Central

    Sun, Hongmao; Shahane, Sampada; Xia, Menghang; Austin, Christopher P.; Huang, Ruili

    2012-01-01

    Drug-induced phospholipidosis (PLD), characterized by an intracellular accumulation of phospholipids and formation of concentric lamellar bodies, has raised concerns in the drug discovery community, due to its potential adverse effects. To evaluate the PLD induction potential, 4,161 non-redundant drug-like molecules from the National Institutes of Health Chemical Genomics Center (NCGC) Pharmaceutical Collection (NPC), the Library of Pharmacologically Active Compounds (LOPAC) and the Tocris Biosciences collection were screened in a quantitative high-throughput screening (qHTS) format. The potential of drug-lipid complex formation can be linked directly to the structures of drug molecules, and many PLD inducing drugs were found to share common structural features. Support vector machine (SVM) models were constructed by using customized atom types or Molecular Operating Environment (MOE) 2D descriptors as structural descriptors. Either the compounds from LOPAC or randomly selected from the entire dataset were used as the training set. The impact of training data with biased structural features and the impact of molecule descriptors emphasizing whole-molecule properties or detailed functional groups at the atom level on model performance were analyzed and discussed. Rebalancing strategies were applied to improve the predictive power of the SVM models. Using the under-sampling method, the consensus model using one third of the compounds randomly selected from the data set as the training set achieved high accuracy of 0.90 in predicting the remaining two thirds of the compounds constituting the test set, as measured by the area under the receiver operator characteristic curve (AUC-ROC). PMID:22725677

  7. Structure-based comparative analysis and prediction of N-linked glycosylation sites in evolutionarily distant eukaryotes.

    PubMed

    Lam, Phuc Vinh Nguyen; Goldman, Radoslav; Karagiannis, Konstantinos; Narsule, Tejas; Simonyan, Vahan; Soika, Valerii; Mazumder, Raja

    2013-04-01

    The asparagine-X-serine/threonine (NXS/T) motif, where X is any amino acid except proline, is the consensus motif for N-linked glycosylation. Significant numbers of high-resolution crystal structures of glycosylated proteins allow us to carry out structural analysis of the N-linked glycosylation sites (NGS). Our analysis shows that there is enough structural information from diverse glycoproteins to allow the development of rules which can be used to predict NGS. A Python-based tool was developed to investigate asparagines implicated in N-glycosylation in five species: Homo sapiens, Mus musculus, Drosophila melanogaster, Arabidopsis thaliana and Saccharomyces cerevisiae. Our analysis shows that 78% of all asparagines of NXS/T motif involved in N-glycosylation are localized in the loop/turn conformation in the human proteome. Similar distribution was revealed for all the other species examined. Comparative analysis of the occurrence of NXS/T motifs not known to be glycosylated and their reverse sequence (S/TXN) shows a similar distribution across the secondary structural elements, indicating that the NXS/T motif in itself is not biologically relevant. Based on our analysis, we have defined rules to determine NGS. Using machine learning methods based on these rules we can predict with 93% accuracy if a particular site will be glycosylated. If structural information is not available the tool uses structural prediction results resulting in 74% accuracy. The tool was used to identify glycosylation sites in 108 human proteins with structures and 2247 proteins without structures that have acquired NXS/T site/s due to non-synonymous variation. The tool, Structure Feature Analysis Tool (SFAT), is freely available to the public at http://hive.biochemistry.gwu.edu/tools/sfat. Copyright © 2013. Production and hosting by Elsevier Ltd.

  8. Predicting the equilibrium protein folding pathway: structure-based analysis of staphylococcal nuclease.

    PubMed

    Hilser, V J; Freire, E

    1997-02-01

    The equilibrium folding pathway of staphylococcal nuclease (SNase) has been approximated using a statistical thermodynamic formalism that utilizes the high-resolution structure of the native state as a template to generate a large ensemble of partially folded states. Close to 400,000 different states ranging from the native to the completely unfolded states were included in the analysis. The probability of each state was estimated using an empirical structural parametrization of the folding energetics. It is shown that this formalism predicts accurately the stability of the protein, the cooperativity of the folding/unfolding transition observed by differential scanning calorimetry (DSC) or urea denaturation and the thermodynamic parameters for unfolding. More importantly, this formalism provides a quantitative account of the experimental hydrogen exchange protection factors measured under native conditions for SNase. These results suggest that the computer-generated distribution of states approximates well the ensemble of conformations existing in solution. Furthermore, this formalism represents the first model capable of quantitatively predicting within a unified framework the probability distribution of states seen under native conditions and its change upon unfolding.

  9. Altered sphingoid base profiles predict compromised membrane structure and permeability in atopic dermatitis

    PubMed Central

    Loiseau, Nicolas; Obata, Yasuko; Moradian, Sam; Sano, Hiromu; Yoshino, Saeko; Aburai, Kenichi; Takayama, Kozo; Sakamoto, Kazutami; Holleran, Walter M.; Elias, Peter M.; Uchida, Yoshikazu

    2013-01-01

    Background Ceramide hydrolysis by ceramidase in the stratum corneum (SC) yields both sphingoid bases and free fatty acids (FFA). While FFA are key constituents of the lamellar bilayers that mediate the epidermal permeability barrier, whether sphingoid bases influence permeability barrier homeostasis remains unknown. Pertinently, alterations of lipid profile, including ceramide and ceramidase activities occur in atopic dermatitis (AD). Object We investigated alterations in sphingoid base levels and/or profiles (sphingosine to sphinganine ratio) in the SC of normal vs. AD mice, a model that faithfully replicates human AD, and then whether altered sphingoid base levels and/or profiles influence(s) membrane stability and/or structures. Methods Unilamellar vesicles (LV), incorporating the three major SC lipids (ceramides/FFA/cholesterol) and different ratios of sphingosine/sphinganine, encapsulating carboxyfluorescein, were used as the model of SC lipids. Membrane stability was measured as release of carboxyfluorescein. Thermal analysis of LV was conducted by Differential scanning calorimetry (DSC). Results LV containing AD levels of sphingosine/sphinganine (AD-LV) displayed altered membrane permeability vs. normal-LV. DSC analyses revealed decreases in orthorhombic structures that form tightly-packed lamellar structures in AD-LV. Conclusion Sphingoid base composition influences lamellar membrane architecture in SC, suggesting that altered sphingoid base profiles could contribute to the barrier abnormality in AD. PMID:24070864

  10. DFT-based prediction of fission product sorption on carbon structures under O2 ingress conditions

    NASA Astrophysics Data System (ADS)

    Londono-Hurtado, Alejandro; Szlufarska, Izabela; Morgan, Dane

    2013-06-01

    An isotherm based model for the prediction of Cs sorption on the carbon components of a High Temperature Reactor (HTR) under O2 ingress conditions is presented. Isotherms are derived from a thermodynamic model based on binding energies calculated using Density Functional Theory (DFT). The DFT derived isotherms are compared with isotherms obtained from experimental calculations and sources of discrepancies are discussed. A DFT only model and a second model combining DFT and experimental calculations are used to predict fission product inventories in a HTR vessel during O2 ingress conditions. Results suggest that the carbon type (i.e. graphitic vs. amorphous) plays a central role on fission product sorption and release. During normal reactor conditions (T around 1400 K, low P) graphitic carbon will absorb a small percentage of a monolayer of Cs, while amorphous carbon will be approximately saturated at an entire monolayer of Cs. Results also indicate that, for the case of O2 ingress to the reactor's vessel, the Cs will form Cs2O. In the case of graphitic carbon, the Cs2O will bind more weakly than Cs, leading to Cs release in the form of Cs2O during O ingress. However, the weak binding of Cs to graphite means that only small release is expected. In the case of amorphous carbon, Cs2O binds almost as strongly Cs, and so no significant change in Cs absorbed to the amorphous carbon is predicted, although the form of the absorbed Cs is predicted to be Cs2O. For the case of low release conditions, consistent with modern TRISO fuels, the core will adsorb the entire Cs inventory at normal operating temperatures. However, for high Cs release conditions, consistent with older TRISO fuels, the surface sites on the core will be saturated and most of the Cs will remain in gas form or plate out on other surfaces.

  11. Union of geometric constraint-based simulations with molecular dynamics for protein structure prediction.

    PubMed

    Glembo, Tyler J; Ozkan, S Banu

    2010-03-17

    Although proteins are a fundamental unit in biology, the mechanism by which proteins fold into their native state is not well understood. In this work, we explore the assembly of secondary structure units via geometric constraint-based simulations and the effect of refinement of assembled structures using reservoir replica exchange molecular dynamics. Our approach uses two crucial features of these methods: i), geometric simulations speed up the search for nativelike topologies as there are no energy barriers to overcome; and ii), molecular dynamics identifies the low free energy structures and further refines these structures toward the actual native conformation. We use eight alpha-, beta-, and alpha/beta-proteins to test our method. The geometric simulations of our test set result in an average RMSD from native of 3.7 A and this further reduces to 2.7 A after refinement. We also explore the question of robustness of assembly for inaccurate (shifted and shortened) secondary structure. We find that the RMSD from native is highly dependent on the accuracy of secondary structure input, and even slightly shifting the location of secondary structure along the amino acid sequence can lead to a rapid decrease in RMSD to native due to incorrect packing.

  12. Information theory-based scoring function for the structure-based prediction of protein-ligand binding affinity.

    PubMed

    Kulharia, Mahesh; Goody, Roger S; Jackson, Richard M

    2008-10-01

    The development and validation of a new knowledge based scoring function (SIScoreJE) to predict binding energy between proteins and ligands is presented. SIScoreJE efficiently predicts the binding energy between a small molecule and its protein receptor. Protein-ligand atomic contact information was derived from a Non-Redundant Data set (NRD) of over 3000 X-ray crystal structures of protein-ligand complexes. This information was classified for individual "atom contact pairs" (ACP) which is used to calculate the atomic contact preferences. In addition to the two schemes generated in this study we have assessed a number of other common atom-type classification schemes. The preferences were calculated using an information theoretic relationship of joint entropy. Among 18 different atom-type classification schemes "ScoreJE Atom Type set2" (SATs2) was found to be the most suitable for our approach. To test the sensitivity of the method to the inclusion of solvent, Single-body Solvation Potentials (SSP) were also derived from the atomic contacts between the protein atom types and water molecules modeled using AQUARIUS2. Validation was carried out using an evaluation data set of 100 protein-ligand complexes with known binding energies to test the ability of the scoring functions to reproduce known binding affinities. In summary, it was found that a combined SSP/ScoreJE (SIScoreJE) performed significantly better than ScoreJE alone, and SIScoreJE and ScoreJE performed better than GOLD::GoldScore, GOLD::ChemScore, and XScore.

  13. The social structure and strategies of delphinids: predictions based on an ecological framework.

    PubMed

    Gowans, Shannon; Würsig, Bernd; Karczmarski, Leszek

    2007-01-01

    Dolphins live in complex social groupings with a wide variety of social strategies. In this chapter we investigate the role that differing habitats and ecological conditions have played in the evolution of delphinid social strategies. We propose a conceptual framework for understanding natural patterns of delphinid social structure in which the spatial and temporal predictability of resources influences the ranging patterns of individuals and communities. The framework predicts that when resources are spatially and temporally predictable, dolphins should remain resident in relatively small areas. Predictable resources are often found in complex inshore environments where dolphins may hide from predators or avoid areas with high predator density. Additionally, available food resources may limit group size. Thus, we predict that there are few benefits to forming large groups and potentially many benefits to being solitary or in small groups. Males may be able to sequester solitary females, controlling mating opportunities. Observations of inshore populations of bottlenose dolphins (Tursiops sp.) and island-associated spinner dolphins (Stenella longirostris) seem to fit this pattern well, along with forest-dwelling African antelope and primates such as vervets (Cercopithicus aethiops), baboons (Papio sp.), macaques (Macaca sp.) and chimpanzees (Pan troglodytes). In contrast, the framework predicts that when resources such as food are unpredictable, individuals must range further to find the necessary resources. Forming groups may be the only strategy available to avoid predation, especially in the open ocean. Larger home ranges are likely to support a greater number of individuals; however, prey is often sparsely distributed, which may act to reduce foraging competition. Cooperative foraging and herding of prey schools may be advantageous, potentially facilitating the formation of long-term bonds. Alternately, individuals may display many short-term affiliations

  14. Large-Scale Structure-Based Prediction and Identification of Novel Protease Substrates Using Computational Protein Design.

    PubMed

    Pethe, Manasi A; Rubenstein, Aliza B; Khare, Sagar D

    2017-01-20

    Characterizing the substrate specificity of protease enzymes is critical for illuminating the molecular basis of their diverse and complex roles in a wide array of biological processes. Rapid and accurate prediction of their extended substrate specificity would also aid in the design of custom proteases capable of selectively and controllably cleaving biotechnologically or therapeutically relevant targets. However, current in silico approaches for protease specificity prediction, rely on, and are therefore limited by, machine learning of sequence patterns in known experimental data. Here, we describe a general approach for predicting peptidase substrates de novo using protein structure modeling and biophysical evaluation of enzyme-substrate complexes. We construct atomic resolution models of thousands of candidate substrate-enzyme complexes for each of five model proteases belonging to the four major protease mechanistic classes-serine, cysteine, aspartyl, and metallo-proteases-and develop a discriminatory scoring function using enzyme design modules from Rosetta and AMBER's MMPBSA. We rank putative substrates based on calculated interaction energy with a modeled near-attack conformation of the enzyme active site. We show that the energetic patterns obtained from these simulations can be used to robustly rank and classify known cleaved and uncleaved peptides and that these structural-energetic patterns have greater discriminatory power compared to purely sequence-based statistical inference. Combining sequence and energetic patterns using machine-learning algorithms further improves classification performance, and analysis of structural models provides physical insight into the structural basis for the observed specificities. We further tested the predictive capability of the model by designing and experimentally characterizing the cleavage of four novel substrate motifs for the hepatitis C virus NS3/4 protease using an in vivo assay. The presented structure-based

  15. Predicting brain structure in population-based samples with biologically informed genetic scores for schizophrenia.

    PubMed

    Van der Auwera, Sandra; Wittfeld, Katharina; Shumskaya, Elena; Bralten, Janita; Zwiers, Marcel P; Onnink, A Marten H; Usberti, Niccolo; Hertel, Johannes; Völzke, Henry; Völker, Uwe; Hosten, Norbert; Franke, Barbara; Grabe, Hans J

    2017-04-01

    Schizophrenia is associated with brain structural abnormalities including gray and white matter volume reductions. Whether these alterations are caused by genetic risk variants for schizophrenia is unclear. Previous attempts to detect associations between polygenic factors for schizophrenia and structural brain phenotypes in healthy subjects have been negative or remain non-replicated. In this study, we used genetic risk scores that were based on the accumulated effect of selected risk variants for schizophrenia belonging to specific biological systems like synaptic function, neurodevelopment, calcium signaling, and glutamatergic neurotransmission. We hypothesized that this "biologically informed" approach would provide the missing link between genetic risk for schizophrenia and brain structural phenotypes. We applied whole-brain voxel-based morphometry (VBM) analyses in two population-based target samples and subsequent regions of interest (ROIs) analyses in an independent replication sample (total N = 2725). No consistent association between the genetic scores and brain volumes were observed in the investigated samples. These results suggest that in healthy subjects with a higher genetic risk for schizophrenia additional factors apart from common genetic variants (e.g., infection, trauma, rare genetic variants, or gene-gene interactions) are required to induce structural abnormalities of the brain. Further studies are recommended to test for possible gene-gene or gene-environment effects. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  16. Physics-based protein-structure prediction using a hierarchical protocol based on the UNRES force field: Assessment in two blind tests

    PubMed Central

    Ołdziej, S.; Czaplewski, C.; Liwo, A.; Chinchio, M.; Nanias, M.; Vila, J. A.; Khalili, M.; Arnautova, Y. A.; Jagielska, A.; Makowski, M.; Schafroth, H. D.; Kaźmierkiewicz, R.; Ripoll, D. R.; Pillardy, J.; Saunders, J. A.; Kang, Y. K.; Gibson, K. D.; Scheraga, H. A.

    2005-01-01

    Recent improvements in the protein-structure prediction method developed in our laboratory, based on the thermodynamic hypothesis, are described. The conformational space is searched extensively at the united-residue level by using our physics-based UNRES energy function and the conformational space annealing method of global optimization. The lowest-energy coarse-grained structures are then converted to an all-atom representation and energy-minimized with the ECEPP/3 force field. The procedure was assessed in two recent blind tests of protein-structure prediction. During the first blind test, we predicted large fragments of α and α+β proteins [60–70 residues with Cα rms deviation (rmsd) <6 Å]. However, for α+β proteins, significant topological errors occurred despite low rmsd values. In the second exercise, we predicted whole structures of five proteins (two α and three α+β, with sizes of 53–235 residues) with remarkably good accuracy. In particular, for the genomic target TM0487 (a 102-residue α+β protein from Thermotoga maritima), we predicted the complete, topologically correct structure with 7.3-Å Cα rmsd. So far this protein is the largest α+β protein predicted based solely on the amino acid sequence and a physics-based potential-energy function and search procedure. For target T0198, a phosphate transport system regulator PhoU from T. maritima (a 235-residue mainly α-helical protein), we predicted the topology of the whole six-helix bundle correctly within 8 Å rmsd, except the 32 C-terminal residues, most of which form a β-hairpin. These and other examples described in this work demonstrate significant progress in physics-based protein-structure prediction. PMID:15894609

  17. Ab Initio Based 2D Continuum Mechanics - Sensitivity Prediction for Contact Resonance Atomic Force Microscopy Based Structure Fingerprints

    NASA Astrophysics Data System (ADS)

    Tu, Qing; Lange, Björn; Lopes, J. Marcelo J.; Zauscher, Stefan; Blum, Volker

    Contact resonance AFM is demonstrated as a powerful tool for mapping differences in the mechanical properties of 2D materials and heterostructures, permitting to resolve surface and subsurface structural differences of different domains. Measured contact resonance frequencies are related to the contact stiffness of the combined tip-sample system. Based on first principles predicted elastic properties and a continuum approach to model the mechanical impedance, we find contact stiffness ratios between different domains of few-layer graphene on 3C-SiC(111) in excellent agreement with experiment. We next demonstrate that the approach is able to quantitatively resolve differences between other 2D materials domains, e.g., for h-BN, MoS2 and MoO3 on graphene on SiC. We show that the combined effect of several materials parameters, especially the in-plane elastic properties and the layer thickness, determines the contact stiffness, therefore boosting the sensitivity even if the out-of-plane elastic properties are similar.

  18. Integrated structure- and ligand-based in silico approach to predict inhibition of cytochrome P450 2D6.

    PubMed

    Martiny, Virginie Y; Carbonell, Pablo; Chevillard, Florent; Moroy, Gautier; Nicot, Arnaud B; Vayer, Philippe; Villoutreix, Bruno O; Miteva, Maria A

    2015-12-15

    Cytochrome P450 (CYP) is a superfamily of enzymes responsible for the metabolism of drugs, xenobiotics and endogenous compounds. CYP2D6 metabolizes about 30% of drugs and predicting potential CYP2D6 inhibition is important in early-stage drug discovery. We developed an original in silico approach for the prediction of CYP2D6 inhibition combining the knowledge of the protein structure and its dynamic behavior in response to the binding of various ligands and machine learning modeling. This approach includes structural information for CYP2D6 based on the available crystal structures and molecular dynamic simulations (MD) that we performed to take into account conformational changes of the binding site. We performed modeling using three learning algorithms--support vector machine, RandomForest and NaiveBayesian--and we constructed combined models based on topological information of known CYP2D6 inhibitors and predicted binding energies computed by docking on both X-ray and MD protein conformations. In addition, we identified three MD-derived structures that are capable all together to better discriminate inhibitors and non-inhibitors compared with individual CYP2D6 conformations, thus ensuring complementary ligand profiles. Inhibition models based on classical molecular descriptors and predicted binding energies were able to predict CYP2D6 inhibition with an accuracy of 78% on the training set and 75% on the external validation set. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  19. Prediction of Infant MRI Appearance and Anatomical Structure Evolution using Sparse Patch-based Metamorphosis Learning Framework

    PubMed Central

    Rekik, Islem; Li, Gang; Wu, Guorong; Lin, Weili; Shen, Dinggang

    2016-01-01

    Magnetic resonance imaging (MRI) of pediatric brain provides invaluable information for early normal and abnormal brain development. Longitudinal neuroimaging has spanned various research works on examining infant brain development patterns. However, studies on predicting postnatal brain image evolution remain scarce, which is very challenging due to the dynamic tissue contrast change and even inversion in postnatal brains. In this paper, we unprecedentedly propose a dual image intensity and anatomical structure (label) prediction framework that nicely links the geodesic image metamorphosis model with sparse patch-based image representation, thereby defining spatiotemporal metamorphic patches encoding both image photometric and geometric deformation. In the training stage, we learn the 4D metamorphosis trajectories for each training subject. In the prediction stage, we define various strategies to sparsely represent each patch in the testing image using the training metamorphosis patches; as we progressively increment the richness of the patch (from appearance-based to multimodal kinetic patches). We used the proposed framework to predict 6, 9 and 12-month brain MR image intensity and structure (white and gray matter maps) from 3 months in 10 infants. Our seminal work showed promising preliminary prediction results for the spatiotemporally complex, drastically changing brain images.

  20. Prediction of RNA Binding Residues: An Extensive Analysis Based on Structure and Function to Select the Best Predictor

    PubMed Central

    Nagarajan, R.; Gromiha, M. Michael

    2014-01-01

    Protein-RNA complexes play key roles in several cellular processes by the interactions of amino acids with RNA. To understand the recognition mechanism, it is important to identify the specific amino acids involved in RNA binding. Various computational methods have been developed for predicting RNA binding residues from protein sequence. However, their performances mainly depend on the training dataset, feature selection for developing a model and learning capacity of the model. Hence, it is important to reveal the correspondence between the performance of methods and properties of RNA-binding proteins (RBPs). In this work, we have collected all available RNA binding residues prediction methods and revealed their performances on unbiased, stringent and diverse datasets for RBPs with less than 25% sequence identity based on structural class, fold, superfamily, family, protein function, RNA type, RNA strand and RNA conformation. The best methods for each type of RBPs and the type of RBPs, which require further refinement in prediction, have been brought out. We also analyzed the performance of these methods for the disordered regions, structures which are not included in the training dataset and recently solved structures. The reliability of prediction is better than randomly choosing any method or combination of methods. This approach would be a valuable resource for biologists to choose the best method based on the type of RBPs for designing their experiments and the tool is freely accessible online at www.iitm.ac.in/bioinfo/RNA-protein/. PMID:24658593

  1. Prediction of RNA binding residues: an extensive analysis based on structure and function to select the best predictor.

    PubMed

    Nagarajan, R; Gromiha, M Michael

    2014-01-01

    Protein-RNA complexes play key roles in several cellular processes by the interactions of amino acids with RNA. To understand the recognition mechanism, it is important to identify the specific amino acids involved in RNA binding. Various computational methods have been developed for predicting RNA binding residues from protein sequence. However, their performances mainly depend on the training dataset, feature selection for developing a model and learning capacity of the model. Hence, it is important to reveal the correspondence between the performance of methods and properties of RNA-binding proteins (RBPs). In this work, we have collected all available RNA binding residues prediction methods and revealed their performances on unbiased, stringent and diverse datasets for RBPs with less than 25% sequence identity based on structural class, fold, superfamily, family, protein function, RNA type, RNA strand and RNA conformation. The best methods for each type of RBPs and the type of RBPs, which require further refinement in prediction, have been brought out. We also analyzed the performance of these methods for the disordered regions, structures which are not included in the training dataset and recently solved structures. The reliability of prediction is better than randomly choosing any method or combination of methods. This approach would be a valuable resource for biologists to choose the best method based on the type of RBPs for designing their experiments and the tool is freely accessible online at www.iitm.ac.in/bioinfo/RNA-protein/.

  2. On potential energy models for EA-based ab initio protein structure prediction.

    PubMed

    Mijajlovic, Milan; Biggs, Mark J; Djurdjevic, Dusan P

    2010-01-01

    Ab initio protein structure prediction involves determination of the three-dimensional (3D) conformation of proteins on the basis of their amino acid sequence, a potential energy (PE) model that captures the physics of the interatomic interactions, and a method to search for and identify the global minimum in the PE (or free energy) surface such as an evolutionary algorithm (EA). Many PE models have been proposed over the past three decades and more. There is currently no understanding of how the behavior of an EA is affected by the PE model used. The study reported here shows that the EA behavior can be profoundly affected: the EA performance obtained when using the ECEPP PE model is significantly worse than that obtained when using the Amber, OPLS, and CVFF PE models, and the optimal EA control parameter values for the ECEPP model also differ significantly from those associated with the other models.

  3. 3D Structure Prediction of Human β1-Adrenergic Receptor via Threading-Based Homology Modeling for Implications in Structure-Based Drug Designing

    PubMed Central

    Ul-Haq, Zaheer; Saeed, Maria; Halim, Sobia Ahsan; Khan, Waqasuddin

    2015-01-01

    Dilated cardiomyopathy is a disease of left ventricular dysfunction accompanied by impairment of the β1-adrenergic receptor (β1-AR) signal cascade. The disturbed β1-AR function may be based on an elevated sympathetic tone observed in patients with heart failure. Prolonged adrenergic stimulation may induce metabolic and electrophysiological disturbances in the myocardium, resulting in tachyarrhythmia that leads to the development of heart failure in human and sudden death. Hence, β1-AR is considered as a promising drug target but attempts to develop effective and specific drug against this tempting pharmaceutical target is slowed down due to the lack of 3D structure of Homo sapiens β1-AR (hsβADR1). This study encompasses elucidation of 3D structural and physicochemical properties of hsβADR1 via threading-based homology modeling. Furthermore, the docking performance of several docking programs including Surflex-Dock, FRED, and GOLD were validated by re-docking and cross-docking experiments. GOLD and Surflex-Dock performed best in re-docking and cross docking experiments, respectively. Consequently, Surflex-Dock was used to predict the binding modes of four hsβADR1 agonists. This study provides clear understanding of hsβADR1 structure and its binding mechanism, thus help in providing the remedial solutions of cardiovascular, effective treatment of asthma and other diseases caused by malfunctioning of the target protein. PMID:25860348

  4. 3D structure prediction of human β1-adrenergic receptor via threading-based homology modeling for implications in structure-based drug designing.

    PubMed

    Ul-Haq, Zaheer; Saeed, Maria; Halim, Sobia Ahsan; Khan, Waqasuddin

    2015-01-01

    Dilated cardiomyopathy is a disease of left ventricular dysfunction accompanied by impairment of the β1-adrenergic receptor (β1-AR) signal cascade. The disturbed β1-AR function may be based on an elevated sympathetic tone observed in patients with heart failure. Prolonged adrenergic stimulation may induce metabolic and electrophysiological disturbances in the myocardium, resulting in tachyarrhythmia that leads to the development of heart failure in human and sudden death. Hence, β1-AR is considered as a promising drug target but attempts to develop effective and specific drug against this tempting pharmaceutical target is slowed down due to the lack of 3D structure of Homo sapiens β1-AR (hsβADR1). This study encompasses elucidation of 3D structural and physicochemical properties of hsβADR1 via threading-based homology modeling. Furthermore, the docking performance of several docking programs including Surflex-Dock, FRED, and GOLD were validated by re-docking and cross-docking experiments. GOLD and Surflex-Dock performed best in re-docking and cross docking experiments, respectively. Consequently, Surflex-Dock was used to predict the binding modes of four hsβADR1 agonists. This study provides clear understanding of hsβADR1 structure and its binding mechanism, thus help in providing the remedial solutions of cardiovascular, effective treatment of asthma and other diseases caused by malfunctioning of the target protein.

  5. Physical scoring function based on AMBER force field and Poisson-Boltzmann implicit solvent for protein structure prediction.

    PubMed

    Hsieh, Meng-Juei; Luo, Ray

    2004-08-15

    A well-behaved physics-based all-atom scoring function for protein structure prediction is analyzed with several widely used all-atom decoy sets. The scoring function, termed AMBER/Poisson-Boltzmann (PB), is based on a refined AMBER force field for intramolecular interactions and an efficient PB model for solvation interactions. Testing on the chosen decoy sets shows that the scoring function, which is designed to consider detailed chemical environments, is able to consistently discriminate all 62 native crystal structures after considering the heteroatom groups, disulfide bonds, and crystal packing effects that are not included in the decoy structures. When NMR structures are considered in the testing, the scoring function is able to discriminate 8 out of 10 targets. In the more challenging test of selecting near-native structures, the scoring function also performs very well: for the majority of the targets studied, the scoring function is able to select decoys that are close to the corresponding native structures as evaluated by ranking numbers and backbone Calpha root mean square deviations. Various important components of the scoring function are also studied to understand their discriminative contributions toward the rankings of native and near-native structures. It is found that neither the nonpolar solvation energy as modeled by the surface area model nor a higher protein dielectric constant improves its discriminative power. The terms remaining to be improved are related to 1-4 interactions. The most troublesome term is found to be the large and highly fluctuating 1-4 electrostatics term, not the dihedral-angle term. These data support ongoing efforts in the community to develop protein structure prediction methods with physics-based potentials that are competitive with knowledge-based potentials.

  6. Structural Bioinformatics-Based Prediction of Exceptional Selectivity of p38 MAP Kinase Inhibitor PH-797804

    SciTech Connect

    Xing, Li; Shieh, Huey S.; Selness, Shaun R.; Devraj, Rajesh V.; Walker, John K.; Devadas, Balekudru; Hope, Heidi R.; Compton, Robert P.; Schindler, John F.; Hirsch, Jeffrey L.; Benson, Alan G.; Kurumbail, Ravi G.; Stegeman, Roderick A.; Williams, Jennifer M.; Broadus, Richard M.; Walden, Zara; Monahan, Joseph B.; Pfizer

    2009-07-24

    PH-797804 is a diarylpyridinone inhibitor of p38{alpha} mitogen-activated protein (MAP) kinase derived from a racemic mixture as the more potent atropisomer (aS), first proposed by molecular modeling and subsequently confirmed by experiments. On the basis of structural comparison with a different biaryl pyrazole template and supported by dozens of high-resolution crystal structures of p38{alpha} inhibitor complexes, PH-797804 is predicted to possess a high level of specificity across the broad human kinase genome. We used a structural bioinformatics approach to identify two selectivity elements encoded by the TXXXG sequence motif on the p38{alpha} kinase hinge: (i) Thr106 that serves as the gatekeeper to the buried hydrophobic pocket occupied by 2,4-difluorophenyl of PH-797804 and (ii) the bidentate hydrogen bonds formed by the pyridinone moiety with the kinase hinge requiring an induced 180{sup o} rotation of the Met109-Gly110 peptide bond. The peptide flip occurs in p38{alpha} kinase due to the critical glycine residue marked by its conformational flexibility. Kinome-wide sequence mining revealed rare presentation of the selectivity motif. Corroboratively, PH-797804 exhibited exceptionally high specificity against MAP kinases and the related kinases. No cross-reactivity was observed in large panels of kinase screens (selectivity ratio of >500-fold). In cellular assays, PH-797804 demonstrated superior potency and selectivity consistent with the biochemical measurements. PH-797804 has met safety criteria in human phase I studies and is under clinical development for several inflammatory conditions. Understanding the rationale for selectivity at the molecular level helps elucidate the biological function and design of specific p38{alpha} kinase inhibitors.

  7. Predicting Surgery Targets in Temporal Lobe Epilepsy through Structural Connectome Based Simulations

    PubMed Central

    Hutchings, Frances; Han, Cheol E.; Keller, Simon S.; Weber, Bernd; Taylor, Peter N.; Kaiser, Marcus

    2015-01-01

    Temporal lobe epilepsy (TLE) is a prevalent neurological disorder resulting in disruptive seizures. In the case of drug resistant epilepsy resective surgery is often considered. This is a procedure hampered by unpredictable success rates, with many patients continuing to have seizures even after surgery. In this study we apply a computational model of epilepsy to patient specific structural connectivity derived from diffusion tensor imaging (DTI) of 22 individuals with left TLE and 39 healthy controls. We validate the model by examining patient-control differences in simulated seizure onset time and network location. We then investigate the potential of the model for surgery prediction by performing in silico surgical resections, removing nodes from patient networks and comparing seizure likelihood post-surgery to pre-surgery simulations. We find that, first, patients tend to transit from non-epileptic to epileptic states more often than controls in the model. Second, regions in the left hemisphere (particularly within temporal and subcortical regions) that are known to be involved in TLE are the most frequent starting points for seizures in patients in the model. In addition, our analysis also implicates regions in the contralateral and frontal locations which may play a role in seizure spreading or surgery resistance. Finally, the model predicts that patient-specific surgery (resection areas chosen on an individual, model-prompted, basis and not following a predefined procedure) may lead to better outcomes than the currently used routine clinical procedure. Taken together this work provides a first step towards patient specific computational modelling of epilepsy surgery in order to inform treatment strategies in individuals. PMID:26657566

  8. An Analysis of the Revere, Quincy and Stamford Structure Data Bases for Predicting Building Material Distribution.

    DTIC Science & Technology

    1985-05-01

    AD-A±57 58 AN ANALYSIS OF THE REVERE QUINCY AND STANFORD STRUCTURE i/I DATA BASES FOR PR.. (U) COLD REGIONS RESEARCH AND ENGINEERING LAO HANOVER NH C...suggest a rejection of the null hypothesis that the variables are statistically independent ( Walpole and Myers 1978). This does not imply a cause-and...0.1180 0.0296 0.0077 Stamford *hypothesis of equality of means ( Walpole and Myers 1978). By separating * building surface area by building type

  9. Structure-Based Prediction of Unstable Regions in Proteins: Applications to Protein Misfolding Diseases

    NASA Astrophysics Data System (ADS)

    Guest, Will; Cashman, Neil; Plotkin, Steven

    2009-03-01

    Protein misfolding is a necessary step in the pathogenesis of many diseases, including Creutzfeldt-Jakob disease (CJD) and familial amyotrophic lateral sclerosis (fALS). Identifying unstable structural elements in their causative proteins elucidates the early events of misfolding and presents targets for inhibition of the disease process. An algorithm was developed to calculate the Gibbs free energy of unfolding for all sequence-contiguous regions of a protein using three methods to parameterize energy changes: a modified G=o model, changes in solvent-accessible surface area, and solution of the Poisson-Boltzmann equation. The entropic effects of disulfide bonds and post-translational modifications are treated analytically. It incorporates a novel method for finding local dielectric constants inside a protein to accurately handle charge effects. We have predicted the unstable parts of prion protein and superoxide dismutase 1, the proteins involved in CJD and fALS respectively, and have used these regions as epitopes to prepare antibodies that are specific to the misfolded conformation and show promise as therapeutic agents.

  10. Template-based protein structure prediction in CASP11 and retrospect of I-TASSER in the last decade.

    PubMed

    Yang, Jianyi; Zhang, Wenxuan; He, Baoji; Walker, Sara Elizabeth; Zhang, Hongjiu; Govindarajoo, Brandon; Virtanen, Jouko; Xue, Zhidong; Shen, Hong-Bin; Zhang, Yang

    2016-09-01

    We report the structure prediction results of a new composite pipeline for template-based modeling (TBM) in the 11th CASP experiment. Starting from multiple structure templates identified by LOMETS based meta-threading programs, the QUARK ab initio folding program is extended to generate initial full-length models under strong constraints from template alignments. The final atomic models are then constructed by I-TASSER based fragment reassembly simulations, followed by the fragment-guided molecular dynamic simulation and the MQAP-based model selection. It was found that the inclusion of QUARK-TBM simulations as an intermediate modeling step could help improve the quality of the I-TASSER models for both Easy and Hard TBM targets. Overall, the average TM-score of the first I-TASSER model is 12% higher than that of the best LOMETS templates, with the RMSD in the same threading-aligned regions reduced from 5.8 to 4.7 Å. Nevertheless, there are nearly 18% of TBM domains with the templates deteriorated by the structure assembly pipeline, which may be attributed to the errors of secondary structure and domain orientation predictions that propagate through and degrade the procedures of template identification and final model selections. To examine the record of progress, we made a retrospective report of the I-TASSER pipeline in the last five CASP experiments (CASP7-11). The data show no clear progress of the LOMETS threading programs over PSI-BLAST; but obvious progress on structural improvement relative to threading templates was witnessed in recent CASP experiments, which is probably attributed to the integration of the extended ab initio folding simulation with the threading assembly pipeline and the introduction of atomic-level structure refinements following the reduced modeling simulations. Proteins 2016; 84(Suppl 1):233-246. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.

  11. Template-based protein structure prediction in CASP11 and retrospect of I-TASSER in the last decade

    PubMed Central

    Yang, Jianyi; Zhang, Wenxuan; He, Baoji; Walker, Sara Elizabeth; Zhang, Hongjiu; Govindarajoo, Brandon; Virtanen, Jouko; Xue, Zhidong; Shen, Hong-Bin; Zhang, Yang

    2015-01-01

    We report the structure prediction results of a new composite pipeline for template-based modeling (TBM) in the 11th CASP experiment. Starting from multiple structure templates identified by LOMETS based meta-threading programs, the QUARK ab initio folding program is extended to generate initial full-length models under strong constraints from template alignments. The final atomic models are then constructed by I-TASSER based fragment reassembly simulations, followed by the fragment-guided molecular dynamic simulation and the MQAP-based model selection. It was found that the inclusion of QUARK-TBM simulations as an intermediate modeling step could help improve the quality of the I-TASSER models for both Easy and Hard TBM targets. Overall, the average TM-score of the first I-TASSER model is 12% higher than that of the best LOMETS templates, with the RMSD in the same threading-aligned regions reduced from 5.8 to 4.7 Å. Nevertheless, there are nearly 18% of TBM domains with the templates deteriorated by the structure assembly pipeline, which may be attributed to the errors of secondary structure and domain orientation predictions that propagate through and degrade the procedures of template identification and final model selections. To examine the record of progress, we made a retrospective report of the I-TASSER pipeline in the last five CASP experiments (CASP7-11). The data show no clear progress of the LOMETS threading programs over PSI-BLAST; but obvious progress on structural improvement relative to threading templates was witnessed in recent CASP experiments, which is probably attributed to the integration of the extended ab initio folding simulation with the threading assembly pipeline and the introduction of atomic-level structure refinements following the reduced modeling simulations. PMID:26343917

  12. The PSIPRED protein structure prediction server.

    PubMed

    McGuffin, L J; Bryson, K; Jones, D T

    2000-04-01

    The PSIPRED protein structure prediction server allows users to submit a protein sequence, perform a prediction of their choice and receive the results of the prediction both textually via e-mail and graphically via the web. The user may select one of three prediction methods to apply to their sequence: PSIPRED, a highly accurate secondary structure prediction method; MEMSAT 2, a new version of a widely used transmembrane topology prediction method; or GenTHREADER, a sequence profile based fold recognition method. Freely available to non-commercial users at http://globin.bio.warwick.ac.uk/psipred/

  13. Ecotoxicity quantitative structure-activity relationships for alcohol ethoxylate mixtures based on substance-specific toxicity predictions.

    PubMed

    Boeije, G M; Cano, M L; Marshall, S J; Belanger, S E; Van Compernolle, R; Dorn, P B; Gümbel, H; Toy, R; Wind, T

    2006-05-01

    Traditionally, ecotoxicity quantitative structure-activity relationships (QSARs) for alcohol ethoxylate (AE) surfactants have been developed by assigning the measured ecotoxicity for commercial products to the average structures (alkyl chain length and ethoxylate chain length) of these materials. Acute Daphnia magna toxicity tests for binary mixtures indicate that mixtures are more toxic than the individual AE substances corresponding with their average structures (due to the nonlinear relation of toxicity with structure). Consequently, the ecotoxicity value (expressed as effects concentration) attributed to the average structures that are used to develop the existing QSARs is expected to be too low. A new QSAR technique for complex substances, which interprets the mixture toxicity with regard to the "ethoxymers" distribution (i.e., the individual AE components) rather than the average structure, was developed. This new technique was then applied to develop new AE ecotoxicity QSARs for invertebrates, fish, and mesocosms. Despite the higher complexity, the fit and accuracy of the new QSARs are at least as good as those for the existing QSARs based on the same data set. As expected from typical ethoxymer distributions of commercial AEs, the new QSAR generally predicts less toxicity than the QSARs based on average structure.

  14. Defect Prediction and Control for Ultra-high-strength Steel Complex Structure in Hot Forming Based on FEM

    NASA Astrophysics Data System (ADS)

    Shang, Xin; Zhou, Jie; Zhuo, Fang; Luo, Yan; Li, Yang

    2015-06-01

    Cracking is the main defect in ultra-high-strength steel (UHSS) forming products. In order to avoid cracking, either adjusting process parameters or changing die's design is usually applied. However, under the condition of forming parts with unreasonable structure design, it makes little difference through the traditional methods of modifying process parameters. In this paper, true stress-strain curves under different strain rates and temperatures are obtained via the hot tensile tests. Then, the material constitutive model of UHSS is introduced into software CAE; this step is used to analyze and predict defects of UHSS hot forming complex structural parts based on FEM. In addition, simulation results of changed structure (open end) are compared with original structure (closed end). The results have shown that both maximum reduction ratio and stress in all directions are sharply reduced, i.e., the forming quality is improved significantly after changing the end structure. Finally, the prediction and control methods of forming defects are verified to be feasible in actual production.

  15. Predicting healthy older adult's brain age based on structural connectivity networks using artificial neural networks.

    PubMed

    Lin, Lan; Jin, Cong; Fu, Zhenrong; Zhang, Baiwen; Bin, Guangyu; Wu, Shuicai

    2016-03-01

    Brain ageing is followed by changes of the connectivity of white matter (WM) and changes of the grey matter (GM) concentration. Neurodegenerative disease is more vulnerable to an accelerated brain ageing, which is associated with prospective cognitive decline and disease severity. Accurate detection of accelerated ageing based on brain network analysis has a great potential for early interventions designed to hinder atypical brain changes. To capture the brain ageing, we proposed a novel computational approach for modeling the 112 normal older subjects (aged 50-79 years) brain age by connectivity analyses of networks of the brain. Our proposed method applied principal component analysis (PCA) to reduce the redundancy in network topological parameters. Back propagation artificial neural network (BPANN) improved by hybrid genetic algorithm (GA) and Levenberg-Marquardt (LM) algorithm is established to model the relation among principal components (PCs) and brain age. The predicted brain age is strongly correlated with chronological age (r=0.8). The model has mean absolute error (MAE) of 4.29 years. Therefore, we believe the method can provide a possible way to quantitatively describe the typical and atypical network organization of human brain and serve as a biomarker for presymptomatic detection of neurodegenerative diseases in the future.

  16. Evaluation of machine learning algorithms and structural features for optimal MRI-based diagnostic prediction in psychosis

    PubMed Central

    Salvador, Raymond; Radua, Joaquim; Canales-Rodríguez, Erick J.; Solanes, Aleix; Sarró, Salvador; Goikolea, José M.; Valiente, Alicia; Monté, Gemma C.; Natividad, María del Carmen; Guerrero-Pedraza, Amalia; Moro, Noemí; Fernández-Corcuera, Paloma; Amann, Benedikt L.; Maristany, Teresa; Vieta, Eduard; McKenna, Peter J.; Pomarol-Clotet, Edith

    2017-01-01

    A relatively large number of studies have investigated the power of structural magnetic resonance imaging (sMRI) data to discriminate patients with schizophrenia from healthy controls. However, very few of them have also included patients with bipolar disorder, allowing the clinically relevant discrimination between both psychotic diagnostics. To assess the efficacy of sMRI data for diagnostic prediction in psychosis we objectively evaluated the discriminative power of a wide range of commonly used machine learning algorithms (ridge, lasso, elastic net and L0 norm regularized logistic regressions, a support vector classifier, regularized discriminant analysis, random forests and a Gaussian process classifier) on main sMRI features including grey and white matter voxel-based morphometry (VBM), vertex-based cortical thickness and volume, region of interest volumetric measures and wavelet-based morphometry (WBM) maps. All possible combinations of algorithms and data features were considered in pairwise classifications of matched samples of healthy controls (N = 127), patients with schizophrenia (N = 128) and patients with bipolar disorder (N = 128). Results show that the selection of feature type is important, with grey matter VBM (without data reduction) delivering the best diagnostic prediction rates (averaging over classifiers: schizophrenia vs. healthy 75%, bipolar disorder vs. healthy 63% and schizophrenia vs. bipolar disorder 62%) whereas algorithms usually yielded very similar results. Indeed, those grey matter VBM accuracy rates were not even improved by combining all feature types in a single prediction model. Further multi-class classifications considering the three groups simultaneously made evident a lack of predictive power for the bipolar group, probably due to its intermediate anatomical features, located between those observed in healthy controls and those found in patients with schizophrenia. Finally, we provide MRIPredict (https

  17. Quantitative Vapor-phase IR Intensities and DFT Computations to Predict Absolute IR Spectra based on Molecular Structure: I. Alkanes

    SciTech Connect

    Williams, Stephen D.; Johnson, Timothy J.; Sharpe, Steven W.; Yavelak, Veronica; Oats, R. P.; Brauer, Carolyn S.

    2013-11-13

    Recently recorded quantitative IR spectra of a variety of gas-phase alkanes are shown to have integrated intensities in both the C-H stretching and C-H bending regions that depend linearly on the molecular size, i.e. the number of C-H bonds. This result is well predicted from CH4 to C15H32 by DFT computations of IR spectra at the B3LYP/6-31+G(d,p) level of DFT theory. A simple model predicting the absolute IR band intensities of alkanes based only on structural formula is proposed: For the C-H stretching band near 2930 cm-1 this is given by (in km/mol): CH¬_str = (34±3)*CH – (41±60) where CH is number of C-H bonds in the alkane. The linearity is explained in terms of coordinated motion of methylene groups rather than the summed intensities of autonomous -CH2- units. The effect of alkyl chain length on the intensity of a C-H bending mode is explored and interpreted in terms of conformer distribution. The relative intensity contribution of a methyl mode compared to the total C-H stretch intensity is shown to be linear in the number of terminal methyl groups in the alkane, and can be used to predict quantitative spectra a priori based on structure alone.

  18. Quantitative vapor-phase IR intensities and DFT computations to predict absolute IR spectra based on molecular structure: I. Alkanes

    NASA Astrophysics Data System (ADS)

    Williams, Stephen D.; Johnson, Timothy J.; Sharpe, Steven W.; Yavelak, Veronica; Oates, R. P.; Brauer, Carolyn S.

    2013-11-01

    Recently recorded quantitative IR spectra of a variety of gas-phase alkanes are shown to have integrated intensities in both the C3H stretching and C3H bending regions that depend linearly on the molecular size, i.e. the number of C3H bonds. This result is well predicted from CH4 to C15H32 by density functional theory (DFT) computations of IR spectra using Becke's three parameter functional (B3LYP/6-31+G(d,p)). Using the experimental data, a simple model predicting the absolute IR band intensities of alkanes based only on structural formula is proposed: For the C3H stretching band envelope centered near 2930 cm-1 this is given by (km/mol) CH_str=(34±1)×CH-(41±23) where CH is number of C3H bonds in the alkane. The linearity is explained in terms of coordinated motion of methylene groups rather than the summed intensities of autonomous -CH2-units. The effect of alkyl chain length on the intensity of a C3H bending mode is explored and interpreted in terms of conformer distribution. The relative intensity contribution of a methyl mode compared to the total C3H stretch intensity is shown to be linear in the number of methyl groups in the alkane, and can be used to predict quantitative spectra a priori based on structure alone.

  19. Multiscale modeling of interwoven Kevlar fibers based on random walk to predict yarn structural response

    NASA Astrophysics Data System (ADS)

    Recchia, Stephen

    Kevlar is the most common high-end plastic filament yarn used in body armor, tire reinforcement, and wear resistant applications. Kevlar is a trade name for an aramid fiber. These are fibers in which the chain molecules are highly oriented along the fiber axis, so the strength of the chemical bond can be exploited. The bulk material is extruded into filaments that are bound together into yarn, which may be chorded with other materials as in car tires, woven into a fabric, or layered in an epoxy to make composite panels. The high tensile strength to low weight ratio makes this material ideal for designs that decrease weight and inertia, such as automobile tires, body panels, and body armor. For designs that use Kevlar, increasing the strength, or tenacity, to weight ratio would improve performance or reduce cost of all products that are based on this material. This thesis computationally and experimentally investigates the tenacity and stiffness of Kevlar yarns with varying twist ratios. The test boundary conditions were replicated with a geometrically accurate finite element model, resulting in a customized code that can reproduce tortuous filaments in a yarn was developed. The solid model geometry capturing filament tortuosity was implemented through a random walk method of axial geometry creation. A finite element analysis successfully recreated the yarn strength and stiffness dependency observed during the tests. The physics applied in the finite element model was reproduced in an analytical equation that was able to predict the failure strength and strain dependency of twist ratio. The analytical solution can be employed to optimize yarn design for high strength applications.

  20. Impact of the subtle differences in MMP-12 structure on Glide-based molecular docking for pose prediction of inhibitors

    NASA Astrophysics Data System (ADS)

    Zhang, Huan; Wang, Yajing; Xu, Feng

    2014-11-01

    Human MMP-12 is involved in many aspects of disease pathology. Substantial efforts have been made to develop MMP-12 inhibitors. However, the mechanism of some MMP-12 inhibitors is still unclear. Recently, the method of molecular modeling was used to explore the mechanism, but selecting the best candidate among the wealth of MMP-12 structures poses a challenge. In this study, we attempted to identify several criteria to predict the most appropriate MMP-12 PDB ID for enzyme-ligand interaction studies based on cross-docking by Glide. Furthermore, the parameters from PDB files such as R-free, resolution, B factor, and the molecular volume of the ligand in the complex can provide useful clues for choosing a suitable approximate initial model for pose prediction for MMP-12 inhibitors. This work might also provide a useful reference for other drug targets.

  1. Impact of computational structure-based predictive toxicology in drug discovery.

    PubMed

    Mohan, Chethampadi Gopi

    2011-06-01

    Computational tools for predicting toxicity have been envisioned to have the potential to broadly impact up on the attrition rate of compounds in pre-clinical drug discovery and development. An integrated approach of computer-assisted, predictive, and physico-chemical properties of a compound, along with its in vitro and in vivo analysis, needs to be routinely exercised in the lead identification and lead optimization processes. Starting with a good lead can save a lot of money and it can significantly reduce the entire drug discovery process. The journey towards triple R's- reduce, replace and refine, further proves to be successful in predicting adverse drug reactions in patients (or animals) enrolled in clinical trials. However, the impact of predictive toxicity analysis was modest and relatively narrow in scope, due to the limited domain knowledge in this field. It is important to note that advances within medical science and newer approaches in drug development will require predictive toxicology applications to be viable. The field of computational toxicology has been heading in a direction more relevant to human diseases by reducing the adverse drug reactions. Therefore, efforts must be directed to integrating these tools relevant to the goal of preventing undesired toxicity in pre-clinical trials followed by different phases of clinical trials.

  2. Structure-based prediction of subtype-selectivity of Histamine H3 receptor selective antagonists in clinical trials

    PubMed Central

    Kim, Soo-Kyung; Fristrup, Peter; Abrol, Ravinder; Goddard, William A.

    2011-01-01

    Histamine receptors (HRs) are excellent drug targets for the treatment of diseases such as schizophrenia, psychosis, depression, migraine, allergies, asthma ulcers, and hypertension. Among them, the human H3 Histamine receptor (hH3HR) antagonists have been proposed for specific therapeutic applications, including treatment of Alzheimer's disease, attention deficit hyperactivity disorder (ADHD), epilepsy, and obesity.1 However, many of these drug candidates cause undesired side effects through the cross-reactivity with other histamine receptor subtypes. In order to develop improved selectivity and activity for such treatments it would be useful to have the three dimensional structures for all four HRs. We report here the predicted structures of four HR subtypes (H1, H2, H3, and H4) using the GEnSeMBLE (GPCR Ensemble of Structures in Membrane BiLayer Environment) Monte Carlo protocol.2 sampling ~ 35 million combinations of helix packings to predict the 10 most stable packings for each of the four subtypes. Then we used these best 10 protein structures with the DarwinDock Monte Carlo protocol to sample ~ 50,000*20 poses to predict the optimum ligand-protein structures for various agonists and antagonists. We find that E2065.46 contributes most in binding H3 selective agonists (5, 6, 7) in agreement with experimental mutation studies. We also find that conserved E5.46/ S5.43 in both of hH3HR and hH4HR are involved in H3/ H4 subtype selectivity. In addition, we find that M3786.55 in hH3HR provides additional hydrophobic interactions different from hH4HR (the corresponding amino acid of T3236.55 in hH4HR) to provide additional subtype bias. From these studies we developed a pharmacophore model based on our predictions for known hH3HR selective antagonists in clinical study [ABT-239 1, GSK-189,254 2, PF-3654746 3, and BF2.649 (Tiprolisant) 4] that suggests critical selectivity directing elements are: the basic proton interacting with D1143.32, the spacer, the aromatic

  3. Finite element prediction of seismic response modification of monumental structures utilizing base isolation

    NASA Astrophysics Data System (ADS)

    Spanos, Konstantinos; Anifantis, Nikolaos; Kakavas, Panayiotis

    2015-05-01

    The analysis of the mechanical behavior of ancient structures is an essential engineering task concerning the preservation of architectural heritage. As many monuments of classical antiquity are located in regions of earthquake activity, the safety assessment of these structures, as well as the selection of possible restoration interventions, requires numerical models capable of correctly representing their seismic response. The work presented herein was part of a research project in which a better understanding of the dynamics of classical column-architrave structures was sought by means of numerical techniques. In this paper, the seismic behavior of ancient monumental structures with multi-drum classical columns is investigated. In particular, the column-architrave classical structure under strong ground excitations was represented by a finite element method. This approach simulates the individual rock blocks as distinct rigid blocks interconnected with slidelines and incorporates seismic isolation dampers under the basement of the structure. Sliding and rocking motions of individual stone blocks and drums are modeled utilizing non-linear frictional contact conditions. The seismic isolation is modeled through the application of pad bearings under the basement of the structure. These pads are interpreted by appropriate rubber and steel layers. Time domain analyses were performed, considering the geometric and material non-linear behavior at the joints and the characteristics of pad bearings. The deformation and failure modes of drum columns subject to seismic excitations of various types and intensities were analyzed. The adverse influence of drum imperfections on structural safety was also examined.

  4. Real-time prediction of atmospheric Lagrangian coherent structures based on forecast data: An application and error analysis

    NASA Astrophysics Data System (ADS)

    BozorgMagham, Amir E.; Ross, Shane D.; Schmale, David G.

    2013-09-01

    The language of Lagrangian coherent structures (LCSs) provides a new means for studying transport and mixing of passive particles advected by an atmospheric flow field. Recent observations suggest that LCSs govern the large-scale atmospheric motion of airborne microorganisms, paving the way for more efficient models and management strategies for the spread of infectious diseases affecting plants, domestic animals, and humans. In addition, having reliable predictions of the timing of hyperbolic LCSs may contribute to improved aerobiological sampling of microorganisms with unmanned aerial vehicles and LCS-based early warning systems. Chaotic atmospheric dynamics lead to unavoidable forecasting errors in the wind velocity field, which compounds errors in LCS forecasting. In this study, we reveal the cumulative effects of errors of (short-term) wind field forecasts on the finite-time Lyapunov exponent (FTLE) fields and the associated LCSs when realistic forecast plans impose certain limits on the forecasting parameters. Objectives of this paper are to (a) quantify the accuracy of prediction of FTLE-LCS features and (b) determine the sensitivity of such predictions to forecasting parameters. Results indicate that forecasts of attracting LCSs exhibit less divergence from the archive-based LCSs than the repelling features. This result is important since attracting LCSs are the backbone of long-lived features in moving fluids. We also show under what circumstances one can trust the forecast results if one merely wants to know if an LCS passed over a region and does not need to precisely know the passage time.

  5. Protein structure prediction using hybrid AI methods

    SciTech Connect

    Guan, X.; Mural, R.J.; Uberbacher, E.C.

    1993-11-01

    This paper describes a new approach for predicting protein structures based on Artificial Intelligence methods and genetic algorithms. We combine nearest neighbor searching algorithms, neural networks, heuristic rules and genetic algorithms to form an integrated system to predict protein structures from their primary amino acid sequences. First we describe our methods and how they are integrated, and then apply our methods to several protein sequences. The results are very close to the real structures obtained by crystallography. Parallel genetic algorithms are also implemented.

  6. Protein Structure Prediction with Visuospatial Analogy

    NASA Astrophysics Data System (ADS)

    Davies, Jim; Glasgow, Janice; Kuo, Tony

    We show that visuospatial representations and reasoning techniques can be used as a similarity metric for analogical protein structure prediction. Our system retrieves pairs of α-helices based on contact map similarity, then transfers and adapts the structure information to an unknown helix pair, showing that similar protein contact maps predict similar 3D protein structure. The success of this method provides support for the notion that changing representations can enable similarity metrics in analogy.

  7. Interplay of I-TASSER and QUARK for template-based and ab initio protein structure prediction in CASP10

    PubMed Central

    Zhang, Yang

    2014-01-01

    We develop and test a new pipeline in CASP10 to predict protein structures based on an interplay of I-TASSER and QUARK for both free-modeling (FM) and template-based modeling (TBM) targets. The most noteworthy observation is that sorting through the threading template pool using the QUARK-based ab initio models as probes allows the detection of distant-homology templates which might be ignored by the traditional sequence profile-based threading alignment algorithms. Further template assembly refinement by I-TASSER resulted in successful folding of two medium-sized FM targets with >150 residues. For TBM, the multiple threading alignments from LOMETS are, for the first time, incorporated into the ab initio QUARK simulations, which were further refined by I-TASSER assembly refinement. Compared with the traditional threading assembly refinement procedures, the inclusion of the threading-constrained ab initio folding models can consistently improve the quality of the full-length models as assessed by the GDT-HA and hydrogen-bonding scores. Despite the success, significant challenges still exist in domain boundary prediction and consistent folding of medium-size proteins (especially beta-proteins) for nonhomologous targets. Further developments of sensitive fold-recognition and ab initio folding methods are critical for solving these problems. PMID:23760925

  8. Interplay of I-TASSER and QUARK for template-based and ab initio protein structure prediction in CASP10.

    PubMed

    Zhang, Yang

    2014-02-01

    We develop and test a new pipeline in CASP10 to predict protein structures based on an interplay of I-TASSER and QUARK for both free-modeling (FM) and template-based modeling (TBM) targets. The most noteworthy observation is that sorting through the threading template pool using the QUARK-based ab initio models as probes allows the detection of distant-homology templates which might be ignored by the traditional sequence profile-based threading alignment algorithms. Further template assembly refinement by I-TASSER resulted in successful folding of two medium-sized FM targets with >150 residues. For TBM, the multiple threading alignments from LOMETS are, for the first time, incorporated into the ab initio QUARK simulations, which were further refined by I-TASSER assembly refinement. Compared with the traditional threading assembly refinement procedures, the inclusion of the threading-constrained ab initio folding models can consistently improve the quality of the full-length models as assessed by the GDT-HA and hydrogen-bonding scores. Despite the success, significant challenges still exist in domain boundary prediction and consistent folding of medium-size proteins (especially beta-proteins) for nonhomologous targets. Further developments of sensitive fold-recognition and ab initio folding methods are critical for solving these problems. Copyright © 2013 Wiley Periodicals, Inc.

  9. Prediction of deleterious functional effects of amino acid mutations using a library of structure-based function descriptors.

    PubMed

    Herrgard, Sanna; Cammer, Stephen A; Hoffman, Brian T; Knutson, Stacy; Gallina, Marijo; Speir, Jeffrey A; Fetrow, Jacquelyn S; Baxter, Susan M

    2003-12-01

    An automated, active site-focused, computational method is described for use in predicting the effects of engineered amino acid mutations on enzyme catalytic activity. The method uses structure-based function descriptors (Fuzzy Functional Forms trade mark or FFFs trade mark ) to automatically identify enzyme functional sites in proteins. Three-dimensional sequence profiles are created from the surrounding active site structure. The computationally derived active site profile is used to analyze the effect of each amino acid change by defining three key features: proximity of the change to the active site, degree of amino acid conservation at the position in related proteins, and compatibility of the change with residues observed at that position in similar proteins. The features were analyzed using a data set of individual amino acid mutations occurring at 128 residue positions in 14 different enzymes. The results show that changes at key active site residues and at highly conserved positions are likely to have deleterious effects on the catalytic activity, and that non-conservative mutations at highly conserved residues are even more likely to be deleterious. Interestingly, the study revealed that amino acid substitutions at residues in close contact with the key active site residues are not more likely to have deleterious effects than mutations more distant from the active site. Utilization of the FFF-derived structural information yields a prediction method that is accurate in 79-83% of the test cases. The success of this method across all six EC classes suggests that it can be used generally to predict the effects of mutations and nsSNPs for enzymes. Future applications of the approach include automated, large-scale identification of deleterious nsSNPs in clinical populations and in large sets of disease-associated nsSNPs, and identification of deleterious nsSNPs in drug targets and drug metabolizing enzymes. Copyright 2003 Wiley-Liss, Inc.

  10. The MULTICOM toolbox for protein structure prediction.

    PubMed

    Cheng, Jianlin; Li, Jilong; Wang, Zheng; Eickholt, Jesse; Deng, Xin

    2012-04-30

    As genome sequencing is becoming routine in biomedical research, the total number of protein sequences is increasing exponentially, recently reaching over 108 million. However, only a tiny portion of these proteins (i.e. ~75,000 or < 0.07%) have solved tertiary structures determined by experimental techniques. The gap between protein sequence and structure continues to enlarge rapidly as the throughput of genome sequencing techniques is much higher than that of protein structure determination techniques. Computational software tools for predicting protein structure and structural features from protein sequences are crucial to make use of this vast repository of protein resources. To meet the need, we have developed a comprehensive MULTICOM toolbox consisting of a set of protein structure and structural feature prediction tools. These tools include secondary structure prediction, solvent accessibility prediction, disorder region prediction, domain boundary prediction, contact map prediction, disulfide bond prediction, beta-sheet topology prediction, fold recognition, multiple template combination and alignment, template-based tertiary structure modeling, protein model quality assessment, and mutation stability prediction. These tools have been rigorously tested by many users in the last several years and/or during the last three rounds of the Critical Assessment of Techniques for Protein Structure Prediction (CASP7-9) from 2006 to 2010, achieving state-of-the-art or near performance. In order to facilitate bioinformatics research and technological development in the field, we have made the MULTICOM toolbox freely available as web services and/or software packages for academic use and scientific research. It is available at http://sysbio.rnet.missouri.edu/multicom_toolbox/.

  11. The MULTICOM toolbox for protein structure prediction

    PubMed Central

    2012-01-01

    Background As genome sequencing is becoming routine in biomedical research, the total number of protein sequences is increasing exponentially, recently reaching over 108 million. However, only a tiny portion of these proteins (i.e. ~75,000 or < 0.07%) have solved tertiary structures determined by experimental techniques. The gap between protein sequence and structure continues to enlarge rapidly as the throughput of genome sequencing techniques is much higher than that of protein structure determination techniques. Computational software tools for predicting protein structure and structural features from protein sequences are crucial to make use of this vast repository of protein resources. Results To meet the need, we have developed a comprehensive MULTICOM toolbox consisting of a set of protein structure and structural feature prediction tools. These tools include secondary structure prediction, solvent accessibility prediction, disorder region prediction, domain boundary prediction, contact map prediction, disulfide bond prediction, beta-sheet topology prediction, fold recognition, multiple template combination and alignment, template-based tertiary structure modeling, protein model quality assessment, and mutation stability prediction. Conclusions These tools have been rigorously tested by many users in the last several years and/or during the last three rounds of the Critical Assessment of Techniques for Protein Structure Prediction (CASP7-9) from 2006 to 2010, achieving state-of-the-art or near performance. In order to facilitate bioinformatics research and technological development in the field, we have made the MULTICOM toolbox freely available as web services and/or software packages for academic use and scientific research. It is available at http://sysbio.rnet.missouri.edu/multicom_toolbox/. PMID:22545707

  12. Predicting the Effect of Mutations on Protein-Protein Binding Interactions through Structure-Based Interface Profiles

    PubMed Central

    Brender, Jeffrey R.; Zhang, Yang

    2015-01-01

    The formation of protein-protein complexes is essential for proteins to perform their physiological functions in the cell. Mutations that prevent the proper formation of the correct complexes can have serious consequences for the associated cellular processes. Since experimental determination of protein-protein binding affinity remains difficult when performed on a large scale, computational methods for predicting the consequences of mutations on binding affinity are highly desirable. We show that a scoring function based on interface structure profiles collected from analogous protein-protein interactions in the PDB is a powerful predictor of protein binding affinity changes upon mutation. As a standalone feature, the differences between the interface profile score of the mutant and wild-type proteins has an accuracy equivalent to the best all-atom potentials, despite being two orders of magnitude faster once the profile has been constructed. Due to its unique sensitivity in collecting the evolutionary profiles of analogous binding interactions and the high speed of calculation, the interface profile score has additional advantages as a complementary feature to combine with physics-based potentials for improving the accuracy of composite scoring approaches. By incorporating the sequence-derived and residue-level coarse-grained potentials with the interface structure profile score, a composite model was constructed through the random forest training, which generates a Pearson correlation coefficient >0.8 between the predicted and observed binding free-energy changes upon mutation. This accuracy is comparable to, or outperforms in most cases, the current best methods, but does not require high-resolution full-atomic models of the mutant structures. The binding interface profiling approach should find useful application in human-disease mutation recognition and protein interface design studies. PMID:26506533

  13. Static compressive strength prediction of open-hole structure based on non-linear shear behavior and micro-mechanics

    NASA Astrophysics Data System (ADS)

    Li, Wangnan; Cai, Hongneng; Li, Chao

    2014-11-01

    This paper deals with the characterization of the strength of the constituents of carbon fiber reinforced plastic laminate (CFRP), and a prediction of the static compressive strength of open-hole structure of polymer composites. The approach combined with non-linear analysis in macro-level and a linear elastic micromechanical failure analysis in microlevel (non-linear MMF) is proposed to improve the prediction accuracy. A face-centered cubic micromechanics model is constructed to analyze the stresses in fiber and matrix in microlevel. Non-interactive failure criteria are proposed to characterize the strength of fiber and matrix. The non-linear shear behavior of the laminate is studied experimentally, and a novel approach of cubic spline interpolation is used to capture significant non-linear shear behavior of laminate. The user-defined material subroutine UMAT for the non-linear share behavior is developed and combined in the mechanics analysis in the macro-level using the Abaqus Python codes. The failure mechanism and static strength of open-hole compressive (OHC) structure of polymer composites is studied based on non-linear MMF. The UTS50/E51 CFRP is used to demonstrate the application of theory of non-linear MMF.

  14. A chromatin structure-based model accurately predicts DNA replication timing in human cells.

    PubMed

    Gindin, Yevgeniy; Valenzuela, Manuel S; Aladjem, Mirit I; Meltzer, Paul S; Bilke, Sven

    2014-03-28

    The metazoan genome is replicated in precise cell lineage-specific temporal order. However, the mechanism controlling this orchestrated process is poorly understood as no molecular mechanisms have been identified that actively regulate the firing sequence of genome replication. Here, we develop a mechanistic model of genome replication capable of predicting, with accuracy rivaling experimental repeats, observed empirical replication timing program in humans. In our model, replication is initiated in an uncoordinated (time-stochastic) manner at well-defined sites. The model contains, in addition to the choice of the genomic landmark that localizes initiation, only a single adjustable parameter of direct biological relevance: the number of replication forks. We find that DNase-hypersensitive sites are optimal and independent determinants of DNA replication initiation. We demonstrate that the DNA replication timing program in human cells is a robust emergent phenomenon that, by its very nature, does not require a regulatory mechanism determining a proper replication initiation firing sequence.

  15. (PS)2: protein structure prediction server

    PubMed Central

    Chen, Chih-Chieh; Hwang, Jenn-Kang; Yang, Jinn-Moon

    2006-01-01

    Protein structure prediction provides valuable insights into function, and comparative modeling is one of the most reliable methods to predict 3D structures directly from amino acid sequences. However, critical problems arise during the selection of the correct templates and the alignment of query sequences therewith. We have developed an automatic protein structure prediction server, (PS)2, which uses an effective consensus strategy both in template selection, which combines PSI-BLAST and IMPALA, and target–template alignment integrating PSI-BLAST, IMPALA and T-Coffee. (PS)2 was evaluated for 47 comparative modeling targets in CASP6 (Critical Assessment of Techniques for Protein Structure Prediction). For the benchmark dataset, the predictive performance of (PS)2, based on the mean GTD_TS score, was superior to 10 other automatic servers. Our method is based solely on the consensus sequence and thus is considerably faster than other methods that rely on the additional structural consensus of templates. Our results show that (PS)2, coupled with suitable consensus strategies and a new similarity score, can significantly improve structure prediction. Our approach should be useful in structure prediction and modeling. The (PS)2 is available through the website at . PMID:16844981

  16. An efficient fragment-based approach for predicting the ground-state energies and structures of large molecules.

    PubMed

    Li, Shuhua; Li, Wei; Fang, Tao

    2005-05-18

    An efficient fragment-based approach for predicting the ground-state energies and structures of large molecules at the Hartree-Fock (HF) and post-HF levels is described. The physical foundation of this approach is attributed to the "quantum locality" of the electron correlation energy and the HF total energy, which is revealed by a new energy decomposition analysis of the HF total energy proposed in this work. This approach is based on the molecular fractionation with conjugated caps (MFCC) scheme (Zhang, D. W.; Zhang, J. Z. H. J. Chem. Phys. 2003, 119, 3599), by which a macromolecule is partitioned into various capped fragments and conjugated caps formed by two adjacent caps. We find that the MFCC scheme, if corrected by the interaction between non-neighboring fragments, can be used to predict the total energy of large molecules only from energy calculations on a series of small subsystems. The approach, named as energy-corrected MFCC (EC-MFCC), computationally achieves linear scaling with the molecular size. Our test calculations on a broad range of medium- and large molecules demonstrate that this approach is able to reproduce the conventional HF and second-order Moller-Plesset perturbation theory (MP2) energies within a few millihartree in most cases. With the EC-MFCC optimization algorithm described in this work, we have obtained the optimized structures of long oligomers of trans-polyacetylene and BN nanotubes with up to about 400 atoms, which are beyond the reach of traditional computational methods. In addition, the EC-MFCC approach is also applied to estimate the heats of formation for a series of organic compounds. This approach provides an appealing approach alternative to the traditional additivity rules based on either bond or group contributions for the estimation of thermochemical properties.

  17. Lewis base complexes of AlH3: prediction of preferred structure and stoichiometry.

    PubMed

    Humphries, Terry D; Munroe, Keelie T; Decken, Andreas; McGrady, G Sean

    2013-05-21

    The structures adopted by a range of complexes AlH3·nL, (n = 1 or 2), have been explored in detail to identify the factors that determine the value of n, and whether a monomeric or dimeric arrangement is preferred for the 1 : 1 complexes. Single-crystal X-ray diffraction, vibrational and NMR spectroscopies, and thermal analysis data have been collected, DFT calculations have been performed for AlH3·nL species, and pK(a) values have been collated for a series of amine and phosphine ligands L. The pK(a) of the ligand L exerts an important influence on the type of complex formed: as the basicity of L increases, a monomeric structure is favoured over a dimeric arrangement. Dimeric amine complexes form if pK(a) < 9.76, while monomeric complexes are preferred when pK(a) > 9.99. The steric requirements of L also influence the structural preference: bulky ligands with large cone angles (>163°) tend to favour formation of monomers, while smaller cone angles (<125°) encourage the formation of dimeric or 1 : 2 adducts. The steric bulk of the ligand appears to be more important for phosphine complexes, with smaller phosphines being unable to stabilise the complex at ambient temperatures even through dimerisation. Raman spectroscopy and DFT calculations have been particularly helpful in elucidating the stoichiometric preferences of complexes that have been contentious; these include AlH3·NMe2Et, AlH3·NMe3 and AlH3·nEt2O.

  18. Annotation inconsistencies beyond sequence similarity-based function prediction - phylogeny and genome structure.

    PubMed

    Promponas, Vasilis J; Iliopoulos, Ioannis; Ouzounis, Christos A

    2015-01-01

    The function annotation process in computational biology has increasingly shifted from the traditional characterization of individual biochemical roles of protein molecules to the system-wide detection of entire metabolic pathways and genomic structures. The so-called genome-aware methods broaden misannotation inconsistencies in genome sequences beyond protein function assignments, encompassing phylogenetic anomalies and artifactual genomic regions. We outline three categories of error propagation in databases by providing striking examples - at various levels of appreciation by the community from traditional to emerging, thus raising awareness for future solutions.

  19. Effective Optimization Algorithms for Fragment-Assembly Based Protein Structure Prediction

    DTIC Science & Technology

    2006-03-27

    approach. Our ex- periments test these algorithms on a diverse set of 276 protein domains derived from SCOP 1.69 [14]. The re- sults of these...using a set of proteins with known structure that was derived from SCOP 1.69 [14] as follows. Starting from the set of domains in SCOP , we first...intervals and SCOP class. Sequence Length SCOP Class < 100 100–200 > 200 total alpha 23 40 6 69 beta 23 27 18 69 alpha/beta 4 26 39 69 alpha+beta 15 36 17

  20. A physical approach to protein structure prediction.

    PubMed Central

    Crivelli, Silvia; Eskow, Elizabeth; Bader, Brett; Lamberti, Vincent; Byrd, Richard; Schnabel, Robert; Head-Gordon, Teresa

    2002-01-01

    We describe our global optimization method called Stochastic Perturbation with Soft Constraints (SPSC), which uses information from known proteins to predict secondary structure, but not in the tertiary structure predictions or in generating the terms of the physics-based energy function. Our approach is also characterized by the use of an all atom energy function that includes a novel hydrophobic solvation function derived from experiments that shows promising ability for energy discrimination against misfolded structures. We present the results obtained using our SPSC method and energy function for blind prediction in the 4th Critical Assessment of Techniques for Protein Structure Prediction competition, and show that our approach is more effective on targets for which less information from known proteins is available. In fact our SPSC method produced the best prediction for one of the most difficult targets of the competition, a new fold protein of 240 amino acids. PMID:11751294

  1. "Adapted Linear Interaction Energy": A Structure-Based LIE Parametrization for Fast Prediction of Protein-Ligand Affinities.

    PubMed

    Linder, Mats; Ranganathan, Anirudh; Brinck, Tore

    2013-02-12

    We present a structure-based parametrization of the Linear Interaction Energy (LIE) method and show that it allows for the prediction of absolute protein-ligand binding energies. We call the new model "Adapted" LIE (ALIE) because the α and β coefficients are defined by system-dependent descriptors and do therefore not require any empirical γ term. The best formulation attains a mean average deviation of 1.8 kcal/mol for a diverse test set and depends on only one fitted parameter. It is robust with respect to additional fitting and cross-validation. We compare this new approach with standard LIE by Åqvist and co-workers and the LIE + γSASA model (initially suggested by Jorgensen and co-workers) against in-house and external data sets and discuss their applicabilities.

  2. Microstructure-Based Fatigue Life Prediction Methods for Naval Steel Structures

    DTIC Science & Technology

    1994-09-12

    Subsequent crack PAP growth models based on the concept of a microstructural mep, unit size where the damage criterion is applied include , those by Antolovich ...m + Cu p Pi of U ’. 20 The present model is quite similar to that of Antolovich et al.� The essential difference between the two models 15U AL "is...dislocation cell size is considered in the latter model. The model of Antolovich et al.1221 10 •• can be expressed in a form similar to Eq. 1121. resulting 0

  3. Computational Prediction of RNA Tertiary Structure

    NASA Astrophysics Data System (ADS)

    Zhao, Yunjie; Gong, Zhou; Chen, Changjun; Xiao, Yi

    2012-02-01

    RNAs have been found to be involved in the biological processes. The large RNA usually consists of two basic elements: RNA hairpins and duplex. Due to the experimental determination difficulties, the few RNA tertiary structures limit our understanding of the specific regulation mechanisms and functions. Therefore, RNA tertiary structure prediction is very important for understanding RNA biological functions. Since RNA often folds hierarchically, one of the possible RNA structure prediction approaches is through the hierarchical steps. Here, we focus on the prediction method of RNA tertiary hairpin and duplex structures in which assembles the small tertiary structure fragments from well-defined RNA structural motifs. In a benchmark test with known experiment structures, more than half of the cases agree with the experimental structure better than 3 å RMSD over all the heavy atoms. The prediction results also reproduce the native like complementary base pairs of the secondary structures. Most importantly, the method performs the atomic accuracy of tertiary structures by about several minutes. We expect that the method will be a useful resource for RNA tertiary structure prediction and helpful to the biological research community.

  4. Reliability prediction of large fuel cell stack based on structure stress analysis

    NASA Astrophysics Data System (ADS)

    Liu, L. F.; Liu, B.; Wu, C. W.

    2017-09-01

    The aim of this paper is to improve the reliability of Proton Electrolyte Membrane Fuel Cell (PEMFC) stack by designing the clamping force and the thickness difference between the membrane electrode assembly (MEA) and the gasket. The stack reliability is directly determined by the component reliability, which is affected by the material property and contact stress. The component contact stress is a random variable because it is usually affected by many uncertain factors in the production and clamping process. We have investigated the influences of parameter variation coefficient on the probability distribution of contact stress using the equivalent stiffness model and the first-order second moment method. The optimal contact stress to make the component stay in the highest level reliability is obtained by the stress-strength interference model. To obtain the optimal contact stress between the contact components, the optimal thickness of the component and the stack clamping force are optimally designed. Finally, a detailed description is given how to design the MEA and gasket dimensions to obtain the highest stack reliability. This work can provide a valuable guidance in the design of stack structure for a high reliability of fuel cell stack.

  5. Predicting binding modes of reversible peptide-based inhibitors of falcipain-2 consistent with structure-activity relationships.

    PubMed

    Hernández González, Jorge Enrique; Hernández Alvarez, Lilian; Pascutti, Pedro Geraldo; Valiente, Pedro A

    2017-09-01

    Falcipain-2 (FP-2) is a major hemoglobinase of Plasmodium falciparum, considered an important drug target for the development of antimalarials. A previous study reported a novel series of 20 reversible peptide-based inhibitors of FP-2. However, the lack of tridimensional structures of the complexes hinders further optimization strategies to enhance the inhibitory activity of the compounds. Here we report the prediction of the binding modes of the aforementioned inhibitors to FP-2. A computational approach combining previous knowledge on the determinants of binding to the enzyme, docking, and postdocking refinement steps, is employed. The latter steps comprise molecular dynamics simulations and free energy calculations. Remarkably, this approach leads to the identification of near-native ligand conformations when applied to a validation set of protein-ligand structures. Overall, we proposed substrate-like binding modes of the studied compounds fulfilling the structural requirements for FP-2 binding and yielding free energy values that correlated well with the experimental data. Proteins 2017; 85:1666-1683. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  6. Predicting RNA structure: advances and limitations.

    PubMed

    Hofacker, Ivo L; Lorenz, Ronny

    2014-01-01

    RNA secondary structures can be predicted using efficient algorithms. A widely used software package implementing a large number of computational methods is the ViennaRNA Package. This chapter describes how to use programs from the ViennaRNA Package to perform common tasks such as prediction of minimum free-energy structures, suboptimal structures, or base pairing probabilities, and generating secondary structure plots with reliability annotation. Moreover, we present recent methods to assess the folding kinetics of an RNA via 2D projections of the energy landscape, identification of local minima and energy barriers, or simulation of RNA folding as a Markov process.

  7. An allometry-based approach for understanding forest structure, predicting tree-size distribution and assessing the degree of disturbance

    PubMed Central

    Anfodillo, Tommaso; Carrer, Marco; Simini, Filippo; Popa, Ionel; Banavar, Jayanth R.; Maritan, Amos

    2013-01-01

    Tree-size distribution is one of the most investigated subjects in plant population biology. The forestry literature reports that tree-size distribution trajectories vary across different stands and/or species, whereas the metabolic scaling theory suggests that the tree number scales universally as −2 power of diameter. Here, we propose a simple functional scaling model in which these two opposing results are reconciled. Basic principles related to crown shape, energy optimization and the finite-size scaling approach were used to define a set of relationships based on a single parameter that allows us to predict the slope of the tree-size distributions in a steady-state condition. We tested the model predictions on four temperate mountain forests. Plots (4 ha each, fully mapped) were selected with different degrees of human disturbance (semi-natural stands versus formerly managed). Results showed that the size distribution range successfully fitted by the model is related to the degree of forest disturbance: in semi-natural forests the range is wide, whereas in formerly managed forests, the agreement with the model is confined to a very restricted range. We argue that simple allometric relationships, at an individual level, shape the structure of the whole forest community. PMID:23193128

  8. An allometry-based approach for understanding forest structure, predicting tree-size distribution and assessing the degree of disturbance.

    PubMed

    Anfodillo, Tommaso; Carrer, Marco; Simini, Filippo; Popa, Ionel; Banavar, Jayanth R; Maritan, Amos

    2013-01-22

    Tree-size distribution is one of the most investigated subjects in plant population biology. The forestry literature reports that tree-size distribution trajectories vary across different stands and/or species, whereas the metabolic scaling theory suggests that the tree number scales universally as -2 power of diameter. Here, we propose a simple functional scaling model in which these two opposing results are reconciled. Basic principles related to crown shape, energy optimization and the finite-size scaling approach were used to define a set of relationships based on a single parameter that allows us to predict the slope of the tree-size distributions in a steady-state condition. We tested the model predictions on four temperate mountain forests. Plots (4 ha each, fully mapped) were selected with different degrees of human disturbance (semi-natural stands versus formerly managed). Results showed that the size distribution range successfully fitted by the model is related to the degree of forest disturbance: in semi-natural forests the range is wide, whereas in formerly managed forests, the agreement with the model is confined to a very restricted range. We argue that simple allometric relationships, at an individual level, shape the structure of the whole forest community.

  9. Water-Regulated Self-Assembly Structure Transformation and Gelation Behavior Prediction Based on a Hydrazide Derivative.

    PubMed

    Li, Yajie; Ran, Xia; Li, Qiuyue; Gao, Qiongqiong; Guo, Lijun

    2016-08-05

    Herein, we report the water-regulated supramolecular self-assembly structure transformation and the predictability of the gelation ability based on an azobenzene derivative bearing a hydrazide group, namely, N-(3,4,5-tributoxyphenyl)-N'-4-[(4-hydroxyphenyl)azophenyl] benzohydrazide (BNB-t4). The regulation effects are demonstrated in the morphological transformation from spherical to lamellar particles then back to spherical in different solvent ratios of n-propanol/water. The self-assembly behavior of BNB-t4 was characterized by minimum gelation concentration, microstructure, thermal, and mechanical stabilities. From the spectroscopy studies, it is suggested that gel formation of BNB-t4 is mainly driven by intermolecular hydrogen bonding, accompanied with the contribution from π-π stacking as well as hydrophobic interactions. The successfully established correlation between the self-assembly behavior and solubility parameters yields a facile way to predict the gelation performance of other molecules in other single or mixed solvents.

  10. Predicting pseudoknotted structures across two RNA sequences

    PubMed Central

    Sperschneider, Jana; Datta, Amitava; Wise, Michael J.

    2012-01-01

    Motivation: Laboratory RNA structure determination is demanding and costly and thus, computational structure prediction is an important task. Single sequence methods for RNA secondary structure prediction are limited by the accuracy of the underlying folding model, if a structure is supported by a family of evolutionarily related sequences, one can be more confident that the prediction is accurate. RNA pseudoknots are functional elements, which have highly conserved structures. However, few comparative structure prediction methods can handle pseudoknots due to the computational complexity. Results: A comparative pseudoknot prediction method called DotKnot-PW is introduced based on structural comparison of secondary structure elements and H-type pseudoknot candidates. DotKnot-PW outperforms other methods from the literature on a hand-curated test set of RNA structures with experimental support. Availability: DotKnot-PW and the RNA structure test set are available at the web site http://dotknot.csse.uwa.edu.au/pw. Contact: janaspe@csse.uwa.edu.au Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23044552

  11. Direct prediction of profiles of sequences compatible with a protein structure by neural networks with fragment-based local and energy-based nonlocal profiles.

    PubMed

    Li, Zhixiu; Yang, Yuedong; Faraggi, Eshel; Zhan, Jian; Zhou, Yaoqi

    2014-10-01

    Locating sequences compatible with a protein structural fold is the well-known inverse protein-folding problem. While significant progress has been made, the success rate of protein design remains low. As a result, a library of designed sequences or profile of sequences is currently employed for guiding experimental screening or directed evolution. Sequence profiles can be computationally predicted by iterative mutations of a random sequence to produce energy-optimized sequences, or by combining sequences of structurally similar fragments in a template library. The latter approach is computationally more efficient but yields less accurate profiles than the former because of lacking tertiary structural information. Here we present a method called SPIN that predicts Sequence Profiles by Integrated Neural network based on fragment-derived sequence profiles and structure-derived energy profiles. SPIN improves over the fragment-derived profile by 6.7% (from 23.6 to 30.3%) in sequence identity between predicted and wild-type sequences. The method also reduces the number of residues in low complex regions by 15.7% and has a significantly better balance of hydrophilic and hydrophobic residues at protein surface. The accuracy of sequence profiles obtained is comparable to those generated from the protein design program RosettaDesign 3.5. This highly efficient method for predicting sequence profiles from structures will be useful as a single-body scoring term for improving scoring functions used in protein design and fold recognition. It also complements protein design programs in guiding experimental design of the sequence library for screening and directed evolution of designed sequences. The SPIN server is available at http://sparks-lab.org. © 2014 Wiley Periodicals, Inc.

  12. Direct prediction of profiles of sequences compatible to a protein structure by neural networks with fragment-based local and energy-based nonlocal profiles

    PubMed Central

    Li, Zhixiu; Yang, Yuedong; Faraggi, Eshel; Zhan, Jian; Zhou, Yaoqi

    2014-01-01

    Locating sequences compatible to a protein structural fold is the well-known inverse protein-folding problem. While significant progress has been made, the success rate of protein design remains low. As a result, a library of designed sequences or profile of sequences is currently employed for guiding experimental screening or directed evolution. Sequence profiles can be computationally predicted by iterative mutations of a random sequence to produce energy-optimized sequences, or by combining sequences of structurally similar fragments in a template library. The latter approach is computationally more efficient but yields less accurate profiles than the former because of lacking tertiary structural information. Here we present a method called SPIN that predicts Sequence Profiles by Integrated Neural network based on fragment-derived sequence profiles and structure-derived energy profiles. SPIN improves over the fragment-derived profile by 6.7% (from 23.6% to 30.3%) in sequence identity between predicted and wild-type sequences. The method also reduces the number of residues in low complex regions by 15.7% and has a significant better balance of hydrophilic and hydrophobic residues at protein surfaces. The accuracy of sequence profiles obtained is comparable to those generated from the protein design program RosettaDesign 3.5. This highly efficient method for predicting sequence profiles from structures will be useful as a single-body scoring term for improving scoring functions used in protein design and fold recognition. It also complements protein design programs in guiding experimental design of the sequence library for screening and directed evolution of designed sequences. The SPIN server is available at http://sparks-lab.org. PMID:24898915

  13. Predicting crystal structures of organic compounds.

    PubMed

    Price, Sarah L

    2014-04-07

    Currently, organic crystal structure prediction (CSP) methods are based on searching for the most thermodynamically stable crystal structure, making various approximations in evaluating the crystal energy. The most stable (global minimum) structure provides a prediction of an experimental crystal structure. However, depending on the specific molecule, there may be other structures which are very close in energy. In this case, the other structures on the crystal energy landscape may be polymorphs, components of static or dynamic disorder in observed structures, or there may be no route to nucleating and growing these structures. A major reason for performing CSP studies is as a complement to solid form screening to see which alternative packings to the known polymorphs are thermodynamically feasible.

  14. Vertical Chlorophyll Canopy Structure Affects the Remote Sensing Based Predictability of LAI, Chlorophyll and Leaf Nitrogen in Agricultural Fields

    NASA Astrophysics Data System (ADS)

    Boegh, E.; Houborg, R.; Bienkowski, J.; Braban, C. F.; Dalgaard, T.; van Dijk, N.; Dragosits, U.; Holmes, E.; Magliulo, V.; Schelde, K.; Di Tommasi, P.; Vitale, L.; Theobald, M.; Cellier, P.; Sutton, M.

    2012-12-01

    Leaf nitrogen and leaf surface area influence the exchange of gases between terrestrial ecosystems and the atmosphere, and they play a significant role in the global cycles of carbon, nitrogen and water. Remote sensing can be used to estimate leaf area index (LAI), chlorophyll content (CHL) and leaf nitrogen (N), but methods are often developed using plot-scale data and not verified over extended regions characterized by variations in environmental boundary conditions (soil, atmosphere) and canopy structures. Estimation of N can be indirect due to its association with CHL, however N is also included in pigments such as carotenoids and anthocyanin which have different spectral signatures than CHL. Photosynthesis optimization theory suggests that plants will distribute their N resources in proportion to the light gradient within the canopy. Such vertical variation in CHL and N complicates the evaluation of remote sensing-based methods. Typically remote sensing studies measure CHL of the upper leaf, which is then multiplied by the green LAI to represent canopy chlorophyll content, or random sampling is used. In this study, field measurements and high spatial resolution (10-20 m) remote sensing images acquired from the HRG and HRVIR sensors aboard the SPOT satellites were used to assess the predictability of LAI, CHL and N in five European agricultural landscapes located in Denmark, Scotland (United Kingdom), Poland, The Netherlands and Italy . All satellite images were atmospherically using the 6SV1 model with atmospheric inputs estimated by MODIS and AIRS data. Five spectral vegetation indices (SVIs) were calculated (the Normalized Difference Vegetation index, the Simple Ratio, the Enhanced Vegetation Index-2, the Green Normalized Difference Vegetation Index, and the green Chlorophyll Index), and an image-based inverse canopy radiative transfer modelling system, REGFLEC (REGularized canopy reFLECtance) was applied to each of the five European landscapes. While the

  15. Predictive Models Based on Support Vector Machines: Whole-Brain versus Regional Analysis of Structural MRI in the Alzheimer's Disease.

    PubMed

    Retico, Alessandra; Bosco, Paolo; Cerello, Piergiorgio; Fiorina, Elisa; Chincarini, Andrea; Fantacci, Maria Evelina

    2015-01-01

    Decision-making systems trained on structural magnetic resonance imaging data of subjects affected by the Alzheimer's disease (AD) and healthy controls (CTRL) are becoming widespread prognostic tools for subjects with mild cognitive impairment (MCI). This study compares the performances of three classification methods based on support vector machines (SVMs), using as initial sets of brain voxels (ie, features): (1) the segmented grey matter (GM); (2) regions of interest (ROIs) by voxel-wise t-test filtering; (3) parceled ROIs, according to prior knowledge. The recursive feature elimination (RFE) is applied in all cases to investigate whether feature reduction improves the classification accuracy. We analyzed more than 600 AD Neuroimaging Initiative (ADNI) subjects, training the SVMs on the AD/CTRL dataset, and evaluating them on a trial MCI dataset. The classification performance, evaluated as the area under the receiver operating characteristic curve (AUC), reaches AUC = (88.9 ± .5)% in 20-fold cross-validation on the AD/CTRL dataset, when the GM is classified as a whole. The highest discrimination accuracy between MCI converters and nonconverters is achieved when the SVM-RFE is applied to the whole GM: with AUC reaching (70.7 ± .9)%, it outperforms both ROI-based approaches in predicting the AD conversion.

  16. Servers for protein structure prediction.

    PubMed

    Fischer, Daniel

    2006-04-01

    The 1990s cultivated a generation of protein structure human predictors. As a result of structural genomics and genome sequencing projects, and significant improvements in the performance of protein structure prediction methods, a generation of automated servers has evolved in the past few years. Servers for close and distant homology modeling are now routinely used by many biologists, and have already been applied to the experimental structure determination process itself, and to the interpretation and annotation of genome sequences. Because dozens of servers are currently available, it is hard for a biologist to know which server(s) to use; however, the state of the art of these methods is now assessed through the LiveBench and CAFASP experiments. Meta-servers--servers that use the results of other autonomous servers to produce a consensus prediction--have proven to be the best performers, and are already challenging all but a handful of expert human predictors. The difference in performance of the top ten autonomous (non-meta) servers is small and hard to assess using relatively small test sets. Recent experiments suggest that servers will soon free humans from most of the burden of protein structure prediction.

  17. Water in protein structure prediction

    PubMed Central

    Papoian, Garegin A.; Ulander, Johan; Eastwood, Michael P.; Luthey-Schulten, Zaida; Wolynes, Peter G.

    2004-01-01

    Proteins have evolved to use water to help guide folding. A physically motivated, nonpairwise-additive model of water-mediated interactions added to a protein structure prediction Hamiltonian yields marked improvement in the quality of structure prediction for larger proteins. Free energy profile analysis suggests that long-range water-mediated potentials guide folding and smooth the underlying folding funnel. Analyzing simulation trajectories gives direct evidence that water-mediated interactions facilitate native-like packing of supersecondary structural elements. Long-range pairing of hydrophilic groups is an integral part of protein architecture. Specific water-mediated interactions are a universal feature of biomolecular recognition landscapes in both folding and binding. PMID:14988499

  18. DR_bind: a web server for predicting DNA-binding residues from the protein structure based on electrostatics, evolution and geometry.

    PubMed

    Chen, Yao Chi; Wright, Jon D; Lim, Carmay

    2012-07-01

    DR_bind is a web server that automatically predicts DNA-binding residues, given the respective protein structure based on (i) electrostatics, (ii) evolution and (iii) geometry. In contrast to machine-learning methods, DR_bind does not require a training data set or any parameters. It predicts DNA-binding residues by detecting a cluster of conserved, solvent-accessible residues that are electrostatically stabilized upon mutation to Asp(-)/Glu(-). The server requires as input the DNA-binding protein structure in PDB format and outputs a downloadable text file of the predicted DNA-binding residues, a 3D visualization of the predicted residues highlighted in the given protein structure, and a downloadable PyMol script for visualization of the results. Calibration on 83 and 55 non-redundant DNA-bound and DNA-free protein structures yielded a DNA-binding residue prediction accuracy/precision of 90/47% and 88/42%, respectively. Since DR_bind does not require any training using protein-DNA complex structures, it may predict DNA-binding residues in novel structures of DNA-binding proteins resulting from structural genomics projects with no conservation data. The DR_bind server is freely available with no login requirement at http://dnasite.limlab.ibms.sinica.edu.tw.

  19. Improved Predictions of Secondary Structures for RNA

    NASA Astrophysics Data System (ADS)

    Jaeger, John A.; Turner, Douglas H.; Zuker, Michael

    1989-10-01

    The accuracy of computer predictions of RNA secondary structure from sequence data and free energy parameters has been increased to roughly 70%. Performance is judged by comparison with structures known from phylogenetic analysis. The algorithm also generates suboptimal structures. On average, the best structure within 10% of the lowest free energy contains roughly 90% of phylogenetically known helixes. The algorithm does not include tertiary interactions or pseudoknots and employs a crude model for single-stranded regions. The only favorable interactions are base pairing and stacking of terminal unpaired nucleotides at the ends of helixes. The excellent performance is consistent with these interactions being the primary interactions determining RNA secondary structure.

  20. Genome3D: a UK collaborative project to annotate genomic sequences with predicted 3D structures based on SCOP and CATH domains

    PubMed Central

    Lewis, Tony E.; Sillitoe, Ian; Andreeva, Antonina; Blundell, Tom L.; Buchan, Daniel W.A.; Chothia, Cyrus; Cuff, Alison; Dana, Jose M.; Filippis, Ioannis; Gough, Julian; Hunter, Sarah; Jones, David T.; Kelley, Lawrence A.; Kleywegt, Gerard J.; Minneci, Federico; Mitchell, Alex; Murzin, Alexey G.; Ochoa-Montaño, Bernardo; Rackham, Owen J. L.; Smith, James; Sternberg, Michael J. E.; Velankar, Sameer; Yeats, Corin; Orengo, Christine

    2013-01-01

    Genome3D, available at http://www.genome3d.eu, is a new collaborative project that integrates UK-based structural resources to provide a unique perspective on sequence–structure–function relationships. Leading structure prediction resources (DomSerf, FUGUE, Gene3D, pDomTHREADER, Phyre and SUPERFAMILY) provide annotations for UniProt sequences to indicate the locations of structural domains (structural annotations) and their 3D structures (structural models). Structural annotations and 3D model predictions are currently available for three model genomes (Homo sapiens, E. coli and baker’s yeast), and the project will extend to other genomes in the near future. As these resources exploit different strategies for predicting structures, the main aim of Genome3D is to enable comparisons between all the resources so that biologists can see where predictions agree and are therefore more trusted. Furthermore, as these methods differ in whether they build their predictions using CATH or SCOP, Genome3D also contains the first official mapping between these two databases. This has identified pairs of similar superfamilies from the two resources at various degrees of consensus (532 bronze pairs, 527 silver pairs and 370 gold pairs). PMID:23203986

  1. Prediction of Human Blood: Air Partition Coefficient: A Comparison of Structure-Based and Property-Based Methods

    DTIC Science & Technology

    2007-11-02

    models developed using experimental properties, including saline;air partition coefficient (longP saline;air) and olive oil ;air partition coefficient...logP olive oil ;air), as independent variables, indicating that the structure-property correlations are comparable to the property-property

  2. TRITIUM RESERVOIR STRUCTURAL PERFORMANCE PREDICTION

    SciTech Connect

    Lam, P.S.; Morgan, M.J

    2005-11-10

    The burst test is used to assess the material performance of tritium reservoirs in the surveillance program in which reservoirs have been in service for extended periods of time. A materials system model and finite element procedure were developed under a Savannah River Site Plant-Directed Research and Development (PDRD) program to predict the structural response under a full range of loading and aged material conditions of the reservoir. The results show that the predicted burst pressure and volume ductility are in good agreement with the actual burst test results for the unexposed units. The material tensile properties used in the calculations were obtained from a curved tensile specimen harvested from a companion reservoir by Electric Discharge Machining (EDM). In the absence of exposed and aged material tensile data, literature data were used for demonstrating the methodology in terms of the helium-3 concentration in the metal and the depth of penetration in the reservoir sidewall. It can be shown that the volume ductility decreases significantly with the presence of tritium and its decay product, helium-3, in the metal, as was observed in the laboratory-controlled burst tests. The model and analytical procedure provides a predictive tool for reservoir structural integrity under aging conditions. It is recommended that benchmark tests and analysis for aged materials be performed. The methodology can be augmented to predict performance for reservoir with flaws.

  3. Characteristics and Prediction of RNA Structure

    PubMed Central

    Zhu, Daming; Zhang, Caiming; Han, Huijian; Crandall, Keith A.

    2014-01-01

    RNA secondary structures with pseudoknots are often predicted by minimizing free energy, which is NP-hard. Most RNAs fold during transcription from DNA into RNA through a hierarchical pathway wherein secondary structures form prior to tertiary structures. Real RNA secondary structures often have local instead of global optimization because of kinetic reasons. The performance of RNA structure prediction may be improved by considering dynamic and hierarchical folding mechanisms. This study is a novel report on RNA folding that accords with the golden mean characteristic based on the statistical analysis of the real RNA secondary structures of all 480 sequences from RNA STRAND, which are validated by NMR or X-ray. The length ratios of domains in these sequences are approximately 0.382L, 0.5L, 0.618L, and L, where L is the sequence length. These points are just the important golden sections of sequence. With this characteristic, an algorithm is designed to predict RNA hierarchical structures and simulate RNA folding by dynamically folding RNA structures according to the above golden section points. The sensitivity and number of predicted pseudoknots of our algorithm are better than those of the Mfold, HotKnots, McQfold, ProbKnot, and Lhw-Zhu algorithms. Experimental results reflect the folding rules of RNA from a new angle that is close to natural folding. PMID:25110687

  4. Dynameomics: data-driven methods and models for utilizing large-scale protein structure repositories for improving fragment-based loop prediction.

    PubMed

    Rysavy, Steven J; Beck, David A C; Daggett, Valerie

    2014-11-01

    Protein function is intimately linked to protein structure and dynamics yet experimentally determined structures frequently omit regions within a protein due to indeterminate data, which is often due protein dynamics. We propose that atomistic molecular dynamics simulations provide a diverse sampling of biologically relevant structures for these missing segments (and beyond) to improve structural modeling and structure prediction. Here we make use of the Dynameomics data warehouse, which contains simulations of representatives of essentially all known protein folds. We developed novel computational methods to efficiently identify, rank and retrieve small peptide structures, or fragments, from this database. We also created a novel data model to analyze and compare large repositories of structural data, such as contained within the Protein Data Bank and the Dynameomics data warehouse. Our evaluation compares these structural repositories for improving loop predictions and analyzes the utility of our methods and models. Using a standard set of loop structures, containing 510 loops, 30 for each loop length from 4 to 20 residues, we find that the inclusion of Dynameomics structures in fragment-based methods improves the quality of the loop predictions without being dependent on sequence homology. Depending on loop length, ∼ 25-75% of the best predictions came from the Dynameomics set, resulting in lower main chain root-mean-square deviations for all fragment lengths using the combined fragment library. We also provide specific cases where Dynameomics fragments provide better predictions for NMR loop structures than fragments from crystal structures. Online access to these fragment libraries is available at http://www.dynameomics.org/fragments. © 2014 The Protein Society.

  5. Dynameomics: Data-driven methods and models for utilizing large-scale protein structure repositories for improving fragment-based loop prediction

    PubMed Central

    Rysavy, Steven J; Beck, David AC; Daggett, Valerie

    2014-01-01

    Protein function is intimately linked to protein structure and dynamics yet experimentally determined structures frequently omit regions within a protein due to indeterminate data, which is often due protein dynamics. We propose that atomistic molecular dynamics simulations provide a diverse sampling of biologically relevant structures for these missing segments (and beyond) to improve structural modeling and structure prediction. Here we make use of the Dynameomics data warehouse, which contains simulations of representatives of essentially all known protein folds. We developed novel computational methods to efficiently identify, rank and retrieve small peptide structures, or fragments, from this database. We also created a novel data model to analyze and compare large repositories of structural data, such as contained within the Protein Data Bank and the Dynameomics data warehouse. Our evaluation compares these structural repositories for improving loop predictions and analyzes the utility of our methods and models. Using a standard set of loop structures, containing 510 loops, 30 for each loop length from 4 to 20 residues, we find that the inclusion of Dynameomics structures in fragment-based methods improves the quality of the loop predictions without being dependent on sequence homology. Depending on loop length, ∼25–75% of the best predictions came from the Dynameomics set, resulting in lower main chain root-mean-square deviations for all fragment lengths using the combined fragment library. We also provide specific cases where Dynameomics fragments provide better predictions for NMR loop structures than fragments from crystal structures. Online access to these fragment libraries is available at http://www.dynameomics.org/fragments. PMID:25142412

  6. RNA Structure: Advances and Assessment of 3D Structure Prediction.

    PubMed

    Miao, Zhichao; Westhof, Eric

    2017-03-30

    Biological functions of RNA molecules are dependent upon sustained specific three-dimensional (3D) structures of RNA, with or without the help of proteins. Understanding of RNA structure is frequently based on 2D structures, which describe only the Watson-Crick (WC) base pairs. Here, we hierarchically review the structural elements of RNA and how they contribute to RNA 3D structure. We focus our analysis on the non-WC base pairs and on RNA modules. Several computer programs have now been designed to predict RNA modules. We describe the RNA-Puzzles initiative, which is a community-wide, blind assessment of RNA 3D structure prediction programs to determine the capabilities and bottlenecks of current predictions. The assessment metrics used in RNA-Puzzles are briefly described. The detection of RNA 3D modules from sequence data and their automatic implementation belong to the current challenges in RNA 3D structure prediction. Expected final online publication date for the Annual Review of Biophysics Volume 46 is May 20, 2017. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

  7. Particle-swarm structure prediction on clusters

    NASA Astrophysics Data System (ADS)

    Lv, Jian; Wang, Yanchao; Zhu, Li; Ma, Yanming

    2012-08-01

    We have developed an efficient method for cluster structure prediction based on the generalization of particle swarm optimization (PSO). A local version of PSO algorithm was implemented to utilize a fine exploration of potential energy surface for a given non-periodic system. We have specifically devised a technique of so-called bond characterization matrix (BCM) to allow the proper measure on the structural similarity. The BCM technique was then employed to eliminate similar structures and define the desirable local search spaces. We find that the introduction of point group symmetries into generation of cluster structures enables structural diversity and apparently avoids the generation of liquid-like (or disordered) clusters for large systems, thus considerably improving the structural search efficiency. We have incorporated Metropolis criterion into our method to further enhance the structural evolution towards low-energy regimes of potential energy surfaces. Our method has been extensively benchmarked on Lennard-Jones clusters with different sizes up to 150 atoms and applied into prediction of new structures of medium-sized Lin (n = 20, 40, 58) clusters. High search efficiency was achieved, demonstrating the reliability of the current methodology and its promise as a major method on cluster structure prediction.

  8. RFDT: A Rotation Forest-based Predictor for Predicting Drug-Target Interactions using Drug Structure and Protein Sequence Information.

    PubMed

    Wang, Lei; You, Zhu-Hong; Chen, Xing; Yan, Xin; Liu, Gang; Zhang, Wei

    2016-11-14

    Identification of interaction between drugs and target proteins plays an important role in discovering new drug candidates. However, through the experimental method to identify the drug-target interactions remain to be extremely time-consuming, expensive and challenging even nowadays. Therefore, it is urgent to develop new computational methods to predict potential drug-target interactions (DTI). In this article, a novel computational model is developed for predicting potential drug-target interactions under the theory that each drug-target interaction pair can be represented by the structural properties from drugs and evolutionary information derived from proteins. Specifically, the protein sequences are encoded as Position-Specific Scoring Matrix (PSSM) descriptor which contains information of biological evolutionary and the drug molecules are encoded as fingerprint feature vector which represents the existence of certain functional groups or fragments. Four benchmark datasets involving enzymes, ion channels, GPCRs and nuclear receptors, are independently used for establishing predictive models with Rotation Forest (RF) model. The proposed method achieved the prediction accuracy of 91.3%, 89.1%, 84.1% and 71.1% for four datasets respectively. In order to make our method more persuasive, we compared our classifier with the state-of-the-art Support Vector Machine (SVM) classifier. We also compared the proposed method with other excellent methods. Experimental results demonstrate that the proposed method is effective in the prediction of DTI, and can provide assistance for new drug research and development.

  9. SHAPE-Directed RNA Secondary Structure Prediction

    PubMed Central

    Low, Justin T.; Weeks, Kevin M.

    2010-01-01

    The diverse functional roles of RNA are determined by its underlying structure. Accurate and comprehensive knowledge of RNA structure would inform a broader understanding of RNA biology and facilitate exploiting RNA as a biotechnological tool and therapeutic target. Determining the pattern of base pairing, or secondary structure, of RNA is a first step in these endeavors. Advances in experimental, computational, and comparative analysis approaches for analyzing secondary structure have yielded accurate structures for many small RNAs, but only a few large (>500 nts) RNAs. In addition, most current methods for determining a secondary structure require considerable effort, analytical expertise, and technical ingenuity. In this review, we outline an efficient strategy for developing accurate secondary structure models for RNAs of arbitrary length. This approach melds structural information obtained using SHAPE chemistry with structure prediction using nearest-neighbor rules and the dynamic programming algorithm implemented in the RNAstructure program. Prediction accuracies reach ≥95% for RNAs on the kilobase scale. This approach facilitates both development of new models and refinement of existing RNA structure models, which we illustrate using the Gag-Pol frameshift element in an HIV-1 M-group genome. Most promisingly, integrated experimental and computational refinement brings closer the ultimate goal of efficiently and accurately establishing the secondary structure for any RNA sequence. PMID:20554050

  10. Prediction of P53 Mutants (Multiple Sites) Transcriptional Activity Based on Structural (2D&3D) Properties

    PubMed Central

    Geetha Ramani, R.; Jacob, Shomona Gracia

    2013-01-01

    Prediction of secondary site mutations that reinstate mutated p53 to normalcy has been the focus of intense research in the recent past owing to the fact that p53 mutants have been implicated in more than half of all human cancers and restoration of p53 causes tumor regression. However laboratory investigations are more often laborious and resource intensive but computational techniques could well surmount these drawbacks. In view of this, we formulated a novel approach utilizing computational techniques to predict the transcriptional activity of multiple site (one-site to five-site) p53 mutants. The optimal MCC obtained by the proposed approach on prediction of one-site, two-site, three-site, four-site and five-site mutants were 0.775,0.341,0.784,0.916 and 0.655 respectively, the highest reported thus far in literature. We have also demonstrated that 2D and 3D features generate higher prediction accuracy of p53 activity and our findings revealed the optimal results for prediction of p53 status, reported till date. We believe detection of the secondary site mutations that suppress tumor growth may facilitate better understanding of the relationship between p53 structure and function and further knowledge on the molecular mechanisms and biological activity of p53, a targeted source for cancer therapy. We expect that our prediction methods and reported results may provide useful insights on p53 functional mechanisms and generate more avenues for utilizing computational techniques in biological data analysis. PMID:23468845

  11. Structure-based activity prediction of CYP21A2 stability variants: A survey of available gene variations.

    PubMed

    Bruque, Carlos D; Delea, Marisol; Fernández, Cecilia S; Orza, Juan V; Taboas, Melisa; Buzzalino, Noemí; Espeche, Lucía D; Solari, Andrea; Luccerini, Verónica; Alba, Liliana; Nadra, Alejandro D; Dain, Liliana

    2016-12-14

    Congenital adrenal hyperplasia due to 21-hydroxylase deficiency accounts for 90-95% of CAH cases. In this work we performed an extensive survey of mutations and SNPs modifying the coding sequence of the CYP21A2 gene. Using bioinformatic tools and two plausible CYP21A2 structures as templates, we initially classified all known mutants (n = 343) according to their putative functional impacts, which were either reported in the literature or inferred from structural models. We then performed a detailed analysis on the subset of mutations believed to exclusively impact protein stability. For those mutants, the predicted stability was calculated and correlated with the variant's expected activity. A high concordance was obtained when comparing our predictions with available in vitro residual activities and/or the patient's phenotype. The predicted stability and derived activity of all reported mutations and SNPs lacking functional assays (n = 108) were assessed. As expected, most of the SNPs (52/76) showed no biological implications. Moreover, this approach was applied to evaluate the putative synergy that could emerge when two mutations occurred in cis. In addition, we propose a putative pathogenic effect of five novel mutations, p.L107Q, p.L122R, p.R132H, p.P335L and p.H466fs, found in 21-hydroxylase deficient patients of our cohort.

  12. Structure-based activity prediction of CYP21A2 stability variants: A survey of available gene variations

    PubMed Central

    Bruque, Carlos D.; Delea, Marisol; Fernández, Cecilia S.; Orza, Juan V.; Taboas, Melisa; Buzzalino, Noemí; Espeche, Lucía D.; Solari, Andrea; Luccerini, Verónica; Alba, Liliana; Nadra, Alejandro D.; Dain, Liliana

    2016-01-01

    Congenital adrenal hyperplasia due to 21-hydroxylase deficiency accounts for 90–95% of CAH cases. In this work we performed an extensive survey of mutations and SNPs modifying the coding sequence of the CYP21A2 gene. Using bioinformatic tools and two plausible CYP21A2 structures as templates, we initially classified all known mutants (n = 343) according to their putative functional impacts, which were either reported in the literature or inferred from structural models. We then performed a detailed analysis on the subset of mutations believed to exclusively impact protein stability. For those mutants, the predicted stability was calculated and correlated with the variant’s expected activity. A high concordance was obtained when comparing our predictions with available in vitro residual activities and/or the patient’s phenotype. The predicted stability and derived activity of all reported mutations and SNPs lacking functional assays (n = 108) were assessed. As expected, most of the SNPs (52/76) showed no biological implications. Moreover, this approach was applied to evaluate the putative synergy that could emerge when two mutations occurred in cis. In addition, we propose a putative pathogenic effect of five novel mutations, p.L107Q, p.L122R, p.R132H, p.P335L and p.H466fs, found in 21-hydroxylase deficient patients of our cohort. PMID:27966633

  13. A universal computational model for predicting antigenic variants of influenza A virus based on conserved antigenic structures

    PubMed Central

    Peng, Yousong; Wang, Dayan; Wang, Jianhong; Li, Kenli; Tan, Zhongyang; Shu, Yuelong; Jiang, Taijiao

    2017-01-01

    Rapid determination of the antigenicity of influenza A virus could help identify the antigenic variants in time. Currently, there is a lack of computational models for predicting antigenic variants of some common hemagglutinin (HA) subtypes of influenza A viruses. By means of sequence analysis, we demonstrate here that multiple HA subtypes of influenza A virus undergo similar mutation patterns of HA1 protein (the immunogenic part of HA). Further analysis on the antigenic variation of influenza A virus H1N1, H3N2 and H5N1 showed that the amino acid residues’ contribution to antigenic variation highly differed in these subtypes, while the regional bands, defined based on their distance to the top of HA1, played conserved roles in antigenic variation of these subtypes. Moreover, the computational models for predicting antigenic variants based on regional bands performed much better in the testing HA subtype than those did based on amino acid residues. Therefore, a universal computational model, named PREDAV-FluA, was built based on the regional bands to predict the antigenic variants for all HA subtypes of influenza A viruses. The model achieved an accuracy of 0.77 when tested with avian influenza H9N2 viruses. It may help for rapid identification of antigenic variants in influenza surveillance. PMID:28165025

  14. Structure-Based Prediction of Drug Distribution Across the Headgroup and Core Strata of a Phospholipid Bilayer Using Surrogate Phases

    PubMed Central

    2015-01-01

    locations for 27 compounds. The resulting structure-based prediction system for intrabilayer distribution will facilitate more realistic modeling of passive transport and drug interactions with those integral membrane proteins, which have the binding sites located in the bilayer, such as some enzymes, influx and efflux transporters, and receptors. If only overall bilayer accumulation is of interest, the 1-octanol/W P values suffice to model the studied set. PMID:25179490

  15. Structure-based prediction of drug distribution across the headgroup and core strata of a phospholipid bilayer using surrogate phases.

    PubMed

    Natesan, Senthil; Lukacova, Viera; Peng, Ming; Subramaniam, Rajesh; Lynch, Sandra; Wang, Zhanbin; Tandlich, Roman; Balaz, Stefan

    2014-10-06

    locations for 27 compounds. The resulting structure-based prediction system for intrabilayer distribution will facilitate more realistic modeling of passive transport and drug interactions with those integral membrane proteins, which have the binding sites located in the bilayer, such as some enzymes, influx and efflux transporters, and receptors. If only overall bilayer accumulation is of interest, the 1-octanol/W P values suffice to model the studied set.

  16. Protein structural domains: definition and prediction.

    PubMed

    Ezkurdia, Iakes; Tress, Michael L

    2011-11-01

    Recognition and prediction of structural domains in proteins is an important part of structure and function prediction. This unit lists the range of tools available for domain prediction, and describes sequence and structural analysis tools that complement domain prediction methods. Also detailed are the basic domain prediction steps, along with suggested strategies for different protein sequences and potential pitfalls in domain boundary prediction. The difficult problem of domain orientation prediction is also discussed. All the resources necessary for domain boundary prediction are accessible via publicly available Web servers and databases and do not require computational expertise.

  17. Kernel-Based, Partial Least Squares Quantitative Structure-Retention Relationship Model for UPLC Retention Time Prediction: A Useful Tool for Metabolite Identification.

    PubMed

    Falchi, Federico; Bertozzi, Sine Mandrup; Ottonello, Giuliana; Ruda, Gian Filippo; Colombano, Giampiero; Fiorelli, Claudio; Martucci, Cataldo; Bertorelli, Rosalia; Scarpelli, Rita; Cavalli, Andrea; Bandiera, Tiziano; Armirotti, Andrea

    2016-10-04

    We propose a new QSRR model based on a Kernel-based partial least-squares method for predicting UPLC retention times in reversed phase mode. The model was built using a combination of classical (physicochemical and topological) and nonclassical (fingerprints) molecular descriptors of 1383 compounds, encompassing different chemical classes and structures and their accurately measured retention time values. Following a random splitting of the data set into a training and a test set, we tested the ability of the model to predict the retention time of all the compounds. The best predicted/experimental R(2) value was higher than 0.86, while the best Q(2) value we observed was close to 0.84. A comparison of our model with traditional and simpler MLR and PLS regression models shows that KPLS better performs in term of correlation (R(2)), prediction (Q(2)), and support to MetID peak assignment. The KPLS model succeeded in two real-life MetID tasks by correctly predicting elution order of Phase I metabolites, including isomeric monohydroxylated compounds. We also show in this paper that the model's predictive power can be extended to different gradient profiles, by simple mathematical extrapolation using a known equation, thus offering very broad flexibility. Moreover, the current study includes a deep investigation of different types of chemical descriptors used to build the structure-retention relationship.

  18. RNA secondary structure prediction using soft computing.

    PubMed

    Ray, Shubhra Sankar; Pal, Sankar K

    2013-01-01

    Prediction of RNA structure is invaluable in creating new drugs and understanding genetic diseases. Several deterministic algorithms and soft computing-based techniques have been developed for more than a decade to determine the structure from a known RNA sequence. Soft computing gained importance with the need to get approximate solutions for RNA sequences by considering the issues related with kinetic effects, cotranscriptional folding, and estimation of certain energy parameters. A brief description of some of the soft computing-based techniques, developed for RNA secondary structure prediction, is presented along with their relevance. The basic concepts of RNA and its different structural elements like helix, bulge, hairpin loop, internal loop, and multiloop are described. These are followed by different methodologies, employing genetic algorithms, artificial neural networks, and fuzzy logic. The role of various metaheuristics, like simulated annealing, particle swarm optimization, ant colony optimization, and tabu search is also discussed. A relative comparison among different techniques, in predicting 12 known RNA secondary structures, is presented, as an example. Future challenging issues are then mentioned.

  19. In silico prediction of drug-target interaction networks based on drug chemical structure and protein sequences.

    PubMed

    Li, Zhengwei; Han, Pengyong; You, Zhu-Hong; Li, Xiao; Zhang, Yusen; Yu, Haiquan; Nie, Ru; Chen, Xing

    2017-09-11

    Analysis of drug-target interactions (DTIs) is of great importance in developing new drug candidates for known protein targets or discovering new targets for old drugs. However, the experimental approaches for identifying DTIs are expensive, laborious and challenging. In this study, we report a novel computational method for predicting DTIs using the highly discriminative information of drug-target interactions and our newly developed discriminative vector machine (DVM) classifier. More specifically, each target protein sequence is transformed as the position-specific scoring matrix (PSSM), in which the evolutionary information is retained; then the local binary pattern (LBP) operator is used to calculate the LBP histogram descriptor. For a drug molecule, a novel fingerprint representation is utilized to describe its chemical structure information representing existence of certain functional groups or fragments. When applying the proposed method to the four datasets (Enzyme, GPCR, Ion Channel and Nuclear Receptor) for predicting DTIs, we obtained good average accuracies of 93.16%, 89.37%, 91.73% and 92.22%, respectively. Furthermore, we compared the performance of the proposed model with that of the state-of-the-art SVM model and other previous methods. The achieved results demonstrate that our method is effective and robust and can be taken as a useful tool for predicting DTIs.

  20. Ko Displacement Theory for Structural Shape Predictions

    NASA Technical Reports Server (NTRS)

    Ko, William L.

    2010-01-01

    The development of the Ko displacement theory for predictions of structure deformed shapes was motivated in 2003 by the Helios flying wing, which had a 247-ft (75-m) wing span with wingtip deflections reaching 40 ft (12 m). The Helios flying wing failed in midair in June 2003, creating the need to develop new technology to predict in-flight deformed shapes of unmanned aircraft wings for visual display before the ground-based pilots. Any types of strain sensors installed on a structure can only sense the surface strains, but are incapable to sense the overall deformed shapes of structures. After the invention of the Ko displacement theory, predictions of structure deformed shapes could be achieved by feeding the measured surface strains into the Ko displacement transfer functions for the calculations of out-of-plane deflections and cross sectional rotations at multiple locations for mapping out overall deformed shapes of the structures. The new Ko displacement theory combined with a strain-sensing system thus created a revolutionary new structure- shape-sensing technology.

  1. Rich parameterization improves RNA structure prediction.

    PubMed

    Zakov, Shay; Goldberg, Yoav; Elhadad, Michael; Ziv-Ukelson, Michal

    2011-11-01

    Current approaches to RNA structure prediction range from physics-based methods, which rely on thousands of experimentally measured thermodynamic parameters, to machine-learning (ML) techniques. While the methods for parameter estimation are successfully shifting toward ML-based approaches, the model parameterizations so far remained fairly constant. We study the potential contribution of increasing the amount of information utilized by RNA folding prediction models to the improvement of their prediction quality. This is achieved by proposing novel models, which refine previous ones by examining more types of structural elements, and larger sequential contexts for these elements. Our proposed fine-grained models are made practical thanks to the availability of large training sets, advances in machine-learning, and recent accelerations to RNA folding algorithms. We show that the application of more detailed models indeed improves prediction quality, while the corresponding running time of the folding algorithm remains fast. An additional important outcome of this experiment is a new RNA folding prediction model (coupled with a freely available implementation), which results in a significantly higher prediction quality than that of previous models. This final model has about 70,000 free parameters, several orders of magnitude more than previous models. Being trained and tested over the same comprehensive data sets, our model achieves a score of 84% according to the F₁-measure over correctly-predicted base-pairs (i.e., 16% error rate), compared to the previously best reported score of 70% (i.e., 30% error rate). That is, the new model yields an error reduction of about 50%. Trained models and source code are available at www.cs.bgu.ac.il/?negevcb/contextfold.

  2. Input-based structure-specific proficiency predicts the neural mechanism of adult L2 syntactic processing.

    PubMed

    Deng, Taiping; Zhou, Huixia; Bi, Hong-Yan; Chen, Baoguo

    2015-06-12

    This study used Event-Related Potentials (ERPs) to explore the role of input-based structure-specific proficiency in L2 syntactic processing, using English subject-verb agreement structures as the stimuli. A pre-test/trainings/post-test paradigm of experimental and control groups was employed, and Chinese speakers who learned English as a second language (L2) participated in the experiment. At pre-test, no ERP component related to the subject-verb agreement structures violations was observed in either group. At training session, the experimental group learned the subject-verb agreement structures, while the control group learned other syntactic structures. After two continuously intensive input trainings, at post-test, a significant P600 component related to the subject-verb agreement structures violations was elicited in the experimental group, but not in the control group. These findings suggest that input training improves structure-specific proficiency, which is reflected in the neural mechanism of L2 syntactic processing.

  3. Gene structure prediction by linguistic methods

    SciTech Connect

    Dong, S.; Searls, D.B.

    1994-10-01

    The higher-order structure of genes and other features of biological sequences can be described by means of formal grammars. These grammars can then be used by general-purpose parsers to detect and to assemble such structures by means of syntactic pattern recognition. We describe a grammar and parser for eukaryotic protein-encoding genes, which by some measures is as effective as current connectionist and combinatorial algorithms in predicting gene structures for sequence database entries. Parameters of the grammar rules are optimized for several different species, and mixing experiments are performed to determine the degree of species specificity and the relative importance of compositional, signal-based, and syntactic components in gene prediction. 24 refs., 5 figs., 3 tabs.

  4. Protein complex compositions predicted by structural similarity

    PubMed Central

    Davis, Fred P.; Braberg, Hannes; Shen, Min-Yi; Pieper, Ursula; Sali, Andrej; Madhusudhan, M.S.

    2006-01-01

    Proteins function through interactions with other molecules. Thus, the network of physical interactions among proteins is of great interest to both experimental and computational biologists. Here we present structure-based predictions of 3387 binary and 1234 higher order protein complexes in Saccharomyces cerevisiae involving 924 and 195 proteins, respectively. To generate candidate complexes, comparative models of individual proteins were built and combined together using complexes of known structure as templates. These candidate complexes were then assessed using a statistical potential, derived from binary domain interfaces in PIBASE (). The statistical potential discriminated a benchmark set of 100 interface structures from a set of sequence-randomized negative examples with a false positive rate of 3% and a true positive rate of 97%. Moreover, the predicted complexes were also filtered using functional annotation and sub-cellular localization data. The ability of the method to select the correct binding mode among alternates is demonstrated for three camelid VHH domain—porcine α–amylase interactions. We also highlight the prediction of co-complexed domain superfamilies that are not present in template complexes. Through integration with MODBASE, the application of the method to proteomes that are less well characterized than that of S.cerevisiae will contribute to expansion of the structural and functional coverage of protein interaction space. The predicted complexes are deposited in MODBASE (). PMID:16738133

  5. Predictive models of biohydrogen and biomethane production based on the compositional and structural features of lignocellulosic materials.

    PubMed

    Monlau, Florian; Sambusiti, Cecilia; Barakat, Abdellatif; Guo, Xin Mei; Latrille, Eric; Trably, Eric; Steyer, Jean-Philippe; Carrere, Hélène

    2012-11-06

    In an integrated biorefinery concept, biological hydrogen and methane production from lignocellulosic substrates appears to be one of the most promising alternatives to produce energy from renewable sources. However, lignocellulosic substrates present compositional and structural features that can limit their conversion into biohydrogen and methane. In this study, biohydrogen and methane potentials of 20 lignocellulosic residues were evaluated. Compositional (lignin, cellulose, hemicelluloses, total uronic acids, proteins, and soluble sugars) as well as structural features (crystallinity) were determined for each substrate. Two predictive partial least square (PLS) models were built to determine which compositional and structural parameters affected biohydrogen or methane production from lignocellulosic substrates, among proteins, total uronic acids, soluble sugars, crystalline cellulose, amorphous holocelluloses, and lignin. Only soluble sugars had a significant positive effect on biohydrogen production. Besides, methane potentials correlated negatively to the lignin contents and, to a lower extent, crystalline cellulose showed also a negative impact, whereas soluble sugars, proteins, and amorphous hemicelluloses showed a positive impact. These findings will help to develop further pretreatment strategies for enhancing both biohydrogen and methane production.

  6. In silico exploratory study using structure-activity relationship models and metabolic information for prediction of mutagenicity based on the Ames test and rodent micronucleus assay.

    PubMed

    Kamath, P; Raitano, G; Fernández, A; Rallo, R; Benfenati, E

    2015-12-01

    The mutagenic potential of chemicals is a cause of growing concern, due to the possible impact on human health. In this paper we have developed a knowledge-based approach, combining information from structure-activity relationship (SAR) and metabolic triggers generated from the metabolic fate of chemicals in biological systems for prediction of mutagenicity in vitro based on the Ames test and in vivo based on the rodent micronucleus assay. In the first part of the work, a model was developed, which comprises newly generated SAR rules and a set of metabolic triggers. These SAR rules and metabolic triggers were further externally validated to predict mutagenicity in vitro, with metabolic triggers being used only to predict mutagenicity of chemicals, which were predicted unknown, by SARpy. Hence, this model has a higher accuracy than the SAR model, with an accuracy of 89% for the training set and 75% for the external validation set. Subsequently, the results of the second part of this work enlist a set of metabolic triggers for prediction of mutagenicity in vivo, based on the rodent micronucleus assay. Finally, the results of the third part enlist a list of metabolic triggers to find similarities and differences in the mutagenic response of chemicals in vitro and in vivo.

  7. Identification of family-specific residue packing motifs and their use for structure-based protein function prediction: I. Method development.

    PubMed

    Bandyopadhyay, Deepak; Huan, Jun; Prins, Jan; Snoeyink, Jack; Wang, Wei; Tropsha, Alexander

    2009-11-01

    Protein function prediction is one of the central problems in computational biology. We present a novel automated protein structure-based function prediction method using libraries of local residue packing patterns that are common to most proteins in a known functional family. Critical to this approach is the representation of a protein structure as a graph where residue vertices (residue name used as a vertex label) are connected by geometrical proximity edges. The approach employs two steps. First, it uses a fast subgraph mining algorithm to find all occurrences of family-specific labeled subgraphs for all well characterized protein structural and functional families. Second, it queries a new structure for occurrences of a set of motifs characteristic of a known family, using a graph index to speed up Ullman's subgraph isomorphism algorithm. The confidence of function inference from structure depends on the number of family-specific motifs found in the query structure compared with their distribution in a large non-redundant database of proteins. This method can assign a new structure to a specific functional family in cases where sequence alignments, sequence patterns, structural superposition and active site templates fail to provide accurate annotation.

  8. Prediction of RNA secondary structures with pseudoknots

    NASA Astrophysics Data System (ADS)

    Bon, M.; Orland, H.

    2010-08-01

    We present a new algorithm to predict RNA secondary structures with pseudoknots. The method is based on a classification of RNA structures according to their topological genus. The algorithm utilizes a simplified parametrization of the free energies for pair stacking, loop penalties, etc. and in addition a free energy penalty proportional to the topological genus of the pairing graph. Our method can take into account all pseudoknot topologies and achieves high success rates compared to state-of-the-art methods. This shows that the genus is a promising concept to classify pseudoknots.

  9. Predicting structured metadata from unstructured metadata

    PubMed Central

    Posch, Lisa; Panahiazar, Maryam; Dumontier, Michel; Gevaert, Olivier

    2016-01-01

    Enormous amounts of biomedical data have been and are being produced by investigators all over the world. However, one crucial and limiting factor in data reuse is accurate, structured and complete description of the data or data about the data—defined as metadata. We propose a framework to predict structured metadata terms from unstructured metadata for improving quality and quantity of metadata, using the Gene Expression Omnibus (GEO) microarray database. Our framework consists of classifiers trained using term frequency-inverse document frequency (TF-IDF) features and a second approach based on topics modeled using a Latent Dirichlet Allocation model (LDA) to reduce the dimensionality of the unstructured data. Our results on the GEO database show that structured metadata terms can be the most accurately predicted using the TF-IDF approach followed by LDA both outperforming the majority vote baseline. While some accuracy is lost by the dimensionality reduction of LDA, the difference is small for elements with few possible values, and there is a large improvement over the majority classifier baseline. Overall this is a promising approach for metadata prediction that is likely to be applicable to other datasets and has implications for researchers interested in biomedical metadata curation and metadata prediction. Database URL: http://www.yeastgenome.org/ PMID:28637268

  10. Predicting structured metadata from unstructured metadata.

    PubMed

    Posch, Lisa; Panahiazar, Maryam; Dumontier, Michel; Gevaert, Olivier

    2016-01-01

    Enormous amounts of biomedical data have been and are being produced by investigators all over the world. However, one crucial and limiting factor in data reuse is accurate, structured and complete description of the data or data about the data-defined as metadata. We propose a framework to predict structured metadata terms from unstructured metadata for improving quality and quantity of metadata, using the Gene Expression Omnibus (GEO) microarray database. Our framework consists of classifiers trained using term frequency-inverse document frequency (TF-IDF) features and a second approach based on topics modeled using a Latent Dirichlet Allocation model (LDA) to reduce the dimensionality of the unstructured data. Our results on the GEO database show that structured metadata terms can be the most accurately predicted using the TF-IDF approach followed by LDA both outperforming the majority vote baseline. While some accuracy is lost by the dimensionality reduction of LDA, the difference is small for elements with few possible values, and there is a large improvement over the majority classifier baseline. Overall this is a promising approach for metadata prediction that is likely to be applicable to other datasets and has implications for researchers interested in biomedical metadata curation and metadata prediction. © The Author(s) 2016. Published by Oxford University Press.

  11. Reliable resonance assignments of selected residues of proteins with known structure based on empirical NMR chemical shift prediction

    NASA Astrophysics Data System (ADS)

    Li, Da-Wei; Meng, Dan; Brüschweiler, Rafael

    2015-05-01

    A robust NMR resonance assignment method is introduced for proteins whose 3D structure has previously been determined by X-ray crystallography. The goal of the method is to obtain a subset of correct assignments from a parsimonious set of 3D NMR experiments of 15N, 13C labeled proteins. Chemical shifts of sequential residue pairs are predicted from static protein structures using PPM_One, which are then compared with the corresponding experimental shifts. Globally optimized weighted matching identifies the assignments that are robust with respect to small changes in NMR cross-peak positions. The method, termed PASSPORT, is demonstrated for 4 proteins with 100-250 amino acids using 3D NHCA and a 3D CBCA(CO)NH experiments as input producing correct assignments with high reliability for 22% of the residues. The method, which works best for Gly, Ala, Ser, and Thr residues, provides assignments that serve as anchor points for additional assignments by both manual and semi-automated methods or they can be directly used for further studies, e.g. on ligand binding, protein dynamics, or post-translational modification, such as phosphorylation.

  12. Reliable Resonance Assignments of Selected Residues of Proteins with Known Structure Based on Empirical NMR Chemical Shift Prediction

    PubMed Central

    Li, Da-Wei; Meng, Dan; Brüschweiler, Rafael

    2015-01-01

    A robust NMR resonance assignment method is introduced for proteins whose 3D structure has previously been determined by X-ray crystallography. The goal of the method is to obtain a subset of correct assignments from a parsimonious set of 3D NMR experiments of 15N, 13C labeled proteins. Chemical shifts of sequential residue pairs are predicted from static protein structures using PPM_One, which are then compared with the corresponding experimental shifts. Globally optimized weighted matching identifies the assignments that are robust with respect to small changes in NMR cross-peak positions. The method, termed PASSPORT, is demonstrated for 4 proteins with 100 – 250 amino acids using 3D NHCA and a 3D CBCA(CO)NH experiments as input producing correct assignments with high reliability for 22% of the residues. The method, which works best for Gly, Ala, Ser, and Thr residues, provides assignments that serve as anchor points for additional assignments by both manual and semi-automated methods or they can be directly used for further studies, e.g. on ligand binding, protein dynamics, or post-translational modification, such as phosphorylation. PMID:25863893

  13. A New Structure-Activity Relationship (SAR) Model for Predicting Drug-Induced Liver Injury, Based on Statistical and Expert-Based Structural Alerts

    PubMed Central

    Pizzo, Fabiola; Lombardo, Anna; Manganaro, Alberto; Benfenati, Emilio

    2016-01-01

    The prompt identification of chemical molecules with potential effects on liver may help in drug discovery and in raising the levels of protection for human health. Besides in vitro approaches, computational methods in toxicology are drawing attention. We built a structure-activity relationship (SAR) model for evaluating hepatotoxicity. After compiling a data set of 950 compounds using data from the literature, we randomly split it into training (80%) and test sets (20%). We also compiled an external validation set (101 compounds) for evaluating the performance of the model. To extract structural alerts (SAs) related to hepatotoxicity and non-hepatotoxicity we used SARpy, a statistical application that automatically identifies and extracts chemical fragments related to a specific activity. We also applied the chemical grouping approach for manually identifying other SAs. We calculated accuracy, specificity, sensitivity and Matthews correlation coefficient (MCC) on the training, test and external validation sets. Considering the complexity of the endpoint, the model performed well. In the training, test and external validation sets the accuracy was respectively 81, 63, and 68%, specificity 89, 33, and 33%, sensitivity 93, 88, and 80% and MCC 0.63, 0.27, and 0.13. Since it is preferable to overestimate hepatotoxicity rather than not to recognize unsafe compounds, the model's architecture followed a conservative approach. As it was built using human data, it might be applied without any need for extrapolation from other species. This model will be freely available in the VEGA platform. PMID:27920722

  14. Multithreaded parsing for predicting RNA secondary structures.

    PubMed

    Al-Mulhem, Muhammed S

    2010-01-01

    Many computational approaches have been developed for modelling and analysing the RNA secondary structure. These approaches are based on diverse methods such as grammars, dynamic programming, matching and evolutionary algorithms. This paper proposes a new parsing algorithm for the prediction of RNA secondary structures. The proposed algorithm is based on the shift-reduce LR parsing algorithm for programming languages. It has two main contributions: it extends the LR parsing algorithm by using a Stochastic Context-Free Grammar (SCFG) instead of Context-Free Grammar (CFG) for parsing RNA secondary structures; it extends the LR parsing algorithm by using a multithreaded approach to handle the LR parsing conflicts resulting from the use of ambiguous grammars.

  15. Practical lessons from protein structure prediction

    PubMed Central

    Ginalski, Krzysztof; Grishin, Nick V.; Godzik, Adam; Rychlewski, Leszek

    2005-01-01

    Despite recent efforts to develop automated protein structure determination protocols, structural genomics projects are slow in generating fold assignments for complete proteomes, and spatial structures remain unknown for many protein families. Alternative cheap and fast methods to assign folds using prediction algorithms continue to provide valuable structural information for many proteins. The development of high-quality prediction methods has been boosted in the last years by objective community-wide assessment experiments. This paper gives an overview of the currently available practical approaches to protein structure prediction capable of generating accurate fold assignment. Recent advances in assessment of the prediction quality are also discussed. PMID:15805122

  16. Prediction of binding affinity and efficacy of thyroid hormone receptor ligands using QSAR and structure based modeling methods

    PubMed Central

    Politi, Regina; Rusyn, Ivan; Tropsha, Alexander

    2016-01-01

    The thyroid hormone receptor (THR) is an important member of the nuclear receptor family that can be activated by endocrine disrupting chemicals (EDC). Quantitative Structure-Activity Relationship (QSAR) models have been developed to facilitate the prioritization of THR-mediated EDC for the experimental validation. The largest database of binding affinities available at the time of the study for ligand binding domain (LBD) of THRβ was assembled to generate both continuous and classification QSAR models with an external accuracy of R2=0.55 and CCR=0.76, respectively. In addition, for the first time a QSAR model was developed to predict binding affinities of antagonists inhibiting the interaction of coactivators with the AF-2 domain of THRβ (R2=0.70). Furthermore, molecular docking studies were performed for a set of THRβ ligands (57 agonists and 15 antagonists of LBD, 210 antagonists of the AF-2 domain, supplemented by putative decoys/non-binders) using several THRβ structures retrieved from the Protein Data Bank. We found that two agonist-bound THRβ conformations could effectively discriminate their corresponding ligands from presumed non-binders. Moreover, one of the agonist conformations could discriminate agonists from antagonists. Finally, we have conducted virtual screening of a chemical library compiled by the EPA as part of the Tox21 program to identify potential THRβ-mediated EDCs using both QSAR models and docking. We concluded that the library is unlikely to have any EDC that would bind to the THRβ. Models developed in this study can be employed either to identify environmental chemicals interacting with the THR or, conversely, to eliminate the THR-mediated mechanism of action for chemicals of concern. PMID:25058446

  17. Predicting organic food consumption: A meta-analytic structural equation model based on the theory of planned behavior.

    PubMed

    Scalco, Andrea; Noventa, Stefano; Sartori, Riccardo; Ceschi, Andrea

    2017-05-01

    During the last decade, the purchase of organic food within a sustainable consumption context has gained momentum. Consequently, the amount of research in the field has increased, leading in some cases to discrepancies regarding both methods and results. The present review examines those works that applied the theory of planned behavior (TPB; Ajzen, 1991) as a theoretical framework in order to understand and predict consumers' motivation to buy organic food. A meta-analysis has been conducted to assess the strength of the relationships between attitude, subjective norms, perceived behavioral control, and intention, as well as between intention and behavior. Results confirm the major role played by individual attitude in shaping buying intention, followed by subjective norms and perceived behavioral control. Intention-behavior shows a large effect size, few studies however explicitly reported such an association. Furthermore, starting from a pooled correlation matrix, a meta-analytic structural equation model has been applied to jointly evaluate the strength of the relationships among the factors of the original model. Results suggest the robustness of the TPB model. In addition, mediation analysis indicates a potential direct effect from subjective norms to individual attitude in the present context. Finally, some issues regarding methodological aspects of the application of the TPB within the context of organic food are discussed for further research developments. Copyright © 2017 Elsevier Ltd. All rights reserved.

  18. Mechanics based model for predicting structure-induced rolling resistance (SRR) of the tire-pavement system

    NASA Astrophysics Data System (ADS)

    Shakiba, Maryam; Ozer, Hasan; Ziyadi, Mojtaba; Al-Qadi, Imad L.

    2016-11-01

    The structure-induced rolling resistance of pavements, and its impact on vehicle fuel consumption, is investigated in this study. The structural response of pavement causes additional rolling resistance and fuel consumption of vehicles through deformation of pavement and various dissipation mechanisms associated with inelastic material properties and damping. Accurate and computationally efficient models are required to capture these mechanisms and obtain realistic estimates of changes in vehicle fuel consumption. Two mechanistic-based approaches are currently used to calculate vehicle fuel consumption as related to structural rolling resistance: dissipation-induced and deflection-induced methods. The deflection-induced approach is adopted in this study, and realistic representation of pavement-vehicle interactions (PVIs) is incorporated. In addition to considering viscoelastic behavior of asphalt concrete layers, the realistic representation of PVIs in this study includes non-uniform three-dimensional tire contact stresses and dynamic analysis in pavement simulations. The effects of analysis type, tire contact stresses, pavement viscoelastic properties, pavement damping coefficients, vehicle speed, and pavement temperature are then investigated.

  19. Implementation of pseudoreceptor-based pharmacophore queries in the prediction of probable protein targets: explorations in the protein structural profile of Zea mays.

    PubMed

    Kumar, Sivakumar Prasanth; Jha, Prakash C; Pandya, Himanshu A; Jasrai, Yogesh T

    2014-07-01

    Molecular docking plays an important role in the protein target identification by prioritizing probable druggable proteins using docking energies. Due to the limitations of docking scoring schemes, there arises a need for structure-based approaches to acquire confidence in theoretical binding affinities. In this direction, we present here a receptor (protein)-based approach to predict probable protein targets using a small molecule of interest. We adopted a reverse approach wherein the ligand pharmacophore features were used to decipher interaction complementary amino acids of protein cavities (a pseudoreceptor) and expressed as queries to match the cavities or binding sites of the protein dataset. These pseudoreceptor-based pharmacophore queries were used to estimate total probabilities of each protein cavity thereby representing the ligand binding efficiency of the protein. We applied this approach to predict 3 experimental protein targets among 28 Zea mays structural data using 3 co-crystallized ligands as inputs and compared its effectiveness using conventional docking results. We suggest that the combination of total probabilities and docking energies increases the confidence in prioritizing probable protein targets using docking methods. These prediction hypotheses were further supported by DrugScoreX (DSX) pair potential calculations and molecular dynamic simulations.

  20. Web-based toolkits for topology prediction of transmembrane helical proteins, fold recognition, structure and binding scoring, folding-kinetics analysis and comparative analysis of domain combinations.

    PubMed

    Zhou, Hongyi; Zhang, Chi; Liu, Song; Zhou, Yaoqi

    2005-07-01

    We have developed the following web servers for protein structural modeling and analysis at http://theory.med.buffalo.edu: THUMBUP, UMDHMM(TMHP) and TUPS, predictors of transmembrane helical protein topology based on a mean-burial-propensity scale of amino acid residues (THUMBUP), hidden Markov model (UMDHMM(TMHP)) and their combinations (TUPS); SPARKS 2.0 and SP3, two profile-profile alignment methods, that match input query sequence(s) to structural templates by integrating sequence profile with knowledge-based structural score (SPARKS 2.0) and structure-derived profile (SP3); DFIRE, a knowledge-based potential for scoring free energy of monomers (DMONOMER), loop conformations (DLOOP), mutant stability (DMUTANT) and binding affinity of protein-protein/peptide/DNA complexes (DCOMPLEX & DDNA); TCD, a program for protein-folding rate and transition-state analysis of small globular proteins; and DOGMA, a web-server that allows comparative analysis of domain combinations between plant and other 55 organisms. These servers provide tools for prediction and/or analysis of proteins on the secondary structure, tertiary structure and interaction levels, respectively.

  1. Prediction of binding affinity and efficacy of thyroid hormone receptor ligands using QSAR and structure-based modeling methods

    SciTech Connect

    Politi, Regina; Rusyn, Ivan; Tropsha, Alexander

    2014-10-01

    The thyroid hormone receptor (THR) is an important member of the nuclear receptor family that can be activated by endocrine disrupting chemicals (EDC). Quantitative Structure–Activity Relationship (QSAR) models have been developed to facilitate the prioritization of THR-mediated EDC for the experimental validation. The largest database of binding affinities available at the time of the study for ligand binding domain (LBD) of THRβ was assembled to generate both continuous and classification QSAR models with an external accuracy of R{sup 2} = 0.55 and CCR = 0.76, respectively. In addition, for the first time a QSAR model was developed to predict binding affinities of antagonists inhibiting the interaction of coactivators with the AF-2 domain of THRβ (R{sup 2} = 0.70). Furthermore, molecular docking studies were performed for a set of THRβ ligands (57 agonists and 15 antagonists of LBD, 210 antagonists of the AF-2 domain, supplemented by putative decoys/non-binders) using several THRβ structures retrieved from the Protein Data Bank. We found that two agonist-bound THRβ conformations could effectively discriminate their corresponding ligands from presumed non-binders. Moreover, one of the agonist conformations could discriminate agonists from antagonists. Finally, we have conducted virtual screening of a chemical library compiled by the EPA as part of the Tox21 program to identify potential THRβ-mediated EDCs using both QSAR models and docking. We concluded that the library is unlikely to have any EDC that would bind to the THRβ. Models developed in this study can be employed either to identify environmental chemicals interacting with the THR or, conversely, to eliminate the THR-mediated mechanism of action for chemicals of concern. - Highlights: • This is the largest curated dataset for ligand binding domain (LBD) of the THRβ. • We report the first QSAR model for antagonists of AF-2 domain of THRβ. • A combination of QSAR and docking enables

  2. Topolology-symmetry law of structure of natural titanosilicate micas and related heterophyllosilicates based on the extended OD theory: Structure prediction

    NASA Astrophysics Data System (ADS)

    Belokoneva, E. L.; Topnikova, A. P.; Aksenov, S. M.

    2015-01-01

    A topology-symmetry analysis of the structures in the family of titanosilicate micas and related heterophyllosilicates based on the extended OD theory reveals their kinship with the family of rhodezite, delhayelite, and other minerals that had been analyzed earlier by distinguishing sheets common for all the structures. Like in the family studied earlier, the structural variety of a more complex titanosilicate family is determined by different local symmetries of sheets. Sheets consist of central O layers of edge-sharing octahedra and H layers formed by tetrahedra connected into diortho groups and Ti(Nb,Fe) semioctahedra (octahedra). Three patterns of connection of O and H layers correspond to sheet symmetry P2/ m, P21/ m, and . Various symmetry modes of sheet connection in the structures are analyzed. Hypothetical structures, including structures with a higher degree of disorder, which can be found in nature or obtained by crystal synthesis, are deduced. Factors responsible for structural variety, including the existence of two main sheet varieties (with P2/ m and P21/ m symmetry) are considered a consequence of the difference in the chemism of the mineral formation medium.

  3. Structure prediction of LDLR-HNP1 complex based on docking enhanced by LDLR binding 3D motif.

    PubMed

    Esmaielbeiki, Reyhaneh; Naughton, Declan P; Nebel, Jean-Christophe

    2012-04-01

    Human antimicrobial peptides (AMPs), including defensins, have come under intense scrutiny owing to their key multiple roles as antimicrobial agents. Not only do they display direct action on microbes, but also recently they have been shown to interact with the immune system to increase antimicrobial activity. Unfortunately, since mechanisms involved in the binding of AMPs to mammalian cells are largely unknown, their potential as novel anti-infective agents cannot be exploited yet. Following the reported interaction of Human Neutrophil Peptide 1 dimer (HNP1) with a low density lipoprotein receptor (LDLR), a computational study was conducted to discover their putative mode of interaction. State-of-the-art docking software produced a set of LDLR-HNP1 complex 3D models. Creation of a 3D motif capturing atomic interactions of the LDLR binding interface allowed selection of the most plausible configurations. Eventually, only two models were in agreement with the literature. Binding energy estimations revealed that only one of them is particularly stable, but also interaction with LDLR weakens significantly bonds within the HNP1 dimer. This may be significant since it suggests a mechanism for internalisation of HNP1 in mammalian cells. In addition to a novel approach for complex structure prediction, this study proposes a 3D model of the LDLR-HNP1 complex which highlights the key residues which are involved in the interactions. The putative identification of the receptor binding mechanism should inform the future design of synthetic HNPs to afford maximum internalisation, which could lead to novel anti-infective drugs.

  4. Inference of Expanded Lrp-Like Feast/Famine Transcription Factor Targets in a Non-Model Organism Using Protein Structure-Based Prediction

    PubMed Central

    Ashworth, Justin; Plaisier, Christopher L.; Lo, Fang Yin; Reiss, David J.; Baliga, Nitin S.

    2014-01-01

    Widespread microbial genome sequencing presents an opportunity to understand the gene regulatory networks of non-model organisms. This requires knowledge of the binding sites for transcription factors whose DNA-binding properties are unknown or difficult to infer. We adapted a protein structure-based method to predict the specificities and putative regulons of homologous transcription factors across diverse species. As a proof-of-concept we predicted the specificities and transcriptional target genes of divergent archaeal feast/famine regulatory proteins, several of which are encoded in the genome of Halobacterium salinarum. This was validated by comparison to experimentally determined specificities for transcription factors in distantly related extremophiles, chromatin immunoprecipitation experiments, and cis-regulatory sequence conservation across eighteen related species of halobacteria. Through this analysis we were able to infer that Halobacterium salinarum employs a divergent local trans-regulatory strategy to regulate genes (carA and carB) involved in arginine and pyrimidine metabolism, whereas Escherichia coli employs an operon. The prediction of gene regulatory binding sites using structure-based methods is useful for the inference of gene regulatory relationships in new species that are otherwise difficult to infer. PMID:25255272

  5. Retention prediction of low molecular weight anions in ion chromatography based on quantitative structure-retention relationships applied to the linear solvent strength model.

    PubMed

    Park, Soo Hyun; Haddad, Paul R; Talebi, Mohammad; Tyteca, Eva; Amos, Ruth I J; Szucs, Roman; Dolan, John W; Pohl, Christopher A

    2017-02-24

    Quantitative Structure-Retention Relationships (QSRRs) represent a popular technique to predict the retention times of analytes, based on molecular descriptors encoding the chemical structures of the analytes. The linear solvent strength (LSS) model relating the retention factor, k to the eluent concentration (log k=a-blog [eluent]), is a well-known and accurate retention model in ion chromatography (IC). In this work, QSRRs for inorganic and small organic anions were used to predict the regression parameters a and b in the LSS model (and hence retention times) for these analytes under a wide range of eluent conditions, based solely on their chemical structures. This approach was performed on retention data of inorganic and small organic anions from the "Virtual Column" software (Thermo Fisher Scientific). These retention data were recalibrated via a "porting" methodology on three columns (AS20, AS19, and AS11HC), prior to the QSRR modeling. This provided retention data more applicable on recently produced columns which may exhibit changes of column behavior due to batch-to-batch variability. Molecular descriptors for the analytes were calculated with Dragon software using the geometry-optimized molecular structures, employing the AM1 semi-empirical method. An optimal subset of molecular descriptors was then selected using an evolutionary algorithm (EA). Finally, the QSRR models were generated by multiple linear regression (MLR). As a result, six QSRR models with good predictive performance were successfully derived for a- and b-values on three columns (R(2)>0.98 and RMSE<0.11). External validation showed the possibility of using the developed QSRR models as predictive tools in IC (Qext(F3)(2)>0.7 and RMSEP<0.4). Moreover, it was demonstrated that the obtained QSRR models for the a- and b-values can predict the retention times for new analytes with good accuracy and predictability (R(2) of 0.98, RMSE of 0.89min, Qext(F3)(2) of 0.96 and RMSEP of 1.18min).

  6. RNA-SSPT: RNA Secondary Structure Prediction Tools

    PubMed Central

    Ahmad, Freed; Mahboob, Shahid; Gulzar, Tahsin; din, Salah U; Hanif, Tanzeela; Ahmad, Hifza; Afzal, Muhammad

    2013-01-01

    The prediction of RNA structure is useful for understanding evolution for both in silico and in vitro studies. Physical methods like NMR studies to predict RNA secondary structure are expensive and difficult. Computational RNA secondary structure prediction is easier. Comparative sequence analysis provides the best solution. But secondary structure prediction of a single RNA sequence is challenging. RNA-SSPT is a tool that computationally predicts secondary structure of a single RNA sequence. Most of the RNA secondary structure prediction tools do not allow pseudoknots in the structure or are unable to locate them. Nussinov dynamic programming algorithm has been implemented in RNA-SSPT. The current studies shows only energetically most favorable secondary structure is required and the algorithm modification is also available that produces base pairs to lower the total free energy of the secondary structure. For visualization of RNA secondary structure, NAVIEW in C language is used and modified in C# for tool requirement. RNA-SSPT is built in C# using Dot Net 2.0 in Microsoft Visual Studio 2005 Professional edition. The accuracy of RNA-SSPT is tested in terms of Sensitivity and Positive Predicted Value. It is a tool which serves both secondary structure prediction and secondary structure visualization purposes. PMID:24250115

  7. RNA-SSPT: RNA Secondary Structure Prediction Tools.

    PubMed

    Ahmad, Freed; Mahboob, Shahid; Gulzar, Tahsin; Din, Salah U; Hanif, Tanzeela; Ahmad, Hifza; Afzal, Muhammad

    2013-01-01

    The prediction of RNA structure is useful for understanding evolution for both in silico and in vitro studies. Physical methods like NMR studies to predict RNA secondary structure are expensive and difficult. Computational RNA secondary structure prediction is easier. Comparative sequence analysis provides the best solution. But secondary structure prediction of a single RNA sequence is challenging. RNA-SSPT is a tool that computationally predicts secondary structure of a single RNA sequence. Most of the RNA secondary structure prediction tools do not allow pseudoknots in the structure or are unable to locate them. Nussinov dynamic programming algorithm has been implemented in RNA-SSPT. The current studies shows only energetically most favorable secondary structure is required and the algorithm modification is also available that produces base pairs to lower the total free energy of the secondary structure. For visualization of RNA secondary structure, NAVIEW in C language is used and modified in C# for tool requirement. RNA-SSPT is built in C# using Dot Net 2.0 in Microsoft Visual Studio 2005 Professional edition. The accuracy of RNA-SSPT is tested in terms of Sensitivity and Positive Predicted Value. It is a tool which serves both secondary structure prediction and secondary structure visualization purposes.

  8. Vascular endothelial growth factor receptor-2 (VEGFR-2) inhibitors: development and validation of predictive 3-D QSAR models through extensive ligand- and structure-based approaches

    NASA Astrophysics Data System (ADS)

    Ragno, Rino; Ballante, Flavio; Pirolli, Adele; Wickersham, Richard B.; Patsilinakos, Alexandros; Hesse, Stéphanie; Perspicace, Enrico; Kirsch, Gilbert

    2015-08-01

    Vascular endothelial growth factor receptor-2, (VEGFR-2), is a key element in angiogenesis, the process by which new blood vessels are formed, and is thus an important pharmaceutical target. Here, 3-D quantitative structure-activity relationship (3-D QSAR) were used to build a quantitative screening and pharmacophore model of the VEGFR-2 receptors for design of inhibitors with improved activities. Most of available experimental data information has been used as training set to derive optimized and fully cross-validated eight mono-probe and a multi-probe quantitative models. Notable is the use of 262 molecules, aligned following both structure-based and ligand-based protocols, as external test set confirming the 3-D QSAR models' predictive capability and their usefulness in design new VEGFR-2 inhibitors. From a survey on literature, this is the first generation of a wide-ranging computational medicinal chemistry application on VEGFR2 inhibitors.

  9. Quantitative structure-activity relationship models for predicting biological properties, developed by combining structure- and ligand-based approaches: an application to the human ether-a-go-go-related gene potassium channel inhibition.

    PubMed

    Coi, Alessio; Massarelli, Ilaria; Saraceno, Marilena; Carli, Niccolò; Testai, Lara; Calderone, Vincenzo; Bianucci, Anna Maria

    2009-10-01

    A strategy for developing accurate quantitative structure-activity relationship models enabling predictions of biological properties, when suitable knowledge concerning both ligands and biological target is available, was tested on a data set where molecules are characterized by high structural diversity. Such a strategy was applied to human ether-a-go-go-related gene K(+) channel inhibition and consists of a combination of ligand- and structure-based approaches, which can be carried out whenever the three-dimensional structure of the target macromolecule is known or may be modeled with good accuracy. Molecular conformations of ligands were obtained by means of molecular docking, performed in a previously built theoretical model of the channel pore, so that descriptors depending upon the three-dimensional molecular structure were properly computed. A modification of the directed sphere-exclusion algorithm was developed and exploited to properly splitting the whole dataset into Training/Test set pairs. Molecular descriptors, computed by means of the codessa program, were used for the search of reliable quantitative structure-activity relationship models that were subsequently identified through a rigorous validation analysis. Finally, pIC(50) values of a prediction set, external to the initial dataset, were predicted and the results confirmed the high predictive power of the model within a quite wide chemical space.

  10. Prediction of the electronic structures, thermodynamic and mechanical properties in manganese doped magnesium-based alloys and their saturated hydrides based on density functional theory

    NASA Astrophysics Data System (ADS)

    Zhang, Ziying; Zhang, Huizhen; Zhao, Hui; Yu, Zhishui; He, Liang; Li, Jin

    2015-04-01

    The crystal structures, electronic structures, thermodynamic and mechanical properties of Mg2Ni alloy and its saturated hydride with different Mn-doping contents are investigated using first-principles density functional theory. The lattice parameters for the Mn-doped Mg2Ni alloys and their saturated hydrides decreased with an increasing Mn-doping content because of the smaller atomic size of Mn compared with that of Mg. Analysis of the formation enthalpies and electronic structures reveal that the partial substitution of Mg with Mn reduces the stability of Mg2Ni alloy and its saturated hydride. The calculated elastic constants indicate that, although the partial substitution of Mg with Mn lowers the toughness of the hexagonal Mg2Ni alloy, the charge/discharge cycles are elevated when the Mn-doping content is high enough to form the predicted intermetallic compound Mg3MnNi2.

  11. Predicting missing links via structural similarity

    NASA Astrophysics Data System (ADS)

    Lyu, Guo-Dong; Fan, Chang-Jun; Yu, Lian-Fei; Xiu, Bao-Xin; Zhang, Wei-Ming

    2015-04-01

    Predicting missing links in networks plays a significant role in modern science. On the basis of structural similarity, our paper proposes a new node-similarity-based measure called biased resource allocation (BRA), which is motivated by the resource allocation (RA) measure. Comparisons between BRA and nine well-known node-similarity-based measures on five real networks indicate that BRA performs no worse than RA, which was the best node-similarity-based index in previous researches. Afterwards, based on localPath (LP) and Katz measure, we propose another two improved measures, named Im-LocalPath and Im-Katz respectively. Numerical results show that the prediction accuracy of both Im-LP and Im-Katz measure improve compared with the original LP and Katz measure. Finally, a new path-similarity-based measure and its improved measure, called LYU and Im-LYU measure, are proposed and especially, Im-LYU measure is shown to perform more remarkably than other mentioned measures.

  12. Protein structural motifs in prediction and design.

    PubMed

    Mackenzie, Craig O; Grigoryan, Gevorg

    2017-06-01

    The Protein Data Bank (PDB) has been an integral resource for shaping our fundamental understanding of protein structure and for the advancement of such applications as protein design and structure prediction. Over the years, information from the PDB has been used to generate models ranging from specific structural mechanisms to general statistical potentials. With accumulating structural data, it has become possible to mine for more complete and complex structural observations, deducing more accurate generalizations. Motif libraries, which capture recurring structural features along with their sequence preferences, have exposed modularity in the structural universe and found successful application in various problems of structural biology. Here we summarize recent achievements in this arena, focusing on subdomain level structural patterns and their applications to protein design and structure prediction, and suggest promising future directions as the structural database continues to grow. Copyright © 2017 Elsevier Ltd. All rights reserved.

  13. Predicting road accidents: Structural time series approach

    NASA Astrophysics Data System (ADS)

    Junus, Noor Wahida Md; Ismail, Mohd Tahir

    2014-07-01

    In this paper, the model for occurrence of road accidents in Malaysia between the years of 1970 to 2010 was developed and throughout this model the number of road accidents have been predicted by using the structural time series approach. The models are developed by using stepwise method and the residual of each step has been analyzed. The accuracy of the model is analyzed by using the mean absolute percentage error (MAPE) and the best model is chosen based on the smallest Akaike information criterion (AIC) value. A structural time series approach found that local linear trend model is the best model to represent the road accidents. This model allows level and slope component to be varied over time. In addition, this approach also provides useful information on improving the conventional time series method.

  14. Toward structure prediction of cyclic peptides.

    PubMed

    Yu, Hongtao; Lin, Yu-Shan

    2015-02-14

    Cyclic peptides are a promising class of molecules that can be used to target specific protein-protein interactions. A computational method to accurately predict their structures would substantially advance the development of cyclic peptides as modulators of protein-protein interactions. Here, we develop a computational method that integrates bias-exchange metadynamics simulations, a Boltzmann reweighting scheme, dihedral principal component analysis and a modified density peak-based cluster analysis to provide a converged structural description for cyclic peptides. Using this method, we evaluate the performance of a number of popular protein force fields on a model cyclic peptide. All the tested force fields seem to over-stabilize the α-helix and PPII/β regions in the Ramachandran plot, commonly populated by linear peptides and proteins. Our findings suggest that re-parameterization of a force field that well describes the full Ramachandran plot is necessary to accurately model cyclic peptides.

  15. Structural network efficiency predicts conversion to dementia

    PubMed Central

    Tuladhar, Anil M.; van Uden, Ingeborg W.M.; Rutten-Jacobs, Loes C.A.; Lawrence, Andrew; van der Holst, Helena; van Norden, Anouk; de Laat, Karlijn; van Dijk, Ewoud; Claassen, Jurgen A.H.R.; Kessels, Roy P.C.; Markus, Hugh S.; Norris, David G.

    2016-01-01

    Objective: To examine whether structural network connectivity at baseline predicts incident all-cause dementia in a prospective hospital-based cohort of elderly participants with MRI evidence of small vessel disease (SVD). Methods: A total of 436 participants from the Radboud University Nijmegen Diffusion Tensor and Magnetic Resonance Cohort (RUN DMC), a prospective hospital-based cohort of elderly without dementia with cerebral SVD, were included in 2006. During follow-up (2011–2012), dementia was diagnosed. The structural network was constructed from baseline diffusion tensor imaging followed by deterministic tractography and measures of efficiency using graph theory were calculated. Cox proportional regression analyses were conducted. Results: During 5 years of follow-up, 32 patients developed dementia. MRI markers for SVD were strongly associated with network measures. Patients with dementia showed lower total network strength and global and local efficiency at baseline as compared with the group without dementia. Lower global network efficiency was independently associated with increased risk of incident all-cause dementia (hazard ratio 0.63, 95% confidence interval 0.42–0.96, p = 0.032); in contrast, individual SVD markers including lacunes, white matter hyperintensities volume, and atrophy were not independently associated. Conclusions: These results support a role of network disruption playing a pivotal role in the genesis of dementia in SVD, and suggest network analysis of the connectivity of white matter has potential as a predictive marker in the disease. PMID:26888983

  16. Structure prediction of the EcoRV DNA methyltransferase based on mutant profiling, secondary structure analysis, comparison with known structures of methyltransferases and isolation of catalytically inactive single mutants.

    PubMed

    Jeltsch, A; Sobotta, T; Pingoud, A

    1996-05-01

    The EcoRV DNA methyltransferase (M.EcoRV) is an alpha-adenine methyltransferase. We have used two different programs to predict the secondary structure of M.EcoRV. The resulting consensus prediction was tested by a mutant profiling analysis. 29 neutral mutations of M.EcoRV were generated by five cycles of random mutagenesis and selection for active variants to increase the reliability of the prediction and to get a secondary structure prediction for some ambiguously predicted regions. The predicted consensus secondary structure elements could be aligned to the common topology of the structures of the catalytic domains of M.HhaI and M.TaqI. In a complementary approach we have isolated nine catalytically inactive single mutants. Five of these mutants contain an amino acid exchange within the catalytic domain of M.EcoRV (Val2-Ala, Lys81Arg, Cys192Arg, Asp193Gly, Trp231Arg). The Trp231Arg mutant binds DNA similarly to wild-type M.EcoRV, but is catalytically inactive. Hence this mutant behaves like a bona fide active site mutant. According to the structure prediction, Trp231 is located in a loop at the putative active site of M.EcoRV. The other inactive mutants were insoluble. They contain amino acid exchanges within the conserved amino acid motifs X, III or IV in M.EcoRV confirming the importance of these regions.

  17. Structure-based methods for predicting target mutation-induced drug resistance and rational drug design to overcome the problem.

    PubMed

    Hao, Ge-Fei; Yang, Guang-Fu; Zhan, Chang-Guo

    2012-10-01

    Drug resistance has become one of the biggest challenges in drug discovery and/or development and has attracted great research interests worldwide. During the past decade, computational strategies have been developed to predict target mutation-induced drug resistance. Meanwhile, various molecular design strategies, including targeting protein backbone, targeting highly conserved residues and dual/multiple targeting, have been used to design novel inhibitors for combating the drug resistance. In this article we review recent advances in development of computational methods for target mutation-induced drug resistance prediction and strategies for rational design of novel inhibitors that could be effective against the possible drug-resistant mutants of the target.

  18. Energy-directed RNA structure prediction.

    PubMed

    Hofacker, Ivo L

    2014-01-01

    In this chapter we present the classic dynamic programming algorithms for RNA structure prediction by energy minimization, as well as variations of this approach that allow to compute suboptimal foldings, or even the partition function over all possible secondary structures. The latter are essential in order to deal with the inaccuracy of minimum free energy (MFE) structure prediction, and can be used, for example, to derive reliability measures that assign a confidence value to all or part of a predicted structure. In addition, we discuss recently proposed alternatives to the MFE criterion such as the use of maximum expected accuracy (MEA) or centroid structures. The dynamic programming algorithms implicitly assume that the RNA molecule is in thermodynamic equilibrium. However, especially for long RNAs, this need not be the case. In the last section we therefore discuss approaches for predicting RNA folding kinetics and co-transcriptional folding.

  19. SCRATCH: a protein structure and structural feature prediction server

    PubMed Central

    Cheng, J.; Randall, A. Z.; Sweredoski, M. J.; Baldi, P.

    2005-01-01

    SCRATCH is a server for predicting protein tertiary structure and structural features. The SCRATCH software suite includes predictors for secondary structure, relative solvent accessibility, disordered regions, domains, disulfide bridges, single mutation stability, residue contacts versus average, individual residue contacts and tertiary structure. The user simply provides an amino acid sequence and selects the desired predictions, then submits to the server. Results are emailed to the user. The server is available at . PMID:15980571

  20. Prediction of Protein Structure Using Surface Accessibility Data

    PubMed Central

    Hartlmüller, Christoph; Göbl, Christoph

    2016-01-01

    Abstract An approach to the de novo structure prediction of proteins is described that relies on surface accessibility data from NMR paramagnetic relaxation enhancements by a soluble paramagnetic compound (sPRE). This method exploits the distance‐to‐surface information encoded in the sPRE data in the chemical shift‐based CS‐Rosetta de novo structure prediction framework to generate reliable structural models. For several proteins, it is demonstrated that surface accessibility data is an excellent measure of the correct protein fold in the early stages of the computational folding algorithm and significantly improves accuracy and convergence of the standard Rosetta structure prediction approach. PMID:27560616

  1. Comparative analysis of QSAR models for predicting pK(a) of organic oxygen acids and nitrogen bases from molecular structure.

    PubMed

    Yu, Haiying; Kühne, Ralph; Ebert, Ralf-Uwe; Schüürmann, Gerrit

    2010-11-22

    For 1143 organic compounds comprising 580 oxygen acids and 563 nitrogen bases that cover more than 17 orders of experimental pK(a) (from -5.00 to 12.23), the pK(a) prediction performances of ACD, SPARC, and two calibrations of a semiempirical quantum chemical (QC) AM1 approach have been analyzed. The overall root-mean-square errors (rms) for the acids are 0.41, 0.58 (0.42 without ortho-substituted phenols with intramolecular H-bonding), and 0.55 and for the bases are 0.65, 0.70, 1.17, and 1.27 for ACD, SPARC, and both QC methods, respectively. Method-specific performances are discussed in detail for six acid subsets (phenols and aromatic and aliphatic carboxylic acids with different substitution patterns) and nine base subsets (anilines, primary, secondary and tertiary amines, meta/para-substituted and ortho-substituted pyridines, pyrimidines, imidazoles, and quinolines). The results demonstrate an overall better performance for acids than for bases but also a substantial variation across subsets. For the overall best-performing ACD, rms ranges from 0.12 to 1.11 and 0.40 to 1.21 pK(a) units for the acid and base subsets, respectively. With regard to the squared correlation coefficient r², the results are 0.86 to 0.96 (acids) and 0.79 to 0.95 (bases) for ACD, 0.77 to 0.95 (acids) and 0.85 to 0.97 (bases) for SPARC, and 0.64 to 0.87 (acids) and 0.43 to 0.83 (bases) for the QC methods, respectively. Attention is paid to structural and method-specific causes for observed pitfalls. The significant subset dependence of the prediction performances suggests a consensus modeling approach.

  2. PSPP: A Protein Structure Prediction Pipeline for Computing Clusters

    PubMed Central

    Lee, Michael S.; Bondugula, Rajkumar; Desai, Valmik; Zavaljevski, Nela; Yeh, In-Chul; Wallqvist, Anders; Reifman, Jaques

    2009-01-01

    Background Protein structures are critical for understanding the mechanisms of biological systems and, subsequently, for drug and vaccine design. Unfortunately, protein sequence data exceed structural data by a factor of more than 200 to 1. This gap can be partially filled by using computational protein structure prediction. While structure prediction Web servers are a notable option, they often restrict the number of sequence queries and/or provide a limited set of prediction methodologies. Therefore, we present a standalone protein structure prediction software package suitable for high-throughput structural genomic applications that performs all three classes of prediction methodologies: comparative modeling, fold recognition, and ab initio. This software can be deployed on a user's own high-performance computing cluster. Methodology/Principal Findings The pipeline consists of a Perl core that integrates more than 20 individual software packages and databases, most of which are freely available from other research laboratories. The query protein sequences are first divided into domains either by domain boundary recognition or Bayesian statistics. The structures of the individual domains are then predicted using template-based modeling or ab initio modeling. The predicted models are scored with a statistical potential and an all-atom force field. The top-scoring ab initio models are annotated by structural comparison against the Structural Classification of Proteins (SCOP) fold database. Furthermore, secondary structure, solvent accessibility, transmembrane helices, and structural disorder are predicted. The results are generated in text, tab-delimited, and hypertext markup language (HTML) formats. So far, the pipeline has been used to study viral and bacterial proteomes. Conclusions The standalone pipeline that we introduce here, unlike protein structure prediction Web servers, allows users to devote their own computing assets to process a potentially unlimited

  3. Predictive Models of Primary Tropical Forest Structure from Geomorphometric Variables Based on SRTM in the Tapajós Region, Brazilian Amazon

    PubMed Central

    Bispo, Polyanna da Conceição; dos Santos, João Roberto; Valeriano, Márcio de Morisson; Graça, Paulo Maurício Lima de Alencastro; Balzter, Heiko; França, Helena; Bispo, Pitágoras da Conceição

    2016-01-01

    Surveying primary tropical forest over large regions is challenging. Indirect methods of relating terrain information or other external spatial datasets to forest biophysical parameters can provide forest structural maps at large scales but the inherent uncertainties need to be evaluated fully. The goal of the present study was to evaluate relief characteristics, measured through geomorphometric variables, as predictors of forest structural characteristics such as average tree basal area (BA) and height (H) and average percentage canopy openness (CO). Our hypothesis is that geomorphometric variables are good predictors of the structure of primary tropical forest, even in areas, with low altitude variation. The study was performed at the Tapajós National Forest, located in the Western State of Pará, Brazil. Forty-three plots were sampled. Predictive models for BA, H and CO were parameterized based on geomorphometric variables using multiple linear regression. Validation of the models with nine independent sample plots revealed a Root Mean Square Error (RMSE) of 3.73 m2/ha (20%) for BA, 1.70 m (12%) for H, and 1.78% (21%) for CO. The coefficient of determination between observed and predicted values were r2 = 0.32 for CO, r2 = 0.26 for H and r2 = 0.52 for BA. The models obtained were able to adequately estimate BA and CO. In summary, it can be concluded that relief variables are good predictors of vegetation structure and enable the creation of forest structure maps in primary tropical rainforest with an acceptable uncertainty. PMID:27089013

  4. Phylogenetic approaches to natural product structure prediction.

    PubMed

    Ziemert, Nadine; Jensen, Paul R

    2012-01-01

    Phylogenetics is the study of the evolutionary relatedness among groups of organisms. Molecular phylogenetics uses sequence data to infer these relationships for both organisms and the genes they maintain. With the large amount of publicly available sequence data, phylogenetic inference has become increasingly important in all fields of biology. In the case of natural product research, phylogenetic relationships are proving to be highly informative in terms of delineating the architecture and function of the genes involved in secondary metabolite biosynthesis. Polyketide synthases and nonribosomal peptide synthetases provide model examples in which individual domain phylogenies display different predictive capacities, resolving features ranging from substrate specificity to structural motifs associated with the final metabolic product. This chapter provides examples in which phylogeny has proven effective in terms of predicting functional or structural aspects of secondary metabolism. The basics of how to build a reliable phylogenetic tree are explained along with information about programs and tools that can be used for this purpose. Furthermore, it introduces the Natural Product Domain Seeker, a recently developed Web tool that employs phylogenetic logic to classify ketosynthase and condensation domains based on established enzyme architecture and biochemical function. Copyright © 2012 Elsevier Inc. All rights reserved.

  5. Transmembrane beta-barrel protein structure prediction

    NASA Astrophysics Data System (ADS)

    Randall, Arlo; Baldi, Pierre

    Transmembrane β-barrel (TMB) proteins are embedded in the outer membranes of mitochondria, Gram-negative bacteria, and chloroplasts. These proteins perform critical functions, including active ion-transport and passive nutrient intake. Therefore, there is a need for accurate prediction of secondary and tertiary structures of TMB proteins. A variety of methods have been developed for predicting the secondary structure and these predictions are very useful for constructing a coarse topology of TMB structure; however, they do not provide enough information to construct a low-resolution tertiary structure for a TMB protein. In addition, while the overall structural architecture is well conserved among TMB proteins, the amino acid sequences are highly divergent. Thus, traditional homology modeling methods cannot be applied to many putative TMB proteins. Here, we describe the TMBpro: a pipeline of methods for predicting TMB secondary structure, β-residue contacts, and finally tertiary structure. The tertiary prediction method relies on the specific construction rules that TMB proteins adhere to and on the predicted β-residue contacts to dramatically reduce the search space for the model building procedure.

  6. Protein Structure and Function Prediction Using I-TASSER.

    PubMed

    Yang, Jianyi; Zhang, Yang

    2015-12-17

    I-TASSER is a hierarchical protocol for automated protein structure prediction and structure-based function annotation. Starting from the amino acid sequence of target proteins, I-TASSER first generates full-length atomic structural models from multiple threading alignments and iterative structural assembly simulations followed by atomic-level structure refinement. The biological functions of the protein, including ligand-binding sites, enzyme commission number, and gene ontology terms, are then inferred from known protein function databases based on sequence and structure profile comparisons. I-TASSER is freely available as both an on-line server and a stand-alone package. This unit describes how to use the I-TASSER protocol to generate structure and function prediction and how to interpret the prediction results, as well as alternative approaches for further improving the I-TASSER modeling quality for distant-homologous and multi-domain protein targets. Copyright © 2015 John Wiley & Sons, Inc.

  7. Predicting and prioritizing maintenance for concrete structures

    SciTech Connect

    Hertlein, B.H. )

    1991-06-01

    Using nondestructive testing of concrete structures to predict maintenance needs can help schedule maintenance work in advance and prevent unexpected shutdowns. Nondestructive testing methods are described and development of a testing program is discussed.

  8. Critical Features of Fragment Libraries for Protein Structure Prediction.

    PubMed

    Trevizani, Raphael; Custódio, Fábio Lima; Dos Santos, Karina Baptista; Dardenne, Laurent Emmanuel

    2017-01-01

    The use of fragment libraries is a popular approach among protein structure prediction methods and has proven to substantially improve the quality of predicted structures. However, some vital aspects of a fragment library that influence the accuracy of modeling a native structure remain to be determined. This study investigates some of these features. Particularly, we analyze the effect of using secondary structure prediction guiding fragments selection, different fragments sizes and the effect of structural clustering of fragments within libraries. To have a clearer view of how these factors affect protein structure prediction, we isolated the process of model building by fragment assembly from some common limitations associated with prediction methods, e.g., imprecise energy functions and optimization algorithms, by employing an exact structure-based objective function under a greedy algorithm. Our results indicate that shorter fragments reproduce the native structure more accurately than the longer. Libraries composed of multiple fragment lengths generate even better structures, where longer fragments show to be more useful at the beginning of the simulations. The use of many different fragment sizes shows little improvement when compared to predictions carried out with libraries that comprise only three different fragment sizes. Models obtained from libraries built using only sequence similarity are, on average, better than those built with a secondary structure prediction bias. However, we found that the use of secondary structure prediction allows greater reduction of the search space, which is invaluable for prediction methods. The results of this study can be critical guidelines for the use of fragment libraries in protein structure prediction.

  9. Critical Features of Fragment Libraries for Protein Structure Prediction

    PubMed Central

    dos Santos, Karina Baptista

    2017-01-01

    The use of fragment libraries is a popular approach among protein structure prediction methods and has proven to substantially improve the quality of predicted structures. However, some vital aspects of a fragment library that influence the accuracy of modeling a native structure remain to be determined. This study investigates some of these features. Particularly, we analyze the effect of using secondary structure prediction guiding fragments selection, different fragments sizes and the effect of structural clustering of fragments within libraries. To have a clearer view of how these factors affect protein structure prediction, we isolated the process of model building by fragment assembly from some common limitations associated with prediction methods, e.g., imprecise energy functions and optimization algorithms, by employing an exact structure-based objective function under a greedy algorithm. Our results indicate that shorter fragments reproduce the native structure more accurately than the longer. Libraries composed of multiple fragment lengths generate even better structures, where longer fragments show to be more useful at the beginning of the simulations. The use of many different fragment sizes shows little improvement when compared to predictions carried out with libraries that comprise only three different fragment sizes. Models obtained from libraries built using only sequence similarity are, on average, better than those built with a secondary structure prediction bias. However, we found that the use of secondary structure prediction allows greater reduction of the search space, which is invaluable for prediction methods. The results of this study can be critical guidelines for the use of fragment libraries in protein structure prediction. PMID:28085928

  10. Enhanced multi-view prediction structure

    NASA Astrophysics Data System (ADS)

    Liu, Da; Li, Yi; Xiong, Yazhou; Wang, Li; Li, Chunyan; Yin, Fang

    2016-11-01

    In this paper, firstly an extended DFMC Structure is proposed, then HQF jump period in extended DFMC is presented. Considering temporal-view and interview prediction structure, HQF location is determined. From the HQF, an enhance LQF is proposed. Then considering the HQF and enhance LQF, improved interview prediction is proposed. Finally bit allocation in the proposed multi-view is proposed. Experimental results show that the proposed method can achieve better performance than the previous schemes.

  11. Predicting Protein Structure Using Parallel Genetic Algorithms.

    DTIC Science & Technology

    1994-12-01

    34 IEEE Transactions on Systems, Man and Cybernetics, 10(9) (September 1980). 16. De Jong, Kenneth A. "On Using Genetic Algoriths to Search Program...By " Predicting rotein Structure D istribticfiar.. ................ Using Parallel Genetic Algorithms ,Avaiu " ’ •"... Dist THESIS I IGeorge H...iiLite-d Approved for public release; distribution unlimited AFIT/ GCS /ENG/94D-03 Predicting Protein Structure Using Parallel Genetic Algorithms

  12. Development of a Support Vector Machine-Based System to Predict Whether a Compound Is a Substrate of a Given Drug Transporter Using Its Chemical Structure.

    PubMed

    Ose, Atsushi; Toshimoto, Kota; Ikeda, Kazushi; Maeda, Kazuya; Yoshida, Shuya; Yamashita, Fumiyoshi; Hashida, Mitsuru; Ishida, Takashi; Akiyama, Yutaka; Sugiyama, Yuichi

    2016-07-01

    The aim of this study was to develop an in silico prediction system to assess which of 7 categories of drug transporters (organic anion transporting polypeptide [OATP] 1B1/1B3, multidrug resistance-associated protein [MRP] 2/3/4, organic anion transporter [OAT] 1, OAT3, organic cation transporter [OCT] 1/2/multidrug and toxin extrusion [MATE] 1/2-K, multidrug resistance protein 1 [MDR1], and breast cancer resistance protein [BCRP]) can recognize compounds as substrates using its chemical structure alone. We compiled an internal data set consisting of 260 compounds that are substrates for at least 1 of the 7 categories of drug transporters. Four physicochemical parameters (charge, molecular weight, lipophilicity, and plasma unbound fraction) of each compound were used as the basic descriptors. Furthermore, a greedy algorithm was used to select 3 additional physicochemical descriptors from 731 available descriptors. In addition, transporter nonsubstrates tend not to be in the public domain; we, thus, tried to compile an expert-curated data set of putative nonsubstrates for each transporter using personal opinions of 11 researchers in the field of drug transporters. The best prediction was finally achieved by a support vector machine based on 4 basic and 3 additional descriptors. The model correctly judged that 364 of 412 compounds (internal data set) and 111 of 136 compounds (external data set) were substrates, indicating that this model performs well enough to predict the specificity of transporter substrates.

  13. A physical approach to protein structure prediction: CASP4 results

    SciTech Connect

    Crivelli, Silvia; Eskow, Elizabeth; Bader, Brett; Lamberti, Vincent; Byrd, Richard; Schnabel, Robert; Head-Gordon, Teresa

    2001-02-27

    We describe our global optimization method called Stochastic Perturbation with Soft Constraints (SPSC), which uses information from known proteins to predict secondary structure, but not in the tertiary structure predictions or in generating the terms of the physics-based energy function. Our approach is also characterized by the use of an all atom energy function that includes a novel hydrophobic solvation function derived from experiments that shows promising ability for energy discrimination against misfolded structures. We present the results obtained using our SPSC method and energy function for blind prediction in the 4th Critical Assessment of Techniques for Protein Structure Prediction (CASP4) competition, and show that our approach is more effective on targets for which less information from known proteins is available. In fact our SPSC method produced the best prediction for one of the most difficult targets of the competition, a new fold protein of 240 amino acids.

  14. Predicting complex mineral structures using genetic algorithms.

    PubMed

    Mohn, Chris E; Kob, Walter

    2015-10-28

    We show that symmetry-adapted genetic algorithms are capable of finding the ground state of a range of complex crystalline phases including layered- and incommensurate super-structures. This opens the way for the atomistic prediction of complex crystal structures of functional materials and mineral phases.

  15. Improving RNA secondary structure prediction with structure mapping data.

    PubMed

    Sloma, Michael F; Mathews, David H

    2015-01-01

    Methods to probe RNA secondary structure, such as small molecule modifying agents, secondary structure-specific nucleases, inline probing, and SHAPE chemistry, are widely used to study the structure of functional RNA. Computational secondary structure prediction programs can incorporate probing data to predict structure with high accuracy. In this chapter, an overview of current methods for probing RNA secondary structure is provided, including modern high-throughput methods. Methods for guiding secondary structure prediction algorithms using these data are explained, and best practices for using these data are provided. This chapter concludes by listing a number of open questions about how to best use probing data, and what these data can provide. © 2015 Elsevier Inc. All rights reserved.

  16. Predicting structure in nonsymmetric sparse matrix factorizations

    SciTech Connect

    Gilbert, J.R.; Ng, E.G.

    1992-10-01

    Many computations on sparse matrices have a phase that predicts the nonzero structure of the output, followed by a phase that actually performs the numerical computation. We study structure prediction for computations that involve nonsymmetric row and column permutations and nonsymmetric or non-square matrices. Our tools are bipartite graphs, matchings, and alternating paths. Our main new result concerns LU factorization with partial pivoting. We show that if a square matrix A has the strong Hall property (i.e., is fully indecomposable) then an upper bound due to George and Ng on the nonzero structure of L + U is as tight as possible. To show this, we prove a crucial result about alternating paths in strong Hall graphs. The alternating-paths theorem seems to be of independent interest: it can also be used to prove related results about structure prediction for QR factorization that are due to Coleman, Edenbrandt, Gilbert, Hare, Johnson, Olesky, Pothen, and van den Driessche.

  17. Predicting structure in nonsymmetric sparse matrix factorizations

    SciTech Connect

    Gilbert, J.R.; Ng, E.

    1991-12-31

    Many computations on sparse matrices have a phase that predicts the nonzero structure of the output, followed by a phase that actually performs the numerical computation. We study structure prediction for computations that involve nonsymmetric row and column permutations and nonsymmetric or non-square matrices. Our tools are bipartite graphs, matchings, and alternating paths. Our main new result concerns LU factorization with partial pivoting. We show that if a square matrix A has the strong Hall property (i.e., is fully indecomposable) then an upper bound due to George and Ng on the nonzero structure of L + U is as tight as possible. To show this, we prove a crucial result about alternating paths in strong Hall graphs. The alternating-paths theorem seems to be of independent interest: it can also be used to prove related results about structure prediction for QR factorization that are due to Coleman, Edenbrandt, Gilbert, Hare, Johnson, Olesky, Pothen, and van den Driessche.

  18. Predicting structure in nonsymmetric sparse matrix factorizations

    SciTech Connect

    Gilbert, J.R. ); Ng, E. )

    1991-01-01

    Many computations on sparse matrices have a phase that predicts the nonzero structure of the output, followed by a phase that actually performs the numerical computation. We study structure prediction for computations that involve nonsymmetric row and column permutations and nonsymmetric or non-square matrices. Our tools are bipartite graphs, matchings, and alternating paths. Our main new result concerns LU factorization with partial pivoting. We show that if a square matrix A has the strong Hall property (i.e., is fully indecomposable) then an upper bound due to George and Ng on the nonzero structure of L + U is as tight as possible. To show this, we prove a crucial result about alternating paths in strong Hall graphs. The alternating-paths theorem seems to be of independent interest: it can also be used to prove related results about structure prediction for QR factorization that are due to Coleman, Edenbrandt, Gilbert, Hare, Johnson, Olesky, Pothen, and van den Driessche.

  19. Predicting structure in nonsymmetric sparse matrix factorizations

    SciTech Connect

    Gilbert, J.R. ); Ng, E.G. )

    1992-10-01

    Many computations on sparse matrices have a phase that predicts the nonzero structure of the output, followed by a phase that actually performs the numerical computation. We study structure prediction for computations that involve nonsymmetric row and column permutations and nonsymmetric or non-square matrices. Our tools are bipartite graphs, matchings, and alternating paths. Our main new result concerns LU factorization with partial pivoting. We show that if a square matrix A has the strong Hall property (i.e., is fully indecomposable) then an upper bound due to George and Ng on the nonzero structure of L + U is as tight as possible. To show this, we prove a crucial result about alternating paths in strong Hall graphs. The alternating-paths theorem seems to be of independent interest: it can also be used to prove related results about structure prediction for QR factorization that are due to Coleman, Edenbrandt, Gilbert, Hare, Johnson, Olesky, Pothen, and van den Driessche.

  20. MSACompro: improving multiple protein sequence alignment by predicted structural features.

    PubMed

    Deng, Xin; Cheng, Jianlin

    2014-01-01

    Multiple Sequence Alignment (MSA) is an essential tool in protein structure modeling, gene and protein function prediction, DNA motif recognition, phylogenetic analysis, and many other bioinformatics tasks. Therefore, improving the accuracy of multiple sequence alignment is an important long-term objective in bioinformatics. We designed and developed a new method MSACompro to incorporate predicted secondary structure, relative solvent accessibility, and residue-residue contact information into the currently most accurate posterior probability-based MSA methods to improve the accuracy of multiple sequence alignments. Different from the multiple sequence alignment methods that use the tertiary structure information of some sequences, our method uses the structural information purely predicted from sequences. In this chapter, we first introduce some background and related techniques in the field of multiple sequence alignment. Then, we describe the detailed algorithm of MSACompro. Finally, we show that integrating predicted protein structural information improved the multiple sequence alignment accuracy.

  1. Tetrahedron-tiling method for crystal structure prediction

    NASA Astrophysics Data System (ADS)

    Hong, Qi-Jun; Yasi, Joseph; van de Walle, Axel

    2017-07-01

    Reliable and robust methods of predicting the crystal structure of a compound, based only on its chemical composition, is crucial to the study of materials and their applications. Despite considerable ongoing research efforts, crystal structure prediction remains a challenging problem that demands large computational resources. Here we propose an efficient approach for first-principles crystal structure prediction. The new method explores and finds crystal structures by tiling together elementary tetrahedra that are energetically favorable and geometrically matching each other. This approach has three distinguishing features: a favorable building unit, an efficient calculation of local energy, and a stochastic Monte Carlo simulation of crystal growth. By applying the method to the crystal structure prediction of various materials, we demonstrate its validity and potential as a promising alternative to current methods.

  2. Predicting Odor Perceptual Similarity from Odor Structure

    PubMed Central

    Weiss, Tali; Frumin, Idan; Khan, Rehan M.; Sobel, Noam

    2013-01-01

    To understand the brain mechanisms of olfaction we must understand the rules that govern the link between odorant structure and odorant perception. Natural odors are in fact mixtures made of many molecules, and there is currently no method to look at the molecular structure of such odorant-mixtures and predict their smell. In three separate experiments, we asked 139 subjects to rate the pairwise perceptual similarity of 64 odorant-mixtures ranging in size from 4 to 43 mono-molecular components. We then tested alternative models to link odorant-mixture structure to odorant-mixture perceptual similarity. Whereas a model that considered each mono-molecular component of a mixture separately provided a poor prediction of mixture similarity, a model that represented the mixture as a single structural vector provided consistent correlations between predicted and actual perceptual similarity (r≥0.49, p<0.001). An optimized version of this model yielded a correlation of r = 0.85 (p<0.001) between predicted and actual mixture similarity. In other words, we developed an algorithm that can look at the molecular structure of two novel odorant-mixtures, and predict their ensuing perceptual similarity. That this goal was attained using a model that considers the mixtures as a single vector is consistent with a synthetic rather than analytical brain processing mechanism in olfaction. PMID:24068899

  3. Lewis Structures Are Models for Predicting Molecular Structure, Not Electronic Structure

    NASA Astrophysics Data System (ADS)

    Purser, Gordon H.

    1999-07-01

    This article argues against a close relationship between Lewis dot structures and electron structure obtained from quantum mechanical calculations. Lewis structures are a powerful tool for structure prediction, though they are classical models of bonding and do not predict electronic structure. The "best" Lewis structures are those that, when combined with the VSEPR model, allow the accurate prediction of molecular properties, such as polarity, bond length, bond angle, and bond strength. These structures are achieved by minimizing formal charges within the molecule, even if it requires an expanded octet on atoms beyond the second period. Lewis structures that show an expanded octet do not imply full d-orbital involvement in the bonding. They suggest that the presence of low-lying d-orbitals is important in producing observed molecular structures. Based on this work, the presence of electron density, not a large separation in charge, is responsible for the short bond lengths and large angles in species containing nonmetal atoms from beyond the second period. This result contradicts results obtained from natural population analysis, a method that attempts to derive Lewis structures from molecular orbital calculations.

  4. Predicting polymeric crystal structures by evolutionary algorithms.

    PubMed

    Zhu, Qiang; Sharma, Vinit; Oganov, Artem R; Ramprasad, Ramamurthy

    2014-10-21

    The recently developed evolutionary algorithm USPEX proved to be a tool that enables accurate and reliable prediction of structures. Here we extend this method to predict the crystal structure of polymers by constrained evolutionary search, where each monomeric unit is treated as a building block with fixed connectivity. This greatly reduces the search space and allows the initial structure generation with different sequences and packings of these blocks. The new constrained evolutionary algorithm is successfully tested and validated on a diverse range of experimentally known polymers, namely, polyethylene, polyacetylene, poly(glycolic acid), poly(vinyl chloride), poly(oxymethylene), poly(phenylene oxide), and poly (p-phenylene sulfide). By fixing the orientation of polymeric chains, this method can be further extended to predict the structures of complex linear polymers, such as all polymorphs of poly(vinylidene fluoride), nylon-6 and cellulose. The excellent agreement between predicted crystal structures and experimentally known structures assures a major role of this approach in the efficient design of the future polymeric materials.

  5. Crystal structure prediction of rigid molecules.

    PubMed

    Elking, Dennis M; Fusti-Molnar, Laszlo; Nichols, Anthony

    2016-08-01

    A non-polarizable force field based on atomic multipoles fit to reproduce experimental crystal properties and ab initio gas-phase dimers is described. The Ewald method is used to calculate both long-range electrostatic and 1/r(6) dispersion energies of crystals. The dispersion energy of a crystal calculated by a cutoff method is shown to converge slowly to the exact Ewald result. A method for constraining space-group symmetry during unit-cell optimization is derived. Results for locally optimizing 4427 unit cells including volume, cell parameters, unit-cell r.m.s.d. and CPU timings are given for both flexible and rigid molecule optimization. An algorithm for randomly generating rigid molecule crystals is described. Using the correct experimentally determined space group, the average and maximum number of random crystals needed to find the correct experimental structure is given for 2440 rigid single component crystals. The force field energy rank of the correct experimental structure is presented for the same set of 2440 rigid single component crystals assuming the correct space group. A complete crystal prediction is performed for two rigid molecules by searching over the 32 most probable space groups.

  6. A proposed architecture for the central domain of the bacterial enhancer-binding proteins based on secondary structure prediction and fold recognition.

    PubMed Central

    Osuna, J.; Soberón, X.; Morett, E.

    1997-01-01

    The expression of genes transcribed by the RNA polymerase with the alternative sigma factor sigma 54 (E sigma 54) is absolutely dependent on activator proteins that bind to enhancer-like sites, located far upstream from the promoter. These unique prokaryotic proteins, known as enhancer-binding proteins (EBP), mediate open promoter complex formation in a reaction dependent on NTP hydrolysis. The best characterized proteins of this family of regulators are NtrC and NifA, which activate genes required for ammonia assimilation and nitrogen fixation, respectively. In a recent IRBM course (@ontiers of protein structure prediction," IRBM, Pomezia, Italy, 1995; see web site http://www.mrc-cpe.cam.uk/irbm-course95/), one of us (J.O.) participated in the elaboration of the proposal that the Central domain of the EBPs might adopt the classical mononucleotide-binding fold. This suggestion was based on the results of a new protein fold recognition algorithm (Map) and in the mapping of correlated mutations calculated for the sequence family on the same mononucleotide-binding fold topology. In this work, we present new data that support the previous conclusion. The results from a number of different secondary structure prediction programs suggest that the Central domain could adopt an alpha/beta topology. The fold recognition programs ProFIT 0.9, 3D PROFILE combined with secondary structure prediction, and 123D suggest a mononucleotide-binding fold topology for the Central domain amino acid sequence. Finally, and most importantly, three of five reported residue alterations that impair the Central domain. ATPase activity of the E sigma 54 activators are mapped to polypeptide regions that might be playing equivalent roles as those involved in nucleotide-binding in the mononucleotide-binding proteins. Furthermore, the known residue substitution that alter the function of the E sigma 54 activators, leaving intact the Central domain ATPase activity, are mapped on region proposed to

  7. Near-edge band structures and band gaps of Cu-based semiconductors predicted by the modified Becke-Johnson potential plus an on-site Coulomb U

    SciTech Connect

    Zhang, Yubo; Zhang, Jiawei; Wang, Youwei; Gao, Weiwei; Abtew, Tesfaye A.; Zhang, Peihong E-mail: wqzhang@mail.sic.ac.cn; Zhang, Wenqing E-mail: wqzhang@mail.sic.ac.cn

    2013-11-14

    Diamond-like Cu-based multinary semiconductors are a rich family of materials that hold promise in a wide range of applications. Unfortunately, accurate theoretical understanding of the electronic properties of these materials is hindered by the involvement of Cu d electrons. Density functional theory (DFT) based calculations using the local density approximation or generalized gradient approximation often give qualitative wrong electronic properties of these materials, especially for narrow-gap systems. The modified Becke-Johnson (mBJ) method has been shown to be a promising alternative to more elaborate theory such as the GW approximation for fast materials screening and predictions. However, straightforward applications of the mBJ method to these materials still encounter significant difficulties because of the insufficient treatment of the localized d electrons. We show that combining the promise of mBJ potential and the spirit of the well-established DFT + U method leads to a much improved description of the electronic structures, including the most challenging narrow-gap systems. A survey of the band gaps of about 20 Cu-based semiconductors calculated using the mBJ + U method shows that the results agree with reliable values to within ±0.2 eV.

  8. Protein Structure Prediction with Evolutionary Algorithms

    SciTech Connect

    Hart, W.E.; Krasnogor, N.; Pelta, D.A.; Smith, J.

    1999-02-08

    Evolutionary algorithms have been successfully applied to a variety of molecular structure prediction problems. In this paper we reconsider the design of genetic algorithms that have been applied to a simple protein structure prediction problem. Our analysis considers the impact of several algorithmic factors for this problem: the confirmational representation, the energy formulation and the way in which infeasible conformations are penalized, Further we empirically evaluated the impact of these factors on a small set of polymer sequences. Our analysis leads to specific recommendations for both GAs as well as other heuristic methods for solving PSP on the HP model.

  9. Protein Structure Prediction by Protein Threading

    NASA Astrophysics Data System (ADS)

    Xu, Ying; Liu, Zhijie; Cai, Liming; Xu, Dong

    The seminal work of Bowie, Lüthy, and Eisenberg (Bowie et al., 1991) on "the inverse protein folding problem" laid the foundation of protein structure prediction by protein threading. By using simple measures for fitness of different amino acid types to local structural environments defined in terms of solvent accessibility and protein secondary structure, the authors derived a simple and yet profoundly novel approach to assessing if a protein sequence fits well with a given protein structural fold. Their follow-up work (Elofsson et al., 1996; Fischer and Eisenberg, 1996; Fischer et al., 1996a,b) and the work by Jones, Taylor, and Thornton (Jones et al., 1992) on protein fold recognition led to the development of a new brand of powerful tools for protein structure prediction, which we now term "protein threading." These computational tools have played a key role in extending the utility of all the experimentally solved structures by X-ray crystallography and nuclear magnetic resonance (NMR), providing structural models and functional predictions for many of the proteins encoded in the hundreds of genomes that have been sequenced up to now.

  10. Structural class prediction of protein using novel feature extraction method from chaos game representation of predicted secondary structure.

    PubMed

    Zhang, Lichao; Kong, Liang; Han, Xiaodong; Lv, Jinfeng

    2016-07-07

    Protein structural class prediction plays an important role in protein structure and function analysis, drug design and many other biological applications. Extracting good representation from protein sequence is fundamental for this prediction task. In recent years, although several secondary structure based feature extraction strategies have been specially proposed for low-similarity protein sequences, the prediction accuracy still remains limited. To explore the potential of secondary structure information, this study proposed a novel feature extraction method from the chaos game representation of predicted secondary structure to mainly capture sequence order information and secondary structure segments distribution information in a given protein sequence. Several kinds of prediction accuracies obtained by the jackknife test are reported on three widely used low-similarity benchmark datasets (25PDB, 1189 and 640). Compared with the state-of-the-art prediction methods, the proposed method achieves the highest overall accuracies on all the three datasets. The experimental results confirm that the proposed feature extraction method is effective for accurate prediction of protein structural class. Moreover, it is anticipated that the proposed method could be extended to other graphical representations of protein sequence and be helpful in future research. Copyright © 2016 Elsevier Ltd. All rights reserved.

  11. WeFold: A Coopetition for Protein Structure Prediction

    PubMed Central

    Khoury, George A.; Liwo, Adam; Khatib, Firas; Zhou, Hongyi; Chopra, Gaurav; Bacardit, Jaume; Bortot, Leandro O.; Faccioli, Rodrigo A.; Deng, Xin; He, Yi; Krupa, Pawel; Li, Jilong; Mozolewska, Magdalena A.; Sieradzan, Adam K.; Smadbeck, James; Wirecki, Tomasz; Cooper, Seth; Flatten, Jeff; Xu, Kefan; Baker, David; Cheng, Jianlin; Delbem, Alexandre C. B.; Floudas, Christodoulos A.; Keasar, Chen; Levitt, Michael; Popović, Zoran; Scheraga, Harold A.; Skolnick, Jeffrey; Crivelli, Silvia N.; Players, Foldit

    2014-01-01

    The protein structure prediction problem continues to elude scientists. Despite the introduction of many methods, only modest gains were made over the last decade for certain classes of prediction targets. To address this challenge, a social-media based worldwide collaborative effort, named WeFold, was undertaken by thirteen labs. During the collaboration, the labs were simultaneously competing with each other. Here, we present the first attempt at “coopetition” in scientific research applied to the protein structure prediction and refinement problems. The coopetition was possible by allowing the participating labs to contribute different components of their protein structure prediction pipelines and create new hybrid pipelines that they tested during CASP10. This manuscript describes both successes and areas needing improvement as identified throughout the first WeFold experiment and discusses the efforts that are underway to advance this initiative. A footprint of all contributions and structures are publicly accessible at http://www.wefold.org. PMID:24677212

  12. WeFold: a coopetition for protein structure prediction.

    PubMed

    Khoury, George A; Liwo, Adam; Khatib, Firas; Zhou, Hongyi; Chopra, Gaurav; Bacardit, Jaume; Bortot, Leandro O; Faccioli, Rodrigo A; Deng, Xin; He, Yi; Krupa, Pawel; Li, Jilong; Mozolewska, Magdalena A; Sieradzan, Adam K; Smadbeck, James; Wirecki, Tomasz; Cooper, Seth; Flatten, Jeff; Xu, Kefan; Baker, David; Cheng, Jianlin; Delbem, Alexandre C B; Floudas, Christodoulos A; Keasar, Chen; Levitt, Michael; Popović, Zoran; Scheraga, Harold A; Skolnick, Jeffrey; Crivelli, Silvia N

    2014-09-01

    The protein structure prediction problem continues to elude scientists. Despite the introduction of many methods, only modest gains were made over the last decade for certain classes of prediction targets. To address this challenge, a social-media based worldwide collaborative effort, named WeFold, was undertaken by 13 labs. During the collaboration, the laboratories were simultaneously competing with each other. Here, we present the first attempt at "coopetition" in scientific research applied to the protein structure prediction and refinement problems. The coopetition was possible by allowing the participating labs to contribute different components of their protein structure prediction pipelines and create new hybrid pipelines that they tested during CASP10. This manuscript describes both successes and areas needing improvement as identified throughout the first WeFold experiment and discusses the efforts that are underway to advance this initiative. A footprint of all contributions and structures are publicly accessible at http://www.wefold.org. © 2014 Wiley Periodicals, Inc.

  13. Status of research aimed at predicting structural integrity

    SciTech Connect

    Reuter, W.G.

    1997-12-31

    Considerable research has been performed throughout the world on measuring the fracture toughness of metals. The existing capability fills the need encountered when selecting materials, thermal-mechanical treatments, welding procedures, etc., but cannot predict the fracture process of structural components containing cracks. The Idaho National Engineering and Environmental Laboratory and the Massachusetts Institute of Technology have been collaborating for a number of years on developing capabilities for using fracture toughness results to predict structural integrity. Because of the high cost of fabricating and testing structural components, these studies have been limited to predicting the fracture process in specimens containing surface cracks. This paper summarizes the present status of the experimental studies of using fracture toughness data to predict crack growth initiation in specimens (structural components) containing surface cracks. These results are limited to homogeneous base materials.

  14. A new protein structure representation for efficient protein function prediction.

    PubMed

    Maghawry, Huda A; Mostafa, Mostafa G M; Gharib, Tarek F

    2014-12-01

    One of the challenging problems in bioinformatics is the prediction of protein function. Protein function is the main key that can be used to classify different proteins. Protein function can be inferred experimentally with very small throughput or computationally with very high throughput. Computational methods are sequence based or structure based. Structure-based methods produce more accurate protein function prediction. In this article, we propose a new protein structure representation for efficient protein function prediction. The representation is based on three-dimensional patterns of protein residues. In the analysis, we used protein function based on enzyme activity through six mechanistically diverse enzyme superfamilies: amidohydrolase, crotonase, haloacid dehalogenase, isoprenoid synthase type I, and vicinal oxygen chelate. We applied three different classification methods, naïve Bayes, k-nearest neighbors, and random forest, to predict the enzyme superfamily of a given protein. The prediction accuracy using the proposed representation outperforms a recently introduced representation method that is based only on the distance patterns. The results show that the proposed representation achieved prediction accuracy up to 98%, with improvement of about 10% on average.

  15. Reduced ceria nanofilms from structure prediction.

    PubMed

    Kozlov, Sergey M; Demiroglu, Ilker; Neyman, Konstantin M; Bromley, Stefan T

    2015-03-14

    Experimentally, Ce2O3 films are used to study cerium oxide in its fully or partially reduced state, as present in many applications. We have explored the space of low energy Ce2O3 nanofilms using structure prediction and density functional calculations, yielding more than 30 distinct nanofilm structures. First, our results help to rationalize the roles of thermodynamics and kinetics in the preparation of reduced ceria nanofilms with different bulk crystalline structures (e.g. A-type or bixbyite) depending on the support used. Second, we predict a novel, as yet experimentally unresolved, nanofilm which has a structure that does not correspond to any previously reported bulk A2B3 phase and which has an energetic stability between that of A-type and bixbyite. To assist identification and fabrication of this new Ce2O3 nanofilm we calculate some observable properties and propose supports for its epitaxial growth.

  16. Cascaded multiple classifiers for secondary structure prediction.

    PubMed Central

    Ouali, M.; King, R. D.

    2000-01-01

    We describe a new classifier for protein secondary structure prediction that is formed by cascading together different types of classifiers using neural networks and linear discrimination. The new classifier achieves an accuracy of 76.7% (assessed by a rigorous full Jack-knife procedure) on a new nonredundant dataset of 496 nonhomologous sequences (obtained from G.J. Barton and J.A. Cuff). This database was especially designed to train and test protein secondary structure prediction methods, and it uses a more stringent definition of homologous sequence than in previous studies. We show that it is possible to design classifiers that can highly discriminate the three classes (H, E, C) with an accuracy of up to 78% for beta-strands, using only a local window and resampling techniques. This indicates that the importance of long-range interactions for the prediction of beta-strands has been probably previously overestimated. PMID:10892809

  17. Predictive models based on Support Vector Machines: whole-brain versus regional analysis of structural MRI in the Alzheimer’s disease

    PubMed Central

    Retico, A.; Bosco, P.; Cerello, P.; Fiorina, E.; Chincarini, A.; Fantacci, M.E.

    2014-01-01

    Decision-making systems trained on structural Magnetic Resonance Imaging (MRI) data of subjects affected by the Alzheimer’s disease (AD) and healthy controls (CTRL) are becoming widespread prognostic tools for subjects with Mild Cognitive Impairment (MCI). This study compares the performance of three classification methods based on Support Vector Machines (SVMs), using as initial sets of brain voxels (i.e. features): 1) the segmented grey matter (GM); 2) regions of interest (ROIs) by voxel-wise t-test filtering; 3) parceled ROIs, according to prior knowledge. The recursive feature elimination (RFE) is applied in all cases in order to investigate whether feature reduction improves the classification accuracy. We analyzed more than 600 ADNI subjects, training the SVMs on the AD/CTRL dataset, and evaluating them on a trial MCI dataset. The classification performance, evaluated as the Area Under the Receiver Operating Characteristic (ROC) Curve (AUC), reaches AUC=(88.9±0.5)% in 20-fold cross-validation on the AD/CTRL dataset, when the GM is classified as a whole. The highest discrimination accuracy between MCI converters and non-converters is achieved when the SVM-RFE is applied to the whole GM: with AUC reaching (70.7±0.9)%, it outperforms both ROI-based approaches in predicting the AD conversion. PMID:25291354

  18. Predicting continuous local structure and the effect of its substitution for secondary structure in fragment-free protein structure prediction.

    PubMed

    Faraggi, Eshel; Yang, Yuedong; Zhang, Shesheng; Zhou, Yaoqi

    2009-11-11

    Local structures predicted from protein sequences are used extensively in every aspect of modeling and prediction of protein structure and function. For more than 50 years, they have been predicted at a low-resolution coarse-grained level (e.g., three-state secondary structure). Here, we combine a two-state classifier with real-value predictor to predict local structure in continuous representation by backbone torsion angles. The accuracy of the angles predicted by this approach is close to that derived from NMR chemical shifts. Their substitution for predicted secondary structure as restraints for ab initio structure prediction doubles the success rate. This result demonstrates the potential of predicted local structure for fragment-free tertiary-structure prediction. It further implies potentially significant benefits from using predicted real-valued torsion angles as a replacement for or supplement to the secondary-structure prediction tools used almost exclusively in many computational methods ranging from sequence alignment to function prediction.

  19. Three-dimensional quantitative structure-activity relationship analysis for human pregnane X receptor for the prediction of CYP3A4 induction in human hepatocytes: structure-based comparative molecular field analysis.

    PubMed

    Handa, Koichi; Nakagome, Izumi; Yamaotsu, Noriyuki; Gouda, Hiroaki; Hirono, Shuichi

    2015-01-01

    The pregnane X receptor [PXR (NR1I2)] induces the expression of xenobiotic metabolic genes and transporter genes. In this study, we aimed to establish a computational method for quantifying the enzyme-inducing potencies of different compounds via their ability to activate PXR, for the application in drug discovery and development. To achieve this purpose, we developed a three-dimensional quantitative structure-activity relationship (3D-QSAR) model using comparative molecular field analysis (CoMFA) for predicting enzyme-inducing potencies, based on computer-ligand docking to multiple PXR protein structures sampled from the trajectory of a molecular dynamics simulation. Molecular mechanics-generalized born/surface area scores representing the ligand-protein-binding free energies were calculated for each ligand. As a result, the predicted enzyme-inducing potencies for compounds generated by the CoMFA model were in good agreement with the experimental values. Finally, we concluded that this 3D-QSAR model has the potential to predict the enzyme-inducing potencies of novel compounds with high precision and therefore has valuable applications in the early stages of the drug discovery process.

  20. Protein structure prediction enhanced with evolutionary diversity : SPEED.

    SciTech Connect

    DeBartolo, J.; Hocky, G.; Wilde, M.; Xu, J.; Freed, K. F.; Sosnick, T. R.; Univ. of Chicago; Toyota Technological Inst. at Chicago

    2010-03-01

    For naturally occurring proteins, similar sequence implies similar structure. Consequently, multiple sequence alignments (MSAs) often are used in template-based modeling of protein structure and have been incorporated into fragment-based assembly methods. Our previous homology-free structure prediction study introduced an algorithm that mimics the folding pathway by coupling the formation of secondary and tertiary structure. Moves in the Monte Carlo procedure involve only a change in a single pair of {phi},{psi} backbone dihedral angles that are obtained from a Protein Data Bank-based distribution appropriate for each amino acid, conditional on the type and conformation of the flanking residues. We improve this method by using MSAs to enrich the sampling distribution, but in a manner that does not require structural knowledge of any protein sequence (i.e., not homologous fragment insertion). In combination with other tools, including clustering and refinement, the accuracies of the predicted secondary and tertiary structures are substantially improved and a global and position-resolved measure of confidence is introduced for the accuracy of the predictions. Performance of the method in the Critical Assessment of Structure Prediction (CASP8) is discussed.

  1. Identification of family-specific residue packing motifs and their use for structure-based protein function prediction: II. Case studies and applications.

    PubMed

    Bandyopadhyay, Deepak; Huan, Jun; Prins, Jan; Snoeyink, Jack; Wang, Wei; Tropsha, Alexander

    2009-11-01

    This paper describes several case studies concerning protein function inference from its structure using our novel approach described in the accompanying paper. This approach employs family-specific motifs, i.e. three-dimensional amino acid packing patterns that are statistically prevalent within a protein family. For our case studies we have selected families from the SCOP and EC classifications and analyzed the discriminating power of the motifs in depth. We have devised several benchmarks to compare motifs mined from unweighted topological graph representations of protein structures with those from distance-labeled (weighted) representations, demonstrating the superiority of the latter for function inference in most families. We have tested the robustness of our motif library by inferring the function of new members added to SCOP families, and discriminating between several families that are structurally similar but functionally divergent. Furthermore we have applied our method to predict function for several proteins characterized in structural genomics projects, including orphan structures, and we discuss several selected predictions in depth. Some of our predictions have been corroborated by other computational methods, and some have been validated by independent experimental studies, validating our approach for protein function inference from structure.

  2. Bankruptcy Prediction with Interfirm Network Structure

    NASA Astrophysics Data System (ADS)

    Kamei, Hideto; Takayasu, Hideki; Kabashima, Yoshiyuki; Takayasu, Misako

    This study examines bankruptcy in terms of financial variables as well as interfirm network structure variables. We first binarize the variables by introducing a threshold and then select the appropriate set of variables that minimize the p-value in Fisher's exact test. Here, the financial variables related to borrowing and savings are strongly correlated with bankruptcy, but the variables of trade network and capital network, including chain bankruptcy effect, have weaker yet significant correlations. Finally, we perform a bankruptcy prediction with the selected variables and a second-order Ising model and confirm that the Ising model has relatively higher predictive power than the logit model.

  3. Predicting PDZ domain mediated protein interactions from structure

    PubMed Central

    2013-01-01

    Background PDZ domains are structural protein domains that recognize simple linear amino acid motifs, often at protein C-termini, and mediate protein-protein interactions (PPIs) in important biological processes, such as ion channel regulation, cell polarity and neural development. PDZ domain-peptide interaction predictors have been developed based on domain and peptide sequence information. Since domain structure is known to influence binding specificity, we hypothesized that structural information could be used to predict new interactions compared to sequence-based predictors. Results We developed a novel computational predictor of PDZ domain and C-terminal peptide interactions using a support vector machine trained with PDZ domain structure and peptide sequence information. Performance was estimated using extensive cross validation testing. We used the structure-based predictor to scan the human proteome for ligands of 218 PDZ domains and show that the predictions correspond to known PDZ domain-peptide interactions and PPIs in curated databases. The structure-based predictor is complementary to the sequence-based predictor, finding unique known and novel PPIs, and is less dependent on training–testing domain sequence similarity. We used a functional enrichment analysis of our hits to create a predicted map of PDZ domain biology. This map highlights PDZ domain involvement in diverse biological processes, some only found by the structure-based predictor. Based on this analysis, we predict novel PDZ domain involvement in xenobiotic metabolism and suggest new interactions for other processes including wound healing and Wnt signalling. Conclusions We built a structure-based predictor of PDZ domain-peptide interactions, which can be used to scan C-terminal proteomes for PDZ interactions. We also show that the structure-based predictor finds many known PDZ mediated PPIs in human that were not found by our previous sequence-based predictor and is less dependent on

  4. Towards cheminformatics-based estimation of drug therapeutic index: Predicting the protective index of anticonvulsants using a new quantitative structure-index relationship approach.

    PubMed

    Chen, Shangying; Zhang, Peng; Liu, Xin; Qin, Chu; Tao, Lin; Zhang, Cheng; Yang, Sheng Yong; Chen, Yu Zong; Chui, Wai Keung

    2016-06-01

    The overall efficacy and safety profile of a new drug is partially evaluated by the therapeutic index in clinical studies and by the protective index (PI) in preclinical studies. In-silico predictive methods may facilitate the assessment of these indicators. Although QSAR and QSTR models can be used for predicting PI, their predictive capability has not been evaluated. To test this capability, we developed QSAR and QSTR models for predicting the activity and toxicity of anticonvulsants at accuracy levels above the literature-reported threshold (LT) of good QSAR models as tested by both the internal 5-fold cross validation and external validation method. These models showed significantly compromised PI predictive capability due to the cumulative errors of the QSAR and QSTR models. Therefore, in this investigation a new quantitative structure-index relationship (QSIR) model was devised and it showed improved PI predictive capability that superseded the LT of good QSAR models. The QSAR, QSTR and QSIR models were developed using support vector regression (SVR) method with the parameters optimized by using the greedy search method. The molecular descriptors relevant to the prediction of anticonvulsant activities, toxicities and PIs were analyzed by a recursive feature elimination method. The selected molecular descriptors are primarily associated with the drug-like, pharmacological and toxicological features and those used in the published anticonvulsant QSAR and QSTR models. This study suggested that QSIR is useful for estimating the therapeutic index of drug candidates.

  5. Predicting nucleic acid binding interfaces from structural models of proteins

    PubMed Central

    Dror, Iris; Shazman, Shula; Mukherjee, Srayanta; Zhang, Yang; Glaser, Fabian; Mandel-Gutfreund, Yael

    2011-01-01

    The function of DNA- and RNA-binding proteins can be inferred from the characterization and accurate prediction of their binding interfaces. However the main pitfall of various structure-based methods for predicting nucleic acid binding function is that they are all limited to a relatively small number of proteins for which high-resolution three dimensional structures are available. In this study, we developed a pipeline for extracting functional electrostatic patches from surfaces of protein structural models, obtained using the I-TASSER protein structure predictor. The largest positive patches are extracted from the protein surface using the patchfinder algorithm. We show that functional electrostatic patches extracted from an ensemble of structural models highly overlap the patches extracted from high-resolution structures. Furthermore, by testing our pipeline on a set of 55 known nucleic acid binding proteins for which I-TASSER produces high-quality models, we show that the method accurately identifies the nucleic acids binding interface on structural models of proteins. Employing a combined patch approach we show that patches extracted from an ensemble of models better predicts the real nucleic acid binding interfaces compared to patches extracted from independent models. Overall, these results suggest that combining information from a collection of low-resolution structural models could be a valuable approach for functional annotation. We suggest that our method will be further applicable for predicting other functional surfaces of proteins with unknown structure. PMID:22086767

  6. Predicting nucleic acid binding interfaces from structural models of proteins.

    PubMed

    Dror, Iris; Shazman, Shula; Mukherjee, Srayanta; Zhang, Yang; Glaser, Fabian; Mandel-Gutfreund, Yael

    2012-02-01

    The function of DNA- and RNA-binding proteins can be inferred from the characterization and accurate prediction of their binding interfaces. However, the main pitfall of various structure-based methods for predicting nucleic acid binding function is that they are all limited to a relatively small number of proteins for which high-resolution three-dimensional structures are available. In this study, we developed a pipeline for extracting functional electrostatic patches from surfaces of protein structural models, obtained using the I-TASSER protein structure predictor. The largest positive patches are extracted from the protein surface using the patchfinder algorithm. We show that functional electrostatic patches extracted from an ensemble of structural models highly overlap the patches extracted from high-resolution structures. Furthermore, by testing our pipeline on a set of 55 known nucleic acid binding proteins for which I-TASSER produces high-quality models, we show that the method accurately identifies the nucleic acids binding interface on structural models of proteins. Employing a combined patch approach we show that patches extracted from an ensemble of models better predicts the real nucleic acid binding interfaces compared with patches extracted from independent models. Overall, these results suggest that combining information from a collection of low-resolution structural models could be a valuable approach for functional annotation. We suggest that our method will be further applicable for predicting other functional surfaces of proteins with unknown structure. Copyright © 2011 Wiley Periodicals, Inc.

  7. Data-Based Predictive Control with Multirate Prediction Step

    NASA Technical Reports Server (NTRS)

    Barlow, Jonathan S.

    2010-01-01

    Data-based predictive control is an emerging control method that stems from Model Predictive Control (MPC). MPC computes current control action based on a prediction of the system output a number of time steps into the future and is generally derived from a known model of the system. Data-based predictive control has the advantage of deriving predictive models and controller gains from input-output data. Thus, a controller can be designed from the outputs of complex simulation code or a physical system where no explicit model exists. If the output data happens to be corrupted by periodic disturbances, the designed controller will also have the built-in ability to reject these disturbances without the need to know them. When data-based predictive control is implemented online, it becomes a version of adaptive control. One challenge of MPC is computational requirements increasing with prediction horizon length. This paper develops a closed-loop dynamic output feedback controller that minimizes a multi-step-ahead receding-horizon cost function with multirate prediction step. One result is a reduced influence of prediction horizon and the number of system outputs on the computational requirements of the controller. Another result is an emphasis on portions of the prediction window that are sampled more frequently. A third result is the ability to include more outputs in the feedback path than in the cost function.

  8. Ab initio prediction of the solution structures and populations of a cyclic pentapeptide in DMSO based on an implicit solvation model.

    PubMed

    Baysal, C; Meirovitch, H

    2000-04-15

    Using a recently developed statistical mechanics methodology, the solution structures and populations of the cyclic pentapeptide cyclo(D-Pro(1)-Ala(2)-Ala(3)-Ala(4)-Ala(5)) in DMSO are obtained ab initio, i.e., without using experimental restraints. An important ingredient of this methodology is a novel optimization of implicit solvation parameters, which in our previous publication [Baysal, C.; Meirovitch, H. J Am Chem Soc 1998, 120, 800-812] has been applied to a cyclic hexapeptide in DMSO. The molecule has been described by the simplified energy function E(tot) = E(GRO) + summation operator(k) sigma(k)A(k), where E(GRO) is the GROMOS force-field energy, sigma(k) and A(k) are the atomic solvation parameter (ASP) and the solvent accessible surface area of atom k. This methodology, which relies on an extensive conformational search, Monte Carlo simulations, and free energy calculations, is applied here with E(tot) based on the ASPs derived in our previous work, and for comparison also with E(GRO) alone. For both models, entropy effects are found to be significant. For E(tot), the theoretical values of proton-proton distances and (3)J coupling constants agree very well with the NMR results [Mierke, D. F.; Kurz, M.; Kessler, H. J Am Chem Soc 1994, 116, 1042-1049], while the results for E(GRO) are significantly worse. This suggests that our ASPs might be transferrable to other cyclic peptides in DMSO as well, making our methodology a reliable tool for an ab initio structure prediction; obviously, if necessary, parts of this methodology can also be incorporated in a best-fit analysis where experimental restraints are used.

  9. Prediction of Protein Structure Using Surface Accessibility Data.

    PubMed

    Hartlmüller, Christoph; Göbl, Christoph; Madl, Tobias

    2016-09-19

    An approach to the de novo structure prediction of proteins is described that relies on surface accessibility data from NMR paramagnetic relaxation enhancements by a soluble paramagnetic compound (sPRE). This method exploits the distance-to-surface information encoded in the sPRE data in the chemical shift-based CS-Rosetta de novo structure prediction framework to generate reliable structural models. For several proteins, it is demonstrated that surface accessibility data is an excellent measure of the correct protein fold in the early stages of the computational folding algorithm and significantly improves accuracy and convergence of the standard Rosetta structure prediction approach. © 2016 The Authors. Published by Wiley-VCH Verlag GmbH & Co. KGaA.

  10. Predicting Secondary Structural Folding Kinetics for Nucleic Acids

    PubMed Central

    Zhao, Peinan; Zhang, Wen-Bing; Chen, Shi-Jie

    2010-01-01

    Abstract We report a new computational approach to the prediction of RNA secondary structure folding kinetics. In this approach, each elementary kinetic step is represented as the transformation between two secondary structures that differ by a helix. Based on the free energy landscape analysis, we identify three types of dominant pathways and the rate constants for the kinetic steps: 1), formation; 2), disruption of a helix stem; and 3), helix formation with concomitant partial melting of a competing (incompatible) helix. The third pathway, termed the tunneling pathway, is the low-barrier dominant pathway for the conversion between two incompatible helices. Comparisons with experimental data indicate that this new method is quite reliable in predicting the kinetics for RNA secondary structural folding and structural rearrangements. The approach presented here may provide a robust first step for further systematic development of a predictive theory for the folding kinetics for large RNAs. PMID:20409482

  11. Structure Prediction of RNA Loops with a Probabilistic Approach

    PubMed Central

    Li, Jun; Zhang, Jian; Wang, Jun; Li, Wenfei; Wang, Wei

    2016-01-01

    The knowledge of the tertiary structure of RNA loops is important for understanding their functions. In this work we develop an efficient approach named RNApps, specifically designed for predicting the tertiary structure of RNA loops, including hairpin loops, internal loops, and multi-way junction loops. It includes a probabilistic coarse-grained RNA model, an all-atom statistical energy function, a sequential Monte Carlo growth algorithm, and a simulated annealing procedure. The approach is tested with a dataset including nine RNA loops, a 23S ribosomal RNA, and a large dataset containing 876 RNAs. The performance is evaluated and compared with a homology modeling based predictor and an ab initio predictor. It is found that RNApps has comparable performance with the former one and outdoes the latter in terms of structure predictions. The approach holds great promise for accurate and efficient RNA tertiary structure prediction. PMID:27494763

  12. Topological structure prediction in binary nanoparticle superlattices

    DOE PAGES

    Travesset, A.

    2017-04-27

    Systems of spherical nanoparticles with capping ligands have been shown to self-assemble into beautiful superlattices of fascinating structure and complexity. Here, I show that the spherical geometry of the nanoparticle imposes constraints on the nature of the topological defects associated with the capping ligand and that such topological defects control the structure and stability of the superlattices that can be assembled. Furthermore, all of these considerations form the basis for the orbifold topological model (OTM) described in this paper. Finally, the model quantitatively predicts the structure of super-lattices where capping ligands are hydrocarbon chains in excellent agreement with experimental results,more » explains the appearance of low packing fraction lattices as equilibrium, why certain similar structures are more stable (bccAB6vs. CaB6, AuCu vs. CsCl, etc.) and many other experimental observations.« less

  13. Reduced ceria nanofilms from structure prediction

    NASA Astrophysics Data System (ADS)

    Kozlov, Sergey M.; Demiroglu, Ilker; Neyman, Konstantin M.; Bromley, Stefan T.

    2015-02-01

    Experimentally, Ce2O3 films are used to study cerium oxide in its fully or partially reduced state, as present in many applications. We have explored the space of low energy Ce2O3 nanofilms using structure prediction and density functional calculations, yielding more than 30 distinct nanofilm structures. First, our results help to rationalize the roles of thermodynamics and kinetics in the preparation of reduced ceria nanofilms with different bulk crystalline structures (e.g. A-type or bixbyite) depending on the support used. Second, we predict a novel, as yet experimentally unresolved, nanofilm which has a structure that does not correspond to any previously reported bulk A2B3 phase and which has an energetic stability between that of A-type and bixbyite. To assist identification and fabrication of this new Ce2O3 nanofilm we calculate some observable properties and propose supports for its epitaxial growth.Experimentally, Ce2O3 films are used to study cerium oxide in its fully or partially reduced state, as present in many applications. We have explored the space of low energy Ce2O3 nanofilms using structure prediction and density functional calculations, yielding more than 30 distinct nanofilm structures. First, our results help to rationalize the roles of thermodynamics and kinetics in the preparation of reduced ceria nanofilms with different bulk crystalline structures (e.g. A-type or bixbyite) depending on the support used. Second, we predict a novel, as yet experimentally unresolved, nanofilm which has a structure that does not correspond to any previously reported bulk A2B3 phase and which has an energetic stability between that of A-type and bixbyite. To assist identification and fabrication of this new Ce2O3 nanofilm we calculate some observable properties and propose supports for its epitaxial growth. Electronic supplementary information (ESI) available: Graph of IP versus DFT relative energies for nanofilms, GGA + U calculated lattice parameters and

  14. Protein Structure Prediction Using String Kernels

    DTIC Science & Technology

    2006-03-03

    evaluated using the sets of sequences obtained from the SCOP database [39]. The SCOP database is a manually curated protein structure database assigning...proteins into hierarchically defined classes. The fold prediction problem in the context of SCOP can be defined as assigning a protein sequence to its...above techniques, remote homology detection is simulated by formulating it as a superfamily classification problem within the context of the SCOP database

  15. A Software Pipeline for Protein Structure Prediction

    DTIC Science & Technology

    2006-11-01

    distant relationships between the domain sequence and a library of thousands of protein fold templates derived from the SCOP 1.69 database (Andreeva...from the SCOP 1.69 database (Andreeva, Howorth et al. 2004) and a list of PDB sequences that have low sequence similarity to every other sequence...protein as defined by SCOP . 4. DISCUSSION Our assessment of the capabilities of the protein structure-prediction suite is consistent with other

  16. Structure-Based Virtual Screening.

    PubMed

    Li, Qingliang; Shah, Salim

    2017-01-01

    Structure-based virtual screening (SBVS) is a computational approach used in the early-stage drug discovery campaign to search a chemical compound library for novel bioactive molecules against a certain drug target. It utilizes the three-dimensional (3D) structure of the biological target, obtained from X-ray, NMR, or computational modeling, to dock a collection of chemical compounds into the binding site and select a subset of these compounds based on the predicted binding scores for further biological evaluation. In the present work, we illustrate the basic process of conducting a SBVS with examples using freely accessible tools and resources.

  17. Toward the Prediction of Organic Hydrate Crystal Structures.

    PubMed

    Hulme, Ashley T; Price, Sarah L

    2007-07-01

    Lattice energy minimization studies on four ordered crystal structures of ice and 22 hydrates of approximately rigid organic molecules (along with 11 corresponding anhydrate structures) were used to establish a model potential scheme, based on the use of a distributed multipole electrostatic model, that can reasonably reproduce the crystal structures. Transferring the empirical repulsion-dispersion potentials for organic oxygen and polar hydrogen atoms to water appears more successful for modeling ice phases than using common water potentials derived from liquid properties. Lattice energy differences are reasonable but quite sensitive to the exact conformation of water and the organic molecule used in the rigid molecule modeling. This potential scheme was used to test a new approach of predicting the crystal structure of 5-azauracil monohydrate (an isolated site hydrate) based on seeking dense crystal packings of 66 5-azauracil···water hydrogen-bonded clusters, derived from an analysis of hydrate hydrogen bond geometries involving the carbonyl- and aza-group acceptors in the Cambridge Structural Database. The known structure was found within 5 kJ mol(-1) of the global minimum in static lattice energy and as the third most stable structure, within 1 kJ mol(-1), when thermal effects at ambient temperature were considered. Thus, although the computational prediction of whether an organic molecule will crystallize in a hydrated form poses many challenges, the prediction of plausible structures for hydrogen-bonded monohydrates is now possible.

  18. Contingency Table Browser − prediction of early stage protein structure

    PubMed Central

    Kalinowska, Barbara; Krzykalski, Artur; Roterman, Irena

    2015-01-01

    The Early Stage (ES) intermediate represents the starting structure in protein folding simulations based on the Fuzzy Oil Drop (FOD) model. The accuracy of FOD predictions is greatly dependent on the accuracy of the chosen intermediate. A suitable intermediate can be constructed using the sequence-structure relationship information contained in the so-called contingency table − this table expresses the likelihood of encountering various structural motifs for each tetrapeptide fragment in the amino acid sequence. The limited accuracy with which such structures could previously be predicted provided the motivation for a more indepth study of the contingency table itself. The Contingency Table Browser is a tool which can visualize, search and analyze the table. Our work presents possible applications of Contingency Table Browser, among them − analysis of specific protein sequences from the point of view of their structural ambiguity. PMID:26664034

  19. Contingency Table Browser - prediction of early stage protein structure.

    PubMed

    Kalinowska, Barbara; Krzykalski, Artur; Roterman, Irena

    2015-01-01

    The Early Stage (ES) intermediate represents the starting structure in protein folding simulations based on the Fuzzy Oil Drop (FOD) model. The accuracy of FOD predictions is greatly dependent on the accuracy of the chosen intermediate. A suitable intermediate can be constructed using the sequence-structure relationship information contained in the so-called contingency table - this table expresses the likelihood of encountering various structural motifs for each tetrapeptide fragment in the amino acid sequence. The limited accuracy with which such structures could previously be predicted provided the motivation for a more indepth study of the contingency table itself. The Contingency Table Browser is a tool which can visualize, search and analyze the table. Our work presents possible applications of Contingency Table Browser, among them - analysis of specific protein sequences from the point of view of their structural ambiguity.

  20. Adaptive modelling of structured molecular representations for toxicity prediction

    NASA Astrophysics Data System (ADS)

    Bertinetto, Carlo; Duce, Celia; Micheli, Alessio; Solaro, Roberto; Tiné, Maria Rosaria

    2012-12-01

    We investigated the possibility of modelling structure-toxicity relationships by direct treatment of the molecular structure (without using descriptors) through an adaptive model able to retain the appropriate structural information. With respect to traditional descriptor-based approaches, this provides a more general and flexible way to tackle prediction problems that is particularly suitable when little or no background knowledge is available. Our method employs a tree-structured molecular representation, which is processed by a recursive neural network (RNN). To explore the realization of RNN modelling in toxicological problems, we employed a data set containing growth impairment concentrations (IGC50) for Tetrahymena pyriformis.

  1. Base Rates, Contingencies, and Prediction Behavior

    ERIC Educational Resources Information Center

    Kareev, Yaakov; Fiedler, Klaus; Avrahami, Judith

    2009-01-01

    A skew in the base rate of upcoming events can often provide a better cue for accurate predictions than a contingency between signals and events. The authors study prediction behavior and test people's sensitivity to both base rate and contingency; they also examine people's ability to compare the benefits of both for prediction. They formalize…

  2. The MEMPACK alpha-helical transmembrane protein structure prediction server

    PubMed Central

    Nugent, Timothy; Ward, Sean; Jones, David T.

    2011-01-01

    Motivation: The experimental difficulties of alpha-helical transmembrane protein structure determination make this class of protein an important target for sequence-based structure prediction tools. The MEMPACK prediction server allows users to submit a transmembrane protein sequence and returns transmembrane topology, lipid exposure, residue contacts, helix–helix interactions and helical packing arrangement predictions in both plain text and graphical formats using a number of novel machine learning-based algorithms. Availability: The server can be accessed as a new component of the PSIPRED portal by at http://bioinf.cs.ucl.ac.uk/psipred/. Contact: d.jones@cs.ucl.ac.uk; t.nugent@cs.ucl.ac.uk PMID:21349872

  3. Improved network community structure improves function prediction

    PubMed Central

    Lee, Juyong; Gross, Steven P.; Lee, Jooyoung

    2013-01-01

    We are overwhelmed by experimental data, and need better ways to understand large interaction datasets. While clustering related nodes in such networks—known as community detection—appears a promising approach, detecting such communities is computationally difficult. Further, how to best use such community information has not been determined. Here, within the context of protein function prediction, we address both issues. First, we apply a novel method that generates improved modularity solutions than the current state of the art. Second, we develop a better method to use this community information to predict proteins' functions. We discuss when and why this community information is important. Our results should be useful for two distinct scientific communities: first, those using various cost functions to detect community structure, where our new optimization approach will improve solutions, and second, those working to extract novel functional information about individual nodes from large interaction datasets. PMID:23852097

  4. Environmental toxicological fate prediction of diverse organic chemicals based on steady-state compartmental chemical mass ratio using quantitative structure-fate relationship (QSFR) models.

    PubMed

    Pramanik, Subrata; Roy, Kunal

    2013-07-01

    Four quantitative prediction models for steady-state compartmental chemical mass concentrations (Wn,g) were obtained from structural information, physiochemical properties, degradation rate and transport coefficients of 455 diverse organic chemicals using chemometric tools in a quantitative structure-fate relationship (QSFR) study. The mass ratio assessment of environmentally prevalent organic chemicals may be helpful to predict their toxicological fate in the ecosystems. Four sets of mass ratios [(1) log(Wair) from water emissions (water to air compartment), (2) log(Wair) from air emissions (within different zones of the air compartment), (3) log(Wwater) from water emissions (within different zones of the water compartment) and (4) log(Wwater) from air emissions (air to water compartment)] have been used. The developed models using genetic function approximation followed by multiple linear regression (GFA-MLR) and subsequent partial least squares (PLS) treatment identify only four descriptors for log(Wair) from water emission, six descriptors for log(Wair) from air emission, five descriptors for log(Wwater) from water emission and seven descriptors for log(Wwater) from air emission for predicting efficiently a large number of test set chemicals (ntest=182). The conclusive models suggest that descriptors such as partition coefficients (Kaw, Kow and Ksw), degradation parameters (Ksoil,Kwater and Kair), vapor pressure (Pv), diffusivity (Dwater), spatial descriptors (Jurs-WNSA-1, Jurs-WNSA-2, Jurs-WPSA-3, Jurs-FNSA-3 and Density), thermodynamic descriptors (MolRef and AlogP98), electrotopological state indices (S_dsN, S_ssNH and S_dsCH) are important for predicting the chemical mass ratios. The developed models may be applicable in toxicological fate prediction of diverse chemicals in the ecosystems.

  5. Structure prediction and targeted synthesis: a new Na(n)N2 diazenide crystalline structure.

    PubMed

    Zhang, Xiuwen; Zunger, Alex; Trimarchi, Giancarlo

    2010-11-21

    Significant progress in theoretical and computational techniques for predicting stable crystal structures has recently begun to stimulate targeted synthesis of such predicted structures. Using a global space-group optimization (GSGO) approach that locates ground-state structures and stable stoichiometries from first-principles energy functionals by objectively starting from randomly selected lattice vectors and random atomic positions, we predict the first alkali diazenide compound Na(n)N(2), manifesting homopolar N-N bonds. The previously predicted Na(3)N structure manifests only heteropolar Na-N bonds and has positive formation enthalpy. It was calculated based on local Hartree-Fock relaxation of a fixed-structure type (Li(3)P-type) found by searching an electrostatic point-ion model. Synthesis attempts of this positive ΔH compound using activated nitrogen yielded another structure (anti-ReO(3)-type). The currently predicted (negative formation enthalpy) diazenide Na(2)N(2) completes the series of previously known BaN(2) and SrN(2) diazenides where the metal sublattice transfers charge into the empty N(2) Π(g) orbital. This points to a new class of alkali nitrides with fundamentally different bonding, i.e., homopolar rather than heteropolar bonds and, at the same time, illustrates some of the crucial subtleties and pitfalls involved in structure predictions versus planned synthesis. Attempts at synthesis of the stable Na(2)N(2) predicted here will be interesting.

  6. PCI-SS: MISO dynamic nonlinear protein secondary structure prediction.

    PubMed

    Green, James R; Korenberg, Michael J; Aboul-Magd, Mohammed O

    2009-07-17

    Since the function of a protein is largely dictated by its three dimensional configuration, determining a protein's structure is of fundamental importance to biology. Here we report on a novel approach to determining the one dimensional secondary structure of proteins (distinguishing alpha-helices, beta-strands, and non-regular structures) from primary sequence data which makes use of Parallel Cascade Identification (PCI), a powerful technique from the field of nonlinear system identification. Using PSI-BLAST divergent evolutionary profiles as input data, dynamic nonlinear systems are built through a black-box approach to model the process of protein folding. Genetic algorithms (GAs) are applied in order to optimize the architectural parameters of the PCI models. The three-state prediction problem is broken down into a combination of three binary sub-problems and protein structure classifiers are built using 2 layers of PCI classifiers. Careful construction of the optimization, training, and test datasets ensures that no homology exists between any training and testing data. A detailed comparison between PCI and 9 contemporary methods is provided over a set of 125 new protein chains guaranteed to be dissimilar to all training data. Unlike other secondary structure prediction methods, here a web service is developed to provide both human- and machine-readable interfaces to PCI-based protein secondary structure prediction. This server, called PCI-SS, is available at http://bioinf.sce.carleton.ca/PCISS. In addition to a dynamic PHP-generated web interface for humans, a Simple Object Access Protocol (SOAP) interface is added to permit invocation of the PCI-SS service remotely. This machine-readable interface facilitates incorporation of PCI-SS into multi-faceted systems biology analysis pipelines requiring protein secondary structure information, and greatly simplifies high-throughput analyses. XML is used to represent the input protein sequence data and also to encode

  7. MUFOLD: A new solution for protein 3D structure prediction.

    PubMed

    Zhang, Jingfen; Wang, Qingguo; Barz, Bogdan; He, Zhiquan; Kosztin, Ioan; Shang, Yi; Xu, Dong

    2010-04-01

    There have been steady improvements in protein structure prediction during the past 2 decades. However, current methods are still far from consistently predicting structural models accurately with computing power accessible to common users. Toward achieving more accurate and efficient structure prediction, we developed a number of novel methods and integrated them into a software package, MUFOLD. First, a systematic protocol was developed to identify useful templates and fragments from Protein Data Bank for a given target protein. Then, an efficient process was applied for iterative coarse-grain model generation and evaluation at the Calpha or backbone level. In this process, we construct models using interresidue spatial restraints derived from alignments by multidimensional scaling, evaluate and select models through clustering and static scoring functions, and iteratively improve the selected models by integrating spatial restraints and previous models. Finally, the full-atom models were evaluated using molecular dynamics simulations based on structural changes under simulated heating. We have continuously improved the performance of MUFOLD by using a benchmark of 200 proteins from the Astral database, where no template with >25% sequence identity to any target protein is included. The average root-mean-square deviation of the best models from the native structures is 4.28 A, which shows significant and systematic improvement over our previous methods. The computing time of MUFOLD is much shorter than many other tools, such as Rosetta. MUFOLD demonstrated some success in the 2008 community-wide experiment for protein structure prediction CASP8.

  8. Template-based modeling of a psychrophilic lipase: conformational changes, novel structural features and its application in predicting the enantioselectivity of lipase catalyzed transesterification of secondary alcohols.

    PubMed

    Xu, Tao; Gao, Bei; Zhang, Lujia; Lin, Jingpin; Wang, Xuedong; Wei, Dongzhi

    2010-12-01

    In order to fully explore the structure-function relationship of a Proteus lipase (LipK107) that was screened from the soil in our previous study, we have modeled the three-dimensional (3-D) structures of the enzyme in its active and inactive conformations on the basis of crystal structures of Burkholderia glumae and Pseudomonas aeruginosa lipases in the present study. Both homology models suggested that LipK107 possessed a catalytic triad (Ser79-Asp232-H254), an oxyanion hole (Leu13 and Gln80) which was used to stabilize the reaction tetrahedral intermediates, and a lid substructure that controlled the access of the substrate to the active site. The existence of the lid was further verified by carrying out the interfacial activation experiment. The conformational change of LipK107 which was caused by lid opening action was predicted by superimposing the two theoretical models for the first time. Finally, both 3-D structures were used to predict the enantioselectivity of LipK107 when the enzyme was used to catalyze the resolution of racemic 1-phenylethanol. Lid-open model of LipK107 identified the R-enantiomer as the preferred enantiomer, while lid-closed mode showed that the S-enantiomer was more favored. However, only the lid-open conformational model could led to predictions that agreed with the following the experimental result of real biocatalysis reaction of 1-phenylethanol. Crown Copyright © 2010. Published by Elsevier B.V. All rights reserved.

  9. Improving Predictions of Protein-Protein Interfaces by Combining Amino Acid-Specific Classifiers Based on Structural and Physicochemical Descriptors with Their Weighted Neighbor Averages

    PubMed Central

    de Moraes, Fábio R.; Neshich, Izabella A. P.; Mazoni, Ivan; Yano, Inácio H.; Pereira, José G. C.; Salim, José A.; Jardine, José G.; Neshich, Goran

    2014-01-01

    Protein-protein interactions are involved in nearly all regulatory processes in the cell and are considered one of the most important issues in molecular biology and pharmaceutical sciences but are still not fully understood. Structural and computational biology contributed greatly to the elucidation of the mechanism of protein interactions. In this paper, we present a collection of the physicochemical and structural characteristics that distinguish interface-forming residues (IFR) from free surface residues (FSR). We formulated a linear discriminative analysis (LDA) classifier to assess whether chosen descriptors from the BlueStar STING database (http://www.cbi.cnptia.embrapa.br/SMS/) are suitable for such a task. Receiver operating characteristic (ROC) analysis indicates that the particular physicochemical and structural descriptors used for building the linear classifier perform much better than a random classifier and in fact, successfully outperform some of the previously published procedures, whose performance indicators were recently compared by other research groups. The results presented here show that the selected set of descriptors can be utilized to predict IFRs, even when homologue proteins are missing (particularly important for orphan proteins where no homologue is available for comparative analysis/indication) or, when certain conformational changes accompany interface formation. The development of amino acid type specific classifiers is shown to increase IFR classification performance. Also, we found that the addition of an amino acid conservation attribute did not improve the classification prediction. This result indicates that the increase in predictive power associated with amino acid conservation is exhausted by adequate use of an extensive list of independent physicochemical and structural parameters that, by themselves, fully describe the nano-environment at protein-protein interfaces. The IFR classifier developed in this study is now

  10. Improving predictions of protein-protein interfaces by combining amino acid-specific classifiers based on structural and physicochemical descriptors with their weighted neighbor averages.

    PubMed

    de Moraes, Fábio R; Neshich, Izabella A P; Mazoni, Ivan; Yano, Inácio H; Pereira, José G C; Salim, José A; Jardine, José G; Neshich, Goran

    2014-01-01

    Protein-protein interactions are involved in nearly all regulatory processes in the cell and are considered one of the most important issues in molecular biology and pharmaceutical sciences but are still not fully understood. Structural and computational biology contributed greatly to the elucidation of the mechanism of protein interactions. In this paper, we present a collection of the physicochemical and structural characteristics that distinguish interface-forming residues (IFR) from free surface residues (FSR). We formulated a linear discriminative analysis (LDA) classifier to assess whether chosen descriptors from the BlueStar STING database (http://www.cbi.cnptia.embrapa.br/SMS/) are suitable for such a task. Receiver operating characteristic (ROC) analysis indicates that the particular physicochemical and structural descriptors used for building the linear classifier perform much better than a random classifier and in fact, successfully outperform some of the previously published procedures, whose performance indicators were recently compared by other research groups. The results presented here show that the selected set of descriptors can be utilized to predict IFRs, even when homologue proteins are missing (particularly important for orphan proteins where no homologue is available for comparative analysis/indication) or, when certain conformational changes accompany interface formation. The development of amino acid type specific classifiers is shown to increase IFR classification performance. Also, we found that the addition of an amino acid conservation attribute did not improve the classification prediction. This result indicates that the increase in predictive power associated with amino acid conservation is exhausted by adequate use of an extensive list of independent physicochemical and structural parameters that, by themselves, fully describe the nano-environment at protein-protein interfaces. The IFR classifier developed in this study is now

  11. Predicting DNA-binding proteins and binding residues by complex structure prediction and application to human proteome.

    PubMed

    Zhao, Huiying; Wang, Jihua; Zhou, Yaoqi; Yang, Yuedong

    2014-01-01

    As more and more protein sequences are uncovered from increasingly inexpensive sequencing techniques, an urgent task is to find their functions. This work presents a highly reliable computational technique for predicting DNA-binding function at the level of protein-DNA complex structures, rather than low-resolution two-state prediction of DNA-binding as most existing techniques do. The method first predicts protein-DNA complex structure by utilizing the template-based structure prediction technique HHblits, followed by binding affinity prediction based on a knowledge-based energy function (Distance-scaled finite ideal-gas reference state for protein-DNA interactions). A leave-one-out cross validation of the method based on 179 DNA-binding and 3797 non-binding protein domains achieves a Matthews correlation coefficient (MCC) of 0.77 with high precision (94%) and high sensitivity (65%). We further found 51% sensitivity for 82 newly determined structures of DNA-binding proteins and 56% sensitivity for the human proteome. In addition, the method provides a reasonably accurate prediction of DNA-binding residues in proteins based on predicted DNA-binding complex structures. Its application to human proteome leads to more than 300 novel DNA-binding proteins; some of these predicted structures were validated by known structures of homologous proteins in APO forms. The method [SPOT-Seq (DNA)] is available as an on-line server at http://sparks-lab.org.

  12. (PS)2: protein structure prediction server version 3.0.

    PubMed

    Huang, Tsun-Tsao; Hwang, Jenn-Kang; Chen, Chu-Huang; Chu, Chih-Sheng; Lee, Chi-Wen; Chen, Chih-Chieh

    2015-07-01

    Protein complexes are involved in many biological processes. Examining coupling between subunits of a complex would be useful to understand the molecular basis of protein function. Here, our updated (PS)(2) web server predicts the three-dimensional structures of protein complexes based on comparative modeling; furthermore, this server examines the coupling between subunits of the predicted complex by combining structural and evolutionary considerations. The predicted complex structure could be indicated and visualized by Java-based 3D graphics viewers and the structural and evolutionary profiles are shown and compared chain-by-chain. For each subunit, considerations with or without the packing contribution of other subunits cause the differences in similarities between structural and evolutionary profiles, and these differences imply which form, complex or monomeric, is preferred in the biological condition for the subunit. We believe that the (PS)(2) server would be a useful tool for biologists who are interested not only in the structures of protein complexes but also in the coupling between subunits of the complexes. The (PS)(2) is freely available at http://ps2v3.life.nctu.edu.tw/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. PredictProtein--an open resource for online prediction of protein structural and functional features.

    PubMed

    Yachdav, Guy; Kloppmann, Edda; Kajan, Laszlo; Hecht, Maximilian; Goldberg, Tatyana; Hamp, Tobias; Hönigschmid, Peter; Schafferhans, Andrea; Roos, Manfred; Bernhofer, Michael; Richter, Lothar; Ashkenazy, Haim; Punta, Marco; Schlessinger, Avner; Bromberg, Yana; Schneider, Reinhard; Vriend, Gerrit; Sander, Chris; Ben-Tal, Nir; Rost, Burkhard

    2014-07-01

    PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein-protein binding sites (ISIS2), protein-polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. Predicted and experimental structures of integrins and beta-propellers.

    PubMed

    Springer, Timothy A

    2002-12-01

    Integrins and other cell surface receptors have been fertile grounds for structure prediction experiments. Recently determined structures show remarkable successes, especially with beta-propeller domain predictions, and also reveal how ligand binding by integrins is conformationally regulated.

  15. Frequency domain and full waveform time domain inversion of ground based magnetometer, electrometer and incoherent scattering radar arrays to image strongly heterogenous 3-D Earth structure, ionospheric structure, and to predict the intensity of GICs in the power grid

    NASA Astrophysics Data System (ADS)

    Schultz, A.; Imamura, N.; Bonner, L. R., IV; Cosgrove, R. B.

    2016-12-01

    Ground-based magnetometer and electrometer arrays provide the means to probe the structure of the Earth's interior, the interactions of space weather with the ionosphere, and to anticipate the intensity of geomagnetically induced currents (GICs) in power grids. We present a local-to-continental scale view of a heterogeneous 3-D crust and mantle as determined from magnetotelluric (MT) observations across arrays of ground-based electric and magnetic field sensors. MT impedance tensors describe the relationship between electric and magnetic fields at a given site, thus implicitly they contain all known information on the 3-D electrical resistivity structure beneath and surrounding that site. By using multivariate transfer functions to project real-time magnetic observatory network data to areas surrounding electric power grids, and by projecting those magnetic fields through MT impedance tensors, the projected magnetic field can be transformed into predictions of electric fields along the path of the transmission lines, an essential element of predicting the intensity of GICs in the grid. Finally, we explore GICs, i.e. Earth-ionosphere coupling directly in the time-domain. We consider the fully coupled EM system, where we allow for a non-stationary ionospheric source field of arbitrary complexity above a 3-D Earth. We solve the simultaneous inverse problem for 3-D Earth conductivity and source field structure directly in the time domain. In the present work, we apply this method to magnetotelluric data obtained from a synchronously operating array of 25 MT stations that collected continuous MT waveform data in the interior of Alaska during the autumn and winter of 2015 under the footprint of the Poker Flat (Alaska) Incoherent Scattering Radar (PFISR). PFISR data yield functionals of the ionospheric electric field and ionospheric conductivity that constrain the MT source field. We show that in this region conventional robust MT processing methods struggle to produce

  16. Predicting loop-helix tertiary structural contacts in RNA pseudoknots.

    PubMed

    Cao, Song; Giedroc, David P; Chen, Shi-Jie

    2010-03-01

    Tertiary interactions between loops and helical stems play critical roles in the biological function of many RNA pseudoknots. However, quantitative predictions for RNA tertiary interactions remain elusive. Here we report a statistical mechanical model for the prediction of noncanonical loop-stem base-pairing interactions in RNA pseudoknots. Central to the model is the evaluation of the conformational entropy for the pseudoknotted folds with defined loop-stem tertiary structural contacts. We develop an RNA virtual bond-based conformational model (Vfold model), which permits a rigorous computation of the conformational entropy for a given fold that contains loop-stem tertiary contacts. With the entropy parameters predicted from the Vfold model and the energy parameters for the tertiary contacts as inserted parameters, we can then predict the RNA folding thermodynamics, from which we can extract the tertiary contact thermodynamic parameters from theory-experimental comparisons. These comparisons reveal a contact enthalpy (DeltaH) of -14 kcal/mol and a contact entropy (DeltaS) of -38 cal/mol/K for a protonated C(+)*(G-C) base triple at pH 7.0, and (DeltaH = -7 kcal/mol, DeltaS = -19 cal/mol/K) for an unprotonated base triple. Tests of the model for a series of pseudoknots show good theory-experiment agreement. Based on the extracted energy parameters for the tertiary structural contacts, the model enables predictions for the structure, stability, and folding pathways for RNA pseudoknots with known or postulated loop-stem tertiary contacts from the nucleotide sequence alone.

  17. Evaluation, analysis and prediction of geologic structures

    NASA Astrophysics Data System (ADS)

    Woodward, Nicholas B.

    2012-08-01

    Balanced cross-sections claim to be better because they apply a rigorous set of rules to develop the conceptual model of the structures present in an area. Balanced cross-sections can be further improved and become more useful to understanding real physical problems by collection of additional data such as seismic reflection surveys, collection of additional stratigraphic data, or collection of rock fabric information. The additional information validates the initial model and provides details on deformation conditions and on local rock responses to the deformation. Although individual cross-sections are two dimensional, the objective of evaluation and analysis of deformed regions should be three dimensional whenever possible to recognize the challenges of the real world. Subsurface system analysis derived from the hydrologic community emphasizes conceptual model development through model verification, validation, uncertainty quantification, benchmarking and meta-analysis. Their approach includes many steps informally used by the structural geology community but in a much more explicit way. Newer geological applications of structural geology would benefit from this more rigorous approach for designing and doing performance predictions as technological needs become more socially sensitive such as for carbon storage sites, new areas of energy exploration in higher population density areas, or for nuclear waste storage facilities.

  18. Prediction of chemical carcinogenicity from molecular structure.

    PubMed

    Sun, Hongmao

    2004-01-01

    Carcinogens represent a serious threat to human health. In vivo determination of carcinogenicity is time-consuming and expensive, thus in silico models to predict chemical carcinogenicity are highly desirable for virtual screening of compound libraries of both pharmaceutically and other commercially interesting molecules. In the present study, a PLS-DA (partial least squares discriminant analysis) model was developed to predict carcinogenicities in each of four rodent models: male mouse (MM), female mouse (FM), male rat (MR), and female rat (FR). The data set that was used contained over 520 compounds from both the NTP and the FDA databases. All the models were built from the same molecular descriptor system, which is based on atom typing [Sun, H. J. Chem. Inf. Comput. Sci. 2004, 44, 748-757], enabling the comparison of atomic contributions to carcinogenicity with respect to species and gender. Using four components, the models were able to achieve excellent fitting and prediction, with r(2) = 0.987 and q(2) = 0.944 for MM, r(2) = 0.985 and q(2) = 0.950 for FM, r(2) = 0.989 and q(2) = 0.962 for MR, and r(2) = 0.990 and q(2) = 0.965 for FR. The models were further validated by response permutation testing and external validation, and the results indicated that the models were both statistically significant and predictive. Variable influence on projection (VIP) analysis identified the key atom types and fragments that contributed to carcinogenicities and response differences across species and gender.

  19. A system structure for predictive relations in penetration mechanics

    NASA Astrophysics Data System (ADS)

    Korjack, Thomas A.

    1992-02-01

    The availability of a software system yielding quick numerical models to predict ballistic behavior is a requisite for any research laboratory engaged in material behavior. What is especially true about accessibility of rapid prototyping for terminal impaction is the enhancement of a system structure which will direct the specific material and impact situation towards a specific predictive model. This is of particular importance when the ranges of validity are at stake and the pertinent constraints associated with the impact are unknown. Hence, a compilation of semiempirical predictive penetration relations for various physical phenomena has been organized into a data structure for the purpose of developing a knowledge-based decision aided expert system to predict the terminal ballistic behavior of projectiles and targets. The ranges of validity and constraints of operation of each model were examined and cast into a decision tree structure to include target type, target material, projectile types, projectile materials, attack configuration, and performance or damage measures. This decision system implements many penetration relations, identifies formulas that match user-given conditions, and displays the predictive relation coincident with the match in addition to a numerical solution. The physical regimes under consideration encompass the hydrodynamic, transitional, and solid; the targets are either semi-infinite or plate, and the projectiles include kinetic and chemical energy. A preliminary databases has been constructed to allow further development of inductive and deductive reasoning techniques applied to ballistic situations involving terminal mechanics.

  20. Quantitative structure-activity relationship models for predicting drug-induced liver injury based on FDA-approved drug labeling annotation and using a large collection of drugs.

    PubMed

    Chen, Minjun; Hong, Huixiao; Fang, Hong; Kelly, Reagan; Zhou, Guangxu; Borlak, Jürgen; Tong, Weida

    2013-11-01

    Drug-induced liver injury (DILI) is one of the leading causes of the termination of drug development programs. Consequently, identifying the risk of DILI in humans for drug candidates during the early stages of the development process would greatly reduce the drug attrition rate in the pharmaceutical industry but would require the implementation of new research and development strategies. In this regard, several in silico models have been proposed as alternative means in prioritizing drug candidates. Because the accuracy and utility of a predictive model rests largely on how to annotate the potential of a drug to cause DILI in a reliable and consistent way, the Food and Drug Administration-approved drug labeling was given prominence. Out of 387 drugs annotated, 197 drugs were used to develop a quantitative structure-activity relationship (QSAR) model and the model was subsequently challenged by the left of drugs serving as an external validation set with an overall prediction accuracy of 68.9%. The performance of the model was further assessed by the use of 2 additional independent validation sets, and the 3 validation data sets have a total of 483 unique drugs. We observed that the QSAR model's performance varied for drugs with different therapeutic uses; however, it achieved a better estimated accuracy (73.6%) as well as negative predictive value (77.0%) when focusing only on these therapeutic categories with high prediction confidence. Thus, the model's applicability domain was defined. Taken collectively, the developed QSAR model has the potential utility to prioritize compound's risk for DILI in humans, particularly for the high-confidence therapeutic subgroups like analgesics, antibacterial agents, and antihistamines.

  1. CENTROIDFOLD: a web server for RNA secondary structure prediction.

    PubMed

    Sato, Kengo; Hamada, Michiaki; Asai, Kiyoshi; Mituyama, Toutai

    2009-07-01

    The CENTROIDFOLD web server (http://www.ncrna.org/centroidfold/) is a web application for RNA secondary structure prediction powered by one of the most accurate prediction engine. The server accepts two kinds of sequence data: a single RNA sequence and a multiple alignment of RNA sequences. It responses with a prediction result shown as a popular base-pair notation and a graph representation. PDF version of the graph representation is also available. For a multiple alignment sequence, the server predicts a common secondary structure. Usage of the server is quite simple. You can paste a single RNA sequence (FASTA or plain sequence text) or a multiple alignment (CLUSTAL-W format) into the textarea then click on the 'execute CentroidFold' button. The server quickly responses with a prediction result. The major advantage of this server is that it employs our original CentroidFold software as its prediction engine which scores the best accuracy in our benchmark results. Our web server is freely available with no login requirement.

  2. Blind protein structure prediction using accelerated free-energy simulations

    PubMed Central

    Perez, Alberto; Morrone, Joseph A.; Brini, Emiliano; MacCallum, Justin L.; Dill, Ken A.

    2016-01-01

    We report a key proof of principle of a new acceleration method [Modeling Employing Limited Data (MELD)] for predicting protein structures by molecular dynamics simulation. It shows that such Boltzmann-satisfying techniques are now sufficiently fast and accurate to predict native protein structures in a limited test within the Critical Assessment of Structure Prediction (CASP) community-wide blind competition. PMID:27847872

  3. Optimizing nondecomposable loss functions in structured prediction.

    PubMed

    Ranjbar, Mani; Lan, Tian; Wang, Yang; Robinovitch, Steven N; Li, Ze-Nian; Mori, Greg

    2013-04-01

    We develop an algorithm for structured prediction with nondecomposable performance measures. The algorithm learns parameters of Markov Random Fields (MRFs) and can be applied to multivariate performance measures. Examples include performance measures such as Fβ score (natural language processing), intersection over union (object category segmentation), Precision/Recall at k (search engines), and ROC area (binary classifiers). We attack this optimization problem by approximating the loss function with a piecewise linear function. The loss augmented inference forms a Quadratic Program (QP), which we solve using LP relaxation. We apply this approach to two tasks: object class-specific segmentation and human action retrieval from videos. We show significant improvement over baseline approaches that either use simple loss functions or simple scoring functions on the PASCAL VOC and H3D Segmentation datasets, and a nursing home action recognition dataset.

  4. Data-directed RNA secondary structure prediction using probabilistic modeling.

    PubMed

    Deng, Fei; Ledda, Mirko; Vaziri, Sana; Aviran, Sharon

    2016-08-01

    Structure dictates the function of many RNAs, but secondary RNA structure analysis is either labor intensive and costly or relies on computational predictions that are often inaccurate. These limitations are alleviated by integration of structure probing data into prediction algorithms. However, existing algorithms are optimized for a specific type of probing data. Recently, new chemistries combined with advances in sequencing have facilitated structure probing at unprecedented scale and sensitivity. These novel technologies and anticipated wealth of data highlight a need for algorithms that readily accommodate more complex and diverse input sources. We implemented and investigated a recently outlined probabilistic framework for RNA secondary structure prediction and extended it to accommodate further refinement of structural information. This framework utilizes direct likelihood-based calculations of pseudo-energy terms per considered structural context and can readily accommodate diverse data types and complex data dependencies. We use real data in conjunction with simulations to evaluate performances of several implementations and to show that proper integration of structural contexts can lead to improvements. Our tests also reveal discrepancies between real data and simulations, which we show can be alleviated by refined modeling. We then propose statistical preprocessing approaches to standardize data interpretation and integration into such a generic framework. We further systematically quantify the information content of data subsets, demonstrating that high reactivities are major drivers of SHAPE-directed predictions and that better understanding of less informative reactivities is key to further improvements. Finally, we provide evidence for the adaptive capability of our framework using mock probe simulations. © 2016 Deng et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  5. A wave based method to predict the absorption, reflection and transmission coefficient of two-dimensional rigid frame porous structures with periodic inclusions

    SciTech Connect

    Deckers, Elke; Claeys, Claus; Atak, Onur; Groby, Jean-Philippe; Dazel, Olivier; Desmet, Wim

    2016-05-01

    This paper presents an extension to the Wave Based Method to predict the absorption, reflection and transmission coefficients of a porous material with an embedded periodic set of inclusions. The porous unit cell is described using the Multi-Level methodology and by embedding Bloch–Floquet periodicity conditions in the weighted residual scheme. The dynamic pressure field in the semi-infinite acoustic domains is approximated using a novel wave function set that fulfils the Helmholtz equation, the Bloch–Floquet periodicity conditions and the Sommerfeld radiation condition. The method is meshless and computationally efficient, which makes it well suited for optimisation studies.

  6. A wave based method to predict the absorption, reflection and transmission coefficient of two-dimensional rigid frame porous structures with periodic inclusions

    NASA Astrophysics Data System (ADS)

    Deckers, Elke; Claeys, Claus; Atak, Onur; Groby, Jean-Philippe; Dazel, Olivier; Desmet, Wim

    2016-05-01

    This paper presents an extension to the Wave Based Method to predict the absorption, reflection and transmission coefficients of a porous material with an embedded periodic set of inclusions. The porous unit cell is described using the Multi-Level methodology and by embedding Bloch-Floquet periodicity conditions in the weighted residual scheme. The dynamic pressure field in the semi-infinite acoustic domains is approximated using a novel wave function set that fulfils the Helmholtz equation, the Bloch-Floquet periodicity conditions and the Sommerfeld radiation condition. The method is meshless and computationally efficient, which makes it well suited for optimisation studies.

  7. Predictive modeling of neuroanatomic structures for brain atrophy detection

    NASA Astrophysics Data System (ADS)

    Hu, Xintao; Guo, Lei; Nie, Jingxin; Li, Kaiming; Liu, Tianming

    2010-03-01

    In this paper, we present an approach of predictive modeling of neuroanatomic structures for the detection of brain atrophy based on cross-sectional MRI image. The underlying premise of applying predictive modeling for atrophy detection is that brain atrophy is defined as significant deviation of part of the anatomy from what the remaining normal anatomy predicts for that part. The steps of predictive modeling are as follows. The central cortical surface under consideration is reconstructed from brain tissue map and Regions of Interests (ROI) on it are predicted from other reliable anatomies. The vertex pair-wise distance between the predicted vertex and the true one within the abnormal region is expected to be larger than that of the vertex in normal brain region. Change of white matter/gray matter ratio within a spherical region is used to identify the direction of vertex displacement. In this way, the severity of brain atrophy can be defined quantitatively by the displacements of those vertices. The proposed predictive modeling method has been evaluated by using both simulated atrophies and MRI images of Alzheimer's disease.

  8. Crystal Structure Prediction for Cyclotrimethylene Trinitramine (RDX) from First Principles

    DTIC Science & Technology

    2009-04-01

    of molecular parameters from corresponding values in an ideal RDX crystal; experimental and predicted orienta- tional parameters of symmetry equivalent...small molecules. Thus, this DFT+D method includes a significant degree of empiricism , in contrast to our SAPT(DFT)-based approach which is completely...in the tables denote only the starting configu- rations . In order to identify duplicate crystal structures pro- duced during the WMIN lattice energy

  9. Numerical Prediction of Buckling in Ship Panel Structures

    DTIC Science & Technology

    2006-01-01

    distortion (LSND) TIG welding of thin-walled structural elements.” The welding Institute Research Report 374, Abington, Cambridge, U.K. HORNE, M.R and...Dong (V), Randal M. Dull (V), Christopher C. Conrardy (V), and Nancy C. Porter (V) ABSTRACT Q- WELD ™, a shell-element-based numerical module was...used to effectively predict welding -induced distortions. The results of Q- WELD ™ were validated by comparison with a series of physical test panels

  10. Understanding the Molecular Determinant of Reversible Human Monoamine Oxidase B Inhibitors Containing 2H-chromen-2-One Core: Structure-Based and Ligand-Based Derived 3-D QSAR Predictive Models.

    PubMed

    Mladenovic, Milan; Patsilinakos, Alexandros; Pirolli, Adele; Sabatino, Manuela; Ragno, Rino

    2017-03-14

    Monoamine oxidase B (MAO B) catalyzes the oxidative deamination of aryalkylamines neurotransmitters with concomitant reduction of oxygen to hydrogen peroxide. Consequently, the enzyme's malfunction can induce oxidative damage to mitochondrial DNA and mediates development of Parkinson's disease. Thus, MAO B emerges as a promising target for developing pharmaceuticals potentially useful to treat this vicious neurodegenerative condition. Aiming to contribute to the development of drugs with the reversible mechanism of MAO B inhibition only, herein, an extended in silico-in vitro procedure for the selection of novel MAO B inhibitors is demonstrated, including: (1) definition of optimized and validated structure-based (SB) 3-D QSAR models derived from available co-crystallized inhibitor-MAO B complexes; (2) elaboration of structure-activity relationships (SAR) features for either irreversible or reversible MAO B inhibitors to characterize and improve coumarin-based inhibitor activity (Protein Data Bank ID: 2V61) as the most potent reversible lead compound; (3) definition of structure-based (SB) and ligand-based (LB) alignment rules assessments by which virtually any untested potential MAO B inhibitor might be evaluated; (4) predictive ability validation of the best 3-D QSAR model through SB/LB modeling of four coumarin-based external test sets (267 compounds); (5) design and SB/LB alignment of novel coumarin-based scaffolds experimentally validated through synthesis and biological evaluation in vitro. Due to the wide range of molecular diversity within the 3-D QSARs training set and derived features, the selected N probe-derived 3-D QSAR model proves to be a valuable tool for virtual screening (VS) of novel MAO B inhibitors and a platform for design, synthesis and evaluation of novel active structures. Accordingly, six highly active and selective MAO B inhibitors (picomolar to low nanomolar range of activity) were disclosed as a result of rational SB/LB 3-D QSAR design

  11. Structure prediction of magnetosome-associated proteins

    PubMed Central

    Nudelman, Hila; Zarivach, Raz

    2014-01-01

    Magnetotactic bacteria (MTB) are Gram-negative bacteria that can navigate along geomagnetic fields. This ability is a result of a unique intracellular organelle, the magnetosome. These organelles are composed of membrane-enclosed magnetite (Fe3O4) or greigite (Fe3S4) crystals ordered into chains along the cell. Magnetosome formation, assembly, and magnetic nano-crystal biomineralization are controlled by magnetosome-associated proteins (MAPs). Most MAP-encoding genes are located in a conserved genomic region – the magnetosome island (MAI). The MAI appears to be conserved in all MTB that were analyzed so far, although the MAI size and organization differs between species. It was shown that MAI deletion leads to a non-magnetic phenotype, further highlighting its important role in magnetosome formation. Today, about 28 proteins are known to be involved in magnetosome formation, but the structures and functions of most MAPs are unknown. To reveal the structure–function relationship of MAPs we used bioinformatics tools in order to build homology models as a way to understand their possible role in magnetosome formation. Here we present a predicted 3D structural models’ overview for all known Magnetospirillum gryphiswaldense strain MSR-1 MAPs. PMID:24523717

  12. PDBalert: automatic, recurrent remote homology tracking and protein structure prediction

    PubMed Central

    Agarwal, Vatsal; Remmert, Michael; Biegert, Andreas; Söding, Johannes

    2008-01-01

    Background During the last years, methods for remote homology detection have grown more and more sensitive and reliable. Automatic structure prediction servers relying on these methods can generate useful 3D models even below 20% sequence identity between the protein of interest and the known structure (template). When no homologs can be found in the protein structure database (PDB), the user would need to rerun the same search at regular intervals in order to make timely use of a template once it becomes available. Results PDBalert is a web-based automatic system that sends an email alert as soon as a structure with homology to a protein in the user's watch list is released to the PDB database or appears among the sequences on hold. The mail contains links to the search results and to an automatically generated 3D homology model. The sequence search is performed with the same software as used by the very sensitive and reliable remote homology detection server HHpred, which is based on pairwise comparison of Hidden Markov models. Conclusion PDBalert will accelerate the information flow from the PDB database to all those who can profit from the newly released protein structures for predicting the 3D structure or function of their proteins of interest. PMID:19025670

  13. THE FUTURE OF COMPUTER-BASED TOXICITY PREDICTION: MECHANISM-BASED MODELS VS. INFORMATION MINING APPROACHES

    EPA Science Inventory


    The Future of Computer-Based Toxicity Prediction:
    Mechanism-Based
    Models vs. Information Mining Approaches

    When we speak of computer-based toxicity prediction, we are generally referring to a broad array of approaches which rely primarily upon chemical structure ...

  14. THE FUTURE OF COMPUTER-BASED TOXICITY PREDICTION: MECHANISM-BASED MODELS VS. INFORMATION MINING APPROACHES

    EPA Science Inventory


    The Future of Computer-Based Toxicity Prediction:
    Mechanism-Based
    Models vs. Information Mining Approaches

    When we speak of computer-based toxicity prediction, we are generally referring to a broad array of approaches which rely primarily upon chemical structure ...

  15. Prediction of RNA secondary structure, including pseudoknotting, by computer simulation.

    PubMed Central

    Abrahams, J P; van den Berg, M; van Batenburg, E; Pleij, C

    1990-01-01

    A computer program is presented which determines the secondary structure of linear RNA molecules by simulating a hypothetical process of folding. This process implies the concept of 'nucleation centres', regions in RNA which locally trigger the folding. During the simulation, the RNA is allowed to fold into pseudoknotted structures, unlike all other programs predicting RNA secondary structure. The simulation uses published, experimentally determined free energy values for nearest neighbour base pair stackings and loop regions, except for new extrapolated values for loops larger than seven nucleotides. The free energy value for a loop arising from pseudoknot formation is set to a single, estimated value of 4.2 kcal/mole. Especially in the case of long RNA sequences, our program appears superior to other secondary structure predicting programs described so far, as tests on tRNAs, the LSU intron of Tetrahymena thermophila and a number of plant viral RNAs show. In addition, pseudoknotted structures are often predicted successfully. The program is written in mainframe APL and is adapted to run on IBM compatible PCs, Atari ST and Macintosh personal computers. On an 8 MHz 8088 standard PC without coprocessor, using STSC APL, it folds a sequence of 700 nucleotides in one and a half hour. PMID:1693421

  16. Factor Structure of Self-Regulation in Preschoolers: Testing Models of a Field-Based Assessment for Predicting Early School Readiness

    PubMed Central

    Denham, Susanne A; Warren-Khot, Heather K.; Bassett, Hideko Hamada; Wyatt, Todd; Perna, Alyssa

    2011-01-01

    The importance of early self-regulatory skill has seen increased focus in the applied research literature, given the implications of these skills for early school success. A three-factor latent structure of self-regulation consisting of compliance, cool executive control, and hot executive control, was tested against alternative models, and retained as best fitting. Tests of model equivalence indicated the model held invariant across Head Start and private child care samples. Partial invariance was supported for age and gender. In the validity model, because of substantial amount of shared variance among latent factors, we included a second-order factor explaining the two types of executive control. Higher-Order Executive Control positively predicted teacher report of learning behaviors and social competence in the classroom. These findings are discussed in light of their practical and theoretical significance. PMID:22104321

  17. A graphic approach to evaluate algorithms of secondary structure prediction.

    PubMed

    Zhang, C T; Zhang, R

    2000-04-01

    Algorithms of secondary structure prediction have undergone the developments of nearly 30 years. However, the problem of how to appropriately evaluate and compare algorithms has not yet completely solved. A graphic method to evaluate algorithms of secondary structure prediction has been proposed here. Traditionally, the performance of an algorithm is evaluated by a number, i.e., accuracy of various definitions. Instead of a number, we use a graph to completely evaluate an algorithm, in which the mapping points are distributed in a three-dimensional space. Each point represents the predictive result of the secondary structure of a protein. Because the distribution of mapping points in the 3D space generally contains more information than a number or a set of numbers, it is expected that algorithms may be evaluated and compared by the proposed graphic method more objectively. Based on the point distribution, six evaluation parameters are proposed, which describe the overall performance of the algorithm evaluated. Furthermore, the graphic method is simple and intuitive. As an example of application, two advanced algorithms, i.e., the PHD and NNpredict methods, are evaluated and compared. It is shown that there is still much room for further improvement for both algorithms. It is pointed out that the accuracy for predicting either the alpha-helix or beta-strand in proteins with higher alpha-helix or beta-strand content, respectively, should be greatly improved for both algorithms.

  18. PREDICTING TOXICOLOGICAL ENDPOINTS OF CHEMICALS USING QUANTITATIVE STRUCTURE-ACTIVITY RELATIONSHIPS (QSARS)

    EPA Science Inventory

    Quantitative structure-activity relationships (QSARs) are being developed to predict the toxicological endpoints for untested chemicals similar in structure to chemicals that have known experimental toxicological data. Based on a very large number of predetermined descriptors, a...

  19. PREDICTING TOXICOLOGICAL ENDPOINTS OF CHEMICALS USING QUANTITATIVE STRUCTURE-ACTIVITY RELATIONSHIPS (QSARS)

    EPA Science Inventory

    Quantitative structure-activity relationships (QSARs) are being developed to predict the toxicological endpoints for untested chemicals similar in structure to chemicals that have known experimental toxicological data. Based on a very large number of predetermined descriptors, a...

  20. RNAex: an RNA secondary structure prediction server enhanced by high-throughput structure-probing data.

    PubMed

    Wu, Yang; Qu, Rihao; Huang, Yiming; Shi, Binbin; Liu, Mengrong; Li, Yang; Lu, Zhi John

    2016-07-08

    Several high-throughput technologies have been developed to probe RNA base pairs and loops at the transcriptome level in multiple species. However, to obtain the final RNA secondary structure, extensive effort and considerable expertise is required to statistically process the probing data and combine them with free energy models. Therefore, we developed an RNA secondary structure prediction server that is enhanced by experimental data (RNAex). RNAex is a web interface that enables non-specialists to easily access cutting-edge structure-probing data and predict RNA secondary structures enhanced by in vivo and in vitro data. RNAex annotates the RNA editing, RNA modification and SNP sites on the predicted structures. It provides four structure-folding methods, restrained MaxExpect, SeqFold, RNAstructure (Fold) and RNAfold that can be selected by the user. The performance of these four folding methods has been verified by previous publications on known structures. We re-mapped the raw sequencing data of the probing experiments to the whole genome for each species. RNAex thus enables users to predict secondary structures for both known and novel RNA transcripts in human, mouse, yeast and Arabidopsis The RNAex web server is available at http://RNAex.ncrnalab.org/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. RNAex: an RNA secondary structure prediction server enhanced by high-throughput structure-probing data

    PubMed Central

    Wu, Yang; Qu, Rihao; Huang, Yiming; Shi, Binbin; Liu, Mengrong; Li, Yang; Lu, Zhi John

    2016-01-01

    Several high-throughput technologies have been developed to probe RNA base pairs and loops at the transcriptome level in multiple species. However, to obtain the final RNA secondary structure, extensive effort and considerable expertise is required to statistically process the probing data and combine them with free energy models. Therefore, we developed an RNA secondary structure prediction server that is enhanced by experimental data (RNAex). RNAex is a web interface that enables non-specialists to easily access cutting-edge structure-probing data and predict RNA secondary structures enhanced by in vivo and in vitro data. RNAex annotates the RNA editing, RNA modification and SNP sites on the predicted structures. It provides four structure-folding methods, restrained MaxExpect, SeqFold, RNAstructure (Fold) and RNAfold that can be selected by the user. The performance of these four folding methods has been verified by previous publications on known structures. We re-mapped the raw sequencing data of the probing experiments to the whole genome for each species. RNAex thus enables users to predict secondary structures for both known and novel RNA transcripts in human, mouse, yeast and Arabidopsis. The RNAex web server is available at http://RNAex.ncrnalab.org/. PMID:27137891

  2. Structural imaging biomarkers of Alzheimer's disease: predicting disease progression.

    PubMed

    Eskildsen, Simon F; Coupé, Pierrick; Fonov, Vladimir S; Pruessner, Jens C; Collins, D Louis

    2015-01-01

    Optimized magnetic resonance imaging (MRI)-based biomarkers of Alzheimer's disease (AD) may allow earlier detection and refined prediction of the disease. In addition, they could serve as valuable tools when designing therapeutic studies of individuals at risk of AD. In this study, we combine (1) a novel method for grading medial temporal lobe structures with (2) robust cortical thickness measurements to predict AD among subjects with mild cognitive impairment (MCI) from a single T1-weighted MRI scan. Using AD and cognitively normal individuals, we generate a set of features potentially discriminating between MCI subjects who convert to AD and those who remain stable over a period of 3 years. Using mutual information-based feature selection, we identify 5 key features optimizing the classification of MCI converters. These features are the left and right hippocampi gradings and cortical thicknesses of the left precuneus, left superior temporal sulcus, and right anterior part of the parahippocampal gyrus. We show that these features are highly stable in cross-validation and enable a prediction accuracy of 72% using a simple linear discriminant classifier, the highest prediction accuracy obtained on the baseline Alzheimer's Disease Neuroimaging Initiative first phase cohort to date. The proposed structural features are consistent with Braak stages and previously reported atrophic patterns in AD and are easy to transfer to new cohorts and to clinical practice.

  3. Structure of nonevaporating sprays - Measurements and predictions

    NASA Technical Reports Server (NTRS)

    Solomon, A. S. P.; Shuen, J.-S.; Zhang, Q.-F.; Faeth, G. M.

    1984-01-01

    Structure measurements were completed within the dilute portion of axisymmetric nonevaporating sprays (SMD of 30 and 87 microns) injected into a still air environment, including: mean and fluctuating gas velocities and Reynolds stress using laser-Doppler anemometry; mean liquid fluxes using isokinetic sampling; drop sizes using slide impaction; and drop sizes and velocities using multiflash photography. The new measurements were used to evaluate three representative models of sprays: (1) a locally homogeneous flow (LHF) model, where slip between the phases was neglected; (2) a deterministic separated flow (DSF) model, where slip was considered but effects of drop interaction with turbulent fluctuations were ignored; and (3) a stochastic separated flow (SSF) model, where effects of both interphase slip and turbulent fluctuations were considered using random sampling for turbulence properties in conjunction with random-walk computations for drop motion. The LHF and DSF models were unsatisfactory for present test conditions-both underestimating flow widths and the rate of spread of drops. In contrast, the SSF model provided reasonably accurate predictions, including effects of enhanced spreading rates of sprays due to drop dispersion by turbulence, with all empirical parameters fixed from earlier work.

  4. Synthesis, characterization, crystal structure and predicting the second-order optical nonlinearity of a new dicobalt(III) complex with Schiff base ligand

    NASA Astrophysics Data System (ADS)

    Zarei, Seyed Amir; Piltan, Mohammad; Hassanzadeh, Keyumars; Akhtari, Keivan; Cinčić, Dominik

    2015-03-01

    The synthesis and characterization of dicobalt(III) complex [Co2L2(OMe)2] of the tetradentate Schiff base ligand N,N‧-bis(2-hydroxybenzylidene)-2,2-dimethyl-1,3-propanediamine (H2L) is reported. The crystal structure of the complex has been determined that exhibited the pseudo-octahedral geometry around both cobalt(III) ions. In the complexation process, H2L acts as two negatively charged tetradentate ligand, L2-, and methoxy group plays as bridging ligand. The geometry structure of the complex is optimized by density functional theory (DFT) using B3LYP/6-311G(d,p). The calculated geometric parameters are in good agreement with the corresponding experimental data. Second-Order Nonlinear Optical (NLO) property of the complex is evaluated by DFT/B3LYP/6-311G(d,p) on the base of the optimized structure that shows the enhancement relative to the calculated value of H2L. The calculated NLO value of the complex is much greater than the corresponding value of urea.

  5. Generalized Pattern Search Algorithm for Peptide Structure Prediction

    PubMed Central

    Nicosia, Giuseppe; Stracquadanio, Giovanni

    2008-01-01

    Finding the near-native structure of a protein is one of the most important open problems in structural biology and biological physics. The problem becomes dramatically more difficult when a given protein has no regular secondary structure or it does not show a fold similar to structures already known. This situation occurs frequently when we need to predict the tertiary structure of small molecules, called peptides. In this research work, we propose a new ab initio algorithm, the generalized pattern search algorithm, based on the well-known class of Search-and-Poll algorithms. We performed an extensive set of simulations over a well-known set of 44 peptides to investigate the robustness and reliability of the proposed algorithm, and we compared the peptide conformation with a state-of-the-art algorithm for peptide structure prediction known as PEPstr. In particular, we tested the algorithm on the instances proposed by the originators of PEPstr, to validate the proposed algorithm; the experimental results confirm that the generalized pattern search algorithm outperforms PEPstr by 21.17% in terms of average root mean-square deviation, RMSD Cα. PMID:18487293

  6. Predictive simulation of guide-wave structural health monitoring

    NASA Astrophysics Data System (ADS)

    Giurgiutiu, Victor

    2017-04-01

    This paper presents an overview of recent developments on predictive simulation of guided wave structural health monitoring (SHM) with piezoelectric wafer active sensor (PWAS) transducers. The predictive simulation methodology is based on the hybrid global local (HGL) concept which allows fast analytical simulation in the undamaged global field and finite element method (FEM) simulation in the local field around and including the damage. The paper reviews the main results obtained in this area by researchers of the Laboratory for Active Materials and Smart Structures (LAMSS) at the University of South Carolina, USA. After thematic introduction and research motivation, the paper covers four main topics: (i) presentation of the HGL analysis; (ii) analytical simulation in 1D and 2D; (iii) scatter field generation; (iv) HGL examples. The paper ends with summary, discussion, and suggestions for future work.

  7. Quantitative structure-property relationships for predicting Henry's law constant from molecular structure.

    PubMed

    Dearden, John C; Schüürmann, Gerrit

    2003-08-01

    Various models are available for the prediction of Henry's law constant (H) or the air-water partition coefficient (Kaw), its dimensionless counterpart. Incremental methods are based on structural features such as atom types, bond types, and local structural environments; other regression models employ physicochemical properties, structural descriptors such as connectivity indices, and descriptors reflecting the electronic structure. There are also methods to calculate H from the ratio of vapor pressure (p(v)) and water solubility (S(w)) that in turn can be estimated from molecular structure, and quantum chemical continuum-solvation models to predict H via the solvation-free energy (deltaG(s)). This review is confined to methods that calculate H from molecular structure without experimental information and covers more than 40 methods published in the last 26 years. For a subset of eight incremental methods and four continuum-solvation models, a comparative analysis of their prediction performance is made using a test set of 700 compounds that includes a significant number of more complex and drug-like chemical structures. The results reveal substantial differences in the application range as well as in the prediction capability, a general decrease in prediction performance with decreasing H, and surprisingly large individual prediction errors, which are particularly striking for some quantum chemical schemes. The overall best-performing method appears to be the bond contribution method as implemented in the HENRYWIN software package, yielding a predictive squared correlation coefficient (q2) of 0.87 and a standard error of 1.03 log units for the test set.

  8. Entropy-based link prediction in weighted networks

    NASA Astrophysics Data System (ADS)

    Xu, Zhongqi; Pu, Cunlai; Ramiz Sharafat, Rajput; Li, Lunbo; Yang, Jian

    2017-01-01

    Information entropy has been proved to be an effective tool to quantify the structural importance of complex networks. In the previous work (Xu et al, 2016 \\cite{xu2016}), we measure the contribution of a path in link prediction with information entropy. In this paper, we further quantify the contribution of a path with both path entropy and path weight, and propose a weighted prediction index based on the contributions of paths, namely Weighted Path Entropy (WPE), to improve the prediction accuracy in weighted networks. Empirical experiments on six weighted real-world networks show that WPE achieves higher prediction accuracy than three typical weighted indices.

  9. A tool for the prediction of structures of complex sugars.

    PubMed

    Xia, Junchao; Margulis, Claudio

    2008-12-01

    In two recent back to back articles(Xia et al., J Chem Theory Comput 3:1620-1628 and 1629-1643, 2007a, b) we have started to address the problem of complex oligosaccharide conformation and folding. The scheme previously presented was based on exhaustive searches in configuration space in conjunction with Nuclear Overhauser Effect (NOE) calculations and the use of a complex rotameric library that takes branching into account. NOEs are extremely useful for structural determination but only provide information about short range interactions and ordering. Instead, the measurement of residual dipolar couplings (RDC), yields information about molecular ordering or folding that is long range in nature. In this article we show the results obtained by incorporation RDC calculations into our prediction scheme. Using this new approach we are able to accurately predict the structure of six human milk sugars: LNF-1, LND-1, LNF-2, LNF-3, LNnT and LNT. Our exhaustive search in dihedral configuration space combined with RDC and NOE calculations allows for highly accurate structural predictions that, because of the non-ergodic nature of these molecules on a time scale compatible with molecular dynamics simulations, are extremely hard to obtain otherwise (Almond et al., Biochemistry 43:5853-5863, 2004). Molecular dynamics simulations in explicit solvent using as initial configurations the structures predicted by our algorithm show that the histo-blood group epitopes in these sugars are relatively rigid and that the whole family of oligosaccharides derives its conformational variability almost exclusively from their common linkage (beta-D: -GlcNAc-(1-->3)-beta-D: -Gal) which can exist in two distinct conformational states. A population analysis based on the conformational variability of this flexible glycosidic link indicates that the relative population of the two distinct states varies for different human milk oligosaccharides.

  10. Predicting fracture in micron-scale polycrystalline silicon MEMS structures.

    SciTech Connect

    Hazra, Siddharth S.; de Boer, Maarten Pieter; Boyce, Brad Lee; Ohlhausen, James Anthony; Foulk, James W., III; Reedy, Earl David, Jr.

    2010-09-01

    Designing reliable MEMS structures presents numerous challenges. Polycrystalline silicon fractures in a brittle manner with considerable variability in measured strength. Furthermore, it is not clear how to use a measured tensile strength distribution to predict the strength of a complex MEMS structure. To address such issues, two recently developed high throughput MEMS tensile test techniques have been used to measure strength distribution tails. The measured tensile strength distributions enable the definition of a threshold strength as well as an inferred maximum flaw size. The nature of strength-controlling flaws has been identified and sources of the observed variation in strength investigated. A double edge-notched specimen geometry was also tested to study the effect of a severe, micron-scale stress concentration on the measured strength distribution. Strength-based, Weibull-based, and fracture mechanics-based failure analyses were performed and compared with the experimental results.

  11. 3D protein structure prediction using Imperialist Competitive algorithm and half sphere exposure prediction.

    PubMed

    Khaji, Erfan; Karami, Masoumeh; Garkani-Nejad, Zahra

    2016-02-21

    Predicting the native structure of proteins based on half-sphere exposure and contact numbers has been studied deeply within recent years. Online predictors of these vectors and secondary structures of amino acids sequences have made it possible to design a function for the folding process. By choosing variant structures and directs for each secondary structure, a random conformation can be generated, and a potential function can then be assigned. Minimizing the potential function utilizing meta-heuristic algorithms is the final step of finding the native structure of a given amino acid sequence. In this work, Imperialist Competitive algorithm was used in order to accelerate the process of minimization. Moreover, we applied an adaptive procedure to apply revolutionary changes. Finally, we considered a more accurate tool for prediction of secondary structure. The results of the computational experiments on standard benchmark show the superiority of the new algorithm over the previous methods with similar potential function. Copyright © 2015 Elsevier Ltd. All rights reserved.

  12. Statistical energy analysis response prediction methods for structural systems

    NASA Technical Reports Server (NTRS)

    Davis, R. F.

    1979-01-01

    The results of an effort to document methods for accomplishing response predictions for commonly encountered aerospace structural configurations is presented. Application of these methods to specified aerospace structure to provide sample analyses is included. An applications manual, with the structural analyses appended as example problems is given. Comparisons of the response predictions with measured data are provided for three of the example problems.

  13. Prediction of protein function from protein sequence and structure.

    PubMed

    Whisstock, James C; Lesk, Arthur M

    2003-08-01

    The sequence of a genome contains the plans of the possible life of an organism, but implementation of genetic information depends on the functions of the proteins and nucleic acids that it encodes. Many individual proteins of known sequence and structure present challenges to the understanding of their function. In particular, a number of genes responsible for diseases have been identified but their specific functions are unknown. Whole-genome sequencing projects are a major source of proteins of unknown function. Annotation of a genome involves assignment of functions to gene products, in most cases on the basis of amino-acid sequence alone. 3D structure can aid the assignment of function, motivating the challenge of structural genomics projects to make structural information available for novel uncharacterized proteins. Structure-based identification of homologues often succeeds where sequence-alone-based methods fail, because in many cases evolution retains the folding pattern long after sequence similarity becomes undetectable. Nevertheless, prediction of protein function from sequence and structure is a difficult problem, because homologous proteins often have different functions. Many methods of function prediction rely on identifying similarity in sequence and/or structure between a protein of unknown function and one or more well-understood proteins. Alternative methods include inferring conservation patterns in members of a functionally uncharacterized family for which many sequences and structures are known. However, these inferences are tenuous. Such methods provide reasonable guesses at function, but are far from foolproof. It is therefore fortunate that the development of whole-organism approaches and comparative genomics permits other approaches to function prediction when the data are available. These include the use of protein-protein interaction patterns, and correlations between occurrences of related proteins in different organisms, as

  14. Predicting aqueous solubility of environmentally relevant compounds from molecular features: a simple but highly effective four-dimensional model based on Project to Latent Structures.

    PubMed

    Xiao, Feng; Gulliver, John S; Simcik, Matt F

    2013-09-15

    The aqueous solubility (log S) of xenobiotic chemicals has been identified as a key characteristic in determining their bioaccessibility/bioavailability and their fate and transport in aquatic environments. We here explore and evaluate the use of a state-of-the-art data analysis technique (Project to Latent Structures, PLS) to estimate log S of environmentally relevant chemicals. A large number (n = 624) of molecular descriptors was computed for over 1400 organic chemicals, and then refined by a feature selection technique. Candidate predictor descriptors were fitted to data by means of PLS, which was optimized by an internal leave-one-out cross-validation technique and validated by an external data set. The final (best) PLS model with only four variables (AlogP, X1sol, Mv, and E) exhibited noteworthy stability and good predictive power. It was able to explain 91% of the data (n = 1400) variance with an average absolute error of 0.5 log units through the solubilities span over 12 orders of magnitude. The newly proposed model is transparent, easily portable from one user to another, and robust enough to accurately estimate log S of a wide range of emerging contaminants.

  15. Graphlet kernels for prediction of functional residues in protein structures.

    PubMed

    Vacic, Vladimir; Iakoucheva, Lilia M; Lonardi, Stefano; Radivojac, Predrag

    2010-01-01

    We introduce a novel graph-based kernel method for annotating functional residues in protein structures. A structure is first modeled as a protein contact graph, where nodes correspond to residues and edges connect spatially neighboring residues. Each vertex in the graph is then represented as a vector of counts of labeled non-isomorphic subgraphs (graphlets), centered on the vertex of interest. A similarity measure between two vertices is expressed as the inner product of their respective count vectors and is used in a supervised learning framework to classify protein residues. We evaluated our method on two function prediction problems: identification of catalytic residues in proteins, which is a well-studied problem suitable for benchmarking, and a much less explored problem of predicting phosphorylation sites in protein structures. The performance of the graphlet kernel approach was then compared against two alternative methods, a sequence-based predictor and our implementation of the FEATURE framework. On both tasks, the graphlet kernel performed favorably; however, the margin of difference was considerably higher on the problem of phosphorylation site prediction. While there is data that phosphorylation sites are preferentially positioned in intrinsically disordered regions, we provide evidence that for the sites that are located in structured regions, neither the surface accessibility alone nor the averaged measures calculated from the residue microenvironments utilized by FEATURE were sufficient to achieve high accuracy. The key benefit of the graphlet representation is its ability to capture neighborhood similarities in protein structures via enumerating the patterns of local connectivity in the corresponding labeled graphs.

  16. Prediction of interface structures and energies via virtual screening

    PubMed Central

    Kiyohara, Shin; Oda, Hiromi; Miyata, Tomohiro; Mizoguchi, Teruyasu

    2016-01-01

    Interfaces markedly affect the properties of materials because of differences in their atomic configurations. Determining the atomic structure of the interface is therefore one of the most significant tasks in materials research. However, determining the interface structure usually requires extensive computation. If the interface structure could be efficiently predicted, our understanding of the mechanisms that give rise to the interface properties would be significantly facilitated, and this would pave the way for the design of material interfaces. Using a virtual screening method based on machine learning, we demonstrate a powerful technique to determine interface energies and structures. On the basis of the results obtained by a nonlinear regression using training data from 4 interfaces, structures and energies for 13 other interfaces were predicted. Our method achieved an efficiency that is more than several hundred to several tens of thousand times higher than that of the previously reported methods. Because the present method uses geometrical factors, such as bond length and atomic density, as descriptors for the regression analysis, the method presented here is robust and general and is expected to be beneficial to understanding the nature of any interface. PMID:28138517

  17. Prediction of interface structures and energies via virtual screening.

    PubMed

    Kiyohara, Shin; Oda, Hiromi; Miyata, Tomohiro; Mizoguchi, Teruyasu

    2016-11-01

    Interfaces markedly affect the properties of materials because of differences in their atomic configurations. Determining the atomic structure of the interface is therefore one of the most significant tasks in materials research. However, determining the interface structure usually requires extensive computation. If the interface structure could be efficiently predicted, our understanding of the mechanisms that give rise to the interface properties would be significantly facilitated, and this would pave the way for the design of material interfaces. Using a virtual screening method based on machine learning, we demonstrate a powerful technique to determine interface energies and structures. On the basis of the results obtained by a nonlinear regression using training data from 4 interfaces, structures and energies for 13 other interfaces were predicted. Our method achieved an efficiency that is more than several hundred to several tens of thousand times higher than that of the previously reported methods. Because the present method uses geometrical factors, such as bond length and atomic density, as descriptors for the regression analysis, the method presented here is robust and general and is expected to be beneficial to understanding the nature of any interface.

  18. Nanoporous carbon structures based on C20

    NASA Astrophysics Data System (ADS)

    Vehviläinen, T. T.; Ganchenkova, M. G.; Nieminen, R. M.

    2011-09-01

    In this paper, we present computational results for C20 based solids. We propose structures that are shown to be energetically more favorable and stable than previously suggested structures. The so-called quasigraphite phase and base-centered-monoclinic type structures are found to be the energetically most favorable. The molecular-dynamics stability of suggested structures was studied via constant-temperature and constant-pressure techniques and by examining phonon dispersion curves. All the predicted structures demonstrate high stability with respect to temperature and external load. By changing the geometry, the electronic properties can be varied from metallic to insulating.

  19. Improved hybrid optimization algorithm for 3D protein structure prediction.

    PubMed

    Zhou, Changjun; Hou, Caixia; Wei, Xiaopeng; Zhang, Qiang

    2014-07-01

    A new improved hybrid optimization algorithm - PGATS algorithm, which is based on toy off-lattice model, is presented for dealing with three-dimensional protein structure prediction problems. The algorithm combines the particle swarm optimization (PSO), genetic algorithm (GA), and tabu search (TS) algorithms. Otherwise, we also take some different improved strategies. The factor of stochastic disturbance is joined in the particle swarm optimization to improve the search ability; the operations of crossover and mutation that are in the genetic algorithm are changed to a kind of random liner method; at last tabu search algorithm is improved by appending a mutation operator. Through the combination of a variety of strategies and algorithms, the protein structure prediction (PSP) in a 3D off-lattice model is achieved. The PSP problem is an NP-hard problem, but the problem can be attributed to a global optimization problem of multi-extremum and multi-parameters. This is the theoretical principle of the hybrid optimization algorithm that is proposed in this paper. The algorithm combines local search and global search, which overcomes the shortcoming of a single algorithm, giving full play to the advantage of each algorithm. In the current universal standard sequences, Fibonacci sequences and real protein sequences are certified. Experiments show that the proposed new method outperforms single algorithms on the accuracy of calculating the protein sequence energy value, which is proved to be an effective way to predict the structure of proteins.

  20. Residual Strength Prediction of Fuselage Structures with Multiple Site Damage

    NASA Technical Reports Server (NTRS)

    Chen, Chuin-Shan; Wawrzynek, Paul A.; Ingraffea, Anthony R.

    1999-01-01

    This paper summarizes recent results on simulating full-scale pressure tests of wide body, lap-jointed fuselage panels with multiple site damage (MSD). The crack tip opening angle (CTOA) fracture criterion and the FRANC3D/STAGS software program were used to analyze stable crack growth under conditions of general yielding. The link-up of multiple cracks and residual strength of damaged structures were predicted. Elastic-plastic finite element analysis based on the von Mises yield criterion and incremental flow theory with small strain assumption was used. A global-local modeling procedure was employed in the numerical analyses. Stress distributions from the numerical simulations are compared with strain gage measurements. Analysis results show that accurate representation of the load transfer through the rivets is crucial for the model to predict the stress distribution accurately. Predicted crack growth and residual strength are compared with test data. Observed and predicted results both indicate that the occurrence of small MSD cracks substantially reduces the residual strength. Modeling fatigue closure is essential to capture the fracture behavior during the early stable crack growth. Breakage of a tear strap can have a major influence on residual strength prediction.

  1. Brain structure predicts risk for obesity ☆

    PubMed Central

    Smucny, Jason; Cornier, Marc-Andre; Eichman, Lindsay C.; Thomas, Elizabeth A.; Bechtell, Jamie L.; Tregellas, Jason R.

    2014-01-01

    The neurobiology of obesity is poorly understood. Here we report findings of a study designed to examine the differences in brain regional gray matter volume in adults recruited as either Obese Prone or Obese Resistant based on self-identification, body mass index, and personal/family weight history. Magnetic resonance imaging was performed in 28 Obese Prone (14 male, 14 female) and 25 Obese Resistant (13 male, 12 female) healthy adults. Voxel-based morphometry was used to identify gray matter volume differences between groups. Gray matter volume was found to be lower in the insula, medial orbitofrontal cortex and cerebellum in Obese Prone, as compared to Obese Resistant individuals. Adjusting for body fat mass did not impact these results. Insula gray matter volume was negatively correlated with leptin concentration and measures of hunger. These findings suggest that individuals at risk for weight gain have structural differences in brain regions known to be important in energy intake regulation, and that these differences, particularly in the insula, may be related to leptin. PMID:22963736

  2. Omics-based hybrid prediction in maize.

    PubMed

    Westhues, Matthias; Schrag, Tobias A; Heuer, Claas; Thaller, Georg; Utz, H Friedrich; Schipprack, Wolfgang; Thiemann, Alexander; Seifert, Felix; Ehret, Anita; Schlereth, Armin; Stitt, Mark; Nikoloski, Zoran; Willmitzer, Lothar; Schön, Chris C; Scholten, Stefan; Melchinger, Albrecht E

    2017-06-24

    Complementing genomic data with other "omics" predictors can increase the probability of success for predicting the best hybrid combinations using complex agronomic traits. Accurate prediction of traits with complex genetic architecture is crucial for selecting superior candidates in animal and plant breeding and for guiding decisions in personalized medicine. Whole-genome prediction has revolutionized these areas but has inherent limitations in incorporating intricate epistatic interactions. Downstream "omics" data are expected to integrate interactions within and between different biological strata and provide the opportunity to improve trait prediction. Yet, predicting traits from parents to progeny has not been addressed by a combination of "omics" data. Here, we evaluate several "omics" predictors-genomic, transcriptomic and metabolic data-measured on parent lines at early developmental stages and demonstrate that the integration of transcriptomic with genomic data leads to higher success rates in the correct prediction of untested hybrid combinations in maize. Despite the high predictive ability of genomic data, transcriptomic data alone outperformed them and other predictors for the most complex heterotic trait, dry matter yield. An eQTL analysis revealed that transcriptomic data integrate genomic information from both, adjacent and distant sites relative to the expressed genes. Together, these findings suggest that downstream predictors capture physiological epistasis that is transmitted from parents to their hybrid offspring. We conclude that the use of downstream "omics" data in prediction can exploit important information beyond structural genomics for leveraging the efficiency of hybrid breeding.

  3. A comprehensive comparison of comparative RNA structure prediction approaches

    PubMed Central

    Gardner, Paul P; Giegerich, Robert

    2004-01-01

    Background An increasing number of researchers have released novel RNA structure analysis and prediction algorithms for comparative approaches to structure prediction. Yet, independent benchmarking of these algorithms is rarely performed as is now common practice for protein-folding, gene-finding and multiple-sequence-alignment algorithms. Results Here we evaluate a number of RNA folding algorithms using reliable RNA data-sets and compare their relative performance. Conclusions We conclude that comparative data can enhance structure prediction but structure-prediction-algorithms vary widely in terms of both sensitivity and selectivity across different lengths and homologies. Furthermore, we outline some directions for future research. PMID:15458580

  4. Structure Prediction and Validation of the ERK8 Kinase Domain

    PubMed Central

    Strambi, Angela; Mori, Mattia; Rossi, Matteo; Colecchia, David; Manetti, Fabrizio; Carlomagno, Francesca; Botta, Maurizio; Chiariello, Mario

    2013-01-01

    Extracellular signal-regulated kinase 8 (ERK8) has been already implicated in cell transformation and in the protection of genomic integrity and, therefore, proposed as a novel potential therapeutic target for cancer. In the absence of a crystal structure, we developed a three-dimensional model for its kinase domain. To validate our model we applied a structure-based virtual screening protocol consisting of pharmacophore screening and molecular docking. Experimental characterization of the hit compounds confirmed that a high percentage of the identified scaffolds was able to inhibit ERK8. We also confirmed an ATP competitive mechanism of action for the two best-performing molecules. Ultimately, we identified an ERK8 drug-resistant “gatekeeper” mutant that corroborated the predicted molecular binding mode, confirming the reliability of the generated structure. We expect that our model will be a valuable tool for the development of specific ERK8 kinase inhibitors. PMID:23326322

  5. Development of advanced structural analysis methodologies for predicting widespread fatigue damage in aircraft structures

    NASA Technical Reports Server (NTRS)

    Harris, Charles E.; Starnes, James H., Jr.; Newman, James C., Jr.

    1995-01-01

    NASA is developing a 'tool box' that includes a number of advanced structural analysis computer codes which, taken together, represent the comprehensive fracture mechanics capability required to predict the onset of widespread fatigue damage. These structural analysis tools have complementary and specialized capabilities ranging from a finite-element-based stress-analysis code for two- and three-dimensional built-up structures with cracks to a fatigue and fracture analysis code that uses stress-intensity factors and material-property data found in 'look-up' tables or from equations. NASA is conducting critical experiments necessary to verify the predictive capabilities of the codes, and these tests represent a first step in the technology-validation and industry-acceptance processes. NASA has established cooperative programs with aircraft manufacturers to facilitate the comprehensive transfer of this technology by making these advanced structural analysis codes available to industry.

  6. 3D Protein structure prediction with genetic tabu search algorithm

    PubMed Central

    2010-01-01

    Background Protein structure prediction (PSP) has important applications in different fields, such as drug design, disease prediction, and so on. In protein structure prediction, there are two important issues. The first one is the design of the structure model and the second one is the design of the optimization technology. Because of the complexity of the realistic protein structure, the structure model adopted in this paper is a simplified model, which is called off-lattice AB model. After the structure model is assumed, optimization technology is needed for searching the best conformation of a protein sequence based on the assumed structure model. However, PSP is an NP-hard problem even if the simplest model is assumed. Thus, many algorithms have been developed to solve the global optimization problem. In this paper, a hybrid algorithm, which combines genetic algorithm (GA) and tabu search (TS) algorithm, is developed to complete this task. Results In order to develop an efficient optimization algorithm, several improved strategies are developed for the proposed genetic tabu search algorithm. The combined use of these strategies can improve the efficiency of the algorithm. In these strategies, tabu search introduced into the crossover and mutation operators can improve the local search capability, the adoption of variable population size strategy can maintain the diversity of the population, and the ranking selection strategy can improve the possibility of an individual with low energy value entering into next generation. Experiments are performed with Fibonacci sequences and real protein sequences. Experimental results show that the lowest energy obtained by the proposed GATS algorithm is lower than that obtained by previous methods. Conclusions The hybrid algorithm has the advantages from both genetic algorithm and tabu search algorithm. It makes use of the advantage of multiple search points in genetic algorithm, and can overcome poor hill

  7. 3D protein structure prediction with genetic tabu search algorithm.

    PubMed

    Zhang, Xiaolong; Wang, Ting; Luo, Huiping; Yang, Jack Y; Deng, Youping; Tang, Jinshan; Yang, Mary Qu

    2010-05-28

    Protein structure prediction (PSP) has important applications in different fields, such as drug design, disease prediction, and so on. In protein structure prediction, there are two important issues. The first one is the design of the structure model and the second one is the design of the optimization technology. Because of the complexity of the realistic protein structure, the structure model adopted in this paper is a simplified model, which is called off-lattice AB model. After the structure model is assumed, optimization technology is needed for searching the best conformation of a protein sequence based on the assumed structure model. However, PSP is an NP-hard problem even if the simplest model is assumed. Thus, many algorithms have been developed to solve the global optimization problem. In this paper, a hybrid algorithm, which combines genetic algorithm (GA) and tabu search (TS) algorithm, is developed to complete this task. In order to develop an efficient optimization algorithm, several improved strategies are developed for the proposed genetic tabu search algorithm. The combined use of these strategies can improve the efficiency of the algorithm. In these strategies, tabu search introduced into the crossover and mutation operators can improve the local search capability, the adoption of variable population size strategy can maintain the diversity of the population, and the ranking selection strategy can improve the possibility of an individual with low energy value entering into next generation. Experiments are performed with Fibonacci sequences and real protein sequences. Experimental results show that the lowest energy obtained by the proposed GATS algorithm is lower than that obtained by previous methods. The hybrid algorithm has the advantages from both genetic algorithm and tabu search algorithm. It makes use of the advantage of multiple search points in genetic algorithm, and can overcome poor hill-climbing capability in the conventional genetic

  8. Comparative modeling: the state of the art and protein drug target structure prediction.

    PubMed

    Liu, Tianyun; Tang, Grace W; Capriotti, Emidio

    2011-07-01

    The goal of computational protein structure prediction is to provide three-dimensional (3D) structures with resolution comparable to experimental results. Comparative modeling, which predicts the 3D structure of a protein based on its sequence similarity to homologous structures, is the most accurate computational method for structure prediction. In the last two decades, significant progress has been made on comparative modeling methods. Using the large number of protein structures deposited in the Protein Data Bank (~65,000), automatic prediction pipelines are generating a tremendous number of models (~1.9 million) for sequences whose structures have not been experimentally determined. Accurate models are suitable for a wide range of applications, such as prediction of protein binding sites, prediction of the effect of protein mutations, and structure-guided virtual screening. In particular, comparative modeling has enabled structure-based drug design against protein targets with unknown structures. In this review, we describe the theoretical basis of comparative modeling, the available automatic methods and databases, and the algorithms to evaluate the accuracy of predicted structures. Finally, we discuss relevant applications in the prediction of important drug target proteins, focusing on the G protein-coupled receptor (GPCR) and protein kinase families.

  9. Ichthyophonus parasite phylogeny based on ITS rDNA structure prediction and alignment identifies six clades, with a single dominant marine type

    USGS Publications Warehouse

    Gregg, Jacob; Thompson, Rachel L.; Purcell, Maureen; Friedman, Carolyn S.; Hershberger, Paul

    2016-01-01

    Despite their widespread, global impact in both wild and cultured fishes, little is known of the diversity, transmission patterns, and phylogeography of parasites generally identified as Ichthyophonus. This study constructed a phylogeny based on the structural alignment of internal transcribed spacer (ITS) rDNA sequences to compare Ichthyophonus isolates from fish hosts in the Atlantic and Pacific oceans, and several rivers and aquaculture sites in North America, Europe, and Japan. Structure of the Ichthyophonus ITS1–5.8S–ITS2 transcript exhibited several homologies with other eukaryotes, and 6 distinct clades were identified within Ichthyophonus. A single clade contained a majority (71 of 98) of parasite isolations. This ubiquitous Ichthyophonus type occurred in 13 marine and anadromous hosts and was associated with epizootics in Atlantic herring, Chinook salmon, and American shad. A second clade contained all isolates from aquaculture, despite great geographic separation of the freshwater hosts. Each of the 4 remaining clades contained isolates from single host species. This study is the first to evaluate the genetic relationships among Ichthyophonus species across a significant portion of their host and geographic range. Additionally, parasite infection prevalence is reported in 16 fish species.

  10. Evaluation of the information content of RNA structure mapping data for secondary structure prediction.

    PubMed

    Quarrier, Scott; Martin, Joshua S; Davis-Neulander, Lauren; Beauregard, Arthur; Laederach, Alain

    2010-06-01

    Structure mapping experiments (using probes such as dimethyl sulfate [DMS], kethoxal, and T1 and V1 RNases) are used to determine the secondary structures of RNA molecules. The process is iterative, combining the results of several probes with constrained minimum free-energy calculations to produce a model of the structure. We aim to evaluate whether particular probes provide more structural information, and specifically, how noise in the data affects the predictions. Our approach involves generating "decoy" RNA structures (using the sFold Boltzmann sampling procedure) and evaluating whether we are able to identify the correct structure from this ensemble of structures. We show that with perfect information, we are always able to identify the optimal structure for five RNAs of known structure. We then collected orthogonal structure mapping data (DMS and RNase T1 digest) under several solution conditions using our high-throughput capillary automated footprinting analysis (CAFA) technique on two group I introns of known structure. Analysis of these data reveals the error rates in the data under optimal (low salt) and suboptimal solution conditions (high MgCl(2)). We show that despite these errors, our computational approach is less sensitive to experimental noise than traditional constraint-based structure prediction algorithms. Finally, we propose a novel approach for visualizing the interaction of chemical and enzymatic mapping data with RNA structure. We project the data onto the first two dimensions of a multidimensional scaling of the sFold-generated decoy structures. We are able to directly visualize the structural information content of structure mapping data and reconcile multiple data sets.

  11. Protein short loop prediction in terms of a structural alphabet.

    PubMed

    Tyagi, Manoj; Bornot, Aurélie; Offmann, Bernard; de Brevern, Alexandre G

    2009-08-01

    Loops connect regular secondary structures. In many instances, they are known to play crucial biological roles. To bypass the limitation of secondary structure description, we previously defined a structural alphabet composed of 16 structural prototypes, called Protein Blocks (PBs). It leads to an accurate description of every region of 3D protein backbones and has been used in local structure prediction. In the present study, we used our structural alphabet to predict the loops connecting two repetitive structures. Thus, we showed interest to take into account the flanking regions, leading to prediction rate improvement up to 19.8%, but we also underline the sensitivity of such an approach. This research can be used to propose different structures for the loops and to probe and sample their flexibility. It is a useful tool for ab initio loop prediction and leads to insights into flexible docking approach.

  12. Prediction of common folding structures of homologous RNAs.

    PubMed Central

    Han, K; Kim, H J

    1993-01-01

    We have developed an algorithm and a computer program for simultaneously folding homologous RNA sequences. Given an alignment of M homologous sequences of length N, the program performs phylogenetic comparative analysis and predicts a common secondary structure conserved in the sequences. When the structure is not uniquely determined, it infers multiple structures which appear most plausible. This method is superior to energy minimization methods in the sense that it is not sensitive to point mutation of a sequence. It is also superior to usual phylogenetic comparative methods in that it does not require manual scrutiny for covariation or secondary structures. The most plausible 1-5 structures are produced in O(MN2 + N3) time and O(N2) space, which are the same requirements as those of widely used dynamic programs based on energy minimization for folding a single sequence. This is the first algorithm probably practical both in terms of time and space for finding secondary structures of homologous RNA sequences. The algorithm has been implemented in C on a Sun SparcStation, and has been verified by testing on tRNAs, 5S rRNAs, 16S rRNAs, TAR RNAs of human immunodeficiency virus type 1 (HIV-1), and RRE RNAs of HIV-1. We have also applied the program to cis-acting packaging sequences of HIV-1, for which no generally accepted structures yet exist, and propose potentially stable structures. Simulation of the program with random sequences with the same base composition and the same degree of similarity as the above sequences shows that structures common to homologous sequences are very unlikely to occur by chance in random sequences. PMID:7681944

  13. On the significance of an RNA tertiary structure prediction

    PubMed Central

    Hajdin, Christine E.; Ding, Feng; Dokholyan, Nikolay V.; Weeks, Kevin M.

    2010-01-01

    Tertiary structure prediction is important for understanding structure–function relationships for RNAs whose structures are unknown and for characterizing RNA states recalcitrant to direct analysis. However, it is unknown what root-mean-square deviation (RMSD) corresponds to a statistically significant RNA tertiary structure prediction. We use discrete molecular dynamics to generate RNA-like folds for structures up to 161 nucleotides (nt) that have complex tertiary interactions and then determine the RMSD distribution between these decoys. These distributions are Gaussian-like. The mean RMSD increases with RNA length and is smaller if secondary structure constraints are imposed while generating decoys. The compactness of RNA molecules with true tertiary folds is intermediate between closely packed spheres and a freely jointed chain. We use this scaling relationship to define an expression relating RMSD with the confidence that a structure prediction is better than that expected by chance. This is the prediction significance, and corresponds to a P-value. For a 100-nt RNA, the RMSD of predicted structures should be within 25 Å of the accepted structure to reach the P ≤ 0.01 level if the secondary structure is predicted de novo and within 14 Å if secondary structure information is used as a constraint. This significance approach should be useful for evaluating diverse RNA structure prediction and molecular modeling algorithms. PMID:20498460

  14. Predicting crystal structure by merging data mining with quantum mechanics.

    PubMed

    Fischer, Christopher C; Tibbetts, Kevin J; Morgan, Dane; Ceder, Gerbrand

    2006-08-01

    Modern methods of quantum mechanics have proved to be effective tools to understand and even predict materials properties. An essential element of the materials design process, relevant to both new materials and the optimization of existing ones, is knowing which crystal structures will form in an alloy system. Crystal structure can only be predicted effectively with quantum mechanics if an algorithm to direct the search through the large space of possible structures is found. We present a new approach to the prediction of structure that rigorously mines correlations embodied within experimental data and uses them to direct quantum mechanical techniques efficiently towards the stable crystal structure of materials.

  15. Link prediction based on local community properties

    NASA Astrophysics Data System (ADS)

    Yang, Xu-Hua; Zhang, Hai-Feng; Ling, Fei; Cheng, Zhi; Weng, Guo-Qing; Huang, Yu-Jiao

    2016-09-01

    The link prediction algorithm is one of the key technologies to reveal the inherent rule of network evolution. This paper proposes a novel link prediction algorithm based on the properties of the local community, which is composed of the common neighbor nodes of any two nodes in the network and the links between these nodes. By referring to the node degree and the condition of assortativity or disassortativity in a network, we comprehensively consider the effect of the shortest path and edge clustering coefficient within the local community on node similarity. We numerically show the proposed method provide good link prediction results.

  16. The prediction of EEG signals using a feedback-structured adaptive rational function filter.

    PubMed

    Kim, H S; Kim, T S; Choi, Y H; Park, S H

    2000-08-01

    In this article, we present a feedback-structured adaptive rational function filter based on a recursive modified Gram-Schmidt algorithm and apply it to the prediction of an EEG signal that has nonlinear and nonstationary characteristics. For the evaluation of the prediction performance, the proposed filter is compared with other methods, where a single-step prediction and a multi-step prediction are considered for a short-term prediction, and the prediction performance is assessed in normalized mean square error. The experimental results show that the proposed filter shows better performance than other methods considered for the short-term prediction of EEG signals.

  17. Shadow-based SAR ATR performance prediction

    NASA Astrophysics Data System (ADS)

    Blacknell, D.

    2009-05-01

    The ability to assess potential automatic target recognition (ATR) performance for a given SAR system, target set and clutter environment is a key requirement for system procurement and mission planning. A cost-effective solution is to develop a theoretical model which can provide ATR performance predictions given a parameterisation of the system, targets and environment. In this paper, a classification scheme based on shadow information is analysed. Consideration of the statistical accuracy of shadow-based features allows ATR performance to be predicted. Quantitative comparisons of predicted performance with results obtained via simulation as well as against real data from the MSTAR data set are presented. It is seen that a reasonable level of agreement is obtained which gives confidence in extending the theoretical concepts to more complex feature-based ATR schemes.

  18. Simultaneous prediction of protein secondary structure and transmembrane spans.

    PubMed

    Leman, Julia Koehler; Mueller, Ralf; Karakas, Mert; Woetzel, Nils; Meiler, Jens

    2013-07-01

    Prediction of transmembrane spans and secondary structure from the protein sequence is generally the first step in the structural characterization of (membrane) proteins. Preference of a stretch of amino acids in a protein to form secondary structure and being placed in the membrane are correlated. Nevertheless, current methods predict either secondary structure or individual transmembrane states. We introduce a method that simultaneously predicts the secondary structure and transmembrane spans from the protein sequence. This approach not only eliminates the necessity to create a consensus prediction from possibly contradicting outputs of several predictors but bears the potential to predict conformational switches, i.e., sequence regions that have a high probability to change for example from a coil conformation in solution to an α-helical transmembrane state. An artificial neural network was trained on databases of 177 membrane proteins and 6048 soluble proteins. The output is a 3 × 3 dimensional probability matrix for each residue in the sequence that combines three secondary structure types (helix, strand, coil) and three environment types (membrane core, interface, solution). The prediction accuracies are 70.3% for nine possible states, 73.2% for three-state secondary structure prediction, and 94.8% for three-state transmembrane span prediction. These accuracies are comparable to state-of-the-art predictors of secondary structure (e.g., Psipred) or transmembrane placement (e.g., OCTOPUS). The method is available as web server and for download at www.meilerlab.org. Copyright © 2013 Wiley Periodicals, Inc.

  19. Weight prediction in complex networks based on neighbor set

    NASA Astrophysics Data System (ADS)

    Zhu, Boyao; Xia, Yongxiang; Zhang, Xue-Jun

    2016-12-01

    Link weights are essential to network functionality, so weight prediction is important for understanding weighted networks given incomplete real-world data. In this work, we develop a novel method for weight prediction based on the local network structure, namely, the set of neighbors of each node. The performance of this method is validated in two cases. In the first case, some links are missing altogether along with their weights, while in the second case all links are known and weight information is missing for some links. Empirical experiments on real-world networks indicate that our method can provide accurate predictions of link weights in both cases.

  20. Weight prediction in complex networks based on neighbor set

    PubMed Central

    Zhu, Boyao; Xia, Yongxiang; Zhang, Xue-Jun

    2016-01-01

    Link weights are essential to network functionality, so weight prediction is important for understanding weighted networks given incomplete real-world data. In this work, we develop a novel method for weight prediction based on the local network structure, namely, the set of neighbors of each node. The performance of this method is validated in two cases. In the first case, some links are missing altogether along with their weights, while in the second case all links are known and weight information is missing for some links. Empirical experiments on real-world networks indicate that our method can provide accurate predictions of link weights in both cases. PMID:27905497

  1. Structure prediction for CASP7 targets using extensive all-atom refinement with Rosetta@home.

    PubMed

    Das, Rhiju; Qian, Bin; Raman, Srivatsan; Vernon, Robert; Thompson, James; Bradley, Philip; Khare, Sagar; Tyka, Michael D; Bhat, Divya; Chivian, Dylan; Kim, David E; Sheffler, William H; Malmström, Lars; Wollacott, Andrew M; Wang, Chu; Andre, Ingemar; Baker, David

    2007-01-01

    We describe predictions made using the Rosetta structure prediction methodology for both template-based modeling and free modeling categories in the Seventh Critical Assessment of Techniques for Protein Structure Prediction. For the first time, aggressive sampling and all-atom refinement could be carried out for the majority of targets, an advance enabled by the Rosetta@home distributed computing network. Template-based modeling predictions using an iterative refinement algorithm improved over the best existing templates for the majority of proteins with less than 200 residues. Free modeling methods gave near-atomic accuracy predictions for several targets under 100 residues from all secondary structure classes. These results indicate that refinement with an all-atom energy function, although computationally expensive, is a powerful method for obtaining accurate structure predictions.

  2. Structure prediction: Encoding evolution of porous solids

    NASA Astrophysics Data System (ADS)

    Mellot-Draznieks, Caroline; Cheetham, Anthony K.

    2017-01-01

    The design and prediction of network topology is challenging, even when the components' principle interactions are strong. Now, frameworks with relatively weak 'chiral recognition' between organic building blocks have been synthesized and rationalized in silico -- an important development in the reticular synthesis of molecular crystals.

  3. Predicting Career Advancement with Structural Equation Modelling

    ERIC Educational Resources Information Center

    Heimler, Ronald; Rosenberg, Stuart; Morote, Elsa-Sofia

    2012-01-01

    Purpose: The purpose of this paper is to use the authors' prior findings concerning basic employability skills in order to determine which skills best predict career advancement potential. Design/methodology/approach: Utilizing survey responses of human resource managers, the employability skills showing the largest relationships to career…

  4. Predicting Career Advancement with Structural Equation Modelling

    ERIC Educational Resources Information Center

    Heimler, Ronald; Rosenberg, Stuart; Morote, Elsa-Sofia

    2012-01-01

    Purpose: The purpose of this paper is to use the authors' prior findings concerning basic employability skills in order to determine which skills best predict career advancement potential. Design/methodology/approach: Utilizing survey responses of human resource managers, the employability skills showing the largest relationships to career…

  5. Neural network definitions of highly predictable protein secondary structure classes

    SciTech Connect

    Lapedes, A. |; Steeg, E.; Farber, R.

    1994-02-01

    We use two co-evolving neural networks to determine new classes of protein secondary structure which are significantly more predictable from local amino sequence than the conventional secondary structure classification. Accurate prediction of the conventional secondary structure classes: alpha helix, beta strand, and coil, from primary sequence has long been an important problem in computational molecular biology. Neural networks have been a popular method to attempt to predict these conventional secondary structure classes. Accuracy has been disappointingly low. The algorithm presented here uses neural networks to similtaneously examine both sequence and structure data, and to evolve new classes of secondary structure that can be predicted from sequence with significantly higher accuracy than the conventional classes. These new classes have both similarities to, and differences with the conventional alpha helix, beta strand and coil.

  6. Factors Influencing Progressive Failure Analysis Predictions for Laminated Composite Structure

    NASA Technical Reports Server (NTRS)

    Knight, Norman F., Jr.

    2008-01-01

    Progressive failure material modeling methods used for structural analysis including failure initiation and material degradation are presented. Different failure initiation criteria and material degradation models are described that define progressive failure formulations. These progressive failure formulations are implemented in a user-defined material model for use with a nonlinear finite element analysis tool. The failure initiation criteria include the maximum stress criteria, maximum strain criteria, the Tsai-Wu failure polynomial, and the Hashin criteria. The material degradation model is based on the ply-discounting approach where the local material constitutive coefficients are degraded. Applications and extensions of the progressive failure analysis material model address two-dimensional plate and shell finite elements and three-dimensional solid finite elements. Implementation details are described in the present paper. Parametric studies for laminated composite structures are discussed to illustrate the features of the progressive failure modeling methods that have been implemented and to demonstrate their influence on progressive failure analysis predictions.

  7. Discriminative structural approaches for enzyme active-site prediction

    PubMed Central

    2011-01-01

    Background Predicting enzyme active-sites in proteins is an important issue not only for protein sciences but also for a variety of practical applications such as drug design. Because enzyme reaction mechanisms are based on the local structures of enzyme active-sites, various template-based methods that compare local structures in proteins have been developed to date. In comparing such local sites, a simple measurement, RMSD, has been used so far. Results This paper introduces new machine learning algorithms that refine the similarity/deviation for comparison of local structures. The similarity/deviation is applied to two types of applications, single template analysis and multiple template analysis. In the single template analysis, a single template is used as a query to search proteins for active sites, whereas a protein structure is examined as a query to discover the possible active-sites using a set of templates in the multiple template analysis. Conclusions This paper experimentally illustrates that the machine learning algorithms effectively improve the similarity/deviation measurements for both the analyses. PMID:21342581

  8. Fatigue Prediction for Composite Materials and Structures

    DTIC Science & Technology

    2005-10-01

    Eugenio OÑATE CIMNE (International Center for Numerical Methods in Engineering) Building C-1, Campus Nord UPC -C/ Gran Capitán s/n 08034 Barcelona...SPAIN * salomon@cimne.upc.edu ABSTRACT The objective of this paper is to present a new computational methodology for predicting the durability of... methodology is validated using experimental data from tests on CFRR composite material samples. 1.0 INTRODUCTION Fatigue is defined as "the process

  9. Protein structure prediction from sequence variation

    PubMed Central

    Marks, Debora S; Hopf, Thomas A; Sander, Chris

    2015-01-01

    Genomic sequences contain rich evolutionary information about functional constraints on macromolecules such as proteins. This information can be efficiently mined to detect evolutionary couplings between residues in proteins and address the long-standing challenge to compute protein three-dimensional structures from amino acid sequences. Substantial progress has recently been made on this problem owing to the explosive growth in available sequences and the application of global statistical methods. In addition to three-dimensional structure, the improved understanding of covariation may help identify functional residues involved in ligand binding, protein-complex formation and conformational changes. We expect computation of covariation patterns to complement experimental structural biology in elucidating the full spectrum of protein structures, their functional interactions and evolutionary dynamics. PMID:23138306

  10. Predicted novel hydrogen hydrate structures under pressure from first principles

    NASA Astrophysics Data System (ADS)

    Qian, Guangrui; Lyakhov, Andriy; Zhu, Qiang; Oganov, Artem; Dong, Xiao

    2014-03-01

    Gas hydrates are systems of prime importance. In particular, hydrogen hydrates are potential materials of icy satellites and comets, and may be used for hydrogen storage. We explore the H2O-H2 system at pressures in the range 0 ~ 100 GPa with ab initio variable-composition evolutionary simulations. According to our calculation and previous experiments, the H2O-H2 system undergoes a series of transformations with pressure, and adopts the known open-network clathrate structures (sII, C0), dense ``filled ice'' structures (C1, C2) and two novel hydrogen hydrate phases. One of these structures is based on the hexagonal ice framework and has the same H2O:H2 ratio (2:1) as the C0 phase at low pressures and similar enthalpy (we name this phase Ih-C0). The other newly predicted hydrate phase has a 1:2 H2O:H2 ratio and structure based on cubic ice. This phase (which we name C3) is predicted to be thermodynamically stable above 38 GPa when including van der Waals interactions and zero-point vibrational energy. This is the hydrogen-richest hydrate and this phase has the highest gravimetric densities (18 wt.%) of extractable hydrogen among all known materials. We thank the DARPA (Grants No. W31P4Q1310005 and No. W31P4Q1210008), National Science Founda- tion (EAR-1114313, DMR-1231586), AFOSR (FA9550- 13-C-0037), DOE (DE-AC02-98CH10886), CRDF Global (UKE2-7034-KV-11) for financial support. We thank Purdue University Teragrid for providing computational resources and technical support for this work (Charge No.: TG-DMR110058).

  11. The sequential structure of brain activation predicts skill.

    PubMed

    Anderson, John R; Bothell, Daniel; Fincham, Jon M; Moon, Jungaa

    2016-01-29

    In an fMRI study, participants were trained to play a complex video game. They were scanned early and then again after substantial practice. While better players showed greater activation in one region (right dorsal striatum) their relative skill was better diagnosed by considering the sequential structure of whole brain activation. Using a cognitive model that played this game, we extracted a characterization of the mental states that are involved in playing a game and the statistical structure of the transitions among these states. There was a strong correspondence between this measure of sequential structure and the skill of different players. Using multi-voxel pattern analysis, it was possible to recognize, with relatively high accuracy, the cognitive states participants were in during particular scans. We used the sequential structure of these activation-recognized states to predict the skill of individual players. These findings indicate that important features about information-processing strategies can be identified from a model-based analysis of the sequential structure of brain activation. Copyright © 2015 Elsevier Ltd. All rights reserved.

  12. Predicting Learned Helplessness Based on Personality

    ERIC Educational Resources Information Center

    Maadikhah, Elham; Erfani, Nasrollah

    2014-01-01

    Learned helplessness as a negative motivational state can latently underlie repeated failures and create negative feelings toward the education as well as depression in students and other members of a society. The purpose of this paper is to predict learned helplessness based on students' personality traits. The research is a predictive…

  13. Optimization-based Dynamic Human Lifting Prediction

    DTIC Science & Technology

    2008-06-01

    Anith Mathai, Steve Beck,Timothy Marler , Jingzhou Yang, Jasbir S. Arora, Karim Abdel-Malek Virtual Soldier Research Program, Center for Computer Aided...Rahmatalla, S., Kim, J., Marler , T., Beck, S., Yang, J., busek, J., Arora, J.S., and Abdel-Malek, K. Optimization-based dynamic human walking prediction

  14. Revealing how network structure affects accuracy of link prediction

    NASA Astrophysics Data System (ADS)

    Yang, Jin-Xuan; Zhang, Xiao-Dong

    2017-08-01

    Link prediction plays an important role in network reconstruction and network evolution. The network structure affects the accuracy of link prediction, which is an interesting problem. In this paper we use common neighbors and the Gini coefficient to reveal the relation between them, which can provide a good reference for the choice of a suitable link prediction algorithm according to the network structure. Moreover, the statistical analysis reveals correlation between the common neighbors index, Gini coefficient index and other indices to describe the network structure, such as Laplacian eigenvalues, clustering coefficient, degree heterogeneity, and assortativity of network. Furthermore, a new method to predict missing links is proposed. The experimental results show that the proposed algorithm yields better prediction accuracy and robustness to the network structure than existing currently used methods for a variety of real-world networks.

  15. IPknot: fast and accurate prediction of RNA secondary structures with pseudoknots using integer programming.

    PubMed

    Sato, Kengo; Kato, Yuki; Hamada, Michiaki; Akutsu, Tatsuya; Asai, Kiyoshi

    2011-07-01

    Pseudoknots found in secondary structures of a number of functional RNAs play various roles in biological processes. Recent methods for predicting RNA secondary structures cover certain classes of pseudoknotted structures, but only a few of them achieve satisfying predictions in terms of both speed and accuracy. We propose IPknot, a novel computational method for predicting RNA secondary structures with pseudoknots based on maximizing expected accuracy of a predicted structure. IPknot decomposes a pseudoknotted structure into a set of pseudoknot-free substructures and approximates a base-pairing probability distribution that considers pseudoknots, leading to the capability of modeling a wide class of pseudoknots and running quite fast. In addition, we propose a heuristic algorithm for refining base-paring probabilities to improve the prediction accuracy of IPknot. The problem of maximizing expected accuracy is solved by using integer programming with threshold cut. We also extend IPknot so that it can predict the consensus secondary structure with pseudoknots when a multiple sequence alignment is given. IPknot is validated through extensive experiments on various datasets, showing that IPknot achieves better prediction accuracy and faster running time as compared with several competitive prediction methods. The program of IPknot is available at http://www.ncrna.org/software/ipknot/. IPknot is also available as a web server at http://rna.naist.jp/ipknot/. satoken@k.u-tokyo.ac.jp; ykato@is.naist.jp Supplementary data are available at Bioinformatics online.

  16. I-TASSER: a unified platform for automated protein structure and function prediction.

    PubMed

    Roy, Ambrish; Kucukural, Alper; Zhang, Yang

    2010-04-01

    The iterative threading assembly refinement (I-TASSER) server is an integrated platform for automated protein structure and function prediction based on the sequence-to-structure-to-function paradigm. Starting from an amino acid sequence, I-TASSER first generates three-dimensional (3D) atomic models from multiple threading alignments and iterative structural assembly simulations. The function of the protein is then inferred by structurally matching the 3D models with other known proteins. The output from a typical server run contains full-length secondary and tertiary structure predictions, and functional annotations on ligand-binding sites, Enzyme Commission numbers and Gene Ontology terms. An estimate of accuracy of the predictions is provided based on the confidence score of the modeling. This protocol provides new insights and guidelines for designing of online server systems for the state-of-the-art protein structure and function predictions. The server is available at http://zhanglab.ccmb.med.umich.edu/I-TASSER.

  17. Structural Prediction and Experimental Verification of Three New Ferroelectric Materials

    NASA Astrophysics Data System (ADS)

    Arbogast, D. J.; Foster, M. C.; Nielson, R. M.; Photinos, P. J.; Abrahams, S. C.

    1999-05-01

    Aminoguanidinium hexafluorozirconate, potassium niobyl silicate and fresnoite are typical members of a growing group of materials predicted to be new ferroelectrics by the application of structural criteria. Following prediction, undergraduate chemistry majors prepare and physics majors measure the dielectric properties of each. Experimental results will be presented for the three title materials in verification of their predicted property. In addition to the structural criteria on which the predictions depend, the circuits built locally and other instrumentation used for measuring ac and dc hysteresis, the thermal and frequency dependence of the dielectric permittivity, the thermal dependence of the spontaneous polarization and also the pyroelectric coefficient will be presented.

  18. Simultaneous prediction of RNA secondary structure and helix coaxial stacking.

    PubMed

    Shareghi, Pooya; Wang, Yingfeng; Malmberg, Russell; Cai, Liming

    2012-06-11

    RNA secondary structure plays a scaffolding role for RNA tertiary conformation. Accurate secondary structure prediction can not only identify double-stranded helices and single stranded-loops but also help provide information for potential tertiary interaction motifs critical to the 3D conformation. The average accuracy in ab initio prediction remains 70%; performance improvement has only been limited to short RNA sequences. The prediction of tertiary interaction motifs is difficult without multiple, related sequences that are usually not available. This paper presents research that aims to improve the secondary structure prediction performance and to develop a capability to predict coaxial stacking between helices. Coaxial stacking positions two helices on the same axis, a tertiary motif present in almost all junctions that account for a high percentage of RNA tertiary structures. This research identified energetic rules for coaxial stacks and geometric constraints on stack combinations, which were applied to developing an efficient dynamic programming application for simultaneous prediction of secondary structure and coaxial stacking. Results on a number of non-coding RNA data sets, of short and moderately long lengths, show a performance improvement (specially on tRNAs) for secondary structure prediction when compared with existing methods. The program also demonstrates a capability for prediction of coaxial stacking. The significant leap of performance on tRNAs demonstrated in this work suggests that a breakthrough to a higher performance in RNA secondary structure prediction may lie in understanding contributions from tertiary motifs critical to the structure, as such information can be used to constrain geometrically as well as energetically the space of RNA secondary structure.

  19. Simultaneous prediction of RNA secondary structure and helix coaxial stacking

    PubMed Central

    2012-01-01

    Background RNA secondary structure plays a scaffolding role for RNA tertiary conformation. Accurate secondary structure prediction can not only identify double-stranded helices and single stranded-loops but also help provide information for potential tertiary interaction motifs critical to the 3D conformation. The average accuracy in ab initio prediction remains 70%; performance improvement has only been limited to short RNA sequences. The prediction of tertiary interaction motifs is difficult without multiple, related sequences that are usually not available. This paper presents research that aims to improve the secondary structure prediction performance and to develop a capability to predict coaxial stacking between helices. Coaxial stacking positions two helices on the same axis, a tertiary motif present in almost all junctions that account for a high percentage of RNA tertiary structures. Results This research identified energetic rules for coaxial stacks and geometric constraints on stack combinations, which were applied to developing an efficient dynamic programming application for simultaneous prediction of secondary structure and coaxial stacking. Results on a number of non-coding RNA data sets, of short and moderately long lengths, show a performance improvement (specially on tRNAs) for secondary structure prediction when compared with existing methods. The program also demonstrates a capability for prediction of coaxial stacking. Conclusions The significant leap of performance on tRNAs demonstrated in this work suggests that a breakthrough to a higher performance in RNA secondary structure prediction may lie in understanding contributions from tertiary motifs critical to the structure, as such information can be used to constrain geometrically as well as energetically the space of RNA secondary structure. PMID:22759616

  20. Prediction of rigid silica based insulation conductivity

    NASA Technical Reports Server (NTRS)

    Williams, Stanley D.; Curry, Donald M.

    1993-01-01

    A method is presented for predicting the thermal conductivity of low density, silica based fibrous insulators. It is shown that the method can be used to extend data values to the upper material temperature limits from those obtained from the test data. It is demonstrated that once the conductivity is accurately determined by the analytical model the conductivity for other atmospheres can be predicted. The method is similar to that presented by previous investigators, but differs significantly in the contribution due to gas and internal radiation.

  1. Information theory provides a comprehensive framework for the evaluation of protein structure predictions

    PubMed Central

    Swanson, Rosemarie; Vannucci, Marina; Tsai, Jerry W.

    2008-01-01

    Protein structure prediction has a number of important ad hoc similarity measures for evaluating predictions, but would benefit from a measure that is able to provide a common framework for a broad range of comparisons. Here we show that a mutual information-like measure can provide a comprehensive framework for evaluating protein structure prediction of all types. We discuss the concept of information, its application to secondary structure, and the obstacle to applying it to 3D structure. Based on insights from the secondary structure case, we present an approach to work around the 3D difficulties, and develop a method to measure the mutual information provided by a 3D structure prediction. We integrate the evaluation of all types of protein structure prediction into a single frame work, and compare the amount of information provided by various prediction methods, including secondary structure prediction. Within this broadened framework, the idea that structure is better preserved than sequence during evolution is evaluated quantitatively for the globin family. A nearly perfect sequence match in the globin family corresponds to about 300 bits of information, whereas a nearly perfect structural match for the same two proteins corresponds to about 2500 bits of information, where bits of information describes the probability of obtaining a match of similar closeness by chance. Mutual information provides both a theoretical basis for evaluating structure similarity and an explanatory surround for existing similarity measures. PMID:18704942

  2. Computer analysis and structure prediction of nucleic acids and proteins.

    PubMed Central

    Kanehisa, M; Klein, P; Greif, P; DeLisi, C

    1984-01-01

    We have developed an integrated computer system for analysis of nucleic acid and protein sequences, which consists of sequence and structure databases, a relational database, and software for structural analysis. The system is potentially applicable to a number of problems in structural biology including predictive classification of the function and location of oncogene products. PMID:6546426

  3. Structure Prediction and Analysis of Neuraminidase Sequence Variants

    ERIC Educational Resources Information Center

    Thayer, Kelly M.

    2016-01-01

    Analyzing protein structure has become an integral aspect of understanding systems of biochemical import. The laboratory experiment endeavors to introduce protein folding to ascertain structures of proteins for which the structure is unavailable, as well as to critically evaluate the quality of the prediction obtained. The model system used is the…

  4. Structure Prediction and Analysis of Neuraminidase Sequence Variants

    ERIC Educational Resources Information Center

    Thayer, Kelly M.

    2016-01-01

    Analyzing protein structure has become an integral aspect of understanding systems of biochemical import. The laboratory experiment endeavors to introduce protein folding to ascertain structures of proteins for which the structure is unavailable, as well as to critically evaluate the quality of the prediction obtained. The model system used is the…

  5. Practical theories for service life prediction of critical aerospace structural components

    NASA Technical Reports Server (NTRS)

    Ko, William L.; Monaghan, Richard C.; Jackson, Raymond H.

    1992-01-01

    A new second-order theory was developed for predicting the service lives of aerospace structural components. The predictions based on this new theory were compared with those based on the Ko first-order theory and the classical theory of service life predictions. The new theory gives very accurate service life predictions. An equivalent constant-amplitude stress cycle method was proposed for representing the random load spectrum for crack growth calculations. This method predicts the most conservative service life. The proposed use of minimum detectable crack size, instead of proof load established crack size as an initial crack size for crack growth calculations, could give a more realistic service life.

  6. Structure and stability prediction of compounds with evolutionary algorithms.

    PubMed

    Revard, Benjamin C; Tipton, William W; Hennig, Richard G

    2014-01-01

    Crystal structure prediction is a long-standing challenge in the physical sciences. In recent years, much practical success has been had by framing it as a global optimization problem, leveraging the existence of increasingly robust and accurate free energy calculations. This optimization problem has often been solved using evolutionary algorithms (EAs). However, many choices are possible when designing an EA for structure prediction, and innovation in the field is ongoing. We review the current state of evolutionary algorithms for crystal structure and composition prediction and discuss the details of methodological and algorithmic choices. Finally, we review the application of these algorithms to many systems of practical and fundamental scientific interest.

  7. The dominant role of side-chain backbone interactions in structural realization of amino acid code. ChiRotor: a side-chain prediction algorithm based on side-chain backbone interactions.

    PubMed

    Spassov, Velin Z; Yan, Lisa; Flook, Paul K

    2007-03-01

    The basic differences between the 20 natural amino acid residues are due to differences in their side-chain structures. This characteristic design of protein building blocks implies that side-chain-side-chain interactions play an important, even dominant role in 3D-structural realization of amino acid codes. Here we present the results of a comparative analysis of the contributions of side-chain-side-chain (s-s) and side-chain-backbone (s-b) interactions to the stabilization of folded protein structures within the framework of the CHARMm molecular data model. Contrary to intuition, our results suggest that side-chain-backbone interactions play the major role in side-chain packing, in stabilizing the folded structures, and in differentiating the folded structures from the unfolded or misfolded structures, while the interactions between side chains have a secondary effect. An additional analysis of electrostatic energies suggests that combinatorial dominance of the interactions between opposite charges makes the electrostatic interactions act as an unspecific folding force that stabilizes not only native structure, but also compact random conformations. This observation is in agreement with experimental findings that, in the denatured state, the charge-charge interactions stabilize more compact conformations. Taking advantage of the dominant role of side-chain-backbone interactions in side-chain packing to reduce the combinatorial problem, we developed a new algorithm, ChiRotor, for rapid prediction of side-chain conformations. We present the results of a validation study of the method based on a set of high resolution X-ray structures.

  8. Saliency-based gaze prediction based on head direction.

    PubMed

    Nakashima, Ryoichi; Fang, Yu; Hatori, Yasuhiro; Hiratani, Akinori; Matsumiya, Kazumichi; Kuriki, Ichiro; Shioiri, Satoshi

    2015-12-01

    Despite decades of attempts to create a model for predicting gaze locations by using saliency maps, a highly accurate gaze prediction model for general conditions has yet to be devised. In this study, we propose a gaze prediction method based on head direction that can improve the accuracy of any model. We used a probability distribution of eye position based on head direction (static eye-head coordination) and added this information to a model of saliency-based visual attention. Using empirical data on eye and head directions while observers were viewing natural scenes, we estimated a probability distribution of eye position. We then combined the relationship between eye position and head direction with visual saliency to predict gaze locations. The model showed that information on head direction improved the prediction accuracy. Further, there was no difference in the gaze prediction accuracy between the two models using information on head direction with and without eye-head coordination. Therefore, information on head direction is useful for predicting gaze location when it is available. Furthermore, this gaze prediction model can be applied relatively easily to many daily situations such as during walking. Copyright © 2015 Elsevier Ltd. All rights reserved.

  9. Plasma Stabilization Based on Model Predictive Control

    NASA Astrophysics Data System (ADS)

    Sotnikova, Margarita

    The nonlinear model predictive control algorithms for plasma current and shape stabilization are proposed. Such algorithms are quite suitable for the situations when the plant to be controlled has essentially nonlinear dynamics. Besides that, predictive model based control algorithms allow to take into account a lot of requirements and constraints involved both on the controlled and manipulated variables. The significant drawback of the algorithms is that they require a lot of time to compute control input at each sampling instant. In this paper the model predictive control algorithms are demonstrated by the example of plasma vertical stabilization for ITER-FEAT tokamak. The tuning of parameters of algorithms is performed in order to decrease computational load.

  10. Cortical structure predicts success in performing musical transformation judgments.

    PubMed

    Foster, Nicholas E V; Zatorre, Robert J

    2010-10-15

    Recognizing melodies by their interval structure, or "relative pitch," is a fundamental aspect of musical perception. By using relative pitch, we are able to recognize tunes regardless of the key in which they are played. We sought to determine the cortical areas important for relative pitch processing using two morphometric techniques. Cortical differences have been reported in musicians within right auditory cortex (AC), a region considered important for pitch-based processing, and we have previously reported a functional correlation between relative pitch processing in the anterior intraparietal sulcus (IPS). We addressed the hypothesis that regional variation of cortical structure within AC and IPS is related to relative pitch ability using two anatomical techniques, cortical thickness (CT) analysis and voxel-based morphometry (VBM) of magnetic resonance imaging data. Persons with variable amounts of formal musical training were tested on a melody transposition task, as well as two musical control tasks and a speech control task. We found that gray matter concentration and cortical thickness in right Heschl's sulcus and bilateral IPS both predicted relative pitch task performance and correlated to a lesser extent with performance on the two musical control tasks. After factoring out variance explained by musical training, only relative pitch performance was predicted by cortical structure in these regions. These results directly demonstrate the functional relevance of previously reported anatomical differences in the auditory cortex of musicians. The findings in the IPS provide further support for the existence of a multimodal network for systematic transformation of stimulus information in this region. Copyright 2010 Elsevier Inc. All rights reserved.

  11. A set of nearest neighbor parameters for predicting the enthalpy change of RNA secondary structure formation

    PubMed Central

    Lu, Zhi John; Turner, Douglas H.; Mathews, David H.

    2006-01-01

    A complete set of nearest neighbor parameters to predict the enthalpy change of RNA secondary structure formation was derived. These parameters can be used with available free energy nearest neighbor parameters to extend the secondary structure prediction of RNA sequences to temperatures other than 37°C. The parameters were tested by predicting the secondary structures of sequences with known secondary structure that are from organisms with known optimal growth temperatures. Compared with the previous set of enthalpy nearest neighbor parameters, the sensitivity of base pair prediction improved from 65.2 to 68.9% at optimal growth temperatures ranging from 10 to 60°C. Base pair probabilities were predicted with a partition function and the positive predictive value of structure prediction is 90.4% when considering the base pairs in the lowest free energy structure with pairing probability of 0.99 or above. Moreover, a strong correlation is found between the predicted melting temperatures of RNA sequences and the optimal growth temperatures of the host organism. This indicates that organisms that live at higher temperatures have evolved RNA sequences with higher melting temperatures. PMID:16982646

  12. A set of nearest neighbor parameters for predicting the enthalpy change of RNA secondary structure formation.

    PubMed

    Lu, Zhi John; Turner, Douglas H; Mathews, David H

    2006-01-01

    A complete set of nearest neighbor parameters to predict the enthalpy change of RNA secondary structure formation was derived. These parameters can be used with available free energy nearest neighbor parameters to extend the secondary structure prediction of RNA sequences to temperatures other than 37 degrees C. The parameters were tested by predicting the secondary structures of sequences with known secondary structure that are from organisms with known optimal growth temperatures. Compared with the previous set of enthalpy nearest neighbor parameters, the sensitivity of base pair prediction improved from 65.2 to 68.9% at optimal growth temperatures ranging from 10 to 60 degrees C. Base pair probabilities were predicted with a partition function and the positive predictive value of structure prediction is 90.4% when considering the base pairs in the lowest free energy structure with pairing probability of 0.99 or above. Moreover, a strong correlation is found between the predicted melting temperatures of RNA sequences and the optimal growth temperatures of the host organism. This indicates that organisms that live at higher temperatures have evolved RNA sequences with higher melting temperatures.

  13. A Predictive Structural Model of the Primate Connectome

    PubMed Central

    Beul, Sarah F.; Barbas, Helen; Hilgetag, Claus C.

    2017-01-01

    Anatomical connectivity imposes strong constraints on brain function, but there is no general agreement about principles that govern its organization. Based on extensive quantitative data, we tested the power of three factors to predict connections of the primate cerebral cortex: architectonic similarity (structural model), spatial proximity (distance model) and thickness similarity (thickness model). Architectonic similarity showed the strongest and most consistent influence on connection features. This parameter was strongly associated with the presence or absence of inter-areal connections and when integrated with spatial distance, the factor allowed predicting the existence of projections with very high accuracy. Moreover, architectonic similarity was strongly related to the laminar pattern of projection origins, and the absolute number of cortical connections of an area. By contrast, cortical thickness similarity and distance were not systematically related to connection features. These findings suggest that cortical architecture provides a general organizing principle for connections in the primate brain, providing further support for the well-corroborated structural model. PMID:28256558

  14. Prediction of Alzheimer's disease using individual structural connectivity networks

    PubMed Central

    Shao, Junming; Myers, Nicholas; Yang, Qinli; Feng, Jing; Plant, Claudia; Böhm, Christian; Förstl, Hans; Kurz, Alexander; Zimmer, Claus; Meng, Chun; Riedl, Valentin; Wohlschläger, Afra; Sorg, Christian

    2012-01-01

    Alzheimer's disease (AD) progressively degrades the brain's gray and white matter. Changes in white matter reflect changes in the brain's structural connectivity pattern. Here, we established individual structural connectivity networks (ISCNs) to distinguish predementia and dementia AD from healthy aging in individual scans. Diffusion tractography was used to construct ISCNs with a fully automated procedure for 21 healthy control subjects (HC), 23 patients with mild cognitive impairment and conversion to AD dementia within 3 years (AD-MCI), and 17 patients with mild AD dementia. Three typical pattern classifiers were used for AD prediction. Patients with AD and AD-MCI were separated from HC with accuracies greater than 95% and 90%, respectively, irrespective of prediction approach and specific fiber properties. Most informative connections involved medial prefrontal, posterior parietal, and insular cortex. Patients with mild AD were separated from those with AD-MCI with an accuracy of approximately 85%. Our finding provides evidence that ISCNs are sensitive to the impact of earliest stages of AD. ISCNs may be useful as a white matter-based imaging biomarker to distinguish healthy aging from AD. PMID:22405045

  15. Highway traffic noise prediction based on GIS

    NASA Astrophysics Data System (ADS)

    Zhao, Jianghua; Qin, Qiming

    2014-05-01

    Before building a new road, we need to predict the traffic noise generated by vehicles. Traditional traffic noise prediction methods are based on certain locations and they are not only time-consuming, high cost, but also cannot be visualized. Geographical Information System (GIS) can not only solve the problem of manual data processing, but also can get noise values at any point. The paper selected a road segment from Wenxi to Heyang. According to the geographical overview of the study area and the comparison between several models, we combine the JTG B03-2006 model and the HJ2.4-2009 model to predict the traffic noise depending on the circumstances. Finally, we interpolate the noise values at each prediction point and then generate contours of noise. By overlaying the village data on the noise contour layer, we can get the thematic maps. The use of GIS for road traffic noise prediction greatly facilitates the decision-makers because of GIS spatial analysis function and visualization capabilities. We can clearly see the districts where noise are excessive, and thus it becomes convenient to optimize the road line and take noise reduction measures such as installing sound barriers and relocating villages and so on.

  16. Epitope prediction based on random peptide library screening: benchmark dataset and prediction tools evaluation.

    PubMed

    Sun, Pingping; Chen, Wenhan; Huang, Yanxin; Wang, Hongyan; Ma, Zhiqiang; Lv, Yinghua

    2011-06-16

    Epitope prediction based on random peptide library screening has become a focus as a promising method in immunoinformatics research. Some novel software and web-based servers have been proposed in recent years and have succeeded in given test cases. However, since the number of available mimotopes with the relevant structure of template-target complex is limited, a systematic evaluation of these methods is still absent. In this study, a new benchmark dataset was defined. Using this benchmark dataset and a representative dataset, five examples of the most popular epitope prediction software products which are based on random peptide library screening have been evaluated. Using the benchmark dataset, in no method did performance exceed a 0.42 precision and 0.37 sensitivity, and the MCC scores suggest that the epitope prediction results of these software programs are greater than random prediction about 0.09-0.13; while using the representative dataset, most of the values of these performance measures are slightly improved, but the overall performance is still not satisfactory. Many test cases in the benchmark dataset cannot be applied to these pieces of software due to software limitations. Moreover chances are that these software products are overfitted to the small dataset and will fail in other cases. Therefore finding the correlation between mimotopes and genuine epitope residues is still far from resolved and much larger dataset for mimotope-based epitope prediction is desirable.

  17. Objective Eulerian Coherent Structures Predict Drifter Motion

    NASA Astrophysics Data System (ADS)

    Serra, Mattia; Haller, George

    2017-04-01

    Recent results show that Objective Eulerian Coherent Structures (OECSs) (Serra, M. and Haller, G., Chaos 26(5), 2016) reveal the correct, frame-independent locations of instantaneous saddle-type material behavior in unsteady flows. Using an unsteady ocean surface velocity field reconstructed from high-frequency-radar measurements, we compute attracting OECSs in a region of the North-East coast of the US, where drifter trajectories are also available. Remarkably, we find that despite their non-passive and inertial dynamics, drifters align rapidly with nearby attracting OECSs. At the same time, the drifter attractors remain completely hidden in instantaneous streamlines plots and in the Okubo-Weiss field.

  18. Propagating uncertainties in statistical model based shape prediction

    NASA Astrophysics Data System (ADS)

    Syrkina, Ekaterina; Blanc, Rémi; Székely, Gàbor

    2011-03-01

    This paper addresses the question of accuracy assessment and confidence regions estimation in statistical model based shape prediction. Shape prediction consists in estimating the shape of an organ based on a partial observation, due e.g. to a limited field of view or poorly contrasted images, and generally requires a statistical model. However, such predictions can be impaired by several sources of uncertainty, in particular the presence of noise in the observation, limited correlations between the predictors and the shape to predict, as well as limitations of the statistical shape model - in particular the number of training samples. We propose a framework which takes these into account and derives confidence regions around the predicted shape. Our method relies on the construction of two separate statistical shape models, for the predictors and for the unseen parts, and exploits the correlations between them assuming a joint Gaussian distribution. Limitations of the models are taken into account by jointly optimizing the prediction and minimizing the shape reconstruction error through cross-validation. An application to the prediction of the shape of the proximal part of the human tibia given the shape of the distal femur is proposed, as well as the evaluation of the reliability of the estimated confidence regions, using a database of 184 samples. Potential applications are reconstructive surgery, e.g. to assess whether an implant fits in a range of acceptable shapes, or functional neurosurgery when the target's position is not directly visible and needs to be inferred from nearby visible structures.

  19. Characterization of domain-peptide interaction interface: prediction of SH3 domain-mediated protein-protein interaction network in yeast by generic structure-based models.

    PubMed

    Hou, Tingjun; Li, Nan; Li, Youyong; Wang, Wei

    2012-05-04

    Determination of the binding specificity of SH3 domain, a peptide recognition module (PRM), is important to understand their biological functions and reconstruct the SH3-mediated protein-protein interaction network. In the present study, the SH3-peptide interactions for both class I and II SH3 domains were characterized by the intermolecular residue-residue interaction network. We developed generic MIEC-SVM models to infer SH3 domain-peptide recognition specificity that achieved satisfactory prediction accuracy. By investigating the domain-peptide recognition mechanisms at the residue level, we found that the class-I and class-II binding peptides have different binding modes even though they occupy the same binding site of SH3. Furthermore, we predicted the potential binding partners of SH3 domains in the yeast proteome and constructed the SH3-mediated protein-protein interaction network. Comparison with the experimentally determined interactions confirmed the effectiveness of our approach. This study showed that our sophisticated computational approach not only provides a powerful platform to decipher protein recognition code at the molecular level but also allows identification of peptide-mediated protein interactions at a proteomic scale. We believe that such an approach is general to be applicable to other domain-peptide interactions.

  20. Characterization of Domain–Peptide Interaction Interface: Prediction of SH3 Domain-Mediated Protein–Protein Interaction Network in Yeast by Generic Structure-Based Models

    PubMed Central

    Hou, Tingjun; Li, Nan; Li, Youyong; Wang, Wei

    2012-01-01

    Determination of the binding specificity of SH3 domain, a peptide recognition module (PRM), is important to understand their biological functions and reconstruct the SH3-mediated protein–protein interaction network. In the present study, the SH3-peptide interactions for both class I and II SH3 domains were characterized by the intermolecular residue–residue interaction network. We developed generic MIEC-SVM models to infer SH3 domain-peptide recognition specificity that achieved satisfactory prediction accuracy. By investigating the domain–peptide recognition mechanisms at the residue level, we found that the class-I and class-II binding peptides have different binding modes even though they occupy the same binding site of SH3. Furthermore, we predicted the potential binding partners of SH3 domains in the yeast proteome and constructed the SH3-mediated protein–protein interaction network. Comparison with the experimentally determined interactions confirmed the effectiveness of our approach. This study showed that our sophisticated computational approach not only provides a powerful platform to decipher protein recognition code at the molecular level but also allows identification of peptide-mediated protein interactions at a proteomic scale. We believe that such an approach is general to be applicable to other domain–peptide interactions. PMID:22468754

  1. A predictive structural model for bulk metallic glasses

    PubMed Central

    Laws, K. J.; Miracle, D. B.; Ferry, M.

    2015-01-01

    Great progress has been made in understanding the atomic structure of metallic glasses, but there is still no clear connection between atomic structure and glass-forming ability. Here we give new insights into perhaps the most important question in the field of amorphous metals: how can glass-forming ability be predicted from atomic structure? We give a new approach to modelling metallic glass atomic structures by solving three long-standing problems: we discover a new family of structural defects that discourage glass formation; we impose efficient local packing around all atoms simultaneously; and we enforce structural self-consistency. Fewer than a dozen binary structures satisfy these constraints, but extra degrees of freedom in structures with three or more different atom sizes significantly expand the number of relatively stable, ‘bulk' metallic glasses. The present work gives a new approach towards achieving the long-sought goal of a predictive capability for bulk metallic glasses. PMID:26370667

  2. A predictive structural model for bulk metallic glasses.

    PubMed

    Laws, K J; Miracle, D B; Ferry, M

    2015-09-15

    Great progress has been made in understanding the atomic structure of metallic glasses, but there is still no clear connection between atomic structure and glass-forming ability. Here we give new insights into perhaps the most important question in the field of amorphous metals: how can glass-forming ability be predicted from atomic structure? We give a new approach to modelling metallic glass atomic structures by solving three long-standing problems: we discover a new family of structural defects that discourage glass formation; we impose efficient local packing around all atoms simultaneously; and we enforce structural self-consistency. Fewer than a dozen binary structures satisfy these constraints, but extra degrees of freedom in structures with three or more different atom sizes significantly expand the number of relatively stable, 'bulk' metallic glasses. The present work gives a new approach towards achieving the long-sought goal of a predictive capability for bulk metallic glasses.

  3. Factor Structure of Self-Regulation in Preschoolers: Testing Models of a Field-Based Assessment for Predicting Early School Readiness

    ERIC Educational Resources Information Center

    Denham, Susanne A.; Warren-Khot, Heather K.; Bassett, Hideko Hamada; Wyatt, Todd; Perna, Alyssa

    2012-01-01

    The importance of early self-regulatory skill has seen increased focus in the applied research literature given the implications of these skills for early school success. A three-factor latent structure of self-regulation consisting of compliance, cool executive control, and hot executive control was tested against alternative models and retained…

  4. Factor Structure of Self-Regulation in Preschoolers: Testing Models of a Field-Based Assessment for Predicting Early School Readiness

    ERIC Educational Resources Information Center

    Denham, Susanne A.; Warren-Khot, Heather K.; Bassett, Hideko Hamada; Wyatt, Todd; Perna, Alyssa

    2012-01-01

    The importance of early self-regulatory skill has seen increased focus in the applied research literature given the implications of these skills for early school success. A three-factor latent structure of self-regulation consisting of compliance, cool executive control, and hot executive control was tested against alternative models and retained…

  5. A Micromechanics-Based Method for Multiscale Fatigue Prediction

    NASA Astrophysics Data System (ADS)

    Moore, John Allan

    An estimated 80% of all structural failures are due to mechanical fatigue, often resulting in catastrophic, dangerous and costly failure events. However, an accurate model to predict fatigue remains an elusive goal. One of the major challenges is that fatigue is intrinsically a multiscale process, which is dependent on a structure's geometric design as well as its material's microscale morphology. The following work begins with a microscale study of fatigue nucleation around non- metallic inclusions. Based on this analysis, a novel multiscale method for fatigue predictions is developed. This method simulates macroscale geometries explicitly while concurrently calculating the simplified response of microscale inclusions. Thus, providing adequate detail on multiple scales for accurate fatigue life predictions. The methods herein provide insight into the multiscale nature of fatigue, while also developing a tool to aid in geometric design and material optimization for fatigue critical devices such as biomedical stents and artificial heart valves.

  6. Automatic measurement of vowel duration via structured prediction.

    PubMed

    Adi, Yossi; Keshet, Joseph; Cibelli, Emily; Gustafson, Erin; Clopper, Cynthia; Goldrick, Matthew

    2016-12-01

    A key barrier to making phonetic studies scalable and replicable is the need to rely on subjective, manual annotation. To help meet this challenge, a machine learning algorithm was developed for automatic measurement of a widely used phonetic measure: vowel duration. Manually-annotated data were used to train a model that takes as input an arbitrary length segment of the acoustic signal containing a single vowel that is preceded and followed by consonants and outputs the duration of the vowel. The model is based on the structured prediction framework. The input signal and a hypothesized set of a vowel's onset and offset are mapped to an abstract vector space by a set of acoustic feature functions. The learning algorithm is trained in this space to minimize the difference in expectations between predicted and manually-measured vowel durations. The trained model can then automatically estimate vowel durations without phonetic or orthographic transcription. Results comparing the model to three sets of manually annotated data suggest it outperformed the current gold standard for duration measurement, an hidden Markov model-based forced aligner (which requires orthographic or phonetic transcription as an input).

  7. Automatic measurement of vowel duration via structured prediction

    NASA Astrophysics Data System (ADS)

    Adi, Yossi; Keshet, Joseph; Cibelli, Emily; Gustafson, Erin; Clopper, Cynthia; Goldrick, Matthew

    2016-12-01

    A key barrier to making phonetic studies scalable and replicable is the need to rely on subjective, manual annotation. To help meet this challenge, a machine learning algorithm was developed for automatic measurement of a widely used phonetic measure: vowel duration. Manually-annotated data were used to train a model that takes as input an arbitrary length segment of the acoustic signal containing a single vowel that is preceded and followed by consonants and outputs the duration of the vowel. The model is based on the structured prediction framework. The input signal and a hypothesized set of a vowel's onset and offset are mapped to an abstract vector space by a set of acoustic feature functions. The learning algorithm is trained in this space to minimize the difference in expectations between predicted and manually-measured vowel durations. The trained model can then automatically estimate vowel durations without phonetic or orthographic transcription. Results comparing the model to three sets of manually annotated data suggest it out-performed the current gold standard for duration measurement, an HMM-based forced aligner (which requires orthographic or phonetic transcription as an input).

  8. PSPP: A Protein Structure Prediction Pipeline for Computing Clusters

    DTIC Science & Technology

    2009-07-01

    scoring ab initio models are annotated by structural comparison against the Structural Classification of Proteins ( SCOP ) fold database. Furthermore...Protein ( SCOP ) database [11]). Finally, if no matches are made in this search, the 3-D atomic structure of the protein domain must be built ab initio, i.e...Fold recognition/threading [34] [60] PSIPRED Jones Secondary structure prediction [21] [61] Rosetta Baker Ab initio folder [41] [62] SCOP /ASTRAL Chothia

  9. PARTS: probabilistic alignment for RNA joinT secondary structure prediction.

    PubMed

    Harmanci, Arif Ozgun; Sharma, Gaurav; Mathews, David H

    2008-04-01

    A novel method is presented for joint prediction of alignment and common secondary structures of two RNA sequences. The joint consideration of common secondary structures and alignment is accomplished by structural alignment over a search space defined by the newly introduced motif called matched helical regions. The matched helical region formulation generalizes previously employed constraints for structural alignment and thereby better accommodates the structural variability within RNA families. A probabilistic model based on pseudo free energies obtained from precomputed base pairing and alignment probabilities is utilized for scoring structural alignments. Maximum a posteriori (MAP) common secondary structures, sequence alignment and joint posterior probabilities of base pairing are obtained from the model via a dynamic programming algorithm called PARTS. The advantage of the more general structural alignment of PARTS is seen in secondary structure predictions for the RNase P family. For this family, the PARTS MAP predictions of secondary structures and alignment perform significantly better than prior methods that utilize a more restrictive structural alignment model. For the tRNA and 5S rRNA families, the richer structural alignment model of PARTS does not offer a benefit and the method therefore performs comparably with existing alternatives. For all RNA families studied, the posterior probability estimates obtained from PARTS offer an improvement over posterior probability estimates from a single sequence prediction. When considering the base pairings predicted over a threshold value of confidence, the combination of sensitivity and positive predictive value is superior for PARTS than for the single sequence prediction. PARTS source code is available for download under the GNU public license at http://rna.urmc.rochester.edu.

  10. PARTS: Probabilistic Alignment for RNA joinT Secondary structure prediction

    PubMed Central

    Harmanci, Arif Ozgun; Sharma, Gaurav; Mathews, David H.

    2008-01-01

    A novel method is presented for joint prediction of alignment and common secondary structures of two RNA sequences. The joint consideration of common secondary structures and alignment is accomplished by structural alignment over a search space defined by the newly introduced motif called matched helical regions. The matched helical region formulation generalizes previously employed constraints for structural alignment and thereby better accommodates the structural variability within RNA families. A probabilistic model based on pseudo free energies obtained from precomputed base pairing and alignment probabilities is utilized for scoring structural alignments. Maximum a posteriori (MAP) common secondary structures, sequence alignment and joint posterior probabilities of base pairing are obtained from the model via a dynamic programming algorithm called PARTS. The advantage of the more general structural alignment of PARTS is seen in secondary structure predictions for the RNase P family. For this family, the PARTS MAP predictions of secondary structures and alignment perform significantly better than prior methods that utilize a more restrictive structural alignment model. For the tRNA and 5S rRNA families, the richer structural alignment model of PARTS does not offer a benefit and the method therefore performs comparably with existing alternatives. For all RNA families studied, the posterior probability estimates obtained from PARTS offer an improvement over posterior probability estimates from a single sequence prediction. When considering the base pairings predicted over a threshold value of confidence, the combination of sensitivity and positive predictive value is superior for PARTS than for the single sequence prediction. PARTS source code is available for download under the GNU public license at http://rna.urmc.rochester.edu. PMID:18304945

  11. Structure of allergens and structure based epitope predictions☆

    PubMed Central

    Dall’Antonia, Fabio; Pavkov-Keller, Tea; Zangger, Klaus; Keller, Walter

    2014-01-01

    The structure determination of major allergens is a prerequisite for analyzing surface exposed areas of the allergen and for mapping conformational epitopes. These may be determined by experimental methods including crystallographic and NMR-based approaches or predicted by computational methods. In this review we summarize the existing structural information on allergens and their classification in protein fold families. The currently available allergen-antibody complexes are described and the experimentally obtained epitopes compared. Furthermore we discuss established methods for linear and conformational epitope mapping, putting special emphasis on a recently developed approach, which uses the structural similarity of proteins in combination with the experimental cross-reactivity data for epitope prediction. PMID:23891546

  12. Distributed Prognostics based on Structural Model Decomposition

    NASA Technical Reports Server (NTRS)

    Daigle, Matthew J.; Bregon, Anibal; Roychoudhury, I.

    2014-01-01

    Within systems health management, prognostics focuses on predicting the remaining useful life of a system. In the model-based prognostics paradigm, physics-based models are constructed that describe the operation of a system and how it fails. Such approaches consist of an estimation phase, in which the health state of the system is first identified, and a prediction phase, in which the health state is projected forward in time to determine the end of life. Centralized solutions to these problems are often computationally expensive, do not scale well as the size of the system grows, and introduce a single point of failure. In this paper, we propose a novel distributed model-based prognostics scheme that formally describes how to decompose both the estimation and prediction problems into independent local subproblems whose solutions may be easily composed into a global solution. The decomposition of the prognostics problem is achieved through structural decomposition of the underlying models. The decomposition algorithm creates from the global system model a set of local submodels suitable for prognostics. Independent local estimation and prediction problems are formed based on these local submodels, resulting in a scalable distributed prognostics approach that allows the local subproblems to be solved in parallel, thus offering increases in computational efficiency. Using a centrifugal pump as a case study, we perform a number of simulation-based experiments to demonstrate the distributed approach, compare the performance with a centralized approach, and establish its scalability. Index Terms-model-based prognostics, distributed prognostics, structural model decomposition ABBREVIATIONS

  13. Methods for evaluating the predictive accuracy of structural dynamic models

    NASA Technical Reports Server (NTRS)

    Hasselman, Timothy K.; Chrostowski, Jon D.

    1991-01-01

    Modeling uncertainty is defined in terms of the difference between predicted and measured eigenvalues and eigenvectors. Data compiled from 22 sets of analysis/test results was used to create statistical databases for large truss-type space structures and both pretest and posttest models of conventional satellite-type space structures. Modeling uncertainty is propagated through the model to produce intervals of uncertainty on frequency response functions, both amplitude and phase. This methodology was used successfully to evaluate the predictive accuracy of several structures, including the NASA CSI Evolutionary Structure tested at Langley Research Center. Test measurements for this structure were within + one-sigma intervals of predicted accuracy for the most part, demonstrating the validity of the methodology and computer code.

  14. High-speed prediction of crystal structures for organic molecules

    NASA Astrophysics Data System (ADS)

    Obata, Shigeaki; Goto, Hitoshi

    2015-02-01

    We developed a master-worker type parallel algorithm for allocating tasks of crystal structure optimizations to distributed compute nodes, in order to improve a performance of simulations for crystal structure predictions. The performance experiments were demonstrated on TUT-ADSIM supercomputer system (HITACHI HA8000-tc/HT210). The experimental results show that our parallel algorithm could achieve speed-ups of 214 and 179 times using 256 processor cores on crystal structure optimizations in predictions of crystal structures for 3-aza-bicyclo(3.3.1)nonane-2,4-dione and 2-diazo-3,5-cyclohexadiene-1-one, respectively. We expect that this parallel algorithm is always possible to reduce computational costs of any crystal structure predictions.

  15. Protein folding simulations and structure predictions

    NASA Astrophysics Data System (ADS)

    Okamoto, Yuko

    2001-12-01

    In complex systems such as spin glasses and proteins, conventional simulations in the canonical ensemble will get trapped in states of energy local minima. We employ the simulated annealing method and generalized-ensemble algorithms in order to overcome this multiple-minima problem. Besides simulated annealing, three well-known generalized-ensemble algorithms, namely, multicanonical algorithm, simulated tempering, and replica-exchange method, are described. We then present three new generalized-ensemble algorithms based on the combinations of the three methods.

  16. RNA-Puzzles Round III: 3D RNA structure prediction of five riboswitches and one ribozyme

    PubMed Central

    Biesiada, Marcin; Boniecki, Michał J.; Chou, Fang-Chieh; Ferré-D'Amaré, Adrian R.; Das, Rhiju; Dunin-Horkawicz, Stanisław; Geniesse, Caleb; Kappel, Kalli; Kladwang, Wipapat; Krokhotin, Andrey; Łach, Grzegorz E.; Major, François; Mann, Thomas H.; Pachulska-Wieczorek, Katarzyna; Patel, Dinshaw J.; Piccirilli, Joseph A.; Popenda, Mariusz; Purzycka, Katarzyna J.; Ren, Aiming; Rice, Greggory M.; Santalucia, John; Tandon, Arpit; Trausch, Jeremiah J.; Wang, Jian; Weeks, Kevin M.; Williams, Benfeard; Xiao, Yi; Zhang, Dong; Zok, Tomasz

    2017-01-01

    RNA-Puzzles is a collective experiment in blind 3D RNA structure prediction. We report here a third round of RNA-Puzzles. Five puzzles, 4, 8, 12, 13, 14, all structures of riboswitch aptamers and puzzle 7, a ribozyme structure, are included in this round of the experiment. The riboswitch structures include biological binding sites for small molecules (S-adenosyl methionine, cyclic diadenosine monophosphate, 5-amino 4-imidazole carboxamide riboside 5′-triphosphate, glutamine) and proteins (YbxF), and one set describes large conformational changes between ligand-free and ligand-bound states. The Varkud satellite ribozyme is the most recently solved structure of a known large ribozyme. All puzzles have established biological functions and require structural understanding to appreciate their molecular mechanisms. Through the use of fast-track experimental data, including multidimensional chemical mapping, and accurate prediction of RNA secondary structure, a large portion of the contacts in 3D have been predicted correctly leading to similar topologies for the top ranking predictions. Template-based and homology-derived predictions could predict structures to particularly high accuracies. However, achieving biological insights from de novo prediction of RNA 3D structures still depe