Science.gov

Sample records for predicting b-dna structure

  1. A DFT study of 2-aminopurine-containing dinucleotides: prediction of stacked conformations with B-DNA structure.

    PubMed

    Smith, Darren A; Holroyd, Leo F; van Mourik, Tanja; Jones, Anita C

    2016-05-25

    The fluorescence properties of dinucleotides incorporating 2-aminopurine (2AP) suggest that the simplest oligonucleotides adopt conformations similar to those found in duplex DNA. However, there is a lack of structural data for these systems. We report a density functional theory (DFT) study of the structures of 2AP-containing dinucleotides (deoxydinucleoside monophosphates), including full geometry optimisation of the sugar-phosphate backbone. Our DFT calculations employ the M06-2X functional for reliable treatment of dispersion interactions and include implicit aqueous solvation. Dinucleotides with 2AP in the 5'-position and each of the natural bases in the 3'-position are examined, together with the analogous 5'-adenine-containing systems. Computed structures are compared in detail with typical B-DNA base-step parameters, backbone torsional angles and sugar pucker, derived from crystallographic data. We find that 2AP-containing dinucleotides adopt structures that closely conform to B-DNA in all characteristic parameters. The structures of 2AP-containing dinucleotides closely resemble those of their adenine-containing counterparts, demonstrating the fidelity of 2AP as a mimic of the natural base. As a first step towards exploring the conformational heterogeneity of dinucleotides, we also characterise an imperfectly stacked conformation and one in which the bases are completely unstacked. PMID:27186599

  2. Structural correlations and melting of B-DNA fibers

    SciTech Connect

    Wildes, Andrew; Theodorakopoulos, Nikos; Valle-Orero, Jessica; Cuesta-Lopez, Santiago; Peyrard, Michel; Garden, Jean-Luc

    2011-06-15

    Despite numerous attempts, understanding the thermal denaturation of DNA is still a challenge due to the lack of structural data on the transition since standard experimental approaches to DNA melting are made in solution and do not provide spatial information. We report a measurement using neutron scattering from oriented DNA fibers to determine the size of the regions that stay in the double-helix conformation as the melting temperature is approached from below. A Bragg peak from the B form of DNA is observed as a function of temperature and its width and integrated intensity are measured. These results, complemented by a differential calorimetry study of the melting of B-DNA fibers as well as electrophoresis and optical observation data, are analyzed in terms of a one-dimensional mesoscopic model of DNA.

  3. A molecular mechanical model to predict the helix twist angles of B-DNA.

    PubMed

    Tung, C S; Harvey, S C

    1984-04-11

    We present here a model for the prediction of helix twist angles in B-DNA, a model composed of a collection of torsional springs. Statistically averaged conformational energy calculations show that, for a specified basepair step, the basepair-basepair conformational energy is quadratically dependent on the helix twist angle, so the calculations provide the spring parameters for the basepair-basepair interactions. Torsional springs can also be used to model the effects of the backbone on the helix twist, and the parameters for those springs are derived by fitting the model to experimental data. The model predicts a macroscopic torsional stiffness and a longitudinal compressibility (Young's modulus) which are both in good agreement with experiment. One biological consequence of the model is examined, the sequence specificity of the Eco RI restriction endonuclease, and it is shown that the discriminatory power of the enzyme receives a substantial contribution from the energetic cost of torsional deformations of the DNA when wrong sequences are forced into the enzyme binding site. PMID:6326059

  4. B-DNA to Zip-DNA: Simulating a DNA Transition to a Novel Structure with Enhanced Charge-Transport Characteristics

    PubMed Central

    Balaeff, Alexander; Craig, Stephen L.; Beratan, David N.

    2013-01-01

    The forced extension of a DNA segment is studied in a series of steered molecular dynamics simulations, employing a broad range of pulling forces. Throughout the entire force range, the formation of a zipper-like (zip-) DNA structure is observed. In that structure, first predicted by Lohikoski et al., the bases of the DNA strands interdigitate with each other and form a single-base aromatic stack. Similar motifs, albeit only a few base pairs in extent, have been observed in experimental crystal structures. Analysis of the dynamics of structural changes in pulled DNA shows that S-form DNA, thought to be adopted by DNA under applied force, serves as an intermediate between B-DNA and zip-DNA. Therefore, the phase transition plateau observed in force–extension curves of DNA is suggested to reflect the B-DNA to zip-DNA structural transition. Electronic structure analysis of purine bases in zip-DNA indicates a several-fold to order of magnitude increase in the π–π electronic coupling among nearest-neighbor nucleobases, compared to B-DNA. We further observe that zip-DNA does not require base pair complementarity between DNA strands, and we predict that the increased electronic coupling in zip-DNA will result in a much higher rate of charge transfer through an all-purine zip-DNA compared to B-DNA of equal length. PMID:21598926

  5. The genome-wide distribution of non-B DNA motifs is shaped by operon structure and suggests the transcriptional importance of non-B DNA structures in Escherichia coli.

    PubMed

    Du, Xiangjun; Wojtowicz, Damian; Bowers, Albert A; Levens, David; Benham, Craig J; Przytycka, Teresa M

    2013-07-01

    Although the right-handed double helical B-form DNA is most common under physiological conditions, DNA is dynamic and can adopt a number of alternative structures, such as the four-stranded G-quadruplex, left-handed Z-DNA, cruciform and others. Active transcription necessitates strand separation and can induce such non-canonical forms at susceptible genomic sequences. Therefore, it has been speculated that these non-B DNA motifs can play regulatory roles in gene transcription. Such conjecture has been supported in higher eukaryotes by direct studies of several individual genes, as well as a number of large-scale analyses. However, the role of non-B DNA structures in many lower organisms, in particular proteobacteria, remains poorly understood and incompletely documented. In this study, we performed the first comprehensive study of the occurrence of B DNA-non-B DNA transition-susceptible sites (non-B DNA motifs) within the context of the operon structure of the Escherichia coli genome. We compared the distributions of non-B DNA motifs in the regulatory regions of operons with those from internal regions. We found an enrichment of some non-B DNA motifs in regulatory regions, and we show that this enrichment cannot be simply explained by base composition bias in these regions. We also showed that the distribution of several non-B DNA motifs within intergenic regions separating divergently oriented operons differs from the distribution found between convergent ones. In particular, we found a strong enrichment of cruciforms in the termination region of operons; this enrichment was observed for operons with Rho-dependent, as well as Rho-independent terminators. Finally, a preference for some non-B DNA motifs was observed near transcription factor-binding sites. Overall, the conspicuous enrichment of transition-susceptible sites in these specific regulatory regions suggests that non-B DNA structures may have roles in the transcriptional regulation of specific operons within the E. coli genome. PMID:23620297

  6. The genome-wide distribution of non-B DNA motifs is shaped by operon structure and suggests the transcriptional importance of non-B DNA structures in Escherichia coli

    PubMed Central

    Du, Xiangjun; Wojtowicz, Damian; Bowers, Albert A.; Levens, David; Benham, Craig J.; Przytycka, Teresa M.

    2013-01-01

    Although the right-handed double helical B-form DNA is most common under physiological conditions, DNA is dynamic and can adopt a number of alternative structures, such as the four-stranded G-quadruplex, left-handed Z-DNA, cruciform and others. Active transcription necessitates strand separation and can induce such non-canonical forms at susceptible genomic sequences. Therefore, it has been speculated that these non-B DNA motifs can play regulatory roles in gene transcription. Such conjecture has been supported in higher eukaryotes by direct studies of several individual genes, as well as a number of large-scale analyses. However, the role of non-B DNA structures in many lower organisms, in particular proteobacteria, remains poorly understood and incompletely documented. In this study, we performed the first comprehensive study of the occurrence of B DNA–non-B DNA transition-susceptible sites (non-B DNA motifs) within the context of the operon structure of the Escherichia coli genome. We compared the distributions of non-B DNA motifs in the regulatory regions of operons with those from internal regions. We found an enrichment of some non-B DNA motifs in regulatory regions, and we show that this enrichment cannot be simply explained by base composition bias in these regions. We also showed that the distribution of several non-B DNA motifs within intergenic regions separating divergently oriented operons differs from the distribution found between convergent ones. In particular, we found a strong enrichment of cruciforms in the termination region of operons; this enrichment was observed for operons with Rho-dependent, as well as Rho-independent terminators. Finally, a preference for some non-B DNA motifs was observed near transcription factor-binding sites. Overall, the conspicuous enrichment of transition-susceptible sites in these specific regulatory regions suggests that non-B DNA structures may have roles in the transcriptional regulation of specific operons within the E. coli genome. PMID:23620297

  7. B-DNA structure is intrinsically polymorphic: even at the level of base pair positions

    SciTech Connect

    Maehigashi, Tatsuya; Hsiao, Chiaolong; Woods, Kristen Kruger; Moulaei, Tinoush; Hud, Nicholas V.; Williams, Loren Dean

    2012-10-23

    Increasingly exact measurement of single crystal X-ray diffraction data offers detailed characterization of DNA conformation, hydration and electrostatics. However, instead of providing a more clear and unambiguous image of DNA, highly accurate diffraction data reveal polymorphism of the DNA atomic positions and conformation and hydration. Here we describe an accurate X-ray structure of B-DNA, painstakingly fit to a multistate model that contains multiple competing positions of most of the backbone and of entire base pairs. Two of ten base-pairs of CCAGGCCTGG are in multiple states distinguished primarily by differences in slide. Similarly, all the surrounding ions are seen to fractionally occupy discrete competing and overlapping sites. And finally, the vast majority of water molecules show strong evidence of multiple competing sites. Conventional resolution appears to give a false sense of homogeneity in conformation and interactions of DNA. In addition, conventional resolution yields an average structure that is not accurate, in that it is different from any of the multiple discrete structures observed at high resolution. Because base pair positional heterogeneity has not always been incorporated into model-building, even some high and ultrahigh-resolution structures of DNA do not indicate the full extent of conformational polymorphism.

  8. The N(2)-Furfuryl-deoxyguanosine Adduct Does Not Alter the Structure of B-DNA.

    PubMed

    Ghodke, Pratibha P; Gore, Kiran R; Harikrishna, S; Samanta, Biswajit; Kottur, Jithesh; Nair, Deepak T; Pradeepkumar, P I

    2016-01-15

    N(2)-Furfuryl-deoxyguanosine (fdG) is carcinogenic DNA adduct that originates from furfuryl alcohol. It is also a stable structural mimic of the damage induced by the nitrofurazone family of antibiotics. For the structural and functional studies of this model N(2)-dG adduct, reliable and rapid access to fdG-modified DNAs are warranted. Toward this end, here we report the synthesis of fdG-modified DNAs using phosphoramidite chemistry involving only three steps. The functional integrity of the modified DNA has been verified by primer extension studies with DNA polymerases I and IV from E. coli. Introduction of fdG into a DNA duplex decreases the Tm by ∼1.6 °C/modification. Molecular dynamics simulations of a DNA duplex bearing the fdG adduct revealed that though the overall B-DNA structure is maintained, this lesion can disrupt W-C H-bonding, stacking interactions, and minor groove hydrations to some extent at the modified site, and these effects lead to slight variations in the local base pair parameters. Overall, our studies show that fdG is tolerated at the minor groove of the DNA to a better extent compared with other bulky DNA damages, and this property will make it difficult for the DNA repair pathways to detect this adduct. PMID:26650891

  9. Statistical mechanical treatment of the structural hydration of biological macromolecules: Results for [ital B]-DNA

    SciTech Connect

    Hummer, G. ); Soumpasis, D.M. )

    1994-12-01

    We constructed an efficient and accurate computational tool based on the potentials-of-mean-force approach for computing the detailed hydrophilic hydration of complex molecular structures in aqueous environments. Using the pair and triplet correlation functions database previously obtained from computer simulations of the simple point charge model of water, we computed the detailed structural organization of water around two [ital B]-DNA molecules with sequences d(AATT)[sub 3][center dot]d(AATT)[sub 3] and d(CCGG)[sub 3][center dot]d(CCGG)[sub 3], and canonical structure. [[ital A], T, C, and G denote adenine, thymine, cytosine, and guanine, respectively, and d(...) denotes the deoxyribose in the sugar-phosphate backbone.] The results obtained are in agreement with the experimental observations. A[center dot]T base-pair stretches are found to support the marked minor-groove spines of hydration'' observed in x-ray crystal structures. The hydrophilic hydration of the minor groove of the molecule d(CCGG)[sub 3][center dot]d(CCGG)[sub 3] exhibits a double ribbon of high water density, which is also in agreement with x-ray crystallography observations of C[center dot]G base-pair regions. The major grooves, on the other hand, do not show a comparably strong localization of water molecules. The quantitative results are compared with a computer simulation study of Forester [ital et] [ital al]. [Mol. Phys. 72, 643 (1991)]. We find good agreement for the hydration of the -NH[sub 2] groups, the cylindrically averaged water density distributions, and the overall hydration number. The agreement is less satisfactory for the phosphate groups. However, by refining the treatment of the anionic oxygens on the phosphate groups, almost full quantitative agreement is achieved.

  10. Statistical mechanical treatment of the structural hydration of biological macromolecules: Results for B-DNA

    NASA Astrophysics Data System (ADS)

    Hummer, Gerhard; Soumpasis, Dikeos Mario

    1994-12-01

    We constructed an efficient and accurate computational tool based on the potentials-of-mean-force approach for computing the detailed hydrophilic hydration of complex molecular structures in aqueous environments. Using the pair and triplet correlation functions database previously obtained from computer simulations of the simple point charge model of water, we computed the detailed structural organization of water around two B-DNA molecules with sequences d(AATT)3.d(AATT)3 and d(CCGG)3.d(CCGG)3, and canonical structure. [A, T, C, and G denote adenine, thymine, cytosine, and guanine, respectively, and d(...) denotes the deoxyribose in the sugar-phosphate backbone.] The results obtained are in agreement with the experimental observations. A.T base-pair stretches are found to support the marked minor-groove ``spines of hydration'' observed in x-ray crystal structures. The hydrophilic hydration of the minor groove of the molecule d(CCGG)3.d(CCGG)3 exhibits a double ribbon of high water density, which is also in agreement with x-ray crystallography observations of C.G base-pair regions. The major grooves, on the other hand, do not show a comparably strong localization of water molecules. The quantitative results are compared with a computer simulation study of Forester et al. [Mol. Phys. 72, 643 (1991)]. We find good agreement for the hydration of the -NH2 groups, the cylindrically averaged water density distributions, and the overall hydration number. The agreement is less satisfactory for the phosphate groups. However, by refining the treatment of the anionic oxygens on the phosphate groups, almost full quantitative agreement is achieved.

  11. Structural insights into VirB-DNA complexes reveal mechanism of transcriptional activation of virulence genes

    PubMed Central

    Gao, Xiaopan; Zou, Tingting; Mu, Zhixia; Qin, Bo; Yang, Jian; Waltersperger, Sandro; Wang, Meitian; Cui, Sheng; Jin, Qi

    2013-01-01

    VirB activates transcription of virulence genes in Shigella flexneri by alleviating heat-stable nucleoid-structuring protein-mediated promoter repression. VirB is unrelated to the conventional transcriptional regulators, but homologous to the plasmid partitioning proteins. We determined the crystal structures of VirB HTH domain bound by the cis-acting site containing the inverted repeat, revealing that the VirB-DNA complex is related to ParB-ParS-like complexes, presenting an example that a ParB-like protein acts exclusively in transcriptional regulation. The HTH domain of VirB docks DNA major groove and provides multiple contacts to backbone and bases, in which the only specific base readout is mediated by R167. VirB only recognizes one half site of the inverted repeats containing the most matches to the consensus for VirB binding. The binding of VirB induces DNA conformational changes and introduces a bend at an invariant A-tract segment in the cis-acting site, suggesting a role of DNA remodeling. VirB exhibits positive cooperativity in DNA binding that is contributed by the C-terminal domain facilitating VirB oligomerization. The isolated HTH domain only confers partial DNA specificity. Additional determinants for sequence specificity may reside in N- or C-terminal domains. Collectively, our findings support and extend a previously proposed model for relieving heat-stable nucleoid-structuring protein-mediated repression by VirB. PMID:23985969

  12. Multistep modeling (MSM) of biomolecular structure application to the A-G mispair in the B-DNA environment

    NASA Technical Reports Server (NTRS)

    Srinivasan, S.; Raghunathan, G.; Shibata, M.; Rein, R.

    1986-01-01

    A multistep modeling procedure has been evolved to study the structural changes introduced by lesions in DNA. We report here the change in the structure of regular B-DNA geometry due to the incorporation of Ganti-Aanti mispair in place of a regular G-C pair, preserving the helix continuity. The energetics of the structure so obtained is compared with the Ganti-Asyn configuration under similar constrained conditions. We present the methodology adopted and discuss the results.

  13. Electronic structure, stacking energy, partial charge, and hydrogen bonding in four periodic B-DNA models

    NASA Astrophysics Data System (ADS)

    Poudel, Lokendra; Rulis, Paul; Liang, Lei; Ching, W. Y.

    2014-08-01

    We present a theoretical study of the electronic structure of four periodic B-DNA models labeled (AT)10,(GC)10, (AT)5(GC)5, and (AT-GC)5 where A denotes adenine, T denotes thymine, G denotes guanine, and C denotes cytosine. Each model has ten base pairs with Na counterions to neutralize the negative phosphate group in the backbone. The (AT)5(GC)5 and (AT-GC)5 models contain two and five AT-GC bilayers, respectively. When compared against the average of the two pure models, we estimate the AT-GC bilayer interaction energy to be 19.015 Kcal/mol, which is comparable to the hydrogen bonding energy between base pairs obtained from the literature. Our investigation shows that the stacking of base pairs plays a vital role in the electronic structure, relative stability, bonding, and distribution of partial charges in the DNA models. All four models show a highest occupied molecular orbital (HOMO) to lowest unoccupied molecular orbital (LUMO) gap ranging from 2.14 to 3.12 eV with HOMO states residing on the PO4 + Na functional group and LUMO states originating from the bases. Our calculation implies that the electrical conductance of a DNA molecule should increase with increased base-pair mixing. Interatomic bonding effects in these models are investigated in detail by analyzing the distributions of the calculated bond order values for every pair of atoms in the four models including hydrogen bonding. The counterions significantly affect the gap width, the conductivity, and the distribution of partial charge on the DNA backbone. We also evaluate quantitatively the surface partial charge density on each functional group of the DNA models.

  14. Non-B DB v2.0: a database of predicted non-B DNA-forming motifs and its associated tools.

    PubMed

    Cer, Regina Z; Donohue, Duncan E; Mudunuri, Uma S; Temiz, Nuri A; Loss, Michael A; Starner, Nathan J; Halusa, Goran N; Volfovsky, Natalia; Yi, Ming; Luke, Brian T; Bacolla, Albino; Collins, Jack R; Stephens, Robert M

    2013-01-01

    The non-B DB, available at http://nonb.abcc.ncifcrf.gov, catalogs predicted non-B DNA-forming sequence motifs, including Z-DNA, G-quadruplex, A-phased repeats, inverted repeats, mirror repeats, direct repeats and their corresponding subsets: cruciforms, triplexes and slipped structures, in several genomes. Version 2.0 of the database revises and re-implements the motif discovery algorithms to better align with accepted definitions and thresholds for motifs, expands the non-B DNA-forming motifs coverage by including short tandem repeats and adds key visualization tools to compare motif locations relative to other genomic annotations. Non-B DB v2.0 extends the ability for comparative genomics by including re-annotation of the five organisms reported in non-B DB v1.0, human, chimpanzee, dog, macaque and mouse, and adds seven additional organisms: orangutan, rat, cow, pig, horse, platypus and Arabidopsis thaliana. Additionally, the non-B DB v2.0 provides an overall improved graphical user interface and faster query performance. PMID:23125372

  15. Double-strand break formation by the RAG complex at the bcl-2 major breakpoint region and at other non-B DNA structures in vitro.

    PubMed

    Raghavan, Sathees C; Swanson, Patrick C; Ma, Yunmei; Lieber, Michael R

    2005-07-01

    The most common chromosomal translocation in cancer, t(14;18) at the 150-bp bcl-2 major breakpoint region (Mbr), occurs in follicular lymphomas. The bcl-2 Mbr assumes a non-B DNA conformation, thus explaining its distinctive fragility. This non-B DNA structure is a target of the RAG complex in vivo, but not because of its primary sequence. Here we report that the RAG complex generates at least two independent nicks that lead to double-strand breaks in vitro, and this requires the non-B DNA structure at the bcl-2 Mbr. A 3-bp mutation is capable of abolishing the non-B structure formation and the double-strand breaks. The observations on the bcl-2 Mbr reflect more general properties of the RAG complex, which can bind and nick at duplex-single-strand transitions of other non-B DNA structures, resulting in double-strand breaks in vitro. Hence, the present study reveals novel insight into a third mechanism of action of RAGs on DNA, besides the standard heptamer/nonamer-mediated cleavage in V(D)J recombination and the in vitro transposase activity. PMID:15988007

  16. A non-B-DNA structure at the Bcl-2 major breakpoint region is cleaved by the RAG complex.

    PubMed

    Raghavan, Sathees C; Swanson, Patrick C; Wu, Xiantuo; Hsieh, Chih-Lin; Lieber, Michael R

    2004-03-01

    The causes of spontaneous chromosomal translocations in somatic cells of biological organisms are largely unknown, although double-strand DNA breaks are required in all proposed mechanisms. The most common chromosomal abnormality in human cancer is the reciprocal translocation between chromosomes 14 and 18 (t(14;18)), which occurs in follicular lymphomas. The break at the immunoglobulin heavy-chain locus on chromosome 14 is an interruption of the normal V(D)J recombination process. But the breakage on chromosome 18, at the Bcl-2 gene, occurs within a confined 150-base-pair region (the major breakpoint region or Mbr) for reasons that have remained enigmatic. We have reproduced key features of the translocation process on an episome that propagates in human cells. The RAG complex--which is the normal enzyme for DNA cleavage at V, D or J segments--nicks the Bcl-2 Mbr in vitro and in vivo in a manner that reflects the pattern of the chromosomal translocations; however, the Mbr is not a V(D)J recombination signal. Rather the Bcl-2 Mbr assumes a non-B-form DNA structure within the chromosomes of human cells at 20-30% of alleles. Purified DNA assuming this structure contains stable regions of single-strandedness, which correspond well to the translocation regions in patients. Hence, a stable non-B-DNA structure in the human genome appears to be the basis for the fragility of the Bcl-2 Mbr, and the RAG complex is able to cleave this structure. PMID:14999286

  17. Study of Electronic Structures of Nucleobases and Associated Nuclear Quandrupole Interactions for ^14N, ^17O and ^2H in A-DNA and B-DNA

    NASA Astrophysics Data System (ADS)

    Scheicher, R. H.; Mahato, Dip N.; Pink, R. H.; Huang, M. B.; Das, T. P.; Dubey, Archana; Saha, H. P.; Chow, Lee

    2007-03-01

    As part of a research program for first-principles investigation of electronic structures of A-DNA and B-DNA systems we have previously carried out studies of the magnetic hyperfine interactions for the spin-label[1] muonium attached to A-DNA and B-DNA. The present work involves the nuclear quadrupole interactions (NQI) of ^14N, ^17O and ^2H in these two systems. We will present the results of our investigations of the NQI properties using the Hartree-Fock-Roothaan procedure with many-electron correlations included using many-body perturbation theory. For the A-DNA and B-DNA systems we are using available structural data for the four nucleobases. For the free nucleobases, the geometry from the energy optimization procedure is being employed. Comparisons will be made with available experimental NQI data and planned future improvements will be discussed. [1] R.H. Scheicher, E. Torikai, F.L. Pratt, K Nagamine, and T.P. Das, Hyperfine Interactions,158, 53 (2004); Physica B, Physics of Condensed Matter, 374, 448 (2006).

  18. Base-pair opening and spermine binding--B-DNA features displayed in the crystal structure of a gal operon fragment: implications for protein-DNA recognition.

    PubMed

    Tari, L W; Secco, A S

    1995-06-11

    A sequence that is represented frequently in functionally important sites involving protein-DNA interactions is GTG/CAC, suggesting that the trimer may play a role in regulatory processes. The 2.5 A resolution structure of d(CGGTGG)/d(CCACCG), a part of the interior operator (OI, nucleotides +44 to +49) of the gal operon, co-crystallized with spermine, is described herein. The crystal packing arrangement in this structure is unprecedented in a crystal of B-DNA, revealing a close packing of columns of stacked DNA resembling a 5-stranded twisted wire cable. The final structure contains one hexamer duplex, 17 water molecules and 1.5 spermine molecules per crystallographic asymmetric unit. The hexamer exhibits base-pair opening and shearing at T.A resulting in a novel non-Watson-Crick hydrogen-bonding scheme between adenine and thymine in the GTG region. The ability of this sequence to adopt unusual conformations in its GTG region may be a critical factor conferring sequence selectivity on the binding of Gal repressor. In addition, this is the first conclusive example of a crystal structure of spermine with native B-DNA, providing insight into the mechanics of polyamine-DNA binding, as well as possible explanations for the biological action of spermine. PMID:7596838

  19. Searching for non-B DNA-forming motifs using nBMST (non-B DNA Motif Search Tool)

    PubMed Central

    Cer, RZ; Bruce, KH; Donohue, DE; Temiz, NA; Mudunuri, US; Yi, M; Volfovsky, N; Bacolla, A; Luke, BT; Collins; Stephens, RM

    2012-01-01

    This unit describes basic protocols on using the non-B DNA Motif Search Tool (nBMST) to search for sequence motifs predicted to form alternative DNA conformations that differ from the canonical right-handed Watson-Crick double-helix, collectively known as non-B DNA and on using the associated PolyBrowse, a GBrowse (Stein et al., 2002) based genomic browser. The nBMST is a web-based resource that allows users to submit one or more DNA sequences to search for inverted repeats (cruciform DNA), mirror repeats (triplex DNA), direct/tandem repeats (slipped/hairpin structures), G4 motifs (tetraplex, G-quadruplex DNA), alternating purine-pyrimidine tracts (left-handed Z-DNA), and Aphased repeats (static bending). Basic protocol 1 illustrates different ways of submitting sequences, the required file input format, results comprising downloadable Generic Feature Format (GFF) files, static Portable Network Graphics (PNG) images, dynamic PolyBrowse link, and accessing documentation through the Help and Frequently Asked Questions (FAQs) pages. Basic Protocol 2 illustrates a brief overview of some of the PolyBrowse functionalities, particularly with reference to possible associations between predicted non-B DNA forming motifs and disease causing effects. The nBMST is versatile, simple to use, does not require bioinformatics skills, and can be applied to any type of DNA sequences, including viral and bacterial genomes, up to 20 megabytes (MB). PMID:22470144

  20. Breakpoints of gross deletions coincide with non-B DNA conformations.

    PubMed

    Bacolla, Albino; Jaworski, Adam; Larson, Jacquelynn E; Jakupciak, John P; Chuzhanova, Nadia; Abeysinghe, Shaun S; O'Connell, Catherine D; Cooper, David N; Wells, Robert D

    2004-09-28

    Genomic rearrangements are a frequent source of instability, but the mechanisms involved are poorly understood. A 2.5-kbp poly(purine.pyrimidine) sequence from the human PKD1 gene, known to form non-B DNA structures, induced long deletions and other instabilities in plasmids that were mediated by mismatch repair and, in some cases, transcription. The breakpoints occurred at predicted non-B DNA structures. Distance measurements also indicated a significant proximity of alternating purine-pyrimidine and oligo(purine.pyrimidine) tracts to breakpoint junctions in 222 gross deletions and translocations, respectively, involved in human diseases. In 11 deletions analyzed, breakpoints were explicable by non-B DNA structure formation. We conclude that alternative DNA conformations trigger genomic rearrangements through recombination-repair activities. PMID:15377784

  1. 1 A crystal structures of B-DNA reveal sequence-specific binding and groove-specific bending of DNA by magnesium and calcium.

    PubMed

    Chiu, T K; Dickerson, R E

    2000-08-25

    The 1 A resolution X-ray crystal structures of Mg(2+) and Ca(2+) salts of the B-DNA decamers CCAACGTTGG and CCAGCGCTGG reveal sequence-specific binding of Mg(2+) and Ca(2+) to the major and minor grooves of DNA, as well as non-specific binding to backbone phosphate oxygen atoms. Minor groove binding involves H-bond interactions between cross-strand DNA base atoms of adjacent base-pairs and the cations' water ligands. In the major groove the cations' water ligands can interact through H-bonds with O and N atoms from either one base or adjacent bases, and in addition the softer Ca(2+) can form polar covalent bonds bridging adjacent N7 and O6 atoms at GG bases. For reasons outlined earlier, localized monovalent cations are neither expected nor found.Ultra-high atomic resolution gives an unprecedented view of hydration in both grooves of DNA, permits an analysis of individual anisotropic displacement parameters, and reveals up to 22 divalent cations per DNA duplex. Each DNA helix is quite anisotropic, and alternate conformations, with motion in the direction of opening and closing the minor groove, are observed for the sugar-phosphate backbone. Taking into consideration the variability of experimental parameters and crystal packing environments among these four helices, and 24 other Mg(2+) and Ca(2+) bound B-DNA structures, we conclude that sequence-specific and strand-specific binding of Mg(2+) and Ca(2+) to the major groove causes DNA bending by base-roll compression towards the major groove, while sequence-specific binding of Mg(2+) and Ca(2+) in the minor groove has a negligible effect on helix curvature. The minor groove opens and closes to accommodate Mg(2+) and Ca(2+) without the necessity for significant bending of the overall helix. The program Shelxdna was written to facilitate refinement and analysis of X-ray crystal structures by Shelxl-97 and to plot and analyze one or more Curves and Freehelix output files. PMID:10966796

  2. Genetic and topological analyses of the bop promoter of Halobacterium halobium: stimulation by DNA supercoiling and non-B-DNA structure.

    PubMed Central

    Yang, C F; Kim, J M; Molinari, E; DasSarma, S

    1996-01-01

    The bop gene of wild-type Halobacterium halobium NRC-1 is transcriptionally induced more than 20-fold under microaerobic conditions. bop transcription is inhibited by novobiocin, a DNA gyrase inhibitor, at concentrations subinhibitory for growth. The exposure of NRC-1 cultures to novobiocin concentrations inhibiting bop transcription was found to partially relax plasmid DNA supercoiling, indicating the requirement of high DNA supercoiling for bop transcription. Next, the bop promoter region was cloned on an H. halobium plasmid vector and introduced into NRC-1 and S9, a bop overproducer strain. The cloned promoter was active in both H. halobium strains, but at a higher level in the overproducer than in the wild type. Transcription from the bop promoter on the plasmid was found to be inhibited by novobiocin to a similar extent as was transcription from the chromosome. When the cloned promoter was introduced into S9 mutant strains with insertions in either of two putative regulatory genes, brp and bat, no transcription was detectable, indicating that these genes serve to activate transcription from the bop promoter in trans. Deletion analysis of the cloned bop promoter from a site approximately 480 bp upstream of bop showed that a 53-bp region 5' to the transcription start site is sufficient for transcription, but a 28-bp region is not. An 11-bp alternating purine-pyrimidine sequence within the functional promoter region, centered 23 bp 5' to the transcription start point, was found to display DNA supercoiling-dependent sensitivity to S1 nuclease and OsO4, which is consistent with a non-B-DNA conformation similar to that of left-handed Z-DNA and suggests the involvement of unusual DNA structure in supercoiling-stimulated bop gene transcription. PMID:8550521

  3. B-DNA to Z-DNA structural transitions in the SV40 enhancer: stabilization of Z-DNA in negatively supercoiled DNA minicircles

    NASA Technical Reports Server (NTRS)

    Gruskin, E. A.; Rich, A.

    1993-01-01

    During replication and transcription, the SV40 control region is subjected to significant levels of DNA unwinding. There are three, alternating purine-pyrimidine tracts within this region that can adopt the Z-DNA conformation in response to negative superhelix density: a single copy of ACACACAT and two copies of ATGCATGC. Since the control region is essential for both efficient transcription and replication, B-DNA to Z-DNA transitions in these vital sequence tracts may have significant biological consequences. We have synthesized DNA minicircles to detect B-DNA to Z-DNA transitions in the SV40 enhancer, and to determine the negative superhelix density required to stabilize the Z-DNA. A variety of DNA sequences, including the entire SV40 enhancer and the two segments of the enhancer with alternating purine-pyrimidine tracts, were incorporated into topologically relaxed minicircles. Negative supercoils were generated, and the resulting topoisomers were resolved by electrophoresis. Using an anti-Z-DNA Fab and an electrophoretic mobility shift assay, Z-DNA was detected in the enhancer-containing minicircles at a superhelix density of -0.05. Fab saturation binding experiments demonstrated that three, independent Z-DNA tracts were stabilized in the supercoiled minicircles. Two other minicircles, each with one of the two alternating purine-pyrimidine tracts, also contained single Z-DNA sites. These results confirm the identities of the Z-DNA-forming sequences within the control region. Moreover, the B-DNA to Z-DNA transitions were detected at superhelix densities observed during normal replication and transcription processes in the SV40 life cycle.

  4. Searching for non-B DNA-forming motifs using nBMST (non-B DNA motif search tool).

    PubMed

    Cer, R Z; Bruce, K H; Donohue, D E; Temiz, N A; Mudunuri, U S; Yi, M; Volfovsky, N; Bacolla, A; Luke, B T; Collins, J R; Stephens, R M

    2012-04-01

    This unit describes basic protocols on using the non-B DNA Motif Search Tool (nBMST) to search for sequence motifs predicted to form alternative DNA conformations that differ from the canonical right-handed Watson-Crick double-helix, collectively known as non-B DNA, and on using the associated PolyBrowse, a GBrowse-based genomic browser. The nBMST is a Web-based resource that allows users to submit one or more DNA sequences to search for inverted repeats (cruciform DNA), mirror repeats (triplex DNA), direct/tandem repeats (slipped/hairpin structures), G4 motifs (tetraplex, G-quadruplex DNA), alternating purine-pyrimidine tracts (left-handed Z-DNA), and A-phased repeats (static bending). The nBMST is versatile, simple to use, does not require bioinformatics skills, and can be applied to any type of DNA sequences, including viral and bacterial genomes, up to an aggregate of 20 megabasepairs (Mbp). PMID:22470144

  5. Intrinsic low temperature paramagnetism in B-DNA.

    PubMed

    Nakamae, S; Cazayous, M; Sacuto, A; Monod, P; Bouchiat, H

    2005-06-24

    We present an experimental study of magnetization in lambda-DNA in conjunction with structural measurements. The results show the surprising interplay between the molecular structures and their magnetic property. In the B-DNA state, lambda-DNA exhibits paramagnetic behavior below 20 K that is nonlinear in an applied magnetic field whereas, in the A-DNA state, it remains diamagnetic down to 2 K. We propose orbital paramagnetism as the origin of the observed phenomena and discuss its relation to the existence of long range coherent transport in B-DNA at low temperature. PMID:16090581

  6. De Novo Protein Structure Prediction

    NASA Astrophysics Data System (ADS)

    Hung, Ling-Hong; Ngan, Shing-Chung; Samudrala, Ram

    An unparalleled amount of sequence data is being made available from large-scale genome sequencing efforts. The data provide a shortcut to the determination of the function of a gene of interest, as long as there is an existing sequenced gene with similar sequence and of known function. This has spurred structural genomic initiatives with the goal of determining as many protein folds as possible (Brenner and Levitt, 2000; Burley, 2000; Brenner, 2001; Heinemann et al., 2001). The purpose of this is twofold: First, the structure of a gene product can often lead to direct inference of its function. Second, since the function of a protein is dependent on its structure, direct comparison of the structures of gene products can be more sensitive than the comparison of sequences of genes for detecting homology. Presently, structural determination by crystallography and NMR techniques is still slow and expensive in terms of manpower and resources, despite attempts to automate the processes. Computer structure prediction algorithms, while not providing the accuracy of the traditional techniques, are extremely quick and inexpensive and can provide useful low-resolution data for structure comparisons (Bonneau and Baker, 2001). Given the immense number of structures which the structural genomic projects are attempting to solve, there would be a considerable gain even if the computer structure prediction approach were applicable to a subset of proteins.

  7. Structure prediction of membrane proteins.

    PubMed

    Zhou, Chunlong; Zheng, Yao; Zhou, Yan

    2004-02-01

    There is a large gap between the number of membrane protein (MP) sequences and that of their decoded 3D structures, especially high-resolution structures, due to difficulties in crystal preparation of MPs. However, detailed knowledge of the 3D structure is required for the fundamental understanding of the function of an MP and the interactions between the protein and its inhibitors or activators. In this paper, some computational approaches that have been used to predict MP structures are discussed and compared. PMID:15629037

  8. Learning to Predict Combinatorial Structures

    NASA Astrophysics Data System (ADS)

    Vembu, Shankar

    2009-12-01

    The major challenge in designing a discriminative learning algorithm for predicting structured data is to address the computational issues arising from the exponential size of the output space. Existing algorithms make different assumptions to ensure efficient, polynomial time estimation of model parameters. For several combinatorial structures, including cycles, partially ordered sets, permutations and other graph classes, these assumptions do not hold. In this thesis, we address the problem of designing learning algorithms for predicting combinatorial structures by introducing two new assumptions: (i) The first assumption is that a particular counting problem can be solved efficiently. The consequence is a generalisation of the classical ridge regression for structured prediction. (ii) The second assumption is that a particular sampling problem can be solved efficiently. The consequence is a new technique for designing and analysing probabilistic structured prediction models. These results can be applied to solve several complex learning problems including but not limited to multi-label classification, multi-category hierarchical classification, and label ranking.

  9. Non-B DNA-forming sequences and WRN deficiency independently increase the frequency of base substitution in human cells.

    PubMed

    Bacolla, Albino; Wang, Guliang; Jain, Aklank; Chuzhanova, Nadia A; Cer, Regina Z; Collins, Jack R; Cooper, David N; Bohr, Vilhelm A; Vasquez, Karen M

    2011-03-25

    Although alternative DNA secondary structures (non-B DNA) can induce genomic rearrangements, their associated mutational spectra remain largely unknown. The helicase activity of WRN, which is absent in the human progeroid Werner syndrome, is thought to counteract this genomic instability. We determined non-B DNA-induced mutation frequencies and spectra in human U2OS osteosarcoma cells and assessed the role of WRN in isogenic knockdown (WRN-KD) cells using a supF gene mutation reporter system flanked by triplex- or Z-DNA-forming sequences. Although both non-B DNA and WRN-KD served to increase the mutation frequency, the increase afforded by WRN-KD was independent of DNA structure despite the fact that purified WRN helicase was found to resolve these structures in vitro. In U2OS cells, ∼70% of mutations comprised single-base substitutions, mostly at G·C base-pairs, with the remaining ∼30% being microdeletions. The number of mutations at G·C base-pairs in the context of NGNN/NNCN sequences correlated well with predicted free energies of base stacking and ionization potentials, suggesting a possible origin via oxidation reactions involving electron loss and subsequent electron transfer (hole migration) between neighboring bases. A set of ∼40,000 somatic mutations at G·C base pairs identified in a lung cancer genome exhibited similar correlations, implying that hole migration may also be involved. We conclude that alternative DNA conformations, WRN deficiency and lung tumorigenesis may all serve to increase the mutation rate by promoting, through diverse pathways, oxidation reactions that perturb the electron orbitals of neighboring bases. It follows that such "hole migration" is likely to play a much more widespread role in mutagenesis than previously anticipated. PMID:21285356

  10. TRITIUM RESERVOIR STRUCTURAL PERFORMANCE PREDICTION

    SciTech Connect

    Lam, P.S.; Morgan, M.J

    2005-11-10

    The burst test is used to assess the material performance of tritium reservoirs in the surveillance program in which reservoirs have been in service for extended periods of time. A materials system model and finite element procedure were developed under a Savannah River Site Plant-Directed Research and Development (PDRD) program to predict the structural response under a full range of loading and aged material conditions of the reservoir. The results show that the predicted burst pressure and volume ductility are in good agreement with the actual burst test results for the unexposed units. The material tensile properties used in the calculations were obtained from a curved tensile specimen harvested from a companion reservoir by Electric Discharge Machining (EDM). In the absence of exposed and aged material tensile data, literature data were used for demonstrating the methodology in terms of the helium-3 concentration in the metal and the depth of penetration in the reservoir sidewall. It can be shown that the volume ductility decreases significantly with the presence of tritium and its decay product, helium-3, in the metal, as was observed in the laboratory-controlled burst tests. The model and analytical procedure provides a predictive tool for reservoir structural integrity under aging conditions. It is recommended that benchmark tests and analysis for aged materials be performed. The methodology can be augmented to predict performance for reservoir with flaws.

  11. PROTEIN STRUCTURE PREDICTION CENTER IN CASP8

    PubMed Central

    Kryshtafovych, Andriy; Krysko, Oleh; Daniluk, Pawel; Dmytriv, Zinovii; Fidelis, Krzysztof

    2009-01-01

    We present an outline of the Critical Assessment of Protein Structure Prediction (CASP) infrastructure implemented at the University of California, Davis, Protein Structure Prediction Center. The infrastructure supports selection and validation of prediction targets, collection of predictions, standard evaluation of submitted predictions, and presentation of results. The Center also supports information exchange relating to CASP experiments and structure prediction in general. Technical aspects of conducting the CASP8 experiment and relevant statistics are also provided. PMID:19722263

  12. Gene duplications in evolution of archaeal family B DNA polymerases.

    PubMed Central

    Edgell, D R; Klenk, H P; Doolittle, W F

    1997-01-01

    All archaeal DNA-dependent DNA polymerases sequenced to date are homologous to family B DNA polymerases from eukaryotes and eubacteria. Presently, representatives of the euryarchaeote division of archaea appear to have a single family B DNA polymerase, whereas two crenarchaeotes, Pyrodictium occultum and Sulfolobus solfataricus, each possess two family B DNA polymerases. We have found the gene for yet a third family B DNA polymerase, designated B3, in the crenarchaeote S. solfataricus P2. The encoded protein is highly divergent at the amino acid level from the previously characterized family B polymerases in S. solfataricus P2 and contains a number of nonconserved amino acid substitutions in catalytic domains. We have cloned and sequenced the ortholog of this gene from the closely related Sulfolobus shibatae. It is also highly divergent from other archaeal family B DNA polymerases and, surprisingly, from the S. solfataricus B3 ortholog. Phylogenetic analysis using all available archaeal family B DNA polymerases suggests that the S. solfataricus P2 B3 and S. shibatae B3 paralogs are related to one of the two DNA polymerases of P. occultum. These sequences are members of a group which includes all euryarchaeote family B homologs, while the remaining crenarchaeote sequences form another distinct group. Archaeal family B DNA polymerases together constitute a monophyletic subfamily whose evolution has been characterized by a number of gene duplication events. PMID:9098062

  13. De novo design of protein mimics of B-DNA.

    PubMed

    Yüksel, Deniz; Bianco, Piero R; Kumar, Krishna

    2016-01-01

    Structural mimicry of DNA is utilized in nature as a strategy to evade molecular defences mounted by host organisms. One such example is the protein Ocr - the first translation product to be expressed as the bacteriophage T7 infects E. coli. The structure of Ocr reveals an intricate and deliberate arrangement of negative charges that endows it with the ability to mimic ∼24 base pair stretches of B-DNA. This uncanny resemblance to DNA enables Ocr to compete in binding the type I restriction modification (R/M) system, and neutralizes the threat of hydrolytic cleavage of viral genomic material. Here, we report the de novo design and biophysical characterization of DNA mimicking peptides, and describe the inhibitory action of the designed helical bundles on a type I R/M enzyme, EcoR124I. This work validates the use of charge patterning as a design principle for creation of protein mimics of DNA, and serves as a starting point for development of therapeutic peptide inhibitors against human pathogens that employ molecular camouflage as part of their invasion stratagem. PMID:26568416

  14. Practical lessons from protein structure prediction

    PubMed Central

    Ginalski, Krzysztof; Grishin, Nick V.; Godzik, Adam; Rychlewski, Leszek

    2005-01-01

    Despite recent efforts to develop automated protein structure determination protocols, structural genomics projects are slow in generating fold assignments for complete proteomes, and spatial structures remain unknown for many protein families. Alternative cheap and fast methods to assign folds using prediction algorithms continue to provide valuable structural information for many proteins. The development of high-quality prediction methods has been boosted in the last years by objective community-wide assessment experiments. This paper gives an overview of the currently available practical approaches to protein structure prediction capable of generating accurate fold assignment. Recent advances in assessment of the prediction quality are also discussed. PMID:15805122

  15. Local backbone structure prediction of proteins.

    PubMed

    de Brevern, Alexandre G; Benros, Cristina; Gautier, Romain; Valadié, Héléne; Hazout, Serge; Etchebest, Catherine

    2004-01-01

    A statistical analysis of the PDB structures has led us to define a new set of small 3D structural prototypes called Protein Blocks (PBs). This structural alphabet includes 16 PBs, each one is defined by the (phi, psi) dihedral angles of 5 consecutive residues. The amino acid distributions observed in sequence windows encompassing these PBs are used to predict by a Bayesian approach the local 3D structure of proteins from the sole knowledge of their sequences. LocPred is a software which allows the users to submit a protein sequence and performs a prediction in terms of PBs. The prediction results are given both textually and graphically. PMID:15724288

  16. Protein structure prediction using hybrid AI methods

    SciTech Connect

    Guan, X.; Mural, R.J.; Uberbacher, E.C.

    1993-11-01

    This paper describes a new approach for predicting protein structures based on Artificial Intelligence methods and genetic algorithms. We combine nearest neighbor searching algorithms, neural networks, heuristic rules and genetic algorithms to form an integrated system to predict protein structures from their primary amino acid sequences. First we describe our methods and how they are integrated, and then apply our methods to several protein sequences. The results are very close to the real structures obtained by crystallography. Parallel genetic algorithms are also implemented.

  17. Predicting and prioritizing maintenance for concrete structures

    SciTech Connect

    Hertlein, B.H. )

    1991-06-01

    Using nondestructive testing of concrete structures to predict maintenance needs can help schedule maintenance work in advance and prevent unexpected shutdowns. Nondestructive testing methods are described and development of a testing program is discussed.

  18. Interface Structure Prediction from First-Principles

    SciTech Connect

    Zhao, Xin; Shu, Qiang; Nguyen, Manh Cuong; Wang, Yangang; Ji, Min; Xiang, Hongjun; Ho, Kai-Ming; Gong, Xingao; Wang, Cai-Zhuang

    2014-05-08

    Information about the atomic structures at solid–solid interfaces is crucial for understanding and predicting the performance of materials. Due to the complexity of the interfaces, it is very challenging to resolve their atomic structures using either experimental techniques or computer simulations. In this paper, we present an efficient first-principles computational method for interface structure prediction based on an adaptive genetic algorithm. This approach significantly reduces the computational cost, while retaining the accuracy of first-principles prediction. The method is applied to the investigation of both stoichiometric and nonstoichiometric SrTiO3 Σ3(112)[1̅10] grain boundaries with unit cell containing up to 200 atoms. Several novel low-energy structures are discovered, which provide fresh insights into the structure and stability of the grain boundaries.

  19. Interval prediction in structural dynamic analysis

    NASA Technical Reports Server (NTRS)

    Hasselman, Timothy K.; Chrostowski, Jon D.; Ross, Timothy J.

    1992-01-01

    Methods for assessing the predictive accuracy of structural dynamic models are examined with attention given to the effects of modal mass, stiffness, and damping uncertainties. The methods are based on a nondeterministic analysis called 'interval prediction' in which interval variables are used to describe parameters and responses that are unknown. Statistical databases for generic modeling uncertainties are derived from experimental data and incorporated analytically to evaluate responses. Covariance matrices of modal mass, stiffness, and damping parameters are propagated numerically in models of large space structures by means of three methods. The test data tend to fall within the predicted intervals of uncertainty determined by the statistical databases. The present findings demonstrate the suitability of using data from previously analyzed and tested space structures for assessing the predictive accuracy of an analytical model.

  20. A structural alphabet for local protein structures: improved prediction methods.

    PubMed

    Etchebest, Catherine; Benros, Cristina; Hazout, Serge; de Brevern, Alexandre G

    2005-06-01

    Three-dimensional protein structures can be described with a library of 3D fragments that define a structural alphabet. We have previously proposed such an alphabet, composed of 16 patterns of five consecutive amino acids, called Protein Blocks (PBs). These PBs have been used to describe protein backbones and to predict local structures from protein sequences. The Q16 prediction rate reaches 40.7% with an optimization procedure. This article examines two aspects of PBs. First, we determine the effect of the enlargement of databanks on their definition. The results show that the geometrical features of the different PBs are preserved (local RMSD value equal to 0.41 A on average) and sequence-structure specificities reinforced when databanks are enlarged. Second, we improve the methods for optimizing PB predictions from sequences, revisiting the optimization procedure and exploring different local prediction strategies. Use of a statistical optimization procedure for the sequence-local structure relation improves prediction accuracy by 8% (Q16 = 48.7%). Better recognition of repetitive structures occurs without losing the prediction efficiency of the other local folds. Adding secondary structure prediction improved the accuracy of Q16 by only 1%. An entropy index (Neq), strongly related to the RMSD value of the difference between predicted PBs and true local structures, is proposed to estimate prediction quality. The Neq is linearly correlated with the Q16 prediction rate distributions, computed for a large set of proteins. An "expected" prediction rate QE16 is deduced with a mean error of 5%. PMID:15822101

  1. Characteristics and Prediction of RNA Structure

    PubMed Central

    Zhu, Daming; Zhang, Caiming; Han, Huijian; Crandall, Keith A.

    2014-01-01

    RNA secondary structures with pseudoknots are often predicted by minimizing free energy, which is NP-hard. Most RNAs fold during transcription from DNA into RNA through a hierarchical pathway wherein secondary structures form prior to tertiary structures. Real RNA secondary structures often have local instead of global optimization because of kinetic reasons. The performance of RNA structure prediction may be improved by considering dynamic and hierarchical folding mechanisms. This study is a novel report on RNA folding that accords with the golden mean characteristic based on the statistical analysis of the real RNA secondary structures of all 480 sequences from RNA STRAND, which are validated by NMR or X-ray. The length ratios of domains in these sequences are approximately 0.382L, 0.5L, 0.618L, and L, where L is the sequence length. These points are just the important golden sections of sequence. With this characteristic, an algorithm is designed to predict RNA hierarchical structures and simulate RNA folding by dynamically folding RNA structures according to the above golden section points. The sensitivity and number of predicted pseudoknots of our algorithm are better than those of the Mfold, HotKnots, McQfold, ProbKnot, and Lhw-Zhu algorithms. Experimental results reflect the folding rules of RNA from a new angle that is close to natural folding. PMID:25110687

  2. Predicting protein dynamics from structural ensembles.

    PubMed

    Copperman, J; Guenza, M G

    2015-12-28

    The biological properties of proteins are uniquely determined by their structure and dynamics. A protein in solution populates a structural ensemble of metastable configurations around the global fold. From overall rotation to local fluctuations, the dynamics of proteins can cover several orders of magnitude in time scales. We propose a simulation-free coarse-grained approach which utilizes knowledge of the important metastable folded states of the protein to predict the protein dynamics. This approach is based upon the Langevin Equation for Protein Dynamics (LE4PD), a Langevin formalism in the coordinates of the protein backbone. The linear modes of this Langevin formalism organize the fluctuations of the protein, so that more extended dynamical cooperativity relates to increasing energy barriers to mode diffusion. The accuracy of the LE4PD is verified by analyzing the predicted dynamics across a set of seven different proteins for which both relaxation data and NMR solution structures are available. Using experimental NMR conformers as the input structural ensembles, LE4PD predicts quantitatively accurate results, with correlation coefficient ? = 0.93 to NMR backbone relaxation measurements for the seven proteins. The NMR solution structure derived ensemble and predicted dynamical relaxation is compared with molecular dynamics simulation-derived structural ensembles and LE4PD predictions and is consistent in the time scale of the simulations. The use of the experimental NMR conformers frees the approach from computationally demanding simulations. PMID:26723616

  3. Predicting protein dynamics from structural ensembles

    NASA Astrophysics Data System (ADS)

    Copperman, J.; Guenza, M. G.

    2015-12-01

    The biological properties of proteins are uniquely determined by their structure and dynamics. A protein in solution populates a structural ensemble of metastable configurations around the global fold. From overall rotation to local fluctuations, the dynamics of proteins can cover several orders of magnitude in time scales. We propose a simulation-free coarse-grained approach which utilizes knowledge of the important metastable folded states of the protein to predict the protein dynamics. This approach is based upon the Langevin Equation for Protein Dynamics (LE4PD), a Langevin formalism in the coordinates of the protein backbone. The linear modes of this Langevin formalism organize the fluctuations of the protein, so that more extended dynamical cooperativity relates to increasing energy barriers to mode diffusion. The accuracy of the LE4PD is verified by analyzing the predicted dynamics across a set of seven different proteins for which both relaxation data and NMR solution structures are available. Using experimental NMR conformers as the input structural ensembles, LE4PD predicts quantitatively accurate results, with correlation coefficient ρ = 0.93 to NMR backbone relaxation measurements for the seven proteins. The NMR solution structure derived ensemble and predicted dynamical relaxation is compared with molecular dynamics simulation-derived structural ensembles and LE4PD predictions and is consistent in the time scale of the simulations. The use of the experimental NMR conformers frees the approach from computationally demanding simulations.

  4. New approaches in molecular structure prediction.

    PubMed

    Böhm, G

    1996-03-01

    In the past years, much effort has been put on the development of new methodologies and algorithms for the prediction of protein secondary and tertiary structures from (sequence) data; this is reviewed in detail. New approaches for these predictions such as neural network methods, genetic algorithms, machine learning, and graph theoretical methods are discussed. Secondary structure prediction algorithms were improved mostly by considering families of related proteins; however, for the reliable tertiary structure modeling of proteins, knowledge-based techniques are still preferred. Methods and examples with more or less successful results are described. Also, programs and parameterizations for energy minimisations, molecular dynamics, and electrostatic interactions have been improved, especially with respect to their former limits of applicability. Other topics discussed in this review include the use of traditional and on-line databases, the docking problem and surface properties of biomolecules, packing of protein cores, de novo design and protein engineering, prediction of membrane protein structures, the verification and reliability of model structures, and progress made with currently available software and computer hardware. In summary, the prediction of the structure, function, and other properties of a protein is still possible only within limits, but these limits continue to be moved. PMID:8867324

  5. Predicting structure in nonsymmetric sparse matrix factorizations

    SciTech Connect

    Gilbert, J.R. ); Ng, E.G. )

    1992-10-01

    Many computations on sparse matrices have a phase that predicts the nonzero structure of the output, followed by a phase that actually performs the numerical computation. We study structure prediction for computations that involve nonsymmetric row and column permutations and nonsymmetric or non-square matrices. Our tools are bipartite graphs, matchings, and alternating paths. Our main new result concerns LU factorization with partial pivoting. We show that if a square matrix A has the strong Hall property (i.e., is fully indecomposable) then an upper bound due to George and Ng on the nonzero structure of L + U is as tight as possible. To show this, we prove a crucial result about alternating paths in strong Hall graphs. The alternating-paths theorem seems to be of independent interest: it can also be used to prove related results about structure prediction for QR factorization that are due to Coleman, Edenbrandt, Gilbert, Hare, Johnson, Olesky, Pothen, and van den Driessche.

  6. Predicting Odor Perceptual Similarity from Odor Structure

    PubMed Central

    Weiss, Tali; Frumin, Idan; Khan, Rehan M.; Sobel, Noam

    2013-01-01

    To understand the brain mechanisms of olfaction we must understand the rules that govern the link between odorant structure and odorant perception. Natural odors are in fact mixtures made of many molecules, and there is currently no method to look at the molecular structure of such odorant-mixtures and predict their smell. In three separate experiments, we asked 139 subjects to rate the pairwise perceptual similarity of 64 odorant-mixtures ranging in size from 4 to 43 mono-molecular components. We then tested alternative models to link odorant-mixture structure to odorant-mixture perceptual similarity. Whereas a model that considered each mono-molecular component of a mixture separately provided a poor prediction of mixture similarity, a model that represented the mixture as a single structural vector provided consistent correlations between predicted and actual perceptual similarity (r≥0.49, p<0.001). An optimized version of this model yielded a correlation of r = 0.85 (p<0.001) between predicted and actual mixture similarity. In other words, we developed an algorithm that can look at the molecular structure of two novel odorant-mixtures, and predict their ensuing perceptual similarity. That this goal was attained using a model that considers the mixtures as a single vector is consistent with a synthetic rather than analytical brain processing mechanism in olfaction. PMID:24068899

  7. Predicting polymeric crystal structures by evolutionary algorithms

    NASA Astrophysics Data System (ADS)

    Zhu, Qiang; Sharma, Vinit; Oganov, Artem R.; Ramprasad, Ramamurthy

    2014-10-01

    The recently developed evolutionary algorithm USPEX proved to be a tool that enables accurate and reliable prediction of structures. Here we extend this method to predict the crystal structure of polymers by constrained evolutionary search, where each monomeric unit is treated as a building block with fixed connectivity. This greatly reduces the search space and allows the initial structure generation with different sequences and packings of these blocks. The new constrained evolutionary algorithm is successfully tested and validated on a diverse range of experimentally known polymers, namely, polyethylene, polyacetylene, poly(glycolic acid), poly(vinyl chloride), poly(oxymethylene), poly(phenylene oxide), and poly (p-phenylene sulfide). By fixing the orientation of polymeric chains, this method can be further extended to predict the structures of complex linear polymers, such as all polymorphs of poly(vinylidene fluoride), nylon-6 and cellulose. The excellent agreement between predicted crystal structures and experimentally known structures assures a major role of this approach in the efficient design of the future polymeric materials.

  8. Protein Structure Prediction with Evolutionary Algorithms

    SciTech Connect

    Hart, W.E.; Krasnogor, N.; Pelta, D.A.; Smith, J.

    1999-02-08

    Evolutionary algorithms have been successfully applied to a variety of molecular structure prediction problems. In this paper we reconsider the design of genetic algorithms that have been applied to a simple protein structure prediction problem. Our analysis considers the impact of several algorithmic factors for this problem: the confirmational representation, the energy formulation and the way in which infeasible conformations are penalized, Further we empirically evaluated the impact of these factors on a small set of polymer sequences. Our analysis leads to specific recommendations for both GAs as well as other heuristic methods for solving PSP on the HP model.

  9. Protein Structure Prediction by Protein Threading

    NASA Astrophysics Data System (ADS)

    Xu, Ying; Liu, Zhijie; Cai, Liming; Xu, Dong

    The seminal work of Bowie, Lüthy, and Eisenberg (Bowie et al., 1991) on "the inverse protein folding problem" laid the foundation of protein structure prediction by protein threading. By using simple measures for fitness of different amino acid types to local structural environments defined in terms of solvent accessibility and protein secondary structure, the authors derived a simple and yet profoundly novel approach to assessing if a protein sequence fits well with a given protein structural fold. Their follow-up work (Elofsson et al., 1996; Fischer and Eisenberg, 1996; Fischer et al., 1996a,b) and the work by Jones, Taylor, and Thornton (Jones et al., 1992) on protein fold recognition led to the development of a new brand of powerful tools for protein structure prediction, which we now term "protein threading." These computational tools have played a key role in extending the utility of all the experimentally solved structures by X-ray crystallography and nuclear magnetic resonance (NMR), providing structural models and functional predictions for many of the proteins encoded in the hundreds of genomes that have been sequenced up to now.

  10. Potential non-B DNA regions in the human genome are associated with higher rates of nucleotide mutation and expression variation.

    PubMed

    Du, Xiangjun; Gertz, E Michael; Wojtowicz, Damian; Zhabinskaya, Dina; Levens, David; Benham, Craig J; Schffer, Alejandro A; Przytycka, Teresa M

    2014-11-10

    While individual non-B DNA structures have been shown to impact gene expression, their broad regulatory role remains elusive. We utilized genomic variants and expression quantitative trait loci (eQTL) data to analyze genome-wide variation propensities of potential non-B DNA regions and their relation to gene expression. Independent of genomic location, these regions were enriched in nucleotide variants. Our results are consistent with previously observed mutagenic properties of these regions and counter a previous study concluding that G-quadruplex regions have a reduced frequency of variants. While such mutagenicity might undermine functionality of these elements, we identified in potential non-B DNA regions a signature of negative selection. Yet, we found a depletion of eQTL-associated variants in potential non-B DNA regions, opposite to what might be expected from their proposed regulatory role. However, we also observed that genes downstream of potential non-B DNA regions showed higher expression variation between individuals. This coupling between mutagenicity and tolerance for expression variability of downstream genes may be a result of evolutionary adaptation, which allows reconciling mutagenicity of non-B DNA structures with their location in functionally important regions and their potential regulatory role. PMID:25336616

  11. Potential non-B DNA regions in the human genome are associated with higher rates of nucleotide mutation and expression variation

    PubMed Central

    Du, Xiangjun; Gertz, E. Michael; Wojtowicz, Damian; Zhabinskaya, Dina; Levens, David; Benham, Craig J.; Schäffer, Alejandro A.; Przytycka, Teresa M.

    2014-01-01

    While individual non-B DNA structures have been shown to impact gene expression, their broad regulatory role remains elusive. We utilized genomic variants and expression quantitative trait loci (eQTL) data to analyze genome-wide variation propensities of potential non-B DNA regions and their relation to gene expression. Independent of genomic location, these regions were enriched in nucleotide variants. Our results are consistent with previously observed mutagenic properties of these regions and counter a previous study concluding that G-quadruplex regions have a reduced frequency of variants. While such mutagenicity might undermine functionality of these elements, we identified in potential non-B DNA regions a signature of negative selection. Yet, we found a depletion of eQTL-associated variants in potential non-B DNA regions, opposite to what might be expected from their proposed regulatory role. However, we also observed that genes downstream of potential non-B DNA regions showed higher expression variation between individuals. This coupling between mutagenicity and tolerance for expression variability of downstream genes may be a result of evolutionary adaptation, which allows reconciling mutagenicity of non-B DNA structures with their location in functionally important regions and their potential regulatory role. PMID:25336616

  12. GPstruct: Bayesian Structured Prediction Using Gaussian Processes.

    PubMed

    Bratières, Sébastien; Quadrianto, Novi; Ghahramani, Zoubin

    2015-07-01

    We introduce a conceptually novel structured prediction model, GPstruct, which is kernelized, non-parametric and Bayesian, by design. We motivate the model with respect to existing approaches, among others, conditional random fields (CRFs), maximum margin Markov networks (M3N), and structured support vector machines (SVMstruct), which embody only a subset of its properties. We present an inference procedure based on Markov Chain Monte Carlo. The framework can be instantiated for a wide range of structured objects such as linear chains, trees, grids, and other general graphs. As a proof of concept, the model is benchmarked on several natural language processing tasks and a video gesture segmentation task involving a linear chain structure. We show prediction accuracies for GPstruct which are comparable to or exceeding those of CRFs and SVMstruct. PMID:26352456

  13. Predict7, a program for protein structure prediction.

    PubMed

    Crmenes, R S; Freije, J P; Molina, M M; Martn, J M

    1989-03-15

    We describe a program for protein sequence analysis which runs in IBM PC computers. Protein sequences are loaded from files in Mount-Conrad and Lipman-Pearson format. Seven features are analyzed: hydrophilicity, hydropathy, surface probability, side chain flexibility, antigenicity, secondary structure and N-glycosylation sites. Numeric results can be shown, printed or stored in files exportable to other programs. Graphics of up to four predictions can be displayed on the screen, printed out or plotted, with several definable options. This program has been designed to be fast, user-friendly and to be shared with the scientific community. PMID:2539121

  14. Structure-Based Predictions of Activity Cliffs

    PubMed Central

    Husby, Jarmila; Bottegoni, Giovanni; Kufareva, Irina; Abagyan, Ruben; Cavalli, Andrea

    2015-01-01

    In drug discovery, it is generally accepted that neighboring molecules in a given descriptors' space display similar activities. However, even in regions that provide strong predictability, structurally similar molecules can occasionally display large differences in potency. In QSAR jargon, these discontinuities in the activity landscape are known as ‘activity cliffs’. In this study, we assessed the reliability of ligand docking and virtual ligand screening schemes in predicting activity cliffs. We performed our calculations on a diverse, independently collected database of cliff-forming co-crystals. Starting from ideal situations, which allowed us to establish our baseline, we progressively moved toward simulating more realistic scenarios. Ensemble- and template-docking achieved a significant level of accuracy, suggesting that, despite the well-known limitations of empirical scoring schemes, activity cliffs can be accurately predicted by advanced structure-based methods. PMID:25918827

  15. Protein complex compositions predicted by structural similarity

    PubMed Central

    Davis, Fred P.; Braberg, Hannes; Shen, Min-Yi; Pieper, Ursula; Sali, Andrej; Madhusudhan, M.S.

    2006-01-01

    Proteins function through interactions with other molecules. Thus, the network of physical interactions among proteins is of great interest to both experimental and computational biologists. Here we present structure-based predictions of 3387 binary and 1234 higher order protein complexes in Saccharomyces cerevisiae involving 924 and 195 proteins, respectively. To generate candidate complexes, comparative models of individual proteins were built and combined together using complexes of known structure as templates. These candidate complexes were then assessed using a statistical potential, derived from binary domain interfaces in PIBASE (). The statistical potential discriminated a benchmark set of 100 interface structures from a set of sequence-randomized negative examples with a false positive rate of 3% and a true positive rate of 97%. Moreover, the predicted complexes were also filtered using functional annotation and sub-cellular localization data. The ability of the method to select the correct binding mode among alternates is demonstrated for three camelid VHH domainporcine ?amylase interactions. We also highlight the prediction of co-complexed domain superfamilies that are not present in template complexes. Through integration with MODBASE, the application of the method to proteomes that are less well characterized than that of S.cerevisiae will contribute to expansion of the structural and functional coverage of protein interaction space. The predicted complexes are deposited in MODBASE (). PMID:16738133

  16. A-DNA and B-DNA: Comparing Their Historical X-ray Fiber Diffraction Images

    NASA Astrophysics Data System (ADS)

    Lucas, Amand A.

    2008-05-01

    A-DNA and B-DNA are two secondary molecular conformations (among other allomorphs) that double-stranded DNA drawn into a fiber can assume, depending on the relative water content and other chemical parameters of the fiber. They were the first two forms to be observed by X-ray fiber diffraction in the early 1950s, respectively by Wilkins and Gosling and by Franklin and Gosling. Their corresponding historical diffraction diagrams played an equally crucial role in the discovery of the primary double-helical structure of the DNA molecule by Watson and Crick in 1953. This paper provides a comparative explanation of the structural content of the two diagrams treated on the same footing. The analysis of the diagrams is supported by the optical transform method with which both A-DNA and B-DNA X-ray images can be simulated optically. The simulations use a simple laser pointer and a dozen optical diffraction gratings, all held on a single diffraction slide. The gratings have been specially designed to pinpoint just which of the structural elements of the molecule is responsible for each of the revealing features of the fiber diffraction images.

  17. RNA secondary structure prediction using soft computing.

    PubMed

    Ray, Shubhra Sankar; Pal, Sankar K

    2013-01-01

    Prediction of RNA structure is invaluable in creating new drugs and understanding genetic diseases. Several deterministic algorithms and soft computing-based techniques have been developed for more than a decade to determine the structure from a known RNA sequence. Soft computing gained importance with the need to get approximate solutions for RNA sequences by considering the issues related with kinetic effects, cotranscriptional folding, and estimation of certain energy parameters. A brief description of some of the soft computing-based techniques, developed for RNA secondary structure prediction, is presented along with their relevance. The basic concepts of RNA and its different structural elements like helix, bulge, hairpin loop, internal loop, and multiloop are described. These are followed by different methodologies, employing genetic algorithms, artificial neural networks, and fuzzy logic. The role of various metaheuristics, like simulated annealing, particle swarm optimization, ant colony optimization, and tabu search is also discussed. A relative comparison among different techniques, in predicting 12 known RNA secondary structures, is presented, as an example. Future challenging issues are then mentioned. PMID:23702539

  18. Excluded volume and ion-ion correlation effects on the ionic atmosphere around B-DNA: Theory, simulations, and experiments

    NASA Astrophysics Data System (ADS)

    Ovanesyan, Zaven; Medasani, Bharat; Fenley, Marcia O.; Guerrero-García, Guillermo Iván; Olvera de la Cruz, Mónica; Marucho, Marcelo

    2014-12-01

    The ionic atmosphere around a nucleic acid regulates its stability in aqueous salt solutions. One major source of complexity in biological activities involving nucleic acids arises from the strong influence of the surrounding ions and water molecules on their structural and thermodynamic properties. Here, we implement a classical density functional theory for cylindrical polyelectrolytes embedded in aqueous electrolytes containing explicit (neutral hard sphere) water molecules at experimental solvent concentrations. Our approach allows us to include ion correlations as well as solvent and ion excluded volume effects for studying the structural and thermodynamic properties of highly charged cylindrical polyelectrolytes. Several models of size and charge asymmetric mixtures of aqueous electrolytes at physiological concentrations are studied. Our results are in good agreement with Monte Carlo simulations. Our numerical calculations display significant differences in the ion density profiles for the different aqueous electrolyte models studied. However, similar results regarding the excess number of ions adsorbed to the B-DNA molecule are predicted by our theoretical approach for different aqueous electrolyte models. These findings suggest that ion counting experimental data should not be used alone to validate the performance of aqueous DNA-electrolyte models.

  19. Excluded volume and ion-ion correlation effects on the ionic atmosphere around B-DNA: theory, simulations, and experiments.

    PubMed

    Ovanesyan, Zaven; Medasani, Bharat; Fenley, Marcia O; Guerrero-García, Guillermo Iván; de la Cruz, Mónica Olvera; Marucho, Marcelo

    2014-12-14

    The ionic atmosphere around a nucleic acid regulates its stability in aqueous salt solutions. One major source of complexity in biological activities involving nucleic acids arises from the strong influence of the surrounding ions and water molecules on their structural and thermodynamic properties. Here, we implement a classical density functional theory for cylindrical polyelectrolytes embedded in aqueous electrolytes containing explicit (neutral hard sphere) water molecules at experimental solvent concentrations. Our approach allows us to include ion correlations as well as solvent and ion excluded volume effects for studying the structural and thermodynamic properties of highly charged cylindrical polyelectrolytes. Several models of size and charge asymmetric mixtures of aqueous electrolytes at physiological concentrations are studied. Our results are in good agreement with Monte Carlo simulations. Our numerical calculations display significant differences in the ion density profiles for the different aqueous electrolyte models studied. However, similar results regarding the excess number of ions adsorbed to the B-DNA molecule are predicted by our theoretical approach for different aqueous electrolyte models. These findings suggest that ion counting experimental data should not be used alone to validate the performance of aqueous DNA-electrolyte models. PMID:25494770

  20. Excluded volume and ion-ion correlation effects on the ionic atmosphere around B-DNA: Theory, simulations, and experiments

    PubMed Central

    Ovanesyan, Zaven; Fenley, Marcia O.; Guerrero-García, Guillermo Iván; Olvera de la Cruz, Mónica

    2014-01-01

    The ionic atmosphere around a nucleic acid regulates its stability in aqueous salt solutions. One major source of complexity in biological activities involving nucleic acids arises from the strong influence of the surrounding ions and water molecules on their structural and thermodynamic properties. Here, we implement a classical density functional theory for cylindrical polyelectrolytes embedded in aqueous electrolytes containing explicit (neutral hard sphere) water molecules at experimental solvent concentrations. Our approach allows us to include ion correlations as well as solvent and ion excluded volume effects for studying the structural and thermodynamic properties of highly charged cylindrical polyelectrolytes. Several models of size and charge asymmetric mixtures of aqueous electrolytes at physiological concentrations are studied. Our results are in good agreement with Monte Carlo simulations. Our numerical calculations display significant differences in the ion density profiles for the different aqueous electrolyte models studied. However, similar results regarding the excess number of ions adsorbed to the B-DNA molecule are predicted by our theoretical approach for different aqueous electrolyte models. These findings suggest that ion counting experimental data should not be used alone to validate the performance of aqueous DNA-electrolyte models. PMID:25494770

  1. Excluded volume and ion-ion correlation effects on the ionic atmosphere around B-DNA: Theory, simulations, and experiments

    SciTech Connect

    Ovanesyan, Zaven; Marucho, Marcelo; Medasani, Bharat; Fenley, Marcia O.; Guerrero-García, Guillermo Iván; Olvera de la Cruz, Mónica

    2014-12-14

    The ionic atmosphere around a nucleic acid regulates its stability in aqueous salt solutions. One major source of complexity in biological activities involving nucleic acids arises from the strong influence of the surrounding ions and water molecules on their structural and thermodynamic properties. Here, we implement a classical density functional theory for cylindrical polyelectrolytes embedded in aqueous electrolytes containing explicit (neutral hard sphere) water molecules at experimental solvent concentrations. Our approach allows us to include ion correlations as well as solvent and ion excluded volume effects for studying the structural and thermodynamic properties of highly charged cylindrical polyelectrolytes. Several models of size and charge asymmetric mixtures of aqueous electrolytes at physiological concentrations are studied. Our results are in good agreement with Monte Carlo simulations. Our numerical calculations display significant differences in the ion density profiles for the different aqueous electrolyte models studied. However, similar results regarding the excess number of ions adsorbed to the B-DNA molecule are predicted by our theoretical approach for different aqueous electrolyte models. These findings suggest that ion counting experimental data should not be used alone to validate the performance of aqueous DNA-electrolyte models.

  2. The intrinsic mechanics of B-DNA in solution characterized by NMR.

    PubMed

    Imeddourene, Akli Ben; Xu, Xiaoqian; Zargarian, Loussiné; Oguey, Christophe; Foloppe, Nicolas; Mauffret, Olivier; Hartmann, Brigitte

    2016-04-20

    Experimental characterization of the structural couplings in free B-DNA in solution has been elusive, because of subtle effects that are challenging to tackle. Here, the exploitation of the NMR measurements collected on four dodecamers containing a substantial set of dinucleotide sequences provides new, consistent correlations revealing the DNA intrinsic mechanics. The difference between two successive residual dipolar couplings (ΔRDCs) involving C6/8-H6/8, C3'-H3' and C4'-H4' vectors are correlated to the(31)P chemical shifts (δP), which reflect the populations of the BI and BII backbone states. The δPs are also correlated to the internucleotide distances (Dinter) involving H6/8, H2' and H2″ protons. Calculations of NMR quantities on high resolution X-ray structures and controlled models of DNA enable to interpret these couplings: the studied ΔRDCs depend mostly on roll, while Dinterare mainly sensitive to twist or slide. Overall, these relations demonstrate how δP measurements inform on key inter base parameters, in addition to probe the BI↔BII backbone equilibrium, and shed new light into coordinated motions of phosphate groups and bases in free B-DNA in solution. Inspection of the 5' and 3' ends of the dodecamers also supplies new information on the fraying events, otherwise neglected. PMID:26883628

  3. The intrinsic mechanics of B-DNA in solution characterized by NMR

    PubMed Central

    Imeddourene, Akli Ben; Xu, Xiaoqian; Zargarian, Loussiné; Oguey, Christophe; Foloppe, Nicolas; Mauffret, Olivier; Hartmann, Brigitte

    2016-01-01

    Experimental characterization of the structural couplings in free B-DNA in solution has been elusive, because of subtle effects that are challenging to tackle. Here, the exploitation of the NMR measurements collected on four dodecamers containing a substantial set of dinucleotide sequences provides new, consistent correlations revealing the DNA intrinsic mechanics. The difference between two successive residual dipolar couplings (ΔRDCs) involving C6/8-H6/8, C3′-H3′ and C4′-H4′ vectors are correlated to the 31P chemical shifts (δP), which reflect the populations of the BI and BII backbone states. The δPs are also correlated to the internucleotide distances (Dinter) involving H6/8, H2′ and H2″ protons. Calculations of NMR quantities on high resolution X-ray structures and controlled models of DNA enable to interpret these couplings: the studied ΔRDCs depend mostly on roll, while Dinter are mainly sensitive to twist or slide. Overall, these relations demonstrate how δP measurements inform on key inter base parameters, in addition to probe the BI↔BII backbone equilibrium, and shed new light into coordinated motions of phosphate groups and bases in free B-DNA in solution. Inspection of the 5′ and 3′ ends of the dodecamers also supplies new information on the fraying events, otherwise neglected. PMID:26883628

  4. Analysis and prediction of protein quaternary structure.

    PubMed

    Poupon, Anne; Janin, Joel

    2010-01-01

    The quaternary structure (QS) of a protein is determined by measuring its molecular weight in solution. The data have to be extracted from the literature, and they may be missing even for proteins that have a crystal structure reported in the Protein Data Bank (PDB). The PDB and other databases derived from it report QS information that either was obtained from the depositors or is based on an analysis of the contacts between polypeptide chains in the crystal, and this frequently differs from the QS determined in solution.The QS of a protein can be predicted from its sequence using either homology or threading methods. However, a majority of the proteins with less than 30% sequence identity have different QSs. A model of the QS can also be derived by docking the subunits when their 3D structure is independently known, but the model is likely to be incorrect if large conformation changes take place when the oligomer assembles. PMID:20221929

  5. On lattice protein structure prediction revisited.

    PubMed

    Dotu, Ivan; Cebrián, Manuel; Van Hentenryck, Pascal; Clote, Peter

    2011-01-01

    Protein structure prediction is regarded as a highly challenging problem both for the biology and for the computational communities. In recent years, many approaches have been developed, moving to increasingly complex lattice models and off-lattice models. This paper presents a Large Neighborhood Search (LNS) to find the native state for the Hydrophobic-Polar (HP) model on the Face-Centered Cubic (FCC) lattice or, in other words, a self-avoiding walk on the FCC lattice having a maximum number of H-H contacts. The algorithm starts with a tabu-search algorithm, whose solution is then improved by a combination of constraint programming and LNS. The flexible framework of this hybrid algorithm allows an adaptation to the Miyazawa-Jernigan contact potential, in place of the HP model, thus suggesting its potential for tertiary structure prediction. Benchmarking statistics are given for our method against the hydrophobic core threading program HPstruct, an exact method which can be viewed as complementary to our method. PMID:21358007

  6. Phylogenetic Approaches to Natural Product Structure Prediction

    PubMed Central

    Ziemert, Nadine; Jensen, Paul R.

    2015-01-01

    Phylogenetics is the study of the evolutionary relatedness among groups of organisms. Molecular phylogenetics uses sequence data to infer these relationships for both organisms and the genes they maintain. With the large amount of publicly available sequence data, phylogenetic inference has become increasingly important in all fields of biology. In the case of natural product research, phylogenetic relationships are proving to be highly informative in terms of delineating the architecture and function of the genes involved in secondary metabolite biosynthesis. Polyketide synthases and nonribosomal peptide synthetases provide model examples in which individual domain phylogenies display different predictive capacities, resolving features ranging from substrate specificity to structural motifs associated with the final metabolic product. This chapter provides examples in which phylogeny has proven effective in terms of predicting functional or structural aspects of secondary metabolism. The basics of how to build a reliable phylogenetic tree are explained along with information about programs and tools that can be used for this purpose. Furthermore, it introduces the Natural Product Domain Seeker, a recently developed Web tool that employs phylogenetic logic to classify ketosynthase and condensation domains based on established enzyme architecture and biochemical function. PMID:23084938

  7. Modeling intercalated PAH metabolites: Explanation for the stereochemical and shape selectivity of B-DNA for bay-region carcinogens

    SciTech Connect

    Szentpaly, L.V.; Shamovsky, I.L.

    1996-12-31

    The equilibrium structures of 22 intercalation complexes of different metabolites of polycyclic aromatic hydrocarbons (PAH) with the dG{sub 2}{lg_bullet}dC{sub 2} dinucleotide are obtained by AMBER and FLEX molecular modeling. The triol carbocations of highly potent carcinogens are stereochemically compatible with the dinucleotide and B-DNA. Their intercalation complexes are found (1) to be stabilized by two hydrogen bonds between DH groups of the triol cation and the N(3) atoms of the adjacent guanine residues, (2) to be {open_quotes}preorganized{close_quotes} for covalent bonding to the N(2) amino group of quanine, (3) to display only minor conformational changes with respect to the uncomplexed dinucleotide in B-DNA. A new explanation for the stereochemical and shape selectivity in the initiation of cancer by PAHa is presented. The molecular mechanics study is sugmented by HF/6-31G{sup I} calculations on the conformations of phenanthrene triol carbocation.

  8. Accurate Prediction of Docked Protein Structure Similarity.

    PubMed

    Akbal-Delibas, Bahar; Pomplun, Marc; Haspel, Nurit

    2015-09-01

    One of the major challenges for protein-protein docking methods is to accurately discriminate nativelike structures. The protein docking community agrees on the existence of a relationship between various favorable intermolecular interactions (e.g. Van der Waals, electrostatic, desolvation forces, etc.) and the similarity of a conformation to its native structure. Different docking algorithms often formulate this relationship as a weighted sum of selected terms and calibrate their weights against specific training data to evaluate and rank candidate structures. However, the exact form of this relationship is unknown and the accuracy of such methods is impaired by the pervasiveness of false positives. Unlike the conventional scoring functions, we propose a novel machine learning approach that not only ranks the candidate structures relative to each other but also indicates how similar each candidate is to the native conformation. We trained the AccuRMSD neural network with an extensive dataset using the back-propagation learning algorithm. Our method achieved predicting RMSDs of unbound docked complexes with 0.4Å error margin. PMID:26335807

  9. Helix Geometry, Hydration, and G\\cdot A Mismatch in a B-DNA Decamer

    NASA Astrophysics Data System (ADS)

    Prive, Gilbert G.; Heinemann, Udo; Chandrasegaran, Srinivasan; Kan, Lou-Sing; Kopka, Mary L.; Dickerson, Richard E.

    1987-10-01

    The DNA double helix is not a regular, featureless barberpole molecule. Different base sequences have their own special signature, in the way that they influence groove width, helical twist, bending, and mechanical rigidity or resistance to bending. These special features probably help other molecules such as repressors to read and recognize one base sequence in preference to another. Single crystal x-ray structure analysis is beginning to show us the various structures possible in the B-DNA family. The DNA decamer C-C-A-A-G-A-T-T-G-G appears to be a better model for mixed-sequence B-DNA than was the earlier C-G-C-G-A-A-T-T-C-G-C-G, which is more akin to regions of poly (dA)\\cdot poly(dT). The G\\cdot A mismatch base pairs at the center of the decamer are in the anti-anti conformation about their bonds from base to sugar, in agreement with nuclear magnetic resonance evidence on this and other sequences, and in contrast to the anti-syn geometry reported for G\\cdot A pairs in C-G-C-G-A-A-T-T-A-G-C-G. The ordered spine of hydration seen earlier in the narrow-grooved dodecamer has its counterpart, in this wide-grooved decamer, in two strings of water molecules lining the walls of the minor groove, bridging from purine N3 or pyrimidine O2, to the following sugar O4'. The same strings of hydration are present in the phosphorothioate analog of G-C-G-C-G-C. Unlike the spine, which is broken up by the intrusion of amine groups at guanines, these water strings are found in general, mixed-sequence DNA because they can pass by unimpeded to either side of a guanine N2 amine. The spine and strings are perceived as two extremes of a general pattern of hydration of the minor groove, which probably is the dominant factor in making B-DNA the preferred form at high hydration.

  10. Helix geometry, hydration, and G.A mismatch in a B-DNA decamer.

    PubMed

    Privé, G G; Heinemann, U; Chandrasegaran, S; Kan, L S; Kopka, M L; Dickerson, R E

    1987-10-23

    The DNA double helix is not a regular, featureless barberpole molecule. Different base sequences have their own special signature, in the way that they influence groove width, helical twist, bending, and mechanical rigidity or resistance to bending. These special features probably help other molecules such as repressors to read and recognize one base sequence in preference to another. Single crystal x-ray structure analysis is beginning to show us the various structures possible in the B-DNA family. The DNA decamer C-C-A-A-G-A-T-T-G-G appears to be a better model for mixed-sequence B-DNA than was the earlier C-G-C-G-A-A-T-T-C-G-C-G, which is more akin to regions of poly(dA).poly(dT). The G.A mismatch base pairs at the center of the decamer are in the anti-anti conformation about their bonds from base to sugar, in agreement with nuclear magnetic resonance evidence on this and other sequences, and in contrast to the anti-syn geometry reported for G.A pairs in C-G-C-G-A-A-T-T-A-G-C-G. The ordered spine of hydration seen earlier in the narrow-grooved dodecamer has its counterpart, in this wide-grooved decamer, in two strings of water molecules lining the walls of the minor groove, bridging from purine N3 or pyrimidine O2, to the following sugar O4'. The same strings of hydration are present in the phosphorothioate analog of G-C-G-C-G-C. Unlike the spine, which is broken up by the intrusion of amine groups at guanines, these water strings are found in general, mixed-sequence DNA because they can pass by unimpeded to either side of a guanine N2 amine. The spine and strings are perceived as two extremes of a general pattern of hydration of the minor groove, which probably is the dominant factor in making B-DNA the preferred form at high hydration. PMID:3310237

  11. Protein structure prediction using basin-hopping

    NASA Astrophysics Data System (ADS)

    Prentiss, Michael C.; Wales, David J.; Wolynes, Peter G.

    2008-06-01

    Associative memory Hamiltonian structure prediction potentials are not overly rugged, thereby suggesting their landscapes are like those of actual proteins. In the present contribution we show how basin-hopping global optimization can identify low-lying minima for the corresponding mildly frustrated energy landscapes. For small systems the basin-hopping algorithm succeeds in locating both lower minima and conformations closer to the experimental structure than does molecular dynamics with simulated annealing. For large systems the efficiency of basin-hopping decreases for our initial implementation, where the steps consist of random perturbations to the Cartesian coordinates. We implemented umbrella sampling using basin-hopping to further confirm when the global minima are reached. We have also improved the energy surface by employing bioinformatic techniques for reducing the roughness or variance of the energy surface. Finally, the basin-hopping calculations have guided improvements in the excluded volume of the Hamiltonian, producing better structures. These results suggest a novel and transferable optimization scheme for future energy function development.

  12. Structure prediction of magnetosome-associated proteins.

    PubMed

    Nudelman, Hila; Zarivach, Raz

    2014-01-01

    Magnetotactic bacteria (MTB) are Gram-negative bacteria that can navigate along geomagnetic fields. This ability is a result of a unique intracellular organelle, the magnetosome. These organelles are composed of membrane-enclosed magnetite (Fe3O4) or greigite (Fe3S4) crystals ordered into chains along the cell. Magnetosome formation, assembly, and magnetic nano-crystal biomineralization are controlled by magnetosome-associated proteins (MAPs). Most MAP-encoding genes are located in a conserved genomic region - the magnetosome island (MAI). The MAI appears to be conserved in all MTB that were analyzed so far, although the MAI size and organization differs between species. It was shown that MAI deletion leads to a non-magnetic phenotype, further highlighting its important role in magnetosome formation. Today, about 28 proteins are known to be involved in magnetosome formation, but the structures and functions of most MAPs are unknown. To reveal the structure-function relationship of MAPs we used bioinformatics tools in order to build homology models as a way to understand their possible role in magnetosome formation. Here we present a predicted 3D structural models' overview for all known Magnetospirillum gryphiswaldense strain MSR-1 MAPs. PMID:24523717

  13. Restriction versus guidance in protein structure prediction.

    PubMed

    Hegler, Joseph A; Lätzer, Joachim; Shehu, Amarda; Clementi, Cecilia; Wolynes, Peter G

    2009-09-01

    Conformational restriction by fragment assembly and guidance in molecular dynamics are alternate conformational search strategies in protein structure prediction. We examine both approaches using a version of the associative memory Hamiltonian that incorporates the influence of water-mediated interactions (AMW). For short proteins (<70 residues), fragment assembly, while searching a restricted space, compares well to molecular dynamics and is often sufficient to fold such proteins to near-native conformations (4A) via simulated annealing. Longer proteins encounter kinetic sampling limitations in fragment assembly not seen in molecular dynamics which generally samples more native-like conformations. We also present a fragment enriched version of the standard AMW energy function, AMW-FME, which incorporates the local sequence alignment derived fragment libraries from fragment assembly directly into the energy function. This energy function, in which fragment information acts as a guide not a restriction, is found by molecular dynamics to improve on both previous approaches. PMID:19706384

  14. Optimizing nondecomposable loss functions in structured prediction.

    PubMed

    Ranjbar, Mani; Lan, Tian; Wang, Yang; Robinovitch, Steven N; Li, Ze-Nian; Mori, Greg

    2013-04-01

    We develop an algorithm for structured prediction with nondecomposable performance measures. The algorithm learns parameters of Markov Random Fields (MRFs) and can be applied to multivariate performance measures. Examples include performance measures such as Fβ score (natural language processing), intersection over union (object category segmentation), Precision/Recall at k (search engines), and ROC area (binary classifiers). We attack this optimization problem by approximating the loss function with a piecewise linear function. The loss augmented inference forms a Quadratic Program (QP), which we solve using LP relaxation. We apply this approach to two tasks: object class-specific segmentation and human action retrieval from videos. We show significant improvement over baseline approaches that either use simple loss functions or simple scoring functions on the PASCAL VOC and H3D Segmentation datasets, and a nursing home action recognition dataset. PMID:22868650

  15. Studies of the B-Z transition of DNA: The temperature dependence of the free-energy difference, the composition of the counterion sheath in mixed salt, and the preparation of a sample of the 5'-d[T-(m(5) C-G)12 -T] duplex in pure B-DNA or Z-DNA form.

    PubMed

    Guéron, Maurice; Plateau, Pierre; Filoche, Marcel

    2016-07-01

    It is often envisioned that cations might coordinate at specific sites of nucleic acids and play an important structural role, for instance in the transition between B-DNA and Z-DNA. However, nucleic acid models explicitly devoid of specific sites may also exhibit features previously considered as evidence for specific binding. Such is the case of the "composite cylinder" (or CC) model which spreads out localized features of DNA structure and charge by cylindrical averaging, while sustaining the main difference between the B and Z structures, namely the better immersion of the B-DNA phosphodiester charges in the solution. Here, we analyze the non-electrostatic component of the free-energy difference between B-DNA and Z-DNA. We also compute the composition of the counterion sheath in a wide range of mixed-salt solutions and of temperatures: in contrast with the large difference of composition between the B-DNA and Z-DNA forms, the temperature dependence of sheath composition, previously unknown, is very weak. In order to validate the model, the mixed-salt predictions should be compared to experiment. We design a procedure for future measurements of the sheath composition based on Anomalous Small-Angle X-ray Scattering and complemented by (31) P NMR. With due consideration for the kinetics of the B-Z transition and for the capacity of generating at will the B or Z form in a single sample, the 5'-d[T-(m(5) C-G)12 -T] 26-mer emerges as a most suitable oligonucleotide for this study. Finally, the application of the finite element method to the resolution of the Poisson-Boltzmann equation is described in detail. © 2016 Wiley Periodicals, Inc. Biopolymers 105: 369-384, 2016. PMID:26900058

  16. Structure of nonevaporating sprays - Measurements and predictions

    NASA Technical Reports Server (NTRS)

    Solomon, A. S. P.; Shuen, J.-S.; Zhang, Q.-F.; Faeth, G. M.

    1984-01-01

    Structure measurements were completed within the dilute portion of axisymmetric nonevaporating sprays (SMD of 30 and 87 microns) injected into a still air environment, including: mean and fluctuating gas velocities and Reynolds stress using laser-Doppler anemometry; mean liquid fluxes using isokinetic sampling; drop sizes using slide impaction; and drop sizes and velocities using multiflash photography. The new measurements were used to evaluate three representative models of sprays: (1) a locally homogeneous flow (LHF) model, where slip between the phases was neglected; (2) a deterministic separated flow (DSF) model, where slip was considered but effects of drop interaction with turbulent fluctuations were ignored; and (3) a stochastic separated flow (SSF) model, where effects of both interphase slip and turbulent fluctuations were considered using random sampling for turbulence properties in conjunction with random-walk computations for drop motion. The LHF and DSF models were unsatisfactory for present test conditions-both underestimating flow widths and the rate of spread of drops. In contrast, the SSF model provided reasonably accurate predictions, including effects of enhanced spreading rates of sprays due to drop dispersion by turbulence, with all empirical parameters fixed from earlier work.

  17. RNA-SSPT: RNA Secondary Structure Prediction Tools.

    PubMed

    Ahmad, Freed; Mahboob, Shahid; Gulzar, Tahsin; Din, Salah U; Hanif, Tanzeela; Ahmad, Hifza; Afzal, Muhammad

    2013-01-01

    The prediction of RNA structure is useful for understanding evolution for both in silico and in vitro studies. Physical methods like NMR studies to predict RNA secondary structure are expensive and difficult. Computational RNA secondary structure prediction is easier. Comparative sequence analysis provides the best solution. But secondary structure prediction of a single RNA sequence is challenging. RNA-SSPT is a tool that computationally predicts secondary structure of a single RNA sequence. Most of the RNA secondary structure prediction tools do not allow pseudoknots in the structure or are unable to locate them. Nussinov dynamic programming algorithm has been implemented in RNA-SSPT. The current studies shows only energetically most favorable secondary structure is required and the algorithm modification is also available that produces base pairs to lower the total free energy of the secondary structure. For visualization of RNA secondary structure, NAVIEW in C language is used and modified in C# for tool requirement. RNA-SSPT is built in C# using Dot Net 2.0 in Microsoft Visual Studio 2005 Professional edition. The accuracy of RNA-SSPT is tested in terms of Sensitivity and Positive Predicted Value. It is a tool which serves both secondary structure prediction and secondary structure visualization purposes. PMID:24250115

  18. A comprehensive comparison of comparative RNA structure prediction approaches

    PubMed Central

    Gardner, Paul P; Giegerich, Robert

    2004-01-01

    Background An increasing number of researchers have released novel RNA structure analysis and prediction algorithms for comparative approaches to structure prediction. Yet, independent benchmarking of these algorithms is rarely performed as is now common practice for protein-folding, gene-finding and multiple-sequence-alignment algorithms. Results Here we evaluate a number of RNA folding algorithms using reliable RNA data-sets and compare their relative performance. Conclusions We conclude that comparative data can enhance structure prediction but structure-prediction-algorithms vary widely in terms of both sensitivity and selectivity across different lengths and homologies. Furthermore, we outline some directions for future research. PMID:15458580

  19. Protein short loop prediction in terms of a structural alphabet.

    PubMed

    Tyagi, Manoj; Bornot, Aurélie; Offmann, Bernard; de Brevern, Alexandre G

    2009-08-01

    Loops connect regular secondary structures. In many instances, they are known to play crucial biological roles. To bypass the limitation of secondary structure description, we previously defined a structural alphabet composed of 16 structural prototypes, called Protein Blocks (PBs). It leads to an accurate description of every region of 3D protein backbones and has been used in local structure prediction. In the present study, we used our structural alphabet to predict the loops connecting two repetitive structures. Thus, we showed interest to take into account the flanking regions, leading to prediction rate improvement up to 19.8%, but we also underline the sensitivity of such an approach. This research can be used to propose different structures for the loops and to probe and sample their flexibility. It is a useful tool for ab initio loop prediction and leads to insights into flexible docking approach. PMID:19625218

  20. Effects of Complementary DNA and Salt on the Thermoresponsiveness of Poly(N-isopropylacrylamide)-b-DNA.

    PubMed

    Fujita, Masahiro; Hiramine, Hayato; Pan, Pengju; Hikima, Takaaki; Maeda, Mizuo

    2016-02-01

    The thermoresponsive structural transition of poly(N-isopropylacrylamide) (PNIPAAm)-b-DNA copolymers was explored. Molecular assembly of the block copolymers was facilitated by adding salt, and this assembly was not nucleated by the association between DNA strands but by the coil-globule transition of PNIPAAm blocks. Below the lower critical solution temperature (LCST) of PNIPAAm, the copolymer solution remained transparent even at high salt concentrations, regardless of whether DNA was hybridized with its complementary partner to form a double-strand (or single-strand) structure. At the LCST, the hybridized copolymer assembled in spherical nanoparticles, surrounded by double-stranded DNA; subsequently, the non-cross-linking aggregation occurred, while the nanoparticles were dispersed if the salt concentration was low or DNA blocks were unhybridized. When the DNA duplex was denatured to a single-stranded state by heating, the aggregated nanoparticles redispersed owing to the recovery of the steric repulsion of the DNA strands. The changes in the steric and electrostatic effects by hybridization and the addition of salt did not result in any specific attraction between DNA strands but merely decreased the repulsive interactions. The van der Waals attraction between the nanoparticles overcame such repulsive interactions so that the non-cross-linking aggregation of the micellar particles was mediated. PMID:26750407

  1. Predicting Career Advancement with Structural Equation Modelling

    ERIC Educational Resources Information Center

    Heimler, Ronald; Rosenberg, Stuart; Morote, Elsa-Sofia

    2012-01-01

    Purpose: The purpose of this paper is to use the authors' prior findings concerning basic employability skills in order to determine which skills best predict career advancement potential. Design/methodology/approach: Utilizing survey responses of human resource managers, the employability skills showing the largest relationships to career

  2. Predicting Career Advancement with Structural Equation Modelling

    ERIC Educational Resources Information Center

    Heimler, Ronald; Rosenberg, Stuart; Morote, Elsa-Sofia

    2012-01-01

    Purpose: The purpose of this paper is to use the authors' prior findings concerning basic employability skills in order to determine which skills best predict career advancement potential. Design/methodology/approach: Utilizing survey responses of human resource managers, the employability skills showing the largest relationships to career…

  3. Prediction of binary hard-sphere crystal structures

    NASA Astrophysics Data System (ADS)

    Filion, Laura; Dijkstra, Marjolein

    2009-04-01

    We present a method based on a combination of a genetic algorithm and Monte Carlo simulations to predict close-packed crystal structures in hard-core systems. We employ this method to predict the binary crystal structures in a mixture of large and small hard spheres with various stoichiometries and diameter ratios between 0.4 and 0.84. In addition to known binary hard-sphere crystal structures similar to NaCl and AlB2 , we predict additional crystal structures with the symmetry of CrB, γCuTi , αIrV , HgBr2 , AuTe2 , Ag2Se , and various structures for which an atomic analog was not found. In order to determine the crystal structures at infinite pressures, we calculate the maximum packing density as a function of size ratio for the crystal structures predicted by our GA using a simulated annealing approach.

  4. RBO Aleph: leveraging novel information sources for protein structure prediction

    PubMed Central

    Mabrouk, Mahmoud; Putz, Ines; Werner, Tim; Schneider, Michael; Neeb, Moritz; Bartels, Philipp; Brock, Oliver

    2015-01-01

    RBO Aleph is a novel protein structure prediction web server for template-based modeling, protein contact prediction and ab initio structure prediction. The server has a strong emphasis on modeling difficult protein targets for which templates cannot be detected. RBO Aleph's unique features are (i) the use of combined evolutionary and physicochemical information to perform residue–residue contact prediction and (ii) leveraging this contact information effectively in conformational space search. RBO Aleph emerged as one of the leading approaches to ab initio protein structure prediction and contact prediction during the most recent Critical Assessment of Protein Structure Prediction experiment (CASP11, 2014). In addition to RBO Aleph's main focus on ab initio modeling, the server also provides state-of-the-art template-based modeling services. Based on template availability, RBO Aleph switches automatically between template-based modeling and ab initio prediction based on the target protein sequence, facilitating use especially for non-expert users. The RBO Aleph web server offers a range of tools for visualization and data analysis, such as the visualization of predicted models, predicted contacts and the estimated prediction error along the model's backbone. The server is accessible at http://compbio.robotics.tu-berlin.de/rbo_aleph/. PMID:25897112

  5. RBO Aleph: leveraging novel information sources for protein structure prediction.

    PubMed

    Mabrouk, Mahmoud; Putz, Ines; Werner, Tim; Schneider, Michael; Neeb, Moritz; Bartels, Philipp; Brock, Oliver

    2015-07-01

    RBO Aleph is a novel protein structure prediction web server for template-based modeling, protein contact prediction and ab initio structure prediction. The server has a strong emphasis on modeling difficult protein targets for which templates cannot be detected. RBO Aleph's unique features are (i) the use of combined evolutionary and physicochemical information to perform residue-residue contact prediction and (ii) leveraging this contact information effectively in conformational space search. RBO Aleph emerged as one of the leading approaches to ab initio protein structure prediction and contact prediction during the most recent Critical Assessment of Protein Structure Prediction experiment (CASP11, 2014). In addition to RBO Aleph's main focus on ab initio modeling, the server also provides state-of-the-art template-based modeling services. Based on template availability, RBO Aleph switches automatically between template-based modeling and ab initio prediction based on the target protein sequence, facilitating use especially for non-expert users. The RBO Aleph web server offers a range of tools for visualization and data analysis, such as the visualization of predicted models, predicted contacts and the estimated prediction error along the model's backbone. The server is accessible at http://compbio.robotics.tu-berlin.de/rbo_aleph/. PMID:25897112

  6. Prediction of incommensurate crystal structure in Ca at high pressure.

    PubMed

    Arapan, Sergiu; Mao, Ho-Kwang; Ahuja, Rajeev

    2008-12-30

    Ca shows an interesting high-pressure phase transformation sequence, but, despite similar physical properties at high pressure and affinity in the electronic structure with its neighbors in the periodic table, no complex phase has been identified for Ca so far. We predict an incommensurate high-pressure phase of Ca from first principle calculations and describe a procedure of estimating incommensurate structure parameters by means of electronic structure calculations for periodic crystals. Thus, by using the ab initio technique for periodic structures, one can get not only reliable information about the electronic structure and structural parameters of an incommensurate phase, but also identify and predict such phases in new elements. PMID:19104037

  7. Genome-wide Membrane Protein Structure Prediction

    PubMed Central

    Piccoli, Stefano; Suku, Eda; Garonzi, Marianna; Giorgetti, Alejandro

    2013-01-01

    Transmembrane proteins allow cells to extensively communicate with the external world in a very accurate and specific way. They form principal nodes in several signaling pathways and attract large interest in therapeutic intervention, as the majority pharmaceutical compounds target membrane proteins. Thus, according to the current genome annotation methods, a detailed structural/functional characterization at the protein level of each of the elements codified in the genome is also required. The extreme difficulty in obtaining high-resolution three-dimensional structures, calls for computational approaches. Here we review to which extent the efforts made in the last few years, combining the structural characterization of membrane proteins with protein bioinformatics techniques, could help describing membrane proteins at a genome-wide scale. In particular we analyze the use of comparative modeling techniques as a way of overcoming the lack of high-resolution three-dimensional structures in the human membrane proteome. PMID:24403851

  8. Neural network definitions of highly predictable protein secondary structure classes

    SciTech Connect

    Lapedes, A. |; Steeg, E.; Farber, R.

    1994-02-01

    We use two co-evolving neural networks to determine new classes of protein secondary structure which are significantly more predictable from local amino sequence than the conventional secondary structure classification. Accurate prediction of the conventional secondary structure classes: alpha helix, beta strand, and coil, from primary sequence has long been an important problem in computational molecular biology. Neural networks have been a popular method to attempt to predict these conventional secondary structure classes. Accuracy has been disappointingly low. The algorithm presented here uses neural networks to similtaneously examine both sequence and structure data, and to evolve new classes of secondary structure that can be predicted from sequence with significantly higher accuracy than the conventional classes. These new classes have both similarities to, and differences with the conventional alpha helix, beta strand and coil.

  9. A New Approach to Predict the Structure of Alloys

    NASA Astrophysics Data System (ADS)

    Curtarolo, Stefano; Morgan, Dane; Persson, Kristin; Ceder, Gerbrand; Rodgers, John

    2003-03-01

    The ability to predict the crystal structure of a material, given its constituent atoms, is one of the most fundamental problems in materials research. Knowledge of the crystal structure is essential to predict or rationalize properties of the material, from mechanical behavior, to optical and electronic properties. Despite its importance, the structure problem remains unsolved and most crystal structure determinations are performed after synthesis, by experimental means. While first principles computations can be used to predict with high accuracy a structural energy, ground state searches are usually limited to calculating the energy of a small number of pre-defined structures. Hence it is difficult to make predictions for completely novel and unknown systems. In order to drastically improve the capability of predicting the ground states of intermetallic alloys, we present an algorithm that can rank a relatively large number of trial structures in terms of the probability that they are ground states. First principles predictions can then be performed on the most likely candidates. With each first principles calculation, the candidate list is improved. This technique makes it possible to predict intermetallic ground states with ˜90% accuracy using only ˜20 first principles calculations. Unlike previous methods, this approach is not limited to super structures of a given lattice type and extends relatively easily to multi-component systems.

  10. Prediction of protein structural classes using hybrid properties.

    PubMed

    Li, Wenjin; Lin, Kao; Feng, Kaiyan; Cai, Yudong

    2008-01-01

    In this paper, amino acid compositions are combined with some protein sequence properties (physiochemical properties) to predict protein structural classes. We are able to predict protein structural classes using a mathematical model that combines the nearest neighbor algorithm (NNA), mRMR (minimum redundancy, maximum relevance), and feature forward searching strategy. Jackknife cross-validation is used to evaluate the prediction accuracy. As a result, the prediction success rate improves to 68.8%, which is better than the 62.2% obtained when using only amino acid compositions. Therefore, we conclude that the physiochemical properties are factors that contribute to the protein folding phenomena and the most contributing features are found to be the amino acid composition. We expect that prediction accuracy will improve further as more sequence information comes to light. A web server for predicting the protein structural classes is available at http://app3.biosino.org:8080/liwenjin/index.jsp. PMID:18953662

  11. Comparative melting and healing of B-DNA and Z-DNA by an infrared laser pulse.

    PubMed

    Man, Viet Hoang; Pan, Feng; Sagui, Celeste; Roland, Christopher

    2016-04-14

    We explore the use of a fast laser melting simulation approach combined with atomistic molecular dynamics simulations in order to determine the melting and healing responses of B-DNA and Z-DNA dodecamers with the same d(5'-CGCGCGCGCGCG-3')2 sequence. The frequency of the laser pulse is specifically tuned to disrupt Watson-Crick hydrogen bonds, thus inducing melting of the DNA duplexes. Subsequently, the structures relax and partially refold, depending on the field strength. In addition to the inherent interest of the nonequilibrium melting process, we propose that fast melting by an infrared laser pulse could be used as a technique for a fast comparison of relative stabilities of same-sequence oligonucleotides with different secondary structures with full atomistic detail of the structures and solvent. This could be particularly useful for nonstandard secondary structures involving non-canonical base pairs, mismatches, etc. PMID:27083751

  12. Comparative melting and healing of B-DNA and Z-DNA by an infrared laser pulse

    NASA Astrophysics Data System (ADS)

    Man, Viet Hoang; Pan, Feng; Sagui, Celeste; Roland, Christopher

    2016-04-01

    We explore the use of a fast laser melting simulation approach combined with atomistic molecular dynamics simulations in order to determine the melting and healing responses of B-DNA and Z-DNA dodecamers with the same d(5'-CGCGCGCGCGCG-3')2 sequence. The frequency of the laser pulse is specifically tuned to disrupt Watson-Crick hydrogen bonds, thus inducing melting of the DNA duplexes. Subsequently, the structures relax and partially refold, depending on the field strength. In addition to the inherent interest of the nonequilibrium melting process, we propose that fast melting by an infrared laser pulse could be used as a technique for a fast comparison of relative stabilities of same-sequence oligonucleotides with different secondary structures with full atomistic detail of the structures and solvent. This could be particularly useful for nonstandard secondary structures involving non-canonical base pairs, mismatches, etc.

  13. Predicting Conformational Flexibility in Protein Structure

    NASA Astrophysics Data System (ADS)

    Jacobs, Donald J.; Kuhn, Leslie A.; Thorpe, Michael F.

    1999-04-01

    The microstructure of a protein is represented as a generic bar-joint truss framework, where hard covalent forces and strong hydrogen bonds are modeled as distance constraints. The mechanical stability is analyzed using graph theoretical techniques with the aid of the FIRST program that determines the Floppy Inclusion and Rigid Substructure Topography. FIRST provides a real-time tool for evaluating intrinsic flexibility in protein structure. Unlike many methods for parsing protein folds, this approach calculates exact mechanical properties of a protein structure (and other macromolecules) under a given set of distance constraints. These properties include: counting the number of independent degrees of freedom, locating overconstrained regions where internal strain arises, partitioning the protein structure into rigid clusters and identifying underconstrained regions where continuous deformations can take place. We quantify the degree of conformational flexibility in HIV protease, and find that the characterization correlates well with mobility and conformational changes observed crystallographically.

  14. A physical approach to protein structure prediction: CASP4 results

    SciTech Connect

    Crivelli, Silvia; Eskow, Elizabeth; Bader, Brett; Lamberti, Vincent; Byrd, Richard; Schnabel, Robert; Head-Gordon, Teresa

    2001-02-27

    We describe our global optimization method called Stochastic Perturbation with Soft Constraints (SPSC), which uses information from known proteins to predict secondary structure, but not in the tertiary structure predictions or in generating the terms of the physics-based energy function. Our approach is also characterized by the use of an all atom energy function that includes a novel hydrophobic solvation function derived from experiments that shows promising ability for energy discrimination against misfolded structures. We present the results obtained using our SPSC method and energy function for blind prediction in the 4th Critical Assessment of Techniques for Protein Structure Prediction (CASP4) competition, and show that our approach is more effective on targets for which less information from known proteins is available. In fact our SPSC method produced the best prediction for one of the most difficult targets of the competition, a new fold protein of 240 amino acids.

  15. Quantifying variances in comparative RNA secondary structure prediction

    PubMed Central

    2013-01-01

    Background With the advancement of next-generation sequencing and transcriptomics technologies, regulatory effects involving RNA, in particular RNA structural changes are being detected. These results often rely on RNA secondary structure predictions. However, current approaches to RNA secondary structure modelling produce predictions with a high variance in predictive accuracy, and we have little quantifiable knowledge about the reasons for these variances. Results In this paper we explore a number of factors which can contribute to poor RNA secondary structure prediction quality. We establish a quantified relationship between alignment quality and loss of accuracy. Furthermore, we define two new measures to quantify uncertainty in alignment-based structure predictions. One of the measures improves on the “reliability score” reported by PPfold, and considers alignment uncertainty as well as base-pair probabilities. The other measure considers the information entropy for SCFGs over a space of input alignments. Conclusions Our predictive accuracy improves on the PPfold reliability score. We can successfully characterize many of the underlying reasons for and variances in poor prediction. However, there is still variability unaccounted for, which we therefore suggest comes from the RNA secondary structure predictive model itself. PMID:23634662

  16. Protein Structure and Function Prediction Using I-TASSER

    PubMed Central

    Yang, Jianyi; Zhang, Yang

    2016-01-01

    I-TASSER is a hierarchical protocol for automated protein structure prediction and structure-based function annotation. Starting from the amino acid sequence of target proteins, I-TASSER first generates full-length atomic structural models from multiple threading alignments and iterative structural assembly simulations followed by atomic-level structure refinement. The biological functions of the protein, including ligand-binding sites, enzyme commission number, and gene ontology terms, are then inferred from known protein function databases based on sequence and structure profile comparisons. I-TASSER is freely available as both an on-line server and a stand-alone package. This unit describes how to use the I-TASSER protocol to generate structure and function prediction and how to interpret the prediction results, as well as alternative approaches for further improving the I-TASSER modeling quality for distant-homologous and multi-domain protein targets. PMID:26678386

  17. PredyFlexy: flexibility and local structure prediction from sequence

    PubMed Central

    de Brevern, Alexandre G.; Bornot, Aurélie; Craveur, Pierrick; Etchebest, Catherine; Gelly, Jean-Christophe

    2012-01-01

    Protein structures are necessary for understanding protein function at a molecular level. Dynamics and flexibility of protein structures are also key elements of protein function. So, we have proposed to look at protein flexibility using novel methods: (i) using a structural alphabet and (ii) combining classical X-ray B-factor data and molecular dynamics simulations. First, we established a library composed of structural prototypes (LSPs) to describe protein structure by a limited set of recurring local structures. We developed a prediction method that proposes structural candidates in terms of LSPs and predict protein flexibility along a given sequence. Second, we examine flexibility according to two different descriptors: X-ray B-factors considered as good indicators of flexibility and the root mean square fluctuations, based on molecular dynamics simulations. We then define three flexibility classes and propose a method based on the LSP prediction method for predicting flexibility along the sequence. This method does not resort to sophisticate learning of flexibility but predicts flexibility from average flexibility of predicted local structures. The method is implemented in PredyFlexy web server. Results are similar to those obtained with the most recent, cutting-edge methods based on direct learning of flexibility data conducted with sophisticated algorithms. PredyFlexy can be accessed at http://www.dsimb.inserm.fr/dsimb_tools/predyflexy/. PMID:22689641

  18. Predicting Crystal Structures with Data Mining of Quantum Calculations

    NASA Astrophysics Data System (ADS)

    Curtarolo, Stefano; Morgan, Dane; Persson, Kristin; Rodgers, John; Ceder, Gerbrand

    2003-09-01

    Predicting and characterizing the crystal structure of materials is a key problem in materials research and development. It is typically addressed with highly accurate quantum mechanical computations on a small set of candidate structures, or with empirical rules that have been extracted from a large amount of experimental information, but have limited predictive power. In this Letter, we transfer the concept of heuristic rule extraction to a large library of abinitio calculated information, and we demonstrate that this can be developed into a tool for crystal structure prediction.

  19. Computational methods in sequence and structure prediction

    NASA Astrophysics Data System (ADS)

    Lang, Caiyi

    This dissertation is organized into two parts. In the first part, we will discuss three computational methods for cis-regulatory element recognition in three different gene regulatory networks as the following: (a) Using a comprehensive "Phylogenetic Footprinting Comparison" method, we will investigate the promoter sequence structures of three enzymes (PAL, CHS and DFR) that catalyze sequential steps in the pathway from phenylalanine to anthocyanins in plants. Our result shows there exists a putative cis-regulatory element "AC(C/G)TAC(C)" in the upstream of these enzyme genes. We propose this cis-regulatory element to be responsible for the genetic regulation of these three enzymes and this element, might also be the binding site for MYB class transcription factor PAP1. (b) We will investigate the role of the Arabidopsis gene glutamate receptor 1.1 (AtGLR1.1) in C and N metabolism by utilizing the microarray data we obtained from AtGLR1.1 deficient lines (antiAtGLR1.1). We focus our investigation on the putatively co-regulated transcript profile of 876 genes we have collected in antiAtGLR1.1 lines. By (a) scanning the occurrence of several groups of known abscisic acid (ABA) related cisregulatory elements in the upstream regions of 876 Arabidopsis genes; and (b) exhaustive scanning of all possible 6-10 bps motif occurrence in the upstream regions of the same set of genes, we are able to make a quantative estimation on the enrichment level of each of the cis-regulatory element candidates. We finally conclude that one specific cis-regulatory element group, called "ABRE" elements, are statistically highly enriched within the 876-gene group as compared to their occurrence within the genome. (c) We will introduce a new general purpose algorithm, called "fuzzy REDUCE1", which we have developed recently for automated cis-regulatory element identification. In the second part, we will discuss our newly devised protein design framework. With this framework we have developed a software package which is capable of designing novel protein structures at the atomic resolution. This software package allows us to perform protein structure design with a flexible backbone. The backbone flexibility includes loop region relaxation as well as a secondary structure collective mode relaxation scheme. (Abstract shortened by UMI.)

  20. Structural class prediction of protein using novel feature extraction method from chaos game representation of predicted secondary structure.

    PubMed

    Zhang, Lichao; Kong, Liang; Han, Xiaodong; Lv, Jinfeng

    2016-07-01

    Protein structural class prediction plays an important role in protein structure and function analysis, drug design and many other biological applications. Extracting good representation from protein sequence is fundamental for this prediction task. In recent years, although several secondary structure based feature extraction strategies have been specially proposed for low-similarity protein sequences, the prediction accuracy still remains limited. To explore the potential of secondary structure information, this study proposed a novel feature extraction method from the chaos game representation of predicted secondary structure to mainly capture sequence order information and secondary structure segments distribution information in a given protein sequence. Several kinds of prediction accuracies obtained by the jackknife test are reported on three widely used low-similarity benchmark datasets (25PDB, 1189 and 640). Compared with the state-of-the-art prediction methods, the proposed method achieves the highest overall accuracies on all the three datasets. The experimental results confirm that the proposed feature extraction method is effective for accurate prediction of protein structural class. Moreover, it is anticipated that the proposed method could be extended to other graphical representations of protein sequence and be helpful in future research. PMID:27084358

  1. Are predicted protein structures of any value for binding site prediction and virtual ligand screening?

    PubMed Central

    Skolnick, Jeffrey; Zhou, Hongyi; Gao, Mu

    2013-01-01

    The recently developed field of ligand homology modeling, LHM, that extends the ideas of protein homology modeling to the prediction of ligand binding sites and for use in virtual ligand screening has emerged as a powerful new approach. Unlike traditional docking methodologies, LHM can be applied to low-to-moderate resolution predicted as well as experimental structures with little if any diminution in performance; thereby enabling ~75% of an average proteome to have potentially significant virtual screening predictions. In large scale benchmarking, LHM is able to predict off-target ligand binding. Thus, despite the widespread belief to the contrary, low-to-moderate resolution predicted structures have considerable utility for biochemical function prediction. PMID:23415854

  2. WeFold: A Coopetition for Protein Structure Prediction

    PubMed Central

    Khoury, George A.; Liwo, Adam; Khatib, Firas; Zhou, Hongyi; Chopra, Gaurav; Bacardit, Jaume; Bortot, Leandro O.; Faccioli, Rodrigo A.; Deng, Xin; He, Yi; Krupa, Pawel; Li, Jilong; Mozolewska, Magdalena A.; Sieradzan, Adam K.; Smadbeck, James; Wirecki, Tomasz; Cooper, Seth; Flatten, Jeff; Xu, Kefan; Baker, David; Cheng, Jianlin; Delbem, Alexandre C. B.; Floudas, Christodoulos A.; Keasar, Chen; Levitt, Michael; Popović, Zoran; Scheraga, Harold A.; Skolnick, Jeffrey; Crivelli, Silvia N.; Players, Foldit

    2014-01-01

    The protein structure prediction problem continues to elude scientists. Despite the introduction of many methods, only modest gains were made over the last decade for certain classes of prediction targets. To address this challenge, a social-media based worldwide collaborative effort, named WeFold, was undertaken by thirteen labs. During the collaboration, the labs were simultaneously competing with each other. Here, we present the first attempt at “coopetition” in scientific research applied to the protein structure prediction and refinement problems. The coopetition was possible by allowing the participating labs to contribute different components of their protein structure prediction pipelines and create new hybrid pipelines that they tested during CASP10. This manuscript describes both successes and areas needing improvement as identified throughout the first WeFold experiment and discusses the efforts that are underway to advance this initiative. A footprint of all contributions and structures are publicly accessible at http://www.wefold.org. PMID:24677212

  3. WeFold: a coopetition for protein structure prediction.

    PubMed

    Khoury, George A; Liwo, Adam; Khatib, Firas; Zhou, Hongyi; Chopra, Gaurav; Bacardit, Jaume; Bortot, Leandro O; Faccioli, Rodrigo A; Deng, Xin; He, Yi; Krupa, Pawel; Li, Jilong; Mozolewska, Magdalena A; Sieradzan, Adam K; Smadbeck, James; Wirecki, Tomasz; Cooper, Seth; Flatten, Jeff; Xu, Kefan; Baker, David; Cheng, Jianlin; Delbem, Alexandre C B; Floudas, Christodoulos A; Keasar, Chen; Levitt, Michael; Popović, Zoran; Scheraga, Harold A; Skolnick, Jeffrey; Crivelli, Silvia N

    2014-09-01

    The protein structure prediction problem continues to elude scientists. Despite the introduction of many methods, only modest gains were made over the last decade for certain classes of prediction targets. To address this challenge, a social-media based worldwide collaborative effort, named WeFold, was undertaken by 13 labs. During the collaboration, the laboratories were simultaneously competing with each other. Here, we present the first attempt at "coopetition" in scientific research applied to the protein structure prediction and refinement problems. The coopetition was possible by allowing the participating labs to contribute different components of their protein structure prediction pipelines and create new hybrid pipelines that they tested during CASP10. This manuscript describes both successes and areas needing improvement as identified throughout the first WeFold experiment and discusses the efforts that are underway to advance this initiative. A footprint of all contributions and structures are publicly accessible at http://www.wefold.org. PMID:24677212

  4. JPred4: a protein secondary structure prediction server.

    PubMed

    Drozdetskiy, Alexey; Cole, Christian; Procter, James; Barton, Geoffrey J

    2015-07-01

    JPred4 (http://www.compbio.dundee.ac.uk/jpred4) is the latest version of the popular JPred protein secondary structure prediction server which provides predictions by the JNet algorithm, one of the most accurate methods for secondary structure prediction. In addition to protein secondary structure, JPred also makes predictions of solvent accessibility and coiled-coil regions. The JPred service runs up to 94 000 jobs per month and has carried out over 1.5 million predictions in total for users in 179 countries. The JPred4 web server has been re-implemented in the Bootstrap framework and JavaScript to improve its design, usability and accessibility from mobile devices. JPred4 features higher accuracy, with a blind three-state (α-helix, β-strand and coil) secondary structure prediction accuracy of 82.0% while solvent accessibility prediction accuracy has been raised to 90% for residues <5% accessible. Reporting of results is enhanced both on the website and through the optional email summaries and batch submission results. Predictions are now presented in SVG format with options to view full multiple sequence alignments with and without gaps and insertions. Finally, the help-pages have been updated and tool-tips added as well as step-by-step tutorials. PMID:25883141

  5. A predictive structural model for bulk metallic glasses

    PubMed Central

    Laws, K. J.; Miracle, D. B.; Ferry, M.

    2015-01-01

    Great progress has been made in understanding the atomic structure of metallic glasses, but there is still no clear connection between atomic structure and glass-forming ability. Here we give new insights into perhaps the most important question in the field of amorphous metals: how can glass-forming ability be predicted from atomic structure? We give a new approach to modelling metallic glass atomic structures by solving three long-standing problems: we discover a new family of structural defects that discourage glass formation; we impose efficient local packing around all atoms simultaneously; and we enforce structural self-consistency. Fewer than a dozen binary structures satisfy these constraints, but extra degrees of freedom in structures with three or more different atom sizes significantly expand the number of relatively stable, ‘bulk' metallic glasses. The present work gives a new approach towards achieving the long-sought goal of a predictive capability for bulk metallic glasses. PMID:26370667

  6. A predictive structural model for bulk metallic glasses

    NASA Astrophysics Data System (ADS)

    Laws, K. J.; Miracle, D. B.; Ferry, M.

    2015-09-01

    Great progress has been made in understanding the atomic structure of metallic glasses, but there is still no clear connection between atomic structure and glass-forming ability. Here we give new insights into perhaps the most important question in the field of amorphous metals: how can glass-forming ability be predicted from atomic structure? We give a new approach to modelling metallic glass atomic structures by solving three long-standing problems: we discover a new family of structural defects that discourage glass formation; we impose efficient local packing around all atoms simultaneously; and we enforce structural self-consistency. Fewer than a dozen binary structures satisfy these constraints, but extra degrees of freedom in structures with three or more different atom sizes significantly expand the number of relatively stable, `bulk' metallic glasses. The present work gives a new approach towards achieving the long-sought goal of a predictive capability for bulk metallic glasses.

  7. A predictive structural model for bulk metallic glasses.

    PubMed

    Laws, K J; Miracle, D B; Ferry, M

    2015-01-01

    Great progress has been made in understanding the atomic structure of metallic glasses, but there is still no clear connection between atomic structure and glass-forming ability. Here we give new insights into perhaps the most important question in the field of amorphous metals: how can glass-forming ability be predicted from atomic structure? We give a new approach to modelling metallic glass atomic structures by solving three long-standing problems: we discover a new family of structural defects that discourage glass formation; we impose efficient local packing around all atoms simultaneously; and we enforce structural self-consistency. Fewer than a dozen binary structures satisfy these constraints, but extra degrees of freedom in structures with three or more different atom sizes significantly expand the number of relatively stable, 'bulk' metallic glasses. The present work gives a new approach towards achieving the long-sought goal of a predictive capability for bulk metallic glasses. PMID:26370667

  8. Methods for evaluating the predictive accuracy of structural dynamic models

    NASA Technical Reports Server (NTRS)

    Hasselman, Timothy K.; Chrostowski, Jon D.

    1991-01-01

    Modeling uncertainty is defined in terms of the difference between predicted and measured eigenvalues and eigenvectors. Data compiled from 22 sets of analysis/test results was used to create statistical databases for large truss-type space structures and both pretest and posttest models of conventional satellite-type space structures. Modeling uncertainty is propagated through the model to produce intervals of uncertainty on frequency response functions, both amplitude and phase. This methodology was used successfully to evaluate the predictive accuracy of several structures, including the NASA CSI Evolutionary Structure tested at Langley Research Center. Test measurements for this structure were within + one-sigma intervals of predicted accuracy for the most part, demonstrating the validity of the methodology and computer code.

  9. An object programming based environment for protein secondary structure prediction.

    PubMed

    Giacomini, M; Ruggiero, C; Sacile, R

    1996-01-01

    The most frequently used methods for protein secondary structure prediction are empirical statistical methods and rule based methods. A consensus system based on object-oriented programming is presented, which integrates the two approaches with the aim of improving the prediction quality. This system uses an object-oriented knowledge representation based on the concepts of conformation, residue and protein, where the conformation class is the basis, the residue class derives from it and the protein class derives from the residue class. The system has been tested with satisfactory results on several proteins of the Brookhaven Protein Data Bank. Its results have been compared with the results of the most widely used prediction methods, and they show a higher prediction capability and greater stability. Moreover, the system itself provides an index of the reliability of its current prediction. This system can also be regarded as a basis structure for programs of this kind. PMID:8803560

  10. A-DNA and B-DNA: Comparing Their Historical X-Ray Fiber Diffraction Images

    ERIC Educational Resources Information Center

    Lucas, Amand A.

    2008-01-01

    A-DNA and B-DNA are two secondary molecular conformations (among other allomorphs) that double-stranded DNA drawn into a fiber can assume, depending on the relative water content and other chemical parameters of the fiber. They were the first two forms to be observed by X-ray fiber diffraction in the early 1950s, respectively by Wilkins and

  11. A-DNA and B-DNA: Comparing Their Historical X-Ray Fiber Diffraction Images

    ERIC Educational Resources Information Center

    Lucas, Amand A.

    2008-01-01

    A-DNA and B-DNA are two secondary molecular conformations (among other allomorphs) that double-stranded DNA drawn into a fiber can assume, depending on the relative water content and other chemical parameters of the fiber. They were the first two forms to be observed by X-ray fiber diffraction in the early 1950s, respectively by Wilkins and…

  12. Text Prediction on Structured Data Entry in Healthcare

    PubMed Central

    Hua, L.; Wang, S.; Gong, Y.

    2014-01-01

    Summary Background Structured data entry pervades computerized patient safety event reporting systems and serves as a key component in collecting patient-related information in electronic health records. Clinicians would spend more time being with patients and arrive at a high probability of proper diagnosis and treatment, if data entry can be completed efficiently and effectively. Historically it has been proven text prediction holds potential for human performance regarding data entry in a variety of research areas. Objective This study aimed at examining a function of text prediction proposed for increasing efficiency and data quality in structured data entry. Methods We employed a two-group randomized design with fifty-two nurses in this usability study. Each participant was assigned the task of reporting patient falls by answering multiple choice questions either with or without the text prediction function. t-test statistics and linear regression model were applied to analyzing the results of the two groups. Results While both groups of participants exhibited a good capacity of accomplishing the assigned task, the results were an overall 13.0% time reduction and 3.9% increase of response accuracy for the group utilizing the prediction function. Conclusion As a primary attempt investigating the effectiveness of text prediction in healthcare, study findings validated the necessity of text prediction to structured date entry, and laid the ground for further research improving the effectiveness of text prediction in clinical settings. PMID:24734137

  13. Bayesian model of protein primary sequence for secondary structure prediction.

    PubMed

    Li, Qiwei; Dahl, David B; Vannucci, Marina; Hyun Joo; Tsai, Jerry W

    2014-01-01

    Determining the primary structure (i.e., amino acid sequence) of a protein has become cheaper, faster, and more accurate. Higher order protein structure provides insight into a protein's function in the cell. Understanding a protein's secondary structure is a first step towards this goal. Therefore, a number of computational prediction methods have been developed to predict secondary structure from just the primary amino acid sequence. The most successful methods use machine learning approaches that are quite accurate, but do not directly incorporate structural information. As a step towards improving secondary structure reduction given the primary structure, we propose a Bayesian model based on the knob-socket model of protein packing in secondary structure. The method considers the packing influence of residues on the secondary structure determination, including those packed close in space but distant in sequence. By performing an assessment of our method on 2 test sets we show how incorporation of multiple sequence alignment data, similarly to PSIPRED, provides balance and improves the accuracy of the predictions. Software implementing the methods is provided as a web application and a stand-alone implementation. PMID:25314659

  14. Blind Test of Physics-Based Prediction of Protein Structures

    PubMed Central

    Shell, M. Scott; Ozkan, S. Banu; Voelz, Vincent; Wu, Guohong Albert; Dill, Ken A.

    2009-01-01

    We report here a multiprotein blind test of a computer method to predict native protein structures based solely on an all-atom physics-based force field. We use the AMBER 96 potential function with an implicit (GB/SA) model of solvation, combined with replica-exchange molecular-dynamics simulations. Coarse conformational sampling is performed using the zipping and assembly method (ZAM), an approach that is designed to mimic the putative physical routes of protein folding. ZAM was applied to the folding of six proteins, from 76 to 112 monomers in length, in CASP7, a community-wide blind test ofprotein structure prediction. Because these predictions have about the same level of accuracy as typical bioinformatics methods, and do not utilize information from databases of known native structures, this work opens up the possibility of predicting the structures of membrane proteins, synthetic peptides, or other foldable polymers, for which there is little prior knowledge of native structures. This approach may also be useful for predicting physical protein folding routes, non-native conformations, and other physical properties from amino acid sequences. PMID:19186130

  15. An atomistic geometrical model of the B-DNA configuration for DNA-radiation interaction simulations

    NASA Astrophysics Data System (ADS)

    Bernal, M. A.; Sikansi, D.; Cavalcante, F.; Incerti, S.; Champion, C.; Ivanchenko, V.; Francis, Z.

    2013-12-01

    In this paper, an atomistic geometrical model for the B-DNA configuration is explained. This model accounts for five organization levels of the DNA, up to the 30 nm chromatin fiber. However, fragments of this fiber can be used to construct the whole genome. The algorithm developed in this work is capable to determine which is the closest atom with respect to an arbitrary point in space. It can be used in any application in which a DNA geometrical model is needed, for instance, in investigations related to the effects of ionizing radiations on the human genetic material. Successful consistency checks were carried out to test the proposed model. Catalogue identifier: AEPZ_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEPZ_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 1245 No. of bytes in distributed program, including test data, etc.: 6574 Distribution format: tar.gz Programming language: FORTRAN. Computer: Any. Operating system: Multi-platform. RAM: 2 Gb Classification: 3. Nature of problem: The Monte Carlo method is used to simulate the interaction of ionizing radiation with the human genetic material in order to determine DNA damage yields per unit absorbed dose. To accomplish this task, an algorithm to determine if a given energy deposition lies within a given target is needed. This target can be an atom or any other structure of the genetic material. Solution method: This is a stand-alone subroutine describing an atomic-resolution geometrical model of the B-DNA configuration. It is able to determine the closest atom to an arbitrary point in space. This model accounts for five organization levels of the human genetic material, from the nucleotide pair up to the 30 nm chromatin fiber. This subroutine carries out a series of coordinate transformations to find which is the closest atom containing an arbitrary point in space. Atom sizes are according to the corresponding van der Waals radii. Restrictions: The geometrical model presented here does not include the chromosome organization level but it could be easily build up by using fragments of the 30 nm chromatin fiber. Unusual features: To our knowledge, this is the first open source atomic-resolution DNA geometrical model developed for DNA-radiation interaction Monte Carlo simulations. In our tests, the current model took into account the explicit position of about 56×106 atoms, although the user may enhance this amount according to the necessities. Running time: This subroutine can process about 2 million points within a few minutes in a typical current computer.

  16. Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences

    PubMed Central

    2009-01-01

    Background Knowledge of structural class is used by numerous methods for identification of structural/functional characteristics of proteins and could be used for the detection of remote homologues, particularly for chains that share twilight-zone similarity. In contrast to existing sequence-based structural class predictors, which target four major classes and which are designed for high identity sequences, we predict seven classes from sequences that share twilight-zone identity with the training sequences. Results The proposed MODular Approach to Structural class prediction (MODAS) method is unique as it allows for selection of any subset of the classes. MODAS is also the first to utilize a novel, custom-built feature-based sequence representation that combines evolutionary profiles and predicted secondary structure. The features quantify information relevant to the definition of the classes including conservation of residues and arrangement and number of helix/strand segments. Our comprehensive design considers 8 feature selection methods and 4 classifiers to develop Support Vector Machine-based classifiers that are tailored for each of the seven classes. Tests on 5 twilight-zone and 1 high-similarity benchmark datasets and comparison with over two dozens of modern competing predictors show that MODAS provides the best overall accuracy that ranges between 80% and 96.7% (83.5% for the twilight-zone datasets), depending on the dataset. This translates into 19% and 8% error rate reduction when compared against the best performing competing method on two largest datasets. The proposed predictor provides accurate predictions at 58% accuracy for membrane proteins class, which is not considered by majority of existing methods, in spite that this class accounts for only 2% of the data. Our predictive model is analyzed to demonstrate how and why the input features are associated with the corresponding classes. Conclusions The improved predictions stem from the novel features that express collocation of the secondary structure segments in the protein sequence and that combine evolutionary and secondary structure information. Our work demonstrates that conservation and arrangement of the secondary structure segments predicted along the protein chain can successfully predict structural classes which are defined based on the spatial arrangement of the secondary structures. A web server is available at http://biomine.ece.ualberta.ca/MODAS/. PMID:20003388

  17. Learning and alignment methods applied to protein structure prediction.

    PubMed

    Gracy, J; Chiche, L; Sallantin, J

    1993-01-01

    Learning techniques are able to extract structural knowledge specific to a selected set of proteins. We describe two algorithms that optimize scores expressing the propensity of a polypeptide sequence to adopt a local fold. The first algorithm generates secondary structure prediction rules based on a dictionary of geometrical patterns frequently found in the learning database. The second algorithm leads to scores that indicate the fit between an amino acid and a given local structural environment. Dynamic programming is then used to align structural information profiles by modifying the local mutation cost with the above learned functions. The main features of the system are exemplified on the structural prediction of the N-terminal domain of the CD4 antigen. Then the usefulness of additional 3-D information in the alignment is benchmarked on eight pairs of weakly homologous proteins. PMID:8347722

  18. Contingency Table Browser − prediction of early stage protein structure

    PubMed Central

    Kalinowska, Barbara; Krzykalski, Artur; Roterman, Irena

    2015-01-01

    The Early Stage (ES) intermediate represents the starting structure in protein folding simulations based on the Fuzzy Oil Drop (FOD) model. The accuracy of FOD predictions is greatly dependent on the accuracy of the chosen intermediate. A suitable intermediate can be constructed using the sequence-structure relationship information contained in the so-called contingency table − this table expresses the likelihood of encountering various structural motifs for each tetrapeptide fragment in the amino acid sequence. The limited accuracy with which such structures could previously be predicted provided the motivation for a more indepth study of the contingency table itself. The Contingency Table Browser is a tool which can visualize, search and analyze the table. Our work presents possible applications of Contingency Table Browser, among them − analysis of specific protein sequences from the point of view of their structural ambiguity. PMID:26664034

  19. A life prediction model for laminated composite structural components

    NASA Technical Reports Server (NTRS)

    Allen, David H.

    1990-01-01

    A life prediction methodology for laminated continuous fiber composites subjected to fatigue loading conditions was developed. A summary is presented of research completed. A phenomenological damage evolution law was formulated for matrix cracking which is independent of stacking sequence. Mechanistic and physical support was developed for the phenomenological evolution law proposed above. The damage evolution law proposed above was implemented to a finite element computer program. And preliminary predictions were obtained for a structural component undergoing fatigue loading induced damage.

  20. Predicting crystal structures ab initio: group 14 nitrides and phosphides.

    PubMed

    Hart, Judy N; Allan, Neil L; Claeyssens, Frederik

    2010-08-14

    Crystal structures are predicted for a range of group 14 nitrides and phosphides with 1 : 1 stoichiometry, following our method of starting from the known structures for a range of binary compounds and looking for trends in the preferred local bonding environments in the optimised structures. We have previously applied this method to predict the structures of carbon nitride and phosphorus carbide. Here, we use a similar approach to predict the structures of silicon and germanium nitrides and phosphides with 1 : 1 stoichiometry. We find that the local bonding environments in the preferred structures for the nitrides are the same as those for the 3 : 4 stoichiometry. For the phosphides, we have found several possible structures with similar energies. Structures containing hypervalent phosphorus must be considered as these are often low in energy, particularly for GeP; these have not been included in previous work. The greater tendency to form hypervalent phosphorus in GeP than SiP can be rationalised by considering the bond enthalpies for the two compositions. PMID:20603659

  1. Confidence-Guided Local Structure Prediction with HHfrag

    PubMed Central

    Kalev, Ivan; Habeck, Michael

    2013-01-01

    We present a method to assess the reliability of local structure prediction from sequence. We introduce a greedy algorithm for filtering and enrichment of dynamic fragment libraries, compiled with remote-homology detection methods such as HHfrag. After filtering false hits at each target position, we reduce the fragment library to a minimal set of representative fragments, which are guaranteed to have correct local structure in regions of detectable conservation. We demonstrate that the location of conserved motifs in a protein sequence can be predicted by examining the recurrence and structural homogeneity of detected fragments. The resulting confidence score correlates with the local RMSD of the representative fragments and allows us to predict torsion angles from sequence with better accuracy compared to existing machine learning methods. PMID:24146881

  2. Cloud Prediction of Protein Structure and Function with PredictProtein for Debian

    PubMed Central

    Kaján, László; Yachdav, Guy; Vicedo, Esmeralda; Steinegger, Martin; Mirdita, Milot; Angermüller, Christof; Böhm, Ariane; Domke, Simon; Ertl, Julia; Mertes, Christian; Reisinger, Eva; Rost, Burkhard

    2013-01-01

    We report the release of PredictProtein for the Debian operating system and derivatives, such as Ubuntu, Bio-Linux, and Cloud BioLinux. The PredictProtein suite is available as a standard set of open source Debian packages. The release covers the most popular prediction methods from the Rost Lab, including methods for the prediction of secondary structure and solvent accessibility (profphd), nuclear localization signals (predictnls), and intrinsically disordered regions (norsnet). We also present two case studies that successfully utilize PredictProtein packages for high performance computing in the cloud: the first analyzes protein disorder for whole organisms, and the second analyzes the effect of all possible single sequence variants in protein coding regions of the human genome. PMID:23971032

  3. Cloud prediction of protein structure and function with PredictProtein for Debian.

    PubMed

    Kaján, László; Yachdav, Guy; Vicedo, Esmeralda; Steinegger, Martin; Mirdita, Milot; Angermüller, Christof; Böhm, Ariane; Domke, Simon; Ertl, Julia; Mertes, Christian; Reisinger, Eva; Staniewski, Cedric; Rost, Burkhard

    2013-01-01

    We report the release of PredictProtein for the Debian operating system and derivatives, such as Ubuntu, Bio-Linux, and Cloud BioLinux. The PredictProtein suite is available as a standard set of open source Debian packages. The release covers the most popular prediction methods from the Rost Lab, including methods for the prediction of secondary structure and solvent accessibility (profphd), nuclear localization signals (predictnls), and intrinsically disordered regions (norsnet). We also present two case studies that successfully utilize PredictProtein packages for high performance computing in the cloud: the first analyzes protein disorder for whole organisms, and the second analyzes the effect of all possible single sequence variants in protein coding regions of the human genome. PMID:23971032

  4. PredictProtein—an open resource for online prediction of protein structural and functional features

    PubMed Central

    Yachdav, Guy; Kloppmann, Edda; Kajan, Laszlo; Hecht, Maximilian; Goldberg, Tatyana; Hamp, Tobias; Hönigschmid, Peter; Schafferhans, Andrea; Roos, Manfred; Bernhofer, Michael; Richter, Lothar; Ashkenazy, Haim; Punta, Marco; Schlessinger, Avner; Bromberg, Yana; Schneider, Reinhard; Vriend, Gerrit; Sander, Chris; Ben-Tal, Nir; Rost, Burkhard

    2014-01-01

    PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein–protein binding sites (ISIS2), protein–polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org. PMID:24799431

  5. Servers for sequence–structure relationship analysis and prediction

    PubMed Central

    Dosztányi, Zsuzsanna; Magyar, Csaba; Tusnády, Gábor E.; Cserző, Miklós; Fiser, András; Simon, István

    2003-01-01

    We describe several algorithms and public servers that were developed to analyze and predict various features of protein structures. These servers provide information about the covalent state of cysteine (CYSREDOX), as well as about residues involved in non-covalent cross links that play an important role in the structural stability of proteins (SCIDE and SCPRED). We also discuss methods and servers developed to identify helical transmembrane proteins from large databases and rough genomic data, including two of the most popular transmembrane prediction methods, DAS and HMMTOP. Several biologically interesting applications of these servers are also presented. The servers are available through http://www.enzim.hu/servers.html. PMID:12824327

  6. Servers for sequence-structure relationship analysis and prediction.

    PubMed

    Dosztányi, Zsuzsanna; Magyar, Csaba; Tusnády, Gábor E; Cserzo, Miklós; Fiser, András; Simon, István

    2003-07-01

    We describe several algorithms and public servers that were developed to analyze and predict various features of protein structures. These servers provide information about the covalent state of cysteine (CYSREDOX), as well as about residues involved in non-covalent cross links that play an important role in the structural stability of proteins (SCIDE and SCPRED). We also discuss methods and servers developed to identify helical transmembrane proteins from large databases and rough genomic data, including two of the most popular transmembrane prediction methods, DAS and HMMTOP. Several biologically interesting applications of these servers are also presented. The servers are available through http://www.enzim.hu/servers.html. PMID:12824327

  7. Adaptive modelling of structured molecular representations for toxicity prediction

    NASA Astrophysics Data System (ADS)

    Bertinetto, Carlo; Duce, Celia; Micheli, Alessio; Solaro, Roberto; Tiné, Maria Rosaria

    2012-12-01

    We investigated the possibility of modelling structure-toxicity relationships by direct treatment of the molecular structure (without using descriptors) through an adaptive model able to retain the appropriate structural information. With respect to traditional descriptor-based approaches, this provides a more general and flexible way to tackle prediction problems that is particularly suitable when little or no background knowledge is available. Our method employs a tree-structured molecular representation, which is processed by a recursive neural network (RNN). To explore the realization of RNN modelling in toxicological problems, we employed a data set containing growth impairment concentrations (IGC50) for Tetrahymena pyriformis.

  8. PCI-SS: MISO dynamic nonlinear protein secondary structure prediction

    PubMed Central

    Green, James R; Korenberg, Michael J; Aboul-Magd, Mohammed O

    2009-01-01

    Background Since the function of a protein is largely dictated by its three dimensional configuration, determining a protein's structure is of fundamental importance to biology. Here we report on a novel approach to determining the one dimensional secondary structure of proteins (distinguishing α-helices, β-strands, and non-regular structures) from primary sequence data which makes use of Parallel Cascade Identification (PCI), a powerful technique from the field of nonlinear system identification. Results Using PSI-BLAST divergent evolutionary profiles as input data, dynamic nonlinear systems are built through a black-box approach to model the process of protein folding. Genetic algorithms (GAs) are applied in order to optimize the architectural parameters of the PCI models. The three-state prediction problem is broken down into a combination of three binary sub-problems and protein structure classifiers are built using 2 layers of PCI classifiers. Careful construction of the optimization, training, and test datasets ensures that no homology exists between any training and testing data. A detailed comparison between PCI and 9 contemporary methods is provided over a set of 125 new protein chains guaranteed to be dissimilar to all training data. Unlike other secondary structure prediction methods, here a web service is developed to provide both human- and machine-readable interfaces to PCI-based protein secondary structure prediction. This server, called PCI-SS, is available at . In addition to a dynamic PHP-generated web interface for humans, a Simple Object Access Protocol (SOAP) interface is added to permit invocation of the PCI-SS service remotely. This machine-readable interface facilitates incorporation of PCI-SS into multi-faceted systems biology analysis pipelines requiring protein secondary structure information, and greatly simplifies high-throughput analyses. XML is used to represent the input protein sequence data and also to encode the resulting structure prediction in a machine-readable format. To our knowledge, this represents the only publicly available SOAP-interface for a protein secondary structure prediction service with published WSDL interface definition. Conclusion Relative to the 9 contemporary methods included in the comparison cascaded PCI classifiers perform well, however PCI finds greatest application as a consensus classifier. When PCI is used to combine a sequence-to-structure PCI-based classifier with the current leading ANN-based method, PSIPRED, the overall error rate (Q3) is maintained while the rate of occurrence of a particularly detrimental error is reduced by up to 25%. This improvement in BAD score, combined with the machine-readable SOAP web service interface makes PCI-SS particularly useful for inclusion in a tertiary structure prediction pipeline. PMID:19615046

  9. Sizing Structures and Predicting Weight of a Spacecraft

    NASA Technical Reports Server (NTRS)

    Cerro, Jeffrey; Shore, C. P.

    2006-01-01

    EZDESIT is a computer program for choosing the sizes of structural components and predicting the weight of a spacecraft, aircraft, or other vehicle. In designing a vehicle, EZDESIT is used in conjunction with a finite-element structural- analysis program: Each structural component is sized within EZDESIT to withstand the loads expected to be encountered during operation, then the weights of all the structural finite elements are added to obtain the structural weight of the vehicle. The sizing of the structural components elements also alters the stiffness properties of the finiteelement model. The finite-element analysis and structural component sizing are iterated until the weight of the vehicle converges to a prescribed iterative difference.

  10. Prediction of protein folding rates from simplified secondary structure alphabet.

    PubMed

    Huang, Jitao T; Wang, Titi; Huang, Shanran R; Li, Xin

    2015-10-21

    Protein folding is a very complicated and highly cooperative dynamic process. However, the folding kinetics is likely to depend more on a few key structural features. Here we find that secondary structures can determine folding rates of only large, multi-state folding proteins and fails to predict those for small, two-state proteins. The importance of secondary structures for protein folding is ordered as: extended β strand > α helix > bend > turn > undefined secondary structure>310 helix > isolated β strand > π helix. Only the first three secondary structures, extended β strand, α helix and bend, can achieve a good correlation with folding rates. This suggests that the rate-limiting step of protein folding would depend upon the formation of regular secondary structures and the buckling of chain. The reduced secondary structure alphabet provides a simplified description for the machine learning applications in protein design. PMID:26247139

  11. Distance matrix-based approach to protein structure prediction.

    PubMed

    Kloczkowski, Andrzej; Jernigan, Robert L; Wu, Zhijun; Song, Guang; Yang, Lei; Kolinski, Andrzej; Pokarowski, Piotr

    2009-03-01

    Much structural information is encoded in the internal distances; a distance matrix-based approach can be used to predict protein structure and dynamics, and for structural refinement. Our approach is based on the square distance matrix D = [r(ij)(2)] containing all square distances between residues in proteins. This distance matrix contains more information than the contact matrix C, that has elements of either 0 or 1 depending on whether the distance r (ij) is greater or less than a cutoff value r (cutoff). We have performed spectral decomposition of the distance matrices D = sigma lambda(k)V(k)V(kT), in terms of eigenvalues lambda kappa and the corresponding eigenvectors v kappa and found that it contains at most five nonzero terms. A dominant eigenvector is proportional to r (2)--the square distance of points from the center of mass, with the next three being the principal components of the system of points. By predicting r (2) from the sequence we can approximate a distance matrix of a protein with an expected RMSD value of about 7.3 A, and by combining it with the prediction of the first principal component we can improve this approximation to 4.0 A. We can also explain the role of hydrophobic interactions for the protein structure, because r is highly correlated with the hydrophobic profile of the sequence. Moreover, r is highly correlated with several sequence profiles which are useful in protein structure prediction, such as contact number, the residue-wise contact order (RWCO) or mean square fluctuations (i.e. crystallographic temperature factors). We have also shown that the next three components are related to spatial directionality of the secondary structure elements, and they may be also predicted from the sequence, improving overall structure prediction. We have also shown that the large number of available HIV-1 protease structures provides a remarkable sampling of conformations, which can be viewed as direct structural information about the dynamics. After structure matching, we apply principal component analysis (PCA) to obtain the important apparent motions for both bound and unbound structures. There are significant similarities between the first few key motions and the first few low-frequency normal modes calculated from a static representative structure with an elastic network model (ENM) that is based on the contact matrix C (related to D), strongly suggesting that the variations among the observed structures and the corresponding conformational changes are facilitated by the low-frequency, global motions intrinsic to the structure. Similarities are also found when the approach is applied to an NMR ensemble, as well as to atomic molecular dynamics (MD) trajectories. Thus, a sufficiently large number of experimental structures can directly provide important information about protein dynamics, but ENM can also provide a similar sampling of conformations. Finally, we use distance constraints from databases of known protein structures for structure refinement. We use the distributions of distances of various types in known protein structures to obtain the most probable ranges or the mean-force potentials for the distances. We then impose these constraints on structures to be refined or include the mean-force potentials directly in the energy minimization so that more plausible structural models can be built. This approach has been successfully used by us in 2006 in the CASPR structure refinement (http://predictioncenter.org/caspR). PMID:19224393

  12. Improving the accuracy of protein secondary structure prediction using structural alignment

    PubMed Central

    Montgomerie, Scott; Sundararaj, Shan; Gallin, Warren J; Wishart, David S

    2006-01-01

    Background The accuracy of protein secondary structure prediction has steadily improved over the past 30 years. Now many secondary structure prediction methods routinely achieve an accuracy (Q3) of about 75%. We believe this accuracy could be further improved by including structure (as opposed to sequence) database comparisons as part of the prediction process. Indeed, given the large size of the Protein Data Bank (>35,000 sequences), the probability of a newly identified sequence having a structural homologue is actually quite high. Results We have developed a method that performs structure-based sequence alignments as part of the secondary structure prediction process. By mapping the structure of a known homologue (sequence ID >25%) onto the query protein's sequence, it is possible to predict at least a portion of that query protein's secondary structure. By integrating this structural alignment approach with conventional (sequence-based) secondary structure methods and then combining it with a "jury-of-experts" system to generate a consensus result, it is possible to attain very high prediction accuracy. Using a sequence-unique test set of 1644 proteins from EVA, this new method achieves an average Q3 score of 81.3%. Extensive testing indicates this is approximately 4–5% better than any other method currently available. Assessments using non sequence-unique test sets (typical of those used in proteome annotation or structural genomics) indicate that this new method can achieve a Q3 score approaching 88%. Conclusion By using both sequence and structure databases and by exploiting the latest techniques in machine learning it is possible to routinely predict protein secondary structure with an accuracy well above 80%. A program and web server, called PROTEUS, that performs these secondary structure predictions is accessible at . For high throughput or batch sequence analyses, the PROTEUS programs, databases (and server) can be downloaded and run locally. PMID:16774686

  13. Improved Chou-Fasman method for protein secondary structure prediction

    PubMed Central

    Chen, Hang; Gu, Fei; Huang, Zhengge

    2006-01-01

    Background Protein secondary structure prediction is a fundamental and important component in the analytical study of protein structure and functions. The prediction technique has been developed for several decades. The Chou-Fasman algorithm, one of the earliest methods, has been successfully applied to the prediction. However, this method has its limitations due to low accuracy, unreliable parameters, and over prediction. Thanks to the recent development in protein folding type-specific structure propensities and wavelet transformation, the shortcomings in Chou-Fasman method are able to be overcome. Results We improved Chou-Fasman method in three aspects. (a) Replace the nucleation regions with extreme values of coefficients calculated by the continuous wavelet transform. (b) Substitute the original secondary structure conformational parameters with folding type-specific secondary structure propensities. (c) Modify Chou-Fasman rules. The CB396 data set was tested by using improved Chou-Fasman method and three indices: Q3, Qpre, SOV were used to measure this method. We compared the indices with those obtained from the original Chou-Fasman method and other four popular methods. The results showed that our improved Chou-Fasman method performs better than the original one in all indices, about 10–18% improvement. It is also comparable to other currently popular methods considering all the indices. Conclusion Our method has greatly improved Chou-Fasman method. It is able to predict protein secondary structure as good as current popular methods. By locating nucleation regions with refined wavelet transform technology and by calculating propensity factors with larger size data set, it is likely to get a better result. PMID:17217506

  14. Crystal structure prediction: a novel approach based on minima hopping

    NASA Astrophysics Data System (ADS)

    Amsler, Maximilian; Goedecker, Stefan

    2012-02-01

    With increasing computational resources the prediction of crystal structures from first principle calculations has become feasible, but still remains a demanding task. A reliable method to perform an efficient, systematic search for the ground state structure based solely on the system's composition is essential. Motivated by the promising results of the minima hopping method obtained on isolated systems, we have generalized the algorithm for crystal structure prediction. Optimized moves in the configurational space spanned by both atomic coordinates and simulation cell variables are performed to escape from local enthalpy minima, and revisiting known minima is avoided, thus allowing a fast exploration of the enthalpy surface. The predictive power of the novel method has been shown in several applications, of which the following will be presented. Superconducting phases in hydrogen rich materials were investigated, leading to the discovery of novel ground state structures. For the longstanding question of the crystal structure of cold compressed graphite a new candidate phase could be identified to perfectly match experimental results. And at last, new low energy structures for materials with possible applications in hydrogen storage are presented.

  15. (PS)2: protein structure prediction server version 3.0.

    PubMed

    Huang, Tsun-Tsao; Hwang, Jenn-Kang; Chen, Chu-Huang; Chu, Chih-Sheng; Lee, Chi-Wen; Chen, Chih-Chieh

    2015-07-01

    Protein complexes are involved in many biological processes. Examining coupling between subunits of a complex would be useful to understand the molecular basis of protein function. Here, our updated (PS)(2) web server predicts the three-dimensional structures of protein complexes based on comparative modeling; furthermore, this server examines the coupling between subunits of the predicted complex by combining structural and evolutionary considerations. The predicted complex structure could be indicated and visualized by Java-based 3D graphics viewers and the structural and evolutionary profiles are shown and compared chain-by-chain. For each subunit, considerations with or without the packing contribution of other subunits cause the differences in similarities between structural and evolutionary profiles, and these differences imply which form, complex or monomeric, is preferred in the biological condition for the subunit. We believe that the (PS)(2) server would be a useful tool for biologists who are interested not only in the structures of protein complexes but also in the coupling between subunits of the complexes. The (PS)(2) is freely available at http://ps2v3.life.nctu.edu.tw/. PMID:25943546

  16. A statistical sampling algorithm for RNA secondary structure prediction.

    PubMed

    Ding, Ye; Lawrence, Charles E

    2003-12-15

    An RNA molecule, particularly a long-chain mRNA, may exist as a population of structures. Further more, multiple structures have been demonstrated to play important functional roles. Thus, a representation of the ensemble of probable structures is of interest. We present a statistical algorithm to sample rigorously and exactly from the Boltzmann ensemble of secondary structures. The forward step of the algorithm computes the equilibrium partition functions of RNA secondary structures with recent thermodynamic parameters. Using conditional probabilities computed with the partition functions in a recursive sampling process, the backward step of the algorithm quickly generates a statistically representative sample of structures. With cubic run time for the forward step, quadratic run time in the worst case for the sampling step, and quadratic storage, the algorithm is efficient for broad applicability. We demonstrate that, by classifying sampled structures, the algorithm enables a statistical delineation and representation of the Boltzmann ensemble. Applications of the algorithm show that alternative biological structures are revealed through sampling. Statistical sampling provides a means to estimate the probability of any structural motif, with or without constraints. For example, the algorithm enables probability profiling of single-stranded regions in RNA secondary structure. Probability profiling for specific loop types is also illustrated. By overlaying probability profiles, a mutual accessibility plot can be displayed for predicting RNA:RNA interactions. Boltzmann probability-weighted density of states and free energy distributions of sampled structures can be readily computed. We show that a sample of moderate size from the ensemble of an enormous number of possible structures is sufficient to guarantee statistical reproducibility in the estimates of typical sampling statistics. Our applications suggest that the sampling algorithm may be well suited to prediction of mRNA structure and target accessibility. The algorithm is applicable to the rational design of small interfering RNAs (siRNAs), antisense oligonucleotides, and trans-cleaving ribozymes in gene knock-down studies. PMID:14654704

  17. Prediction of reactive hazards based on molecular structure.

    PubMed

    Saraf, S R; Rogers, W J; Mannan, M S

    2003-03-17

    There is considerable interest in prediction of reactive hazards based on chemical structure. Calorimetric measurements to determine reactivity can be resource consuming, so computational methods to predict reactivity hazards present an attractive option. This paper reviews some of the commonly employed theoretical hazard evaluation techniques, including the oxygen-balance method, ASTM CHETAH, and calculated adiabatic reaction temperature (CART). It also discusses the development of a study table to correlate and predict calorimetric properties of pure compounds. Quantitative structure-property relationships (QSPR) based on quantum mechanical calculations can be employed to correlate calorimetrically measured onset temperatures, T(o), and energies of reaction, -deltaH, with molecular properties. To test the feasibility of this approach, the QSPR technique is used to correlate differential scanning calorimeter (DSC) data, T(o) and -deltaH, with molecular properties for 19 nitro compounds. PMID:12628775

  18. Structural Damage Prediction and Analysis for Hypervelocity Impact: Consulting

    NASA Technical Reports Server (NTRS)

    1995-01-01

    A portion of the contract NAS8-38856, 'Structural Damage Prediction and Analysis for Hypervelocity Impacts,' from NASA Marshall Space Flight Center (MSFC), included consulting which was to be documented in the final report. This attachment to the final report contains memos produced as part of that consulting.

  19. Process for predicting structural performance of mechanical systems

    DOEpatents

    Gardner, David R.; Hendrickson, Bruce A.; Plimpton, Steven J.; Attaway, Stephen W.; Heinstein, Martin W.; Vaughan, Courtenay T.

    1998-01-01

    A process for predicting the structural performance of a mechanical system represents the mechanical system by a plurality of surface elements. The surface elements are grouped according to their location in the volume occupied by the mechanical system so that contacts between surface elements can be efficiently located. The process is well suited for efficient practice on multiprocessor computers.

  20. Process for predicting structural performance of mechanical systems

    DOEpatents

    Gardner, D.R.; Hendrickson, B.A.; Plimpton, S.J.; Attaway, S.W.; Heinstein, M.W.; Vaughan, C.T.

    1998-05-19

    A process for predicting the structural performance of a mechanical system represents the mechanical system by a plurality of surface elements. The surface elements are grouped according to their location in the volume occupied by the mechanical system so that contacts between surface elements can be efficiently located. The process is well suited for efficient practice on multiprocessor computers. 12 figs.

  1. Automatic Prediction of Facial Trait Judgments: Appearance vs. Structural Models

    PubMed Central

    Rojas Q., Mario; Masip, David; Todorov, Alexander; Vitria, Jordi

    2011-01-01

    Evaluating other individuals with respect to personality characteristics plays a crucial role in human relations and it is the focus of attention for research in diverse fields such as psychology and interactive computer systems. In psychology, face perception has been recognized as a key component of this evaluation system. Multiple studies suggest that observers use face information to infer personality characteristics. Interactive computer systems are trying to take advantage of these findings and apply them to increase the natural aspect of interaction and to improve the performance of interactive computer systems. Here, we experimentally test whether the automatic prediction of facial trait judgments (e.g. dominance) can be made by using the full appearance information of the face and whether a reduced representation of its structure is sufficient. We evaluate two separate approaches: a holistic representation model using the facial appearance information and a structural model constructed from the relations among facial salient points. State of the art machine learning methods are applied to a) derive a facial trait judgment model from training data and b) predict a facial trait value for any face. Furthermore, we address the issue of whether there are specific structural relations among facial points that predict perception of facial traits. Experimental results over a set of labeled data (9 different trait evaluations) and classification rules (4 rules) suggest that a) prediction of perception of facial traits is learnable by both holistic and structural approaches; b) the most reliable prediction of facial trait judgments is obtained by certain type of holistic descriptions of the face appearance; and c) for some traits such as attractiveness and extroversion, there are relationships between specific structural features and social perceptions. PMID:21858069

  2. Assessment of structural integrity in pressure vessels predictions and verification

    SciTech Connect

    Loushin, L.L.

    1996-12-01

    Methods to assess the structural integrity of pressure vessels, piping, and storage tankage have been developed by a wide variety of sources. Of these efforts, the Materials Properties Council Program on Fitness-for-Service Evaluation Procedures for Operating Pressure Vessels, Tanks, and Piping in Refinery and Chemical Service is one of the most noteworthy. This fitness-for-service evaluation methodology is applied to real scenarios where the continued service of carbon and stainless steel pressure vessels was in question. How such assessments of structural integrity, fitness-for-service, remaining life, and failure modes would be made by an owner/user engineering specialist are described. The conclusions derived from this full-scale testing program demonstrate that technically sound and economically viable predictions are well within acceptable bounds of structural integrity. The real life behavior of pressure vessels tested to failure were far more resistant to catastrophic failure than was predicted.

  3. Combining Sequence and Structural Profiles for Protein Solvent Accessibility Prediction

    PubMed Central

    Bondugula, Rajkumar

    2009-01-01

    Solvent accessibility is an important structural feature for a protein. We propose a new method for solvent accessibility prediction that uses known structure and sequence information more efficiently. We first estimate the relative solvent accessibility of the query protein using fuzzy mean operator from the solvent accessibilities of known structure fragments that have similar sequences to the query protein. We then integrate the estimated solvent accessibility and the position specific scoring matrix of the query protein using a neural network. We tested our method on a large data set consisting of 3386 non-redundant proteins. The comparison with other methods show slightly improved prediction accuracies with our method. The resulting system does need not be re-trained when new data is available. We incorporated our method into the MUPRED system, which is available as a web server at http://digbio.missouri.edu/mupred. PMID:19642280

  4. Protein 8-class secondary structure prediction using conditional neural fields.

    PubMed

    Wang, Zhiyong; Zhao, Feng; Peng, Jian; Xu, Jinbo

    2011-10-01

    Compared with the protein 3-class secondary structure (SS) prediction, the 8-class prediction gains less attention and is also much more challenging, especially for proteins with few sequence homologs. This paper presents a new probabilistic method for 8-class SS prediction using conditional neural fields (CNFs), a recently invented probabilistic graphical model. This CNF method not only models the complex relationship between sequence features and SS, but also exploits the interdependency among SS types of adjacent residues. In addition to sequence profiles, our method also makes use of non-evolutionary information for SS prediction. Tested on the CB513 and RS126 data sets, our method achieves Q8 accuracy of 64.9 and 64.7%, respectively, which are much better than the SSpro8 web server (51.0 and 48.0%, respectively). Our method can also be used to predict other structure properties (e.g. solvent accessibility) of a protein or the SS of RNA. PMID:21805636

  5. Structure-Based Predictive Models for Allosteric Hot Spots

    PubMed Central

    Demerdash, Omar N. A.; Daily, Michael D.; Mitchell, Julie C.

    2009-01-01

    In allostery, a binding event at one site in a protein modulates the behavior of a distant site. Identifying residues that relay the signal between sites remains a challenge. We have developed predictive models using support-vector machines, a widely used machine-learning method. The training data set consisted of residues classified as either hotspots or non-hotspots based on experimental characterization of point mutations from a diverse set of allosteric proteins. Each residue had an associated set of calculated features. Two sets of features were used, one consisting of dynamical, structural, network, and informatic measures, and another of structural measures defined by Daily and Gray [1]. The resulting models performed well on an independent data set consisting of hotspots and non-hotspots from five allosteric proteins. For the independent data set, our top 10 models using Feature Set 1 recalled 6881% of known hotspots, and among total hotspot predictions, 5867% were actual hotspots. Hence, these models have precision P?=?5867% and recall R?=?6881%. The corresponding models for Feature Set 2 had P?=?5559% and R?=?8192%. We combined the features from each set that produced models with optimal predictive performance. The top 10 models using this hybrid feature set had R?=?7381% and P?=?6471%, the best overall performance of any of the sets of models. Our methods identified hotspots in structural regions of known allosteric significance. Moreover, our predicted hotspots form a network of contiguous residues in the interior of the structures, in agreement with previous work. In conclusion, we have developed models that discriminate between known allosteric hotspots and non-hotspots with high accuracy and sensitivity. Moreover, the pattern of predicted hotspots corresponds to known functional motifs implicated in allostery, and is consistent with previous work describing sparse networks of allosterically important residues. PMID:19816556

  6. Predictability of the polymorphs of small organic compounds: crystal structure predictions of four benchmark blind test molecules.

    PubMed

    Chan, H C Stephen; Kendrick, John; Leusen, Frank J J

    2011-12-01

    Predicting the crystal structure of an organic molecule from first principles has been a major challenge in physical chemistry. Recently, the application of Density Functional Theory including a dispersive energy correction (the DFT(d) method) has been shown to be a reliable method for predicting experimental structures based purely on their ranking according to lattice energy. Further validation results of the application of the DFT(d) method to four organic molecules are presented here. The compounds were targets (labelled molecule II, VI, VII and XI) in previous blind tests of crystal structure prediction, and their structures proved difficult to predict. However, this study shows that the DFT(d) approach is capable of predicting the solid state structures of these small molecules. For molecule VII, the most stable (rank 1) predicted crystal structure corresponds to the experimentally observed structure. For molecule VI, the rank 1, 2 and 3 predicted structures correspond to the three experimental polymorphs, forms I, III and II, respectively. For molecules II and XI, their rank 1 predicted structures are energetically more stable than those corresponding to the experimental crystal structures, and were not found amongst the structures submitted by the participants in the blind tests. The rank 1 structure of molecule II is predicted to exist under high pressure, whilst the rank 1 structure predicted for molecule XI has the same space group and hydrogen bonding pattern as observed in the crystal of 1-amino-1-methyl-cyclopropane, which is structurally related to molecule XI. The experimental crystal structure of molecule II corresponds to the rank 4 prediction, 0.8 kJ mol(-1) above the global minimum structure, and the experimental structure of molecule XI corresponds to the rank 2 prediction, 0.4 kJ mol(-1) above the global minimum. PMID:21993855

  7. Predicting protein structures with a multiplayer online game.

    PubMed

    Cooper, Seth; Khatib, Firas; Treuille, Adrien; Barbero, Janos; Lee, Jeehyung; Beenen, Michael; Leaver-Fay, Andrew; Baker, David; Popović, Zoran; Players, Foldit

    2010-08-01

    People exert large amounts of problem-solving effort playing computer games. Simple image- and text-recognition tasks have been successfully 'crowd-sourced' through games, but it is not clear if more complex scientific problems can be solved with human-directed computing. Protein structure prediction is one such problem: locating the biologically relevant native conformation of a protein is a formidable computational challenge given the very large size of the search space. Here we describe Foldit, a multiplayer online game that engages non-scientists in solving hard prediction problems. Foldit players interact with protein structures using direct manipulation tools and user-friendly versions of algorithms from the Rosetta structure prediction methodology, while they compete and collaborate to optimize the computed energy. We show that top-ranked Foldit players excel at solving challenging structure refinement problems in which substantial backbone rearrangements are necessary to achieve the burial of hydrophobic residues. Players working collaboratively develop a rich assortment of new strategies and algorithms; unlike computational approaches, they explore not only the conformational space but also the space of possible search strategies. The integration of human visual problem-solving and strategy development capabilities with traditional computational algorithms through interactive multiplayer games is a powerful new approach to solving computationally-limited scientific problems. PMID:20686574

  8. Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields

    PubMed Central

    Wang, Sheng; Peng, Jian; Ma, Jianzhu; Xu, Jinbo

    2016-01-01

    Protein secondary structure (SS) prediction is important for studying protein structure and function. When only the sequence (profile) information is used as input feature, currently the best predictors can obtain ~80% Q3 accuracy, which has not been improved in the past decade. Here we present DeepCNF (Deep Convolutional Neural Fields) for protein SS prediction. DeepCNF is a Deep Learning extension of Conditional Neural Fields (CNF), which is an integration of Conditional Random Fields (CRF) and shallow neural networks. DeepCNF can model not only complex sequence-structure relationship by a deep hierarchical architecture, but also interdependency between adjacent SS labels, so it is much more powerful than CNF. Experimental results show that DeepCNF can obtain ~84% Q3 accuracy, ~85% SOV score, and ~72% Q8 accuracy, respectively, on the CASP and CAMEO test proteins, greatly outperforming currently popular predictors. As a general framework, DeepCNF can be used to predict other protein structure properties such as contact number, disorder regions, and solvent accessibility. PMID:26752681

  9. Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields

    NASA Astrophysics Data System (ADS)

    Wang, Sheng; Peng, Jian; Ma, Jianzhu; Xu, Jinbo

    2016-01-01

    Protein secondary structure (SS) prediction is important for studying protein structure and function. When only the sequence (profile) information is used as input feature, currently the best predictors can obtain ~80% Q3 accuracy, which has not been improved in the past decade. Here we present DeepCNF (Deep Convolutional Neural Fields) for protein SS prediction. DeepCNF is a Deep Learning extension of Conditional Neural Fields (CNF), which is an integration of Conditional Random Fields (CRF) and shallow neural networks. DeepCNF can model not only complex sequence-structure relationship by a deep hierarchical architecture, but also interdependency between adjacent SS labels, so it is much more powerful than CNF. Experimental results show that DeepCNF can obtain ~84% Q3 accuracy, ~85% SOV score, and ~72% Q8 accuracy, respectively, on the CASP and CAMEO test proteins, greatly outperforming currently popular predictors. As a general framework, DeepCNF can be used to predict other protein structure properties such as contact number, disorder regions, and solvent accessibility.

  10. Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields.

    PubMed

    Wang, Sheng; Peng, Jian; Ma, Jianzhu; Xu, Jinbo

    2016-01-01

    Protein secondary structure (SS) prediction is important for studying protein structure and function. When only the sequence (profile) information is used as input feature, currently the best predictors can obtain ~80% Q3 accuracy, which has not been improved in the past decade. Here we present DeepCNF (Deep Convolutional Neural Fields) for protein SS prediction. DeepCNF is a Deep Learning extension of Conditional Neural Fields (CNF), which is an integration of Conditional Random Fields (CRF) and shallow neural networks. DeepCNF can model not only complex sequence-structure relationship by a deep hierarchical architecture, but also interdependency between adjacent SS labels, so it is much more powerful than CNF. Experimental results show that DeepCNF can obtain ~84% Q3 accuracy, ~85% SOV score, and ~72% Q8 accuracy, respectively, on the CASP and CAMEO test proteins, greatly outperforming currently popular predictors. As a general framework, DeepCNF can be used to predict other protein structure properties such as contact number, disorder regions, and solvent accessibility. PMID:26752681

  11. Prediction of the structure of symmetrical protein assemblies

    PubMed Central

    André, Ingemar; Bradley, Philip; Wang, Chu; Baker, David

    2007-01-01

    Biological supramolecular systems are commonly built up by the self-assembly of identical protein subunits to produce symmetrical oligomers with cyclical, icosahedral, or helical symmetry that play roles in processes ranging from allosteric control and molecular transport to motor action. The large size of these systems often makes them difficult to structurally characterize using experimental techniques. We have developed a computational protocol to predict the structure of symmetrical protein assemblies based on the structure of a single subunit. The method carries out simultaneous optimization of backbone, side chain, and rigid-body degrees of freedom, while restricting the search space to symmetrical conformations. Using this protocol, we can reconstruct, starting from the structure of a single subunit, the structure of cyclic oligomers and the icosahedral virus capsid of satellite panicum virus using a rigid backbone approximation. We predict the oligomeric state of EscJ from the type III secretion system both in its proposed cyclical and crystallized helical form. Finally, we show that the method can recapitulate the structure of an amyloid-like fibril formed by the peptide NNQQNY from the yeast prion protein Sup35 starting from the amino acid sequence alone and searching the complete space of backbone, side chain, and rigid-body degrees of freedom. PMID:17978193

  12. Predicting oxygen uptake and VOC emissions at enclosed drop structures

    SciTech Connect

    Rahme, Z.G.; Zytner, R.G.; Madani-Isfahani, M.; Corsi, R.L.

    1997-01-01

    Drop structures used during wastewater collection and treatment are sources for volatile organic compound (VOC) emissions. To assist in the reduction of such emissions, pilot-scale experiments were completed using municipal wastewater to study the effects of drop height, liquid flow rate, and tailwater depth on oxygen transfer, and to evaluate the effects of the same parameters on the stripping of 10 VOC tracers. Results were used to develop predictive models for oxygen and VOC transfer. Oxygen uptake at the pilot drop structure suggests that the drop height is the most important parameter influencing oxygen uptake at enclosed drop structures. Tailwater depth had little effect on oxygen transfer at the drop structure. Stripping of VOCs at drop structures was seen to be a strong function of Henry`s law coefficient. This sensitivity was related to gas-phase resistance in mass-transfer and/or VOC accumulation in the air bubbles. Incorporating gas-phase resistance and an appropriate {alpha} factor for wastewater into the model allowed the prediction of VOC deficit ratios and estimation of VOC stripping at drop structures for both clean water and wastewater.

  13. Constraint Logic Programming approach to protein structure prediction

    PubMed Central

    Dal Palù, Alessandro; Dovier, Agostino; Fogolari, Federico

    2004-01-01

    Background The protein structure prediction problem is one of the most challenging problems in biological sciences. Many approaches have been proposed using database information and/or simplified protein models. The protein structure prediction problem can be cast in the form of an optimization problem. Notwithstanding its importance, the problem has very seldom been tackled by Constraint Logic Programming, a declarative programming paradigm suitable for solving combinatorial optimization problems. Results Constraint Logic Programming techniques have been applied to the protein structure prediction problem on the face-centered cube lattice model. Molecular dynamics techniques, endowed with the notion of constraint, have been also exploited. Even using a very simplified model, Constraint Logic Programming on the face-centered cube lattice model allowed us to obtain acceptable results for a few small proteins. As a test implementation their (known) secondary structure and the presence of disulfide bridges are used as constraints. Simplified structures obtained in this way have been converted to all atom models with plausible structure. Results have been compared with a similar approach using a well-established technique as molecular dynamics. Conclusions The results obtained on small proteins show that Constraint Logic Programming techniques can be employed for studying protein simplified models, which can be converted into realistic all atom models. The advantage of Constraint Logic Programming over other, much more explored, methodologies, resides in the rapid software prototyping, in the easy way of encoding heuristics, and in exploiting all the advances made in this research area, e.g. in constraint propagation and its use for pruning the huge search space. PMID:15571634

  14. Virality Prediction and Community Structure in Social Networks

    NASA Astrophysics Data System (ADS)

    Weng, Lilian; Menczer, Filippo; Ahn, Yong-Yeol

    2013-08-01

    How does network structure affect diffusion? Recent studies suggest that the answer depends on the type of contagion. Complex contagions, unlike infectious diseases (simple contagions), are affected by social reinforcement and homophily. Hence, the spread within highly clustered communities is enhanced, while diffusion across communities is hampered. A common hypothesis is that memes and behaviors are complex contagions. We show that, while most memes indeed spread like complex contagions, a few viral memes spread across many communities, like diseases. We demonstrate that the future popularity of a meme can be predicted by quantifying its early spreading pattern in terms of community concentration. The more communities a meme permeates, the more viral it is. We present a practical method to translate data about community structure into predictive knowledge about what information will spread widely. This connection contributes to our understanding in computational social science, social media analytics, and marketing applications.

  15. Virality prediction and community structure in social networks.

    PubMed

    Weng, Lilian; Menczer, Filippo; Ahn, Yong-Yeol

    2013-01-01

    How does network structure affect diffusion? Recent studies suggest that the answer depends on the type of contagion. Complex contagions, unlike infectious diseases (simple contagions), are affected by social reinforcement and homophily. Hence, the spread within highly clustered communities is enhanced, while diffusion across communities is hampered. A common hypothesis is that memes and behaviors are complex contagions. We show that, while most memes indeed spread like complex contagions, a few viral memes spread across many communities, like diseases. We demonstrate that the future popularity of a meme can be predicted by quantifying its early spreading pattern in terms of community concentration. The more communities a meme permeates, the more viral it is. We present a practical method to translate data about community structure into predictive knowledge about what information will spread widely. This connection contributes to our understanding in computational social science, social media analytics, and marketing applications. PMID:23982106

  16. Predicting olfactory receptor neuron responses from odorant structure

    PubMed Central

    Schmuker, Michael; de Bruyne, Marien; Hähnel, Melanie; Schneider, Gisbert

    2007-01-01

    Background Olfactory receptors work at the interface between the chemical world of volatile molecules and the perception of scent in the brain. Their main purpose is to translate chemical space into information that can be processed by neural circuits. Assuming that these receptors have evolved to cope with this task, the analysis of their coding strategy promises to yield valuable insight in how to encode chemical information in an efficient way. Results We mimicked olfactory coding by modeling responses of primary olfactory neurons to small molecules using a large set of physicochemical molecular descriptors and artificial neural networks. We then tested these models by recording in vivo receptor neuron responses to a new set of odorants and successfully predicted the responses of five out of seven receptor neurons. Correlation coefficients ranged from 0.66 to 0.85, demonstrating the applicability of our approach for the analysis of olfactory receptor activation data. The molecular descriptors that are best-suited for response prediction vary for different receptor neurons, implying that each receptor neuron detects a different aspect of chemical space. Finally, we demonstrate that receptor responses themselves can be used as descriptors in a predictive model of neuron activation. Conclusion The chemical meaning of molecular descriptors helps understand structure-response relationships for olfactory receptors and their "receptive fields". Moreover, it is possible to predict receptor neuron activation from chemical structure using machine-learning techniques, although this is still complicated by a lack of training data. PMID:17880742

  17. Statistics of noncoding RNAs: alignment and secondary structure prediction

    NASA Astrophysics Data System (ADS)

    Nechaev, S. K.; Tamm, M. V.; Valba, O. V.

    2011-05-01

    A new statistical approach to alignment (finding the longest common subsequence) of two random RNA-type sequences is proposed. We have constructed a generalized 'dynamic programming' algorithm for finding the extreme value of the free energy of two noncoding RNAs. In our procedure, we take into account the binding free energy of two random heteropolymer chains which are capable of forming the cloverleaf-like spatial structures typical for RNA molecules. The algorithm is based on two observations: (i) the standard alignment problem can be considered as a zero-temperature limit of a more general statistical problem of binding of two associating heteropolymer chains; (ii) this last problem can be generalized naturally to consider sequences with hierarchical cloverleaf-like structures (i.e. of RNA type). The approach also permits us to perform a 'secondary structure recovery'. Namely, we can predict the optimal secondary structures of interacting RNAs in a zero-temperature limit knowing only their primary sequences.

  18. Fragment-HMM: a new approach to protein structure prediction.

    PubMed

    Li, Shuai Cheng; Bu, Dongbo; Xu, Jinbo; Li, Ming

    2008-11-01

    We designed a simple position-specific hidden Markov model to predict protein structure. Our new framework naturally repeats itself to converge to a final target, conglomerating fragment assembly, clustering, target selection, refinement, and consensus, all in one process. Our initial implementation of this theory converges to within 6 A of the native structures for 100% of decoys on all six standard benchmark proteins used in ROSETTA (discussed by Simons and colleagues in a recent paper), which achieved only 14%-94% for the same data. The qualities of the best decoys and the final decoys our theory converges to are also notably better. PMID:18723665

  19. Predicting structure/property relations in polymeric photovoltaic devices.

    NASA Astrophysics Data System (ADS)

    Buxton, Gavin; Clarke, Nigel

    2007-03-01

    Plastic solar cells are attractive candidates for providing cheap, clean and renewable energy. However, such devices are critically dependent on the internal structure, or morphology, of the polymer constituents. We have developed a model that enables us to predict photovoltaic behaviour for arbitrary morphologies, which we also generate from numerical simulations. We illustrate the model by showing how diblock copolymer morphologies can be manipulated to optimise the photovoltaic effect in plastic solar cells. In this manner, we can correlate photovoltaic properties with device structure and hence guide experiments to optimise polymer morphologies to meet photovoltaic needs.

  20. Predicting structure and property relations in polymeric photovoltaic devices

    NASA Astrophysics Data System (ADS)

    Buxton, Gavin A.; Clarke, Nigel

    2006-08-01

    Plastic solar cells are attractive candidates for providing cheap, clean, and renewable energy. However, such devices are critically dependent on the internal structure, or morphology, of the polymer constituents. We have developed a model that enables us to predict photovoltaic behavior for arbitrary morphologies, which we also generate from numerical simulations. We illustrate the model by showing how diblock copolymer morphologies can be manipulated to optimize the photovoltaic effect in plastic solar cells. In this manner, we can correlate photovoltaic properties with device structure and hence guide experiments to optimize polymer morphologies to meet photovoltaic needs.

  1. Effects of scale in predicting global structural response

    NASA Technical Reports Server (NTRS)

    Deo, R. B.; Kan, H. P.

    1991-01-01

    Analytical techniques for scale-up effects were reviewed. The advantages and limitations of applying the principles of similitude to composite structures is summarized and illustrated by simple examples. An analytical procedure was formulated to design scale models of an axially compressed composite cylinder. A building-block approach was outlined where each structural detail is analyzed independently and the probable failure sequence of a selected component is predicted, taking into account load redistribution subsequent to first element failure. Details of this building-block approach are under development.

  2. Structure and stoichiometry prediction of surfaces reacting with multicomponent gases.

    PubMed

    Herrmann, Philipp; Heimel, Georg

    2015-01-14

    Reactive interactions of molecules with solid surfaces are of key interest for catalysis and surface functionalization. Here, conceptual shortcomings of previous theoretical methods for the prediction of steady-state surface structures and stoichiometries from first-principles thermodynamics are identified. An extension is then proposed, which now enables the unconstrained description of an arbitrary number of mutually reacting gas-phase species. PMID:25382305

  3. A tool for the prediction of structures of complex sugars.

    PubMed

    Xia, Junchao; Margulis, Claudio

    2008-12-01

    In two recent back to back articles(Xia et al., J Chem Theory Comput 3:1620-1628 and 1629-1643, 2007a, b) we have started to address the problem of complex oligosaccharide conformation and folding. The scheme previously presented was based on exhaustive searches in configuration space in conjunction with Nuclear Overhauser Effect (NOE) calculations and the use of a complex rotameric library that takes branching into account. NOEs are extremely useful for structural determination but only provide information about short range interactions and ordering. Instead, the measurement of residual dipolar couplings (RDC), yields information about molecular ordering or folding that is long range in nature. In this article we show the results obtained by incorporation RDC calculations into our prediction scheme. Using this new approach we are able to accurately predict the structure of six human milk sugars: LNF-1, LND-1, LNF-2, LNF-3, LNnT and LNT. Our exhaustive search in dihedral configuration space combined with RDC and NOE calculations allows for highly accurate structural predictions that, because of the non-ergodic nature of these molecules on a time scale compatible with molecular dynamics simulations, are extremely hard to obtain otherwise (Almond et al., Biochemistry 43:5853-5863, 2004). Molecular dynamics simulations in explicit solvent using as initial configurations the structures predicted by our algorithm show that the histo-blood group epitopes in these sugars are relatively rigid and that the whole family of oligosaccharides derives its conformational variability almost exclusively from their common linkage (beta-D: -GlcNAc-(1-->3)-beta-D: -Gal) which can exist in two distinct conformational states. A population analysis based on the conformational variability of this flexible glycosidic link indicates that the relative population of the two distinct states varies for different human milk oligosaccharides. PMID:18953494

  4. STITCHER: Dynamic assembly of likely amyloid and prion β-structures from secondary structure predictions.

    PubMed

    Bryan, Allen W; O'Donnell, Charles W; Menke, Matthew; Cowen, Lenore J; Lindquist, Susan; Berger, Bonnie

    2012-02-01

    The supersecondary structure of amyloids and prions, proteins of intense clinical and biological interest, are difficult to determine by standard experimental or computational means. In addition, significant conformational heterogeneity is known or suspected to exist in many amyloid fibrils. Previous work has demonstrated that probability-based prediction of discrete β-strand pairs can offer insight into these structures. Here, we devise a system of energetic rules that can be used to dynamically assemble these discrete β-strand pairs into complete amyloid β-structures. The STITCHER algorithm progressively 'stitches' strand-pairs into full β-sheets based on a novel free-energy model, incorporating experimentally observed amino-acid side-chain stacking contributions, entropic estimates, and steric restrictions for amyloidal parallel β-sheet construction. A dynamic program computes the top 50 structures and returns both the highest scoring structure and a consensus structure taken by polling this list for common discrete elements. Putative structural heterogeneity can be inferred from sequence regions that compose poorly. Predictions show agreement with experimental models of Alzheimer's amyloid beta peptide and the Podospora anserina Het-s prion. Predictions of the HET-s homolog HET-S also reflect experimental observations of poor amyloid formation. We put forward predicted structures for the yeast prion Sup35, suggesting N-terminal structural stability enabled by tyrosine ladders, and C-terminal heterogeneity. Predictions for the Rnq1 prion and alpha-synuclein are also given, identifying a similar mix of homogenous and heterogeneous secondary structure elements. STITCHER provides novel insight into the energetic basis of amyloid structure, provides accurate structure predictions, and can help guide future experimental studies. PMID:22095906

  5. STITCHER: Dynamic assembly of likely amyloid and prion β-structures from secondary structure predictions

    PubMed Central

    Bryan, Allen W; O’Donnell, Charles W; Menke, Matthew; Cowen, Lenore J; Lindquist, Susan; Berger, Bonnie

    2012-01-01

    The supersecondary structure of amyloids and prions, proteins of intense clinical and biological interest, are difficult to determine by standard experimental or computational means. In addition, significant conformational heterogeneity is known or suspected to exist in many amyloid fibrils. Previous work has demonstrated that probability-based prediction of discrete β-strand pairs can offer insight into these structures. Here, we devise a system of energetic rules that can be used to dynamically assemble these discrete β-strand pairs into complete amyloid β-structures. The STITCHER algorithm progressively ‘stitches’ strand-pairs into full β-sheets based on a novel free-energy model, incorporating experimentally observed amino-acid side-chain stacking contributions, entropic estimates, and steric restrictions for amyloidal parallel β-sheet construction. A dynamic program computes the top 50 structures and returns both the highest scoring structure and a consensus structure taken by polling this list for common discrete elements. Putative structural heterogeneity can be inferred from sequence regions that compose poorly. Predictions show agreement with experimental models of Alzheimer’s amyloid beta peptide and the Podospora anserina Het-s prion. Predictions of the HET-s homolog HET-S also reflect experimental observations of poor amyloid formation. We put forward predicted structures for the yeast prion Sup35, suggesting N-terminal structural stability enabled by tyrosine ladders, and C-terminal heterogeneity. Predictions for the Rnq1 prion and alpha-synuclein are also given, identifying a similar mix of homogenous and heterogeneous secondary structure elements. STITCHER provides novel insight into the energetic basis of amyloid structure, provides accurate structure predictions, and can help guide future experimental studies. Proteins 2012. © 2011 Wiley Periodicals, Inc. PMID:22095906

  6. Structure-based mutant stability predictions on proteins of unknown structure.

    PubMed

    Gonnelli, Giulia; Rooman, Marianne; Dehouck, Yves

    2012-10-31

    The ability to rapidly and accurately predict the effects of mutations on the physicochemical properties of proteins holds tremendous importance in the rational design of modified proteins for various types of industrial, environmental or pharmaceutical applications, as well as in elucidating the genetic background of complex diseases. In many cases, the absence of an experimentally resolved structure represents a major obstacle, since most currently available predictive software crucially depend on it. We investigate here the relevance of combining coarse-grained structure-based stability predictions with a simple comparative modeling procedure. Strikingly, our results show that the use of average to high quality structural models leads to virtually no loss in predictive power compared to the use of experimental structures. Even in the case of low quality models, the decrease in performance is quite limited and this combined approach remains markedly superior to other methods based exclusively on the analysis of sequence features. PMID:22782143

  7. Predicting ion binding properties for RNA tertiary structures.

    PubMed

    Tan, Zhi-Jie; Chen, Shi-Jie

    2010-09-01

    Recent experiments pointed to the potential importance of ion correlation for multivalent ions such as Mg(2+) ions in RNA folding. In this study, we develop an all-atom model to predict the ion electrostatics in RNA folding. The model can treat ion correlation effects explicitly by considering an ensemble of discrete ion distributions. In contrast to the previous coarse-grained models that can treat ion correlation, this new model is based on all-atom nucleic acid structures. Thus, unlike the previous coarse-grained models, this new model allows us to treat complex tertiary structures such as HIV-1 DIS type RNA kissing complexes. Theory-experiment comparisons for a variety of tertiary structures indicate that the model gives improved predictions over the Poisson-Boltzmann theory, which underestimates the Mg(2+) binding in the competition with Na(+). Further systematic theory-experiment comparisons for a series of tertiary structures lead to a set of analytical formulas for Mg(2+)/Na(+) ion-binding to various RNA and DNA structures over a wide range of Mg(2+) and Na(+) concentrations. PMID:20816069

  8. Improved hybrid optimization algorithm for 3D protein structure prediction.

    PubMed

    Zhou, Changjun; Hou, Caixia; Wei, Xiaopeng; Zhang, Qiang

    2014-07-01

    A new improved hybrid optimization algorithm - PGATS algorithm, which is based on toy off-lattice model, is presented for dealing with three-dimensional protein structure prediction problems. The algorithm combines the particle swarm optimization (PSO), genetic algorithm (GA), and tabu search (TS) algorithms. Otherwise, we also take some different improved strategies. The factor of stochastic disturbance is joined in the particle swarm optimization to improve the search ability; the operations of crossover and mutation that are in the genetic algorithm are changed to a kind of random liner method; at last tabu search algorithm is improved by appending a mutation operator. Through the combination of a variety of strategies and algorithms, the protein structure prediction (PSP) in a 3D off-lattice model is achieved. The PSP problem is an NP-hard problem, but the problem can be attributed to a global optimization problem of multi-extremum and multi-parameters. This is the theoretical principle of the hybrid optimization algorithm that is proposed in this paper. The algorithm combines local search and global search, which overcomes the shortcoming of a single algorithm, giving full play to the advantage of each algorithm. In the current universal standard sequences, Fibonacci sequences and real protein sequences are certified. Experiments show that the proposed new method outperforms single algorithms on the accuracy of calculating the protein sequence energy value, which is proved to be an effective way to predict the structure of proteins. PMID:25069136

  9. SVM-based method for protein structural class prediction using secondary structural content and structural information of amino acids.

    PubMed

    Mohammad, Tabrez Anwar Shamim; Nagarajaram, Hampapathalu Adimurthy

    2011-08-01

    The knowledge collated from the known protein structures has revealed that the proteins are usually folded into the four structural classes: all-α, all-β, α/β and α + β. A number of methods have been proposed to predict the protein's structural class from its primary structure; however, it has been observed that these methods fail or perform poorly in the cases of distantly related sequences. In this paper, we propose a new method for protein structural class prediction using low homology (twilight-zone) protein sequences dataset. Since protein structural class prediction is a typical classification problem, we have developed a Support Vector Machine (SVM)-based method for protein structural class prediction that uses features derived from the predicted secondary structure and predicted burial information of amino acid residues. The examination of different individual as well as feature combinations revealed that the combination of secondary structural content, secondary structural and solvent accessibility state frequencies of amino acids gave rise to the best leave-one-out cross-validation accuracy of ~81% which is comparable to the best accuracy reported in the literature so far. PMID:21776605

  10. How Good Are Simplified Models for Protein Structure Prediction?

    PubMed Central

    Newton, M. A. Hakim; Rashid, Mahmood A.; Pham, Duc Nghia; Sattar, Abdul

    2014-01-01

    Protein structure prediction (PSP) has been one of the most challenging problems in computational biology for several decades. The challenge is largely due to the complexity of the all-atomic details and the unknown nature of the energy function. Researchers have therefore used simplified energy models that consider interaction potentials only between the amino acid monomers in contact on discrete lattices. The restricted nature of the lattices and the energy models poses a twofold concern regarding the assessment of the models. Can a native or a very close structure be obtained when structures are mapped to lattices? Can the contact based energy models on discrete lattices guide the search towards the native structures? In this paper, we use the protein chain lattice fitting (PCLF) problem to address the first concern; we developed a constraint-based local search algorithm for the PCLF problem for cubic and face-centered cubic lattices and found very close lattice fits for the native structures. For the second concern, we use a number of techniques to sample the conformation space and find correlations between energy functions and root mean square deviation (RMSD) distance of the lattice-based structures with the native structures. Our analysis reveals weakness of several contact based energy models used that are popular in PSP. PMID:24876837

  11. Structure Prediction: New Insights into Decrypting Long Noncoding RNAs

    PubMed Central

    Yan, Kun; Arfat, Yasir; Li, Dijie; Zhao, Fan; Chen, Zhihao; Yin, Chong; Sun, Yulong; Hu, Lifang; Yang, Tuanmin; Qian, Airong

    2016-01-01

    Long noncoding RNAs (lncRNAs), which form a diverse class of RNAs, remain the least understood type of noncoding RNAs in terms of their nature and identification. Emerging evidence has revealed that a small number of newly discovered lncRNAs perform important and complex biological functions such as dosage compensation, chromatin regulation, genomic imprinting, and nuclear organization. However, understanding the wide range of functions of lncRNAs related to various processes of cellular networks remains a great experimental challenge. Structural versatility is critical for RNAs to perform various functions and provides new insights into probing the functions of lncRNAs. In recent years, the computational method of RNA structure prediction has been developed to analyze the structure of lncRNAs. This novel methodology has provided basic but indispensable information for the rapid, large-scale and in-depth research of lncRNAs. This review focuses on mainstream RNA structure prediction methods at the secondary and tertiary levels to offer an additional approach to investigating the functions of lncRNAs. PMID:26805815

  12. EVO—Evolutionary algorithm for crystal structure prediction

    NASA Astrophysics Data System (ADS)

    Bahmann, Silvia; Kortus, Jens

    2013-06-01

    We present EVO—an evolution strategy designed for crystal structure search and prediction. The concept and main features of biological evolution such as creation of diversity and survival of the fittest have been transferred to crystal structure prediction. EVO successfully demonstrates its applicability to find crystal structures of the elements of the 3rd main group with their different spacegroups. For this we used the number of atoms in the conventional cell and multiples of it. Running EVO with different numbers of carbon atoms per unit cell yields graphite as the lowest energy structure as well as a diamond-like structure, both in one run. Our implementation also supports the search for 2D structures and was able to find a boron sheet with structural features so far not considered in literature. Program summaryProgram title: EVO Catalogue identifier: AEOZ_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEOZ_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU General Public License version 3 No. of lines in distributed program, including test data, etc.: 23488 No. of bytes in distributed program, including test data, etc.: 1830122 Distribution format: tar.gz Programming language: Python. Computer: No limitations known. Operating system: Linux. RAM: Negligible compared to the requirements of the electronic structure programs used Classification: 7.8. External routines: Quantum ESPRESSO (http://www.quantum-espresso.org/), GULP (https://projects.ivec.org/gulp/) Nature of problem: Crystal structure search is a global optimisation problem in 3N+3 dimensions where N is the number of atoms in the unit cell. The high dimensional search space is accompanied by an unknown energy landscape. Solution method: Evolutionary algorithms transfer the main features of biological evolution to use them in global searches. The combination of the "survival of the fittest" (deterministic) and the randomised choice of the parents and normally distributed mutation steps (non-deterministic) provides a thorough search. Restrictions: The algorithm is in principle only restricted by a huge search space and simultaneously increasing calculation time (memory, etc.), which is not a problem for our piece of code but for the used electronic structure programs. Running time: The simplest provided case runs serially and takes 30 minutes to one hour. All other calculations run for significantly longer time depending on the parameters like the number and sort of atoms and the electronic structure program in use as well as the level of parallelism included.

  13. Predicting the stability of large structured food webs.

    PubMed

    Allesina, Stefano; Grilli, Jacopo; Barabás, György; Tang, Si; Aljadeff, Johnatan; Maritan, Amos

    2015-01-01

    The stability of ecological systems has been a long-standing focus of ecology. Recently, tools from random matrix theory have identified the main drivers of stability in ecological communities whose network structure is random. However, empirical food webs differ greatly from random graphs. For example, their degree distribution is broader, they contain few trophic cycles, and they are almost interval. Here we derive an approximation for the stability of food webs whose structure is generated by the cascade model, in which 'larger' species consume 'smaller' ones. We predict the stability of these food webs with great accuracy, and our approximation also works well for food webs whose structure is determined empirically or by the niche model. We find that intervality and broad degree distributions tend to stabilize food webs, and that average interaction strength has little influence on stability, compared with the effect of variance and correlation. PMID:26198207

  14. Predicting the stability of large structured food webs

    PubMed Central

    Allesina, Stefano; Grilli, Jacopo; Barabás, György; Tang, Si; Aljadeff, Johnatan; Maritan, Amos

    2015-01-01

    The stability of ecological systems has been a long-standing focus of ecology. Recently, tools from random matrix theory have identified the main drivers of stability in ecological communities whose network structure is random. However, empirical food webs differ greatly from random graphs. For example, their degree distribution is broader, they contain few trophic cycles, and they are almost interval. Here we derive an approximation for the stability of food webs whose structure is generated by the cascade model, in which ‘larger' species consume ‘smaller' ones. We predict the stability of these food webs with great accuracy, and our approximation also works well for food webs whose structure is determined empirically or by the niche model. We find that intervality and broad degree distributions tend to stabilize food webs, and that average interaction strength has little influence on stability, compared with the effect of variance and correlation. PMID:26198207

  15. Structure Prediction and Validation of the ERK8 Kinase Domain

    PubMed Central

    Strambi, Angela; Mori, Mattia; Rossi, Matteo; Colecchia, David; Manetti, Fabrizio; Carlomagno, Francesca; Botta, Maurizio; Chiariello, Mario

    2013-01-01

    Extracellular signal-regulated kinase 8 (ERK8) has been already implicated in cell transformation and in the protection of genomic integrity and, therefore, proposed as a novel potential therapeutic target for cancer. In the absence of a crystal structure, we developed a three-dimensional model for its kinase domain. To validate our model we applied a structure-based virtual screening protocol consisting of pharmacophore screening and molecular docking. Experimental characterization of the hit compounds confirmed that a high percentage of the identified scaffolds was able to inhibit ERK8. We also confirmed an ATP competitive mechanism of action for the two best-performing molecules. Ultimately, we identified an ERK8 drug-resistant “gatekeeper” mutant that corroborated the predicted molecular binding mode, confirming the reliability of the generated structure. We expect that our model will be a valuable tool for the development of specific ERK8 kinase inhibitors. PMID:23326322

  16. Structural class tendency of polypeptide: A new conception in predicting protein structural class

    NASA Astrophysics Data System (ADS)

    Yu, Tao; Sun, Zhi-Bo; Sang, Jian-Ping; Huang, Sheng-You; Zou, Xian-Wu

    2007-12-01

    Prediction of protein domain structural classes is an important topic in protein science. In this paper, we proposed a new conception: structural class tendency of polypeptides (SCTP), which is based on the fact that a given amino acid fragment tends to be presented in certain type of proteins. The SCTP is obtained from an available training data set PDB40-B. When using the SCTP to predict protein structural classes by Intimate Sorting predictive method, we got the predictive accuracy (jackknife test) with 93.7%, 96.5%, and 78.6% for the testing data set PDB40-j, Chou&Maggiora and CHOU. These results indicate that the SCTP approach is quite encouraging and promising. This new conception provides an effective tool to extract valuable information from protein sequences.

  17. Symmetry-adapted digital modeling II. The double-helix B-DNA.

    PubMed

    Janner, A

    2016-05-01

    The positions of phosphorus in B-DNA have the remarkable property of occurring (in axial projection) at well defined points in the three-dimensional space of a projected five-dimensional decagonal lattice, subdividing according to the golden mean ratio τ:1:τ [with τ = (1+\\sqrt {5})/2] the edges of an enclosing decagon. The corresponding planar integral indices n1, n2, n3, n4 (which are lattice point coordinates) are extended to include the axial index n5 as well, defined for each P position of the double helix with respect to the single decagonal lattice ΛP(aP, cP) with aP = 2.222 Å and cP = 0.676 Å. A finer decagonal lattice Λ(a, c), with a = aP/6 and c = cP, together with a selection of lattice points for each nucleotide with a given indexed P position (so as to define a discrete set in three dimensions) permits the indexing of the atomic positions of the B-DNA d(AGTCAGTCAG) derived by M. J. P. van Dongen. This is done for both DNA strands and the single lattice Λ. Considered first is the sugar-phosphate subsystem, and then each nucleobase guanine, adenine, cytosine and thymine. One gets in this way a digital modeling of d(AGTCAGTCAG) in a one-to-one correspondence between atomic and indexed positions and a maximal deviation of about 0.6 Å (for the value of the lattice parameters given above). It is shown how to get a digital modeling of the B-DNA double helix for any given code. Finally, a short discussion indicates how this procedure can be extended to derive coarse-grained B-DNA models. An example is given with a reduction factor of about 2 in the number of atomic positions. A few remarks about the wider interest of this investigation and possible future developments conclude the paper. PMID:27126108

  18. Importance of coulombic end effects on cation accumulation near oligoelectrolyte B-DNA: a demonstration using 23Na NMR.

    PubMed

    Stein, V M; Bond, J P; Capp, M W; Anderson, C F; Record, M T

    1995-03-01

    The local cation concentration at the surface of oligomeric or polymeric B-DNA is expected, on the basis of MC simulations (Olmsted, M. C., C. F. Anderson, and M. T. Record, Jr. 1989. Proc. Natl. Acad. Sci. USA. 86:7766-7770), to decrease sharply as either end of the molecule is approached. In this paper we report 23Na NMR measurements indicating the importance of this "coulombic" end effect on the average extent of association of Na+ with oligomeric duplex DNA. In solutions containing either 20-bp synthetic DNA or 160-bp mononucleosomal calf thymus DNA at phosphate monomer concentrations [P] of 4-10 mM, measurements were made over the range of ratios 1 < or = [Na]/[LP] < or = 20, corresponding to Na+ concentrations of 4-200 nM. The longitudinal 23Na NMR relaxation rates measured in these NaDNA solutions, Robs, are interpreted as population-weighted averages of contributions from "bound" (RB) and "free" (RF) 23Na relaxation rates. The observed enhancements of Robs indicate that RB significantly exceeds RF, which is approximately equal to the 23Na relaxation rate in an aqueous solution containing only NaCl. Under salt-fre-tconditions ([Na]/[P] = 1), where the enhancement in Robs is maximal, we find that Robs--RF in the solution containing 160-bp DNA is approximately 1.8 times that observed for the 20-bp DNA. For the 160-bp oligomer (which theoretical calculations predict to be effectively polyion-like), we find that a plot of Robs v. [P]/[Na] is linear, as observed previously for sonicated (approximately 700 bp) DNA samples. For the 20-bp oligonucleotide this plot exhibits a marked departure from linearity that can be fitted to a quadratic function of [P]/[Na]. Monte Carlo simulations based on a simplified model are capable of reproducing the qualitative trends in the 23Na NMR measurements analyzed here. In particular, the dependences of Robs--RF on DNA charge magnitude of Z(320 vs. 38 phosphates) and (for the 20-bp oligomer) on [Na]/[P] are well correlated with the calculated average surface concentration of Na+. Thus, effects of sodium concentration on RB appear to be of secondary importance. We conclude that 23Na NMR relaxation measurements are a sensitive probe of the effects of oligomer charge on the extent of ion accumulation near B-DNA oligonucleotides, as a function of [Na] and [P]. PMID:7756526

  19. StructBoost: Boosting Methods for Predicting Structured Output Variables.

    PubMed

    Chunhua Shen; Guosheng Lin; van den Hengel, Anton

    2014-10-01

    Boosting is a method for learning a single accurate predictor by linearly combining a set of less accurate weak learners. Recently, structured learning has found many applications in computer vision. Inspired by structured support vector machines (SSVM), here we propose a new boosting algorithm for structured output prediction, which we refer to as StructBoost. StructBoost supports nonlinear structured learning by combining a set of weak structured learners. As SSVM generalizes SVM, our StructBoost generalizes standard boosting approaches such as AdaBoost, or LPBoost to structured learning. The resulting optimization problem of StructBoost is more challenging than SSVM in the sense that it may involve exponentially many variables and constraints. In contrast, for SSVM one usually has an exponential number of constraints and a cutting-plane method is used. In order to efficiently solve StructBoost, we formulate an equivalent 1-slack formulation and solve it using a combination of cutting planes and column generation. We show the versatility and usefulness of StructBoost on a range of problems such as optimizing the tree loss for hierarchical multi-class classification, optimizing the Pascal overlap criterion for robust visual tracking and learning conditional random field parameters for image segmentation. PMID:26352637

  20. Addressing the Role of Conformational Diversity in Protein Structure Prediction

    PubMed Central

    Parisi, Gustavo; Fornasari, Maria Silvina

    2016-01-01

    Computational modeling of tertiary structures has become of standard use to study proteins that lack experimental characterization. Unfortunately, 3D structure prediction methods and model quality assessment programs often overlook that an ensemble of conformers in equilibrium populates the native state of proteins. In this work we collected sets of publicly available protein models and the corresponding target structures experimentally solved and studied how they describe the conformational diversity of the protein. For each protein, we assessed the quality of the models against known conformers by several standard measures and identified those models ranked best. We found that model rankings are defined by both the selected target conformer and the similarity measure used. 70% of the proteins in our datasets show that different models are structurally closest to different conformers of the same protein target. We observed that model building protocols such as template-based or ab initio approaches describe in similar ways the conformational diversity of the protein, although for template-based methods this description may depend on the sequence similarity between target and template sequences. Taken together, our results support the idea that protein structure modeling could help to identify members of the native ensemble, highlight the importance of considering conformational diversity in protein 3D quality evaluations and endorse the study of the variability of the native structure for a meaningful biological analysis. PMID:27159429

  1. Quantitative structure-property relationships for predicting Henry's law constant from molecular structure.

    PubMed

    Dearden, John C; Schüürmann, Gerrit

    2003-08-01

    Various models are available for the prediction of Henry's law constant (H) or the air-water partition coefficient (Kaw), its dimensionless counterpart. Incremental methods are based on structural features such as atom types, bond types, and local structural environments; other regression models employ physicochemical properties, structural descriptors such as connectivity indices, and descriptors reflecting the electronic structure. There are also methods to calculate H from the ratio of vapor pressure (p(v)) and water solubility (S(w)) that in turn can be estimated from molecular structure, and quantum chemical continuum-solvation models to predict H via the solvation-free energy (deltaG(s)). This review is confined to methods that calculate H from molecular structure without experimental information and covers more than 40 methods published in the last 26 years. For a subset of eight incremental methods and four continuum-solvation models, a comparative analysis of their prediction performance is made using a test set of 700 compounds that includes a significant number of more complex and drug-like chemical structures. The results reveal substantial differences in the application range as well as in the prediction capability, a general decrease in prediction performance with decreasing H, and surprisingly large individual prediction errors, which are particularly striking for some quantum chemical schemes. The overall best-performing method appears to be the bond contribution method as implemented in the HENRYWIN software package, yielding a predictive squared correlation coefficient (q2) of 0.87 and a standard error of 1.03 log units for the test set. PMID:12924576

  2. FOURIER ANALYSIS OF EXTENDED FINE STRUCTURE WITH AUTOREGRESSIVE PREDICTION

    SciTech Connect

    Barton, J.; Shirley, D.A.

    1985-01-01

    Autoregressive prediction is adapted to double the resolution of Angle-Resolved Photoemission Extended Fine Structure (ARPEFS) Fourier transforms. Even with the optimal taper (weighting function), the commonly used taper-and-transform Fourier method has limited resolution: it assumes the signal is zero beyond the limits of the measurement. By seeking the Fourier spectrum of an infinite extent oscillation consistent with the measurements but otherwise having maximum entropy, the errors caused by finite data range can be reduced. Our procedure developed to implement this concept applies autoregressive prediction to extrapolate the signal to an extent controlled by a taper width. Difficulties encountered when processing actual ARPEFS data are discussed. A key feature of this approach is the ability to convert improved measurements (signal-to-noise or point density) into improved Fourier resolution.

  3. Improved thermodynamic parameters for prediction of structure H hydrate equilibria

    SciTech Connect

    Mehta, A.P.; Sloan, E.D.

    1996-07-01

    An improved set of all the thermodynamic and molecular properties required for the prediction of the existing 20 systems of structure H (sH) hydrate phase equilibrium data is presented. The statistical thermodynamics model was based on the van der Waals and Platteeuw theory, and the spherical core Kihara potential was used for guest-water interactions. Optimized Kihara parameters and reference thermodynamic properties were derived from experimental data of over 20 sH hydrate forming systems. The model could fit all the existing sH hydrate data within an accuracy of {+-}6%. Inhibitor predictions were also shown to fit recent data with no adjustable parameters. The feasibility of using hydrate cage occupancies to derive refined Kihara parameters of the guest molecules was investigated. Possible existence of sH hydrates at cryogenic temperatures was also established based on the model.

  4. Structural syntactic prediction measured with ELAN: evidence from ERPs.

    PubMed

    Fonteneau, Elisabeth

    2013-02-01

    The current study used event-related potentials (ERPs) to investigate how and when argument structure information is used during the processing of sentences with a filler-gap dependency. We hypothesize that one specific property - animacy (living vs. non-living) - is used by the parser during the building of the syntactic structure. Participants heard sentences that were rated off-line as having an expected noun (Who did the Lion King chase the caravan with?) or an unexpected noun (Who did Lion King chase the animal with?). This prediction is based on the animacy properties relation between the wh-word and the noun in the object position. ERPs from the noun in the unexpected condition (animal) elicited a typical Early Left Anterior Negativity (ELAN)/P600 complex compared to the noun in the expected condition (caravan). Firstly, these results demonstrate that the ELAN reflects not only grammatical category violation but also animacy property expectations in filler-gap dependency. Secondly, our data suggests that the language comprehension system is able to make detailed predictions about aspects of the upcoming words to build up the syntactic structure. PMID:23262082

  5. Cortical structure predicts success in performing musical transformation judgments.

    PubMed

    Foster, Nicholas E V; Zatorre, Robert J

    2010-10-15

    Recognizing melodies by their interval structure, or "relative pitch," is a fundamental aspect of musical perception. By using relative pitch, we are able to recognize tunes regardless of the key in which they are played. We sought to determine the cortical areas important for relative pitch processing using two morphometric techniques. Cortical differences have been reported in musicians within right auditory cortex (AC), a region considered important for pitch-based processing, and we have previously reported a functional correlation between relative pitch processing in the anterior intraparietal sulcus (IPS). We addressed the hypothesis that regional variation of cortical structure within AC and IPS is related to relative pitch ability using two anatomical techniques, cortical thickness (CT) analysis and voxel-based morphometry (VBM) of magnetic resonance imaging data. Persons with variable amounts of formal musical training were tested on a melody transposition task, as well as two musical control tasks and a speech control task. We found that gray matter concentration and cortical thickness in right Heschl's sulcus and bilateral IPS both predicted relative pitch task performance and correlated to a lesser extent with performance on the two musical control tasks. After factoring out variance explained by musical training, only relative pitch performance was predicted by cortical structure in these regions. These results directly demonstrate the functional relevance of previously reported anatomical differences in the auditory cortex of musicians. The findings in the IPS provide further support for the existence of a multimodal network for systematic transformation of stimulus information in this region. PMID:20600982

  6. Structural brain MRI trait polygenic score prediction of cognitive abilities

    PubMed Central

    Luciano, Michelle; Marioni, Riccardo E; Hernández, Maria Valdés; Maniega, Susana Munoz; Hamilton, Iona F; Royle, Natalie A.; Scotland, Generation; Chauhan, Ganesh; Bis, Joshua C.; Debette, Stephanie; DeCarli, Charles; Fornage, Myriam; Schmidt, Reinhold; Ikram, M. Arfan; Launer, Lenore J.; Seshadri, Sudha; Bastin, Mark E.; Porteous, David J.; Wardlaw, Joanna; Deary, Ian J

    2016-01-01

    Structural brain magnetic resonance imaging (MRI) traits share part of their genetic variance with cognitive traits. Here, we use genetic association results from large meta-analytic studies of genome-wide association for brain infarcts, white matter hyperintensities, intracranial, hippocampal and total brain volumes to estimate polygenic scores for these traits in three Scottish samples: Generation Scotland: Scottish Family Health Study (GS:SFHS), and the Lothian Birth Cohorts of 1936 (LBC1936) and 1921 (LBC1921). These five brain MRI trait polygenic scores were then used to 1) predict corresponding MRI traits in the LBC1936 (numbers ranged 573 to 630 across traits) and 2) predict cognitive traits in all three cohorts (in 8,115 to 8,250 persons). In the LBC1936, all MRI phenotypic traits were correlated with at least one cognitive measure; and polygenic prediction of MRI traits was observed for intracranial volume. Meta-analysis of the correlations between MRI polygenic scores and cognitive traits revealed a significant negative correlation (maximal r=0.08) between the hippocampal volume polygenic score and measures of global cognitive ability collected in childhood and in old age in the Lothian Birth Cohorts. The lack of association to a related general cognitive measure when including the GS:SFHS points to either type 1 error or the importance of using prediction samples that closely match the demographics of the genome-wide association samples from which prediction is based. Ideally, these analyses should be repeated in larger samples with data on both MRI and cognition, and using MRI GWA results from even larger meta-analysis studies. PMID:26427786

  7. Failure prediction of thin beryllium sheets used in spacecraft structures

    NASA Technical Reports Server (NTRS)

    Roschke, Paul N.; Papados, Photios; Mascorro, Edward

    1991-01-01

    In an attempt to predict failure for cross-rolled beryllium sheet structures, high order macroscopic failure criteria are used. These require the knowledge of in-plane uniaxial and shear strengths. Test results are included for in-plane biaxial tension, uniaxial compression for two different material orientations, and shear. All beryllium specimens have the same chemical composition. In addition, all experimental work was performed in a controlled laboratory environment. Numerical simulation complements these tests. A brief bibliography supplements references listed in a previous report.

  8. The sequential structure of brain activation predicts skill.

    PubMed

    Anderson, John R; Bothell, Daniel; Fincham, Jon M; Moon, Jungaa

    2016-01-29

    In an fMRI study, participants were trained to play a complex video game. They were scanned early and then again after substantial practice. While better players showed greater activation in one region (right dorsal striatum) their relative skill was better diagnosed by considering the sequential structure of whole brain activation. Using a cognitive model that played this game, we extracted a characterization of the mental states that are involved in playing a game and the statistical structure of the transitions among these states. There was a strong correspondence between this measure of sequential structure and the skill of different players. Using multi-voxel pattern analysis, it was possible to recognize, with relatively high accuracy, the cognitive states participants were in during particular scans. We used the sequential structure of these activation-recognized states to predict the skill of individual players. These findings indicate that important features about information-processing strategies can be identified from a model-based analysis of the sequential structure of brain activation. PMID:26707716

  9. Gene function prediction based on the Gene Ontology hierarchical structure.

    PubMed

    Cheng, Liangxi; Lin, Hongfei; Hu, Yuncui; Wang, Jian; Yang, Zhihao

    2014-01-01

    The information of the Gene Ontology annotation is helpful in the explanation of life science phenomena, and can provide great support for the research of the biomedical field. The use of the Gene Ontology is gradually affecting the way people store and understand bioinformatic data. To facilitate the prediction of gene functions with the aid of text mining methods and existing resources, we transform it into a multi-label top-down classification problem and develop a method that uses the hierarchical relationships in the Gene Ontology structure to relieve the quantitative imbalance of positive and negative training samples. Meanwhile the method enhances the discriminating ability of classifiers by retaining and highlighting the key training samples. Additionally, the top-down classifier based on a tree structure takes the relationship of target classes into consideration and thus solves the incompatibility between the classification results and the Gene Ontology structure. Our experiment on the Gene Ontology annotation corpus achieves an F-value performance of 50.7% (precision: 52.7% recall: 48.9%). The experimental results demonstrate that when the size of training set is small, it can be expanded via topological propagation of associated documents between the parent and child nodes in the tree structure. The top-down classification model applies to the set of texts in an ontology structure or with a hierarchical relationship. PMID:25192339

  10. Gene Function Prediction Based on the Gene Ontology Hierarchical Structure

    PubMed Central

    Cheng, Liangxi; Lin, Hongfei; Hu, Yuncui; Wang, Jian; Yang, Zhihao

    2014-01-01

    The information of the Gene Ontology annotation is helpful in the explanation of life science phenomena, and can provide great support for the research of the biomedical field. The use of the Gene Ontology is gradually affecting the way people store and understand bioinformatic data. To facilitate the prediction of gene functions with the aid of text mining methods and existing resources, we transform it into a multi-label top-down classification problem and develop a method that uses the hierarchical relationships in the Gene Ontology structure to relieve the quantitative imbalance of positive and negative training samples. Meanwhile the method enhances the discriminating ability of classifiers by retaining and highlighting the key training samples. Additionally, the top-down classifier based on a tree structure takes the relationship of target classes into consideration and thus solves the incompatibility between the classification results and the Gene Ontology structure. Our experiment on the Gene Ontology annotation corpus achieves an F-value performance of 50.7% (precision: 52.7% recall: 48.9%). The experimental results demonstrate that when the size of training set is small, it can be expanded via topological propagation of associated documents between the parent and child nodes in the tree structure. The top-down classification model applies to the set of texts in an ontology structure or with a hierarchical relationship. PMID:25192339

  11. Unbiased charge oscillations in B-DNA: monomer polymers and dimer polymers.

    PubMed

    Lambropoulos, K; Chatzieleftheriou, M; Morphis, A; Kaklamanis, K; Theodorakou, M; Simserides, C

    2015-09-01

    We call monomer a B-DNA base pair and examine, analytically and numerically, electron or hole oscillations in monomer and dimer polymers, i.e., periodic sequences with repetition unit made of one or two monomers. We employ a tight-binding (TB) approach at the base-pair level to readily determine the spatiotemporal evolution of a single extra carrier along a N base-pair B-DNA segment. We study highest occupied molecular orbital and lowest unoccupied molecular orbital eigenspectra as well as the mean over time probabilities to find the carrier at a particular monomer. We use the pure mean transfer rate k to evaluate the easiness of charge transfer. The inverse decay length β for exponential fits k(d), where d is the charge transfer distance, and the exponent η for power-law fits k(N) are computed; generally power-law fits are better. We illustrate that increasing the number of different parameters involved in the TB description, the fall of k(d) or k(N) becomes steeper and show the range covered by β and η. Finally, for both the time-independent and the time-dependent problems, we analyze the palindromicity and the degree of eigenspectrum dependence of the probabilities to find the carrier at a particular monomer. PMID:26465516

  12. Unbiased charge oscillations in B-DNA: Monomer polymers and dimer polymers

    NASA Astrophysics Data System (ADS)

    Lambropoulos, K.; Chatzieleftheriou, M.; Morphis, A.; Kaklamanis, K.; Theodorakou, M.; Simserides, C.

    2015-09-01

    We call monomer a B-DNA base pair and examine, analytically and numerically, electron or hole oscillations in monomer and dimer polymers, i.e., periodic sequences with repetition unit made of one or two monomers. We employ a tight-binding (TB) approach at the base-pair level to readily determine the spatiotemporal evolution of a single extra carrier along a N base-pair B-DNA segment. We study highest occupied molecular orbital and lowest unoccupied molecular orbital eigenspectra as well as the mean over time probabilities to find the carrier at a particular monomer. We use the pure mean transfer rate k to evaluate the easiness of charge transfer. The inverse decay length β for exponential fits k (d ) , where d is the charge transfer distance, and the exponent η for power-law fits k (N ) are computed; generally power-law fits are better. We illustrate that increasing the number of different parameters involved in the TB description, the fall of k (d ) or k (N ) becomes steeper and show the range covered by β and η . Finally, for both the time-independent and the time-dependent problems, we analyze the palindromicity and the degree of eigenspectrum dependence of the probabilities to find the carrier at a particular monomer.

  13. Predicting fracture in micron-scale polycrystalline silicon MEMS structures.

    SciTech Connect

    Hazra, Siddharth S.; de Boer, Maarten Pieter; Boyce, Brad Lee; Ohlhausen, James Anthony; Foulk, James W., III; Reedy, Earl David, Jr.

    2010-09-01

    Designing reliable MEMS structures presents numerous challenges. Polycrystalline silicon fractures in a brittle manner with considerable variability in measured strength. Furthermore, it is not clear how to use a measured tensile strength distribution to predict the strength of a complex MEMS structure. To address such issues, two recently developed high throughput MEMS tensile test techniques have been used to measure strength distribution tails. The measured tensile strength distributions enable the definition of a threshold strength as well as an inferred maximum flaw size. The nature of strength-controlling flaws has been identified and sources of the observed variation in strength investigated. A double edge-notched specimen geometry was also tested to study the effect of a severe, micron-scale stress concentration on the measured strength distribution. Strength-based, Weibull-based, and fracture mechanics-based failure analyses were performed and compared with the experimental results.

  14. Methods for evaluating the predictive accuracy of structural dynamic models

    NASA Technical Reports Server (NTRS)

    Hasselman, T. K.; Chrostowski, Jon D.

    1990-01-01

    Uncertainty of frequency response using the fuzzy set method and on-orbit response prediction using laboratory test data to refine an analytical model are emphasized with respect to large space structures. Two aspects of the fuzzy set approach were investigated relative to its application to large structural dynamics problems: (1) minimizing the number of parameters involved in computing possible intervals; and (2) the treatment of extrema which may occur in the parameter space enclosed by all possible combinations of the important parameters of the model. Extensive printer graphics were added to the SSID code to help facilitate model verification, and an application of this code to the LaRC Ten Bay Truss is included in the appendix to illustrate this graphics capability.

  15. Factors Influencing Progressive Failure Analysis Predictions for Laminated Composite Structure

    NASA Technical Reports Server (NTRS)

    Knight, Norman F., Jr.

    2008-01-01

    Progressive failure material modeling methods used for structural analysis including failure initiation and material degradation are presented. Different failure initiation criteria and material degradation models are described that define progressive failure formulations. These progressive failure formulations are implemented in a user-defined material model for use with a nonlinear finite element analysis tool. The failure initiation criteria include the maximum stress criteria, maximum strain criteria, the Tsai-Wu failure polynomial, and the Hashin criteria. The material degradation model is based on the ply-discounting approach where the local material constitutive coefficients are degraded. Applications and extensions of the progressive failure analysis material model address two-dimensional plate and shell finite elements and three-dimensional solid finite elements. Implementation details are described in the present paper. Parametric studies for laminated composite structures are discussed to illustrate the features of the progressive failure modeling methods that have been implemented and to demonstrate their influence on progressive failure analysis predictions.

  16. Exploiting homology information in nontemplate based prediction of protein structures.

    PubMed

    Iacoangeli, Alfredo; Marcatili, Paolo; Tramontano, Anna

    2015-10-13

    In this paper we describe a novel strategy for exploring the conformational space of proteins and show that this leads to better models for proteins the structure of which is not amenable to template based methods. Our strategy is based on the assumption that the energy global minimum of homologous proteins must correspond to similar conformations, while the precise profiles of their energy landscape, and consequently the positions of the local minima, are likely to be different. In line with this hypothesis, we apply a replica exchange Monte Carlo simulation protocol that, rather than using different parameters for each parallel simulation, uses the sequences of homologous proteins. We show that our results are competitive with respect to alternative methods, including those producing the best model for each of the analyzed targets in the CASP10 (10th Critical Assessment of techniques for protein Structure Prediction) experiment free modeling category. PMID:26574289

  17. Strain Concentration at Structural Discontinuities and Its Prediction Based on Characteristics of Compliance Change in Structures

    NASA Astrophysics Data System (ADS)

    Kasahara, Naoto

    Elevated temperature structural design codes pay attention to strain concentration at structural discontinuities due to creep and plasticity, since it causes an increase in creep-fatigue damage of materials. One of the difficulties in predicting strain concentration is its dependence on the magnitude of loading, the constitutive equations, and the duration of loading. In this study, the author investigated the fundamental mechanism of strain concentration and its main factors. The results revealed that strain concentration is caused by strain redistribution between elastic and inelastic regions, which can be quantified by the characteristics of structural compliance. The characteristics of structural compliance are controlled by elastic region in structures and are insensitive to constitutive equations. It means that inelastic analysis can be easily applied to obtain compliance characteristics. By utilizing this fact, a simplified inelastic analysis method was proposed based on the characteristics of compliance change for the prediction of strain concentration.

  18. Prediction of Alzheimer's disease using individual structural connectivity networks

    PubMed Central

    Shao, Junming; Myers, Nicholas; Yang, Qinli; Feng, Jing; Plant, Claudia; Böhm, Christian; Förstl, Hans; Kurz, Alexander; Zimmer, Claus; Meng, Chun; Riedl, Valentin; Wohlschläger, Afra; Sorg, Christian

    2012-01-01

    Alzheimer's disease (AD) progressively degrades the brain's gray and white matter. Changes in white matter reflect changes in the brain's structural connectivity pattern. Here, we established individual structural connectivity networks (ISCNs) to distinguish predementia and dementia AD from healthy aging in individual scans. Diffusion tractography was used to construct ISCNs with a fully automated procedure for 21 healthy control subjects (HC), 23 patients with mild cognitive impairment and conversion to AD dementia within 3 years (AD-MCI), and 17 patients with mild AD dementia. Three typical pattern classifiers were used for AD prediction. Patients with AD and AD-MCI were separated from HC with accuracies greater than 95% and 90%, respectively, irrespective of prediction approach and specific fiber properties. Most informative connections involved medial prefrontal, posterior parietal, and insular cortex. Patients with mild AD were separated from those with AD-MCI with an accuracy of approximately 85%. Our finding provides evidence that ISCNs are sensitive to the impact of earliest stages of AD. ISCNs may be useful as a white matter-based imaging biomarker to distinguish healthy aging from AD. PMID:22405045

  19. Structure-Based Predictive model for Coal Char Combustion.

    SciTech Connect

    Hurt, R.; Colo, J; Essenhigh, R.; Hadad, C; Stanley, E.

    1997-09-24

    During the third quarter of this project, progress was made on both major technical tasks. Progress was made in the chemistry department at OSU on the calculation of thermodynamic properties for a number of model organic compounds. Modelling work was carried out at Brown to adapt a thermodynamic model of carbonaceous mesophase formation, originally applied to pitch carbonization, to the prediction of coke texture in coal combustion. This latter work makes use of the FG-DVC model of coal pyrolysis developed by Advanced Fuel Research to specify the pool of aromatic clusters that participate in the order/disorder transition. This modelling approach shows promise for the mechanistic prediction of the rank dependence of char structure and will therefore be pursued further. Crystalline ordering phenomena were also observed in a model char prepared from phenol-formaldehyde carbonized at 900{degrees}C and 1300{degrees}C using high-resolution TEM fringe imaging. Dramatic changes occur in the structure between 900 and 1300{degrees}C, making this char a suitable candidate for upcoming in situ work on the hot stage TEM. Work also proceeded on molecular dynamics simulations at Boston University and on equipment modification and testing for the combustion experiments with widely varying flame types at Ohio State.

  20. Hybrid Global Optimization Algorithms for Protein Structure Prediction: Alternating Hybrids

    PubMed Central

    Klepeis, J. L.; Pieja, M. J.; Floudas, C. A.

    2003-01-01

    Hybrid global optimization methods attempt to combine the beneficial features of two or more algorithms, and can be powerful methods for solving challenging nonconvex optimization problems. In this paper, novel classes of hybrid global optimization methods, termed alternating hybrids, are introduced for application as a tool in treating the peptide and protein structure prediction problems. In particular, these new optimization methods take the form of hybrids between a deterministic global optimization algorithm, the αBB, and a stochastically based method, conformational space annealing (CSA). The αBB method, as a theoretically proven global optimization approach, exhibits consistency, as it guarantees convergence to the global minimum for twice-continuously differentiable constrained nonlinear programming problems, but can benefit from computationally related enhancements. On the other hand, the independent CSA algorithm is highly efficient, though the method lacks theoretical guarantees of convergence. Furthermore, both the αBB method and the CSA method are found to identify ensembles of low-energy conformers, an important feature for determining the true free energy minimum of the system. The proposed hybrid methods combine the desirable features of efficiency and consistency, thus enabling the accurate prediction of the structures of larger peptides. Computational studies for met-enkephalin and melittin, employing sequential and parallel computing frameworks, demonstrate the promise for these proposed hybrid methods. PMID:12547770

  1. Prediction of Alzheimer's disease using individual structural connectivity networks.

    PubMed

    Shao, Junming; Myers, Nicholas; Yang, Qinli; Feng, Jing; Plant, Claudia; Böhm, Christian; Förstl, Hans; Kurz, Alexander; Zimmer, Claus; Meng, Chun; Riedl, Valentin; Wohlschläger, Afra; Sorg, Christian

    2012-12-01

    Alzheimer's disease (AD) progressively degrades the brain's gray and white matter. Changes in white matter reflect changes in the brain's structural connectivity pattern. Here, we established individual structural connectivity networks (ISCNs) to distinguish predementia and dementia AD from healthy aging in individual scans. Diffusion tractography was used to construct ISCNs with a fully automated procedure for 21 healthy control subjects (HC), 23 patients with mild cognitive impairment and conversion to AD dementia within 3 years (AD-MCI), and 17 patients with mild AD dementia. Three typical pattern classifiers were used for AD prediction. Patients with AD and AD-MCI were separated from HC with accuracies greater than 95% and 90%, respectively, irrespective of prediction approach and specific fiber properties. Most informative connections involved medial prefrontal, posterior parietal, and insular cortex. Patients with mild AD were separated from those with AD-MCI with an accuracy of approximately 85%. Our finding provides evidence that ISCNs are sensitive to the impact of earliest stages of AD. ISCNs may be useful as a white matter-based imaging biomarker to distinguish healthy aging from AD. PMID:22405045

  2. Predicting the genotoxicity of thiophene derivatives from molecular structure.

    PubMed

    Mosier, Philip D; Jurs, Peter C; Custer, Laura L; Durham, Stephen K; Pearl, Greg M

    2003-06-01

    We report several binary classification models that directly link the genetic toxicity of a series of 140 thiophene derivatives with information derived from the compounds' molecular structure. Genetic toxicity was measured using an SOS Chromotest. IMAX (maximal SOS induction factor) values were recorded for each of the 140 compounds both in the presence and in the absence of S9 rat liver homogenate. Compounds were classified as genotoxic if IMAX >or= 1.5 in either test or nongenotoxic if IMAX < 1.5 for both tests. The molecular structures were represented by numerical descriptors that encoded the topological, geometric, electronic, and polar surface area properties of the thiophene derivatives. The classification models used were linear discriminant analysis (LDA), k-nearest neighbor classification (k-NN), and the probabilistic neural network (PNN). These were used in conjunction with either a genetic algorithm or a generalized simulated annealing to find optimal subsets of descriptors for each classifier. The quality of the resulting models was determined by the number of misclassified compounds, with preference given to models that produced fewer false negative classifications. Model sizes ranged from seven descriptors for LDA to three descriptors for k-NN and PNN. Very good classification results were obtained with all three classifiers. Classification rates for the LDA, k-NN, and PNN models were 80, 85, and 85%, respectively, for the prediction set compounds. Additionally, a consensus model was generated that incorporated all three of the basic model types. This consensus model correctly predicted the genotoxicity of 95% of the prediction set compounds. PMID:12807355

  3. Protein structure prediction with local adjust tabu search algorithm

    PubMed Central

    2014-01-01

    Background Protein folding structure prediction is one of the most challenging problems in the bioinformatics domain. Because of the complexity of the realistic protein structure, the simplified structure model and the computational method should be adopted in the research. The AB off-lattice model is one of the simplification models, which only considers two classes of amino acids, hydrophobic (A) residues and hydrophilic (B) residues. Results The main work of this paper is to discuss how to optimize the lowest energy configurations in 2D off-lattice model and 3D off-lattice model by using Fibonacci sequences and real protein sequences. In order to avoid falling into local minimum and faster convergence to the global minimum, we introduce a novel method (SATS) to the protein structure problem, which combines simulated annealing algorithm and tabu search algorithm. Various strategies, such as the new encoding strategy, the adaptive neighborhood generation strategy and the local adjustment strategy, are adopted successfully for high-speed searching the optimal conformation corresponds to the lowest energy of the protein sequences. Experimental results show that some of the results obtained by the improved SATS are better than those reported in previous literatures, and we can sure that the lowest energy folding state for short Fibonacci sequences have been found. Conclusions Although the off-lattice models is not very realistic, they can reflect some important characteristics of the realistic protein. It can be found that 3D off-lattice model is more like native folding structure of the realistic protein than 2D off-lattice model. In addition, compared with some previous researches, the proposed hybrid algorithm can more effectively and more quickly search the spatial folding structure of a protein chain. PMID:25474708

  4. Interaction of Iron II Complexes with B-DNA. Insights from Molecular Modeling, Spectroscopy, and Cellular Biology

    PubMed Central

    Gattuso, Hugo; Duchanois, Thibaut; Besancenot, Vanessa; Barbieux, Claire; Assfeld, Xavier; Becuwe, Philippe; Gros, Philippe C.; Grandemange, Stephanie; Monari, Antonio

    2015-01-01

    We report the characterization of the interaction between B-DNA and three terpyridin iron II complexes. Relatively long time-scale molecular dynamics (MD) is used in order to characterize the stable interaction modes. By means of molecular modeling and UV-vis spectroscopy, we prove that they may lead to stable interactions with the DNA duplex. Furthermore, the presence of larger π-conjugated moieties also leads to the appearance of intercalation binding mode. Non-covalent stabilizing interactions between the iron complexes and the DNA are also characterized and evidenced by the analysis of the gradient of the electronic density. Finally, the structural deformations induced on the DNA in the different binding modes are also evidenced. The synthesis and chemical characterization of the three complexes is reported, as well as their absorption spectra in presence of DNA duplexes to prove the interaction with DNA. Finally, their effects on human cell cultures have also been evidenced to further enlighten their biological effects. PMID:26734600

  5. Crystal structure prediction from first principles: The crystal structures of glycine

    NASA Astrophysics Data System (ADS)

    Lund, Albert M.; Pagola, Gabriel I.; Orendt, Anita M.; Ferraro, Marta B.; Facelli, Julio C.

    2015-04-01

    Here we present the results of our unbiased searches of glycine polymorphs obtained using the genetic algorithms search implemented in MGAC, modified genetic algorithm for crystals, coupled with the local optimization and energy evaluation provided by Quantum Espresso. We demonstrate that it is possible to predict the crystal structures of a biomedical molecule using solely first principles calculations. We were able to find all the ambient pressure stable glycine polymorphs, which are found in the same energetic ordering as observed experimentally and the agreement between the experimental and predicted structures is of such accuracy that the two are visually almost indistinguishable.

  6. Crystal Structure Prediction from First Principles: The Crystal Structures of Glycine

    PubMed Central

    Lund, Albert M.; Pagola, Gabriel I.; Orendt, Anita M.; Ferraro, Marta B.; Facelli, Julio C.

    2015-01-01

    Here we present the results of our unbiased searches of glycine polymorphs obtained using the Genetic Algorithms search implemented in Modified Genetic Algorithm for Crystals coupled with the local optimization and energy evaluation provided by Quantum Espresso. We demonstrate that it is possible to predict the crystal structures of a biomedical molecule using solely first principles calculations. We were able to find all the ambient pressure stable glycine polymorphs, which are found in the same energetic ordering as observed experimentally and the agreement between the experimental and predicted structures is of such accuracy that the two are visually almost indistinguishable. PMID:25843964

  7. How evolutionary crystal structure prediction works--and why.

    PubMed

    Oganov, Artem R; Lyakhov, Andriy O; Valle, Mario

    2011-03-15

    Once the crystal structure of a chemical substance is known, many properties can be predicted reliably and routinely. Therefore if researchers could predict the crystal structure of a material before it is synthesized, they could significantly accelerate the discovery of new materials. In addition, the ability to predict crystal structures at arbitrary conditions of pressure and temperature is invaluable for the study of matter at extreme conditions, where experiments are difficult. Crystal structure prediction (CSP), the problem of finding the most stable arrangement of atoms given only the chemical composition, has long remained a major unsolved scientific problem. Two problems are entangled here: search, the efficient exploration of the multidimensional energy landscape, and ranking, the correct calculation of relative energies. For organic crystals, which contain a few molecules in the unit cell, search can be quite simple as long as a researcher does not need to include many possible isomers or conformations of the molecules; therefore ranking becomes the main challenge. For inorganic crystals, quantum mechanical methods often provide correct relative energies, making search the most critical problem. Recent developments provide useful practical methods for solving the search problem to a considerable extent. One can use simulated annealing, metadynamics, random sampling, basin hopping, minima hopping, and data mining. Genetic algorithms have been applied to crystals since 1995, but with limited success, which necessitated the development of a very different evolutionary algorithm. This Account reviews CSP using one of the major techniques, the hybrid evolutionary algorithm USPEX (Universal Structure Predictor: Evolutionary Xtallography). Using recent developments in the theory of energy landscapes, we unravel the reasons evolutionary techniques work for CSP and point out their limitations. We demonstrate that the energy landscapes of chemical systems have an overall shape and explore their intrinsic dimensionalities. Because of the inverse relationships between order and energy and between the dimensionality and diversity of an ensemble of crystal structures, the chances that a random search will find the ground state decrease exponentially with increasing system size. A well-designed evolutionary algorithm allows for much greater computational efficiency. We illustrate the power of evolutionary CSP through applications that examine matter at high pressure, where new, unexpected phenomena take place. Evolutionary CSP has allowed researchers to make unexpected discoveries such as a transparent phase of sodium, a partially ionic form of boron, complex superconducting forms of calcium, a novel superhard allotrope of carbon, polymeric modifications of nitrogen, and a new class of compounds, perhydrides. These methods have also led to the discovery of novel hydride superconductors including the "impossible" LiH(n) (n=2, 6, 8) compounds, and CaLi(2). We discuss extensions of the method to molecular crystals, systems of variable composition, and the targeted optimization of specific physical properties. PMID:21361336

  8. Development of advanced structural analysis methodologies for predicting widespread fatigue damage in aircraft structures

    NASA Technical Reports Server (NTRS)

    Harris, Charles E.; Starnes, James H., Jr.; Newman, James C., Jr.

    1995-01-01

    NASA is developing a 'tool box' that includes a number of advanced structural analysis computer codes which, taken together, represent the comprehensive fracture mechanics capability required to predict the onset of widespread fatigue damage. These structural analysis tools have complementary and specialized capabilities ranging from a finite-element-based stress-analysis code for two- and three-dimensional built-up structures with cracks to a fatigue and fracture analysis code that uses stress-intensity factors and material-property data found in 'look-up' tables or from equations. NASA is conducting critical experiments necessary to verify the predictive capabilities of the codes, and these tests represent a first step in the technology-validation and industry-acceptance processes. NASA has established cooperative programs with aircraft manufacturers to facilitate the comprehensive transfer of this technology by making these advanced structural analysis codes available to industry.

  9. The extended evolutionary synthesis: its structure, assumptions and predictions

    PubMed Central

    Laland, Kevin N.; Uller, Tobias; Feldman, Marcus W.; Sterelny, Kim; Müller, Gerd B.; Moczek, Armin; Jablonka, Eva; Odling-Smee, John

    2015-01-01

    Scientific activities take place within the structured sets of ideas and assumptions that define a field and its practices. The conceptual framework of evolutionary biology emerged with the Modern Synthesis in the early twentieth century and has since expanded into a highly successful research program to explore the processes of diversification and adaptation. Nonetheless, the ability of that framework satisfactorily to accommodate the rapid advances in developmental biology, genomics and ecology has been questioned. We review some of these arguments, focusing on literatures (evo-devo, developmental plasticity, inclusive inheritance and niche construction) whose implications for evolution can be interpreted in two ways—one that preserves the internal structure of contemporary evolutionary theory and one that points towards an alternative conceptual framework. The latter, which we label the ‘extended evolutionary synthesis' (EES), retains the fundaments of evolutionary theory, but differs in its emphasis on the role of constructive processes in development and evolution, and reciprocal portrayals of causation. In the EES, developmental processes, operating through developmental bias, inclusive inheritance and niche construction, share responsibility for the direction and rate of evolution, the origin of character variation and organism–environment complementarity. We spell out the structure, core assumptions and novel predictions of the EES, and show how it can be deployed to stimulate and advance research in those fields that study or use evolutionary biology. PMID:26246559

  10. The extended evolutionary synthesis: its structure, assumptions and predictions.

    PubMed

    Laland, Kevin N; Uller, Tobias; Feldman, Marcus W; Sterelny, Kim; Müller, Gerd B; Moczek, Armin; Jablonka, Eva; Odling-Smee, John

    2015-08-22

    Scientific activities take place within the structured sets of ideas and assumptions that define a field and its practices. The conceptual framework of evolutionary biology emerged with the Modern Synthesis in the early twentieth century and has since expanded into a highly successful research program to explore the processes of diversification and adaptation. Nonetheless, the ability of that framework satisfactorily to accommodate the rapid advances in developmental biology, genomics and ecology has been questioned. We review some of these arguments, focusing on literatures (evo-devo, developmental plasticity, inclusive inheritance and niche construction) whose implications for evolution can be interpreted in two ways—one that preserves the internal structure of contemporary evolutionary theory and one that points towards an alternative conceptual framework. The latter, which we label the 'extended evolutionary synthesis' (EES), retains the fundaments of evolutionary theory, but differs in its emphasis on the role of constructive processes in development and evolution, and reciprocal portrayals of causation. In the EES, developmental processes, operating through developmental bias, inclusive inheritance and niche construction, share responsibility for the direction and rate of evolution, the origin of character variation and organism-environment complementarity. We spell out the structure, core assumptions and novel predictions of the EES, and show how it can be deployed to stimulate and advance research in those fields that study or use evolutionary biology. PMID:26246559

  11. The experimental search for new predicted binary-alloy structures

    NASA Astrophysics Data System (ADS)

    Erb, K. C.; Richey, Lauren; Lang, Candace; Campbell, Branton; Hart, Gus

    2010-10-01

    Predicting new ordered phases in metallic alloys is a productive line of inquiry because configurational ordering in an alloy can dramatically alter their useful material properties. One is able to infer the existence of an ordered phase in an alloy using first-principles calculated formation enthalpies.ootnotetextG. L. W. Hart, ``Where are Nature's missing structures?,'' Nature Materials 6 941-945 2007 Using this approach, we have been able to identify stable (i.e. lowest energy) orderings in a variety of binary metallic alloys. Many of these phases have been observed experimentally in the past, though others have not. In pursuit of several of the missing structures, we have characterized potential orderings in PtCd, PtPd and PtMo alloys using synchrotron x-ray powder diffraction and symmetry-analysis tools.ootnotetextB. J. Campbell, H. T. Stokes, D. E. Tanner, and D. M. Hatch, ``ISODISPLACE: a web-based tool for exploring structural distortions,'' J. Appl. Cryst. 39, 607-614 (2006)

  12. Lifetime Reliability Prediction of Ceramic Structures Under Transient Thermomechanical Loads

    NASA Technical Reports Server (NTRS)

    Nemeth, Noel N.; Jadaan, Osama J.; Gyekenyesi, John P.

    2005-01-01

    An analytical methodology is developed to predict the probability of survival (reliability) of ceramic components subjected to harsh thermomechanical loads that can vary with time (transient reliability analysis). This capability enables more accurate prediction of ceramic component integrity against fracture in situations such as turbine startup and shutdown, operational vibrations, atmospheric reentry, or other rapid heating or cooling situations (thermal shock). The transient reliability analysis methodology developed herein incorporates the following features: fast-fracture transient analysis (reliability analysis without slow crack growth, SCG); transient analysis with SCG (reliability analysis with time-dependent damage due to SCG); a computationally efficient algorithm to compute the reliability for components subjected to repeated transient loading (block loading); cyclic fatigue modeling using a combined SCG and Walker fatigue law; proof testing for transient loads; and Weibull and fatigue parameters that are allowed to vary with temperature or time. Component-to-component variation in strength (stochastic strength response) is accounted for with the Weibull distribution, and either the principle of independent action or the Batdorf theory is used to predict the effect of multiaxial stresses on reliability. The reliability analysis can be performed either as a function of the component surface (for surface-distributed flaws) or component volume (for volume-distributed flaws). The transient reliability analysis capability has been added to the NASA CARES/ Life (Ceramic Analysis and Reliability Evaluation of Structures/Life) code. CARES/Life was also updated to interface with commercially available finite element analysis software, such as ANSYS, when used to model the effects of transient load histories. Examples are provided to demonstrate the features of the methodology as implemented in the CARES/Life program.

  13. Structural Acoustic Prediction and Interior Noise Control Technology

    NASA Technical Reports Server (NTRS)

    Mathur, G. P.; Chin, C. L.; Simpson, M. A.; Lee, J. T.; Palumbo, Daniel L. (Technical Monitor)

    2001-01-01

    This report documents the results of Task 14, "Structural Acoustic Prediction and Interior Noise Control Technology". The task was to evaluate the performance of tuned foam elements (termed Smart Foam) both analytically and experimentally. Results taken from a three-dimensional finite element model of an active, tuned foam element are presented. Measurements of sound absorption and sound transmission loss were taken using the model. These results agree well with published data. Experimental performance data were taken in Boeing's Interior Noise Test Facility where 12 smart foam elements were applied to a 757 sidewall. Several configurations were tested. Noise reductions of 5-10 dB were achieved over the 200-800 Hz bandwidth of the controller. Accelerometers mounted on the panel provided a good reference for the controller. Configurations with far-field error microphones outperformed near-field cases.

  14. Leveraging structure for enzyme function prediction: methods, opportunities and challenges

    PubMed Central

    Jacobson, Matthew P.; Kalyanaraman, Chakrapani; Zhao, Suwen; Tian, Boxue

    2014-01-01

    The rapid growth of the number of protein sequences that can be inferred from sequenced genomes presents challenges for function assignment, as only a small fraction (currently <%) of have been experimentally characterized. Bioinformatics tools are commonly used to predict functions of uncharacterized proteins. Recently there has been significant progress in using protein structures as an additional source of information to infer aspects of enzyme function, which is the focus of this review. Successful application of these approaches has led to the identification of novel metabolites, enzyme activities, and biochemical pathways. We discuss opportunities to systematically elucidate protein domains of unknown function, orphan enzyme activities, dead-end metabolites, and pathways in secondary metabolism. PMID:24998033

  15. Optimizing Non-Decomposable Loss Functions in Structured Prediction

    PubMed Central

    Ranjbar, Mani; Lan, Tian; Wang, Yang; Robinovitch, Steven N.; Li, Ze-Nian; Mori, Greg

    2012-01-01

    We develop an algorithm for structured prediction with non-decomposable performance measures. The algorithm learns parameters of Markov random fields and can be applied to multivariate performance measures. Examples include performance measures such as Fβ score (natural language processing), intersection over union (object category segmentation), Precision/Recall at k (search engines) and ROC area (binary classifiers). We attack this optimization problem by approximating the loss function with a piecewise linear function. The loss augmented inference forms a quadratic program (QP), which we solve using LP relaxation. We apply this approach to two tasks: object class-specific segmentation and human action retrieval from videos. We show significant improvement over baseline approaches that either use simple loss functions or simple scoring functions on the PASCAL VOC and H3D Segmentation datasets, and a nursing home action recognition dataset. PMID:22868650

  16. High Precision Prediction of Functional Sites in Protein Structures

    PubMed Central

    Buturovic, Ljubomir; Wong, Mike; Tang, Grace W.; Altman, Russ B.; Petkovic, Dragutin

    2014-01-01

    We address the problem of assigning biological function to solved protein structures. Computational tools play a critical role in identifying potential active sites and informing screening decisions for further lab analysis. A critical parameter in the practical application of computational methods is the precision, or positive predictive value. Precision measures the level of confidence the user should have in a particular computed functional assignment. Low precision annotations lead to futile laboratory investigations and waste scarce research resources. In this paper we describe an advanced version of the protein function annotation system FEATURE, which achieved 99% precision and average recall of 95% across 20 representative functional sites. The system uses a Support Vector Machine classifier operating on the microenvironment of physicochemical features around an amino acid. We also compared performance of our method with state-of-the-art sequence-level annotator Pfam in terms of precision, recall and localization. To our knowledge, no other functional site annotator has been rigorously evaluated against these key criteria. The software and predictive models are incorporated into the WebFEATURE service at http://feature.stanford.edu/wf4.0-beta. PMID:24632601

  17. Automatic measurement of voice onset time using discriminative structured prediction.

    PubMed

    Sonderegger, Morgan; Keshet, Joseph

    2012-12-01

    A discriminative large-margin algorithm for automatic measurement of voice onset time (VOT) is described, considered as a case of predicting structured output from speech. Manually labeled data are used to train a function that takes as input a speech segment of an arbitrary length containing a voiceless stop, and outputs its VOT. The function is explicitly trained to minimize the difference between predicted and manually measured VOT; it operates on a set of acoustic feature functions designed based on spectral and temporal cues used by human VOT annotators. The algorithm is applied to initial voiceless stops from four corpora, representing different types of speech. Using several evaluation methods, the algorithm's performance is near human intertranscriber reliability, and compares favorably with previous work. Furthermore, the algorithm's performance is minimally affected by training and testing on different corpora, and remains essentially constant as the amount of training data is reduced to 50-250 manually labeled examples, demonstrating the method's practical applicability to new datasets. PMID:23231126

  18. Engineering Property Prediction Tools for Tailored Polymer Composite Structures

    SciTech Connect

    Nguyen, Ba Nghiep; Foss, Peter; Wyzgoski, Michael; Trantina, Gerry; Kunc, Vlastimil; Schutte, Carol; Smith, Mark T.

    2009-12-23

    This report summarizes our FY 2009 research activities for the project titled:"Engineering Property Prediction Tools for Tailored Polymer Composite Structures." These activities include (i) the completion of the development of a fiber length attrition model for injection-molded long-fiber thermoplastics (LFTs), (ii) development of the a fatigue damage model for LFTs and its implementation in ABAQUS, (iii) development of an impact damage model for LFTs and its implementation in ABAQUS, (iv) development of characterization methods for fatigue testing, (v) characterization of creep and fatigue responses of glass-fiber/polyamide (PA6,6) and glass-fiber/polypropylene (PP), (vi) characterization of fiber length distribution along the flow length of glass/PA6,6 and glass-fiber/PP, and (vii) characterization of impact responses of glass-fiber/PA6,6. The fiber length attrition model accurately captures the fiber length distribution along the flow length of the studied glass-fiber/PP material. The fatigue damage model is able to predict the S-N and stiffness reduction data which are valuable to the fatigue design of LFTs. The impact damage model correctly captures damage accumulation observed in experiments of glass-fiber/PA6,6 plaques.Further work includes validations of these models for representative LFT materials and a complex LFT part.

  19. Structure activity relationships: their function in biological prediction

    SciTech Connect

    Schultz, T.W.

    1982-01-01

    Quantitative structure activity relationships provide a means of ranking or predicting biological effects based on chemical structure. For each compound used to formulate a structure activity model two kinds of quantitative information are required: (1) biological activity and (2) molecular properties. Molecular properties are of three types: (1) molecular shape, (2) physiochemical parameters, and (3) abstract quantitations of molecular structure. Currently the two best descriptors are the hydrophobic parameter, log 1-octanol/water partition coefficient (log P), and the /sup 1/X/sup v/(one-chi-v) molecular connectivity index. Biological responses can be divided into three main categories: (1) non-specific effects due to membrane perturbation, (2) non-specific effects due to interaction with functional groups of proteins, and (3) specific effects due to interaction with receptors. Twenty-six synthetic fossil fuel-related nitrogen-containing aromatic compounds were examined to determine the quantitative correlation between log P and /sup 1/X/sup v/ and population growth impairment of Tetrahymena pyriformis. Nitro-containing compounds are the most active, followed by amino-containing compounds and azaarenes. Within each analog series activity increases with alkyl substitution and ring addition. The planar model log BR = 0.5564 log P + 0.3000 /sup 1/X/sup v/ -2.0138 was determined using mono-nitrogen substituted compounds. Attempts to extrapolate this model to dinitrogen-containing molecules were, for the most part, unsuccessful because of a change in mode of action from membrane perturbation to uncoupling of oxidative phosphoralation.

  20. How to predict very large and complex crystal structures

    NASA Astrophysics Data System (ADS)

    Lyakhov, Andriy O.; Oganov, Artem R.; Valle, Mario

    2010-09-01

    Evolutionary crystal structure prediction proved to be a powerful approach in discovering new materials. Certain limitations are encountered for systems with a large number of degrees of freedom ("large systems") and complex energy landscapes ("complex systems"). We explore the nature of these limitations and address them with a number of newly developed tools. For large systems a major problem is the lack of diversity: any randomly produced population consists predominantly of high-energy disordered structures, offering virtually no routes toward the ordered ground state. We offer two solutions: first, modified variation operators that favor atoms with higher local order (a function we introduce here), and, second, construction of the first generation non-randomly, using pseudo-subcells with, in general, fractional atomic occupancies. This enhances order and diversity and improves energies of the structures. We introduce an additional variation operator, coordinate mutation, which applies preferentially to low-order ("badly placed") atoms. Biasing other variation operators by local order is also found to produce improved results. One promising version of coordinate mutation, explored here, displaces atoms along the eigenvector of the lowest-frequency vibrational mode. For complex energy landscapes, the key problem is the possible existence of several energy funnels - in this situation it is possible to get trapped in one funnel (not necessarily containing the ground state). To address this problem, we develop an algorithm incorporating the ideas of abstract "distance" between structures. These new ingredients improve the performance of the evolutionary algorithm USPEX, in terms of efficiency and reliability, for large and complex systems.

  1. Protein structure prediction provides comparable performance to crystallographic structures in docking-based virtual screening.

    PubMed

    Du, Hongying; Brender, Jeffrey R; Zhang, Jian; Zhang, Yang

    2015-01-01

    Structure based virtual screening has largely been limited to protein targets for which either an experimental structure is available or a strongly homologous template exists so that a high-resolution model can be constructed. The performance of state of the art protein structure predictions in virtual screening in systems where only weakly homologous templates are available is largely untested. Using the challenging DUD database of structural decoys, we show here that even using templates with only weak sequence homology (<30% sequence identity) structural models can be constructed by I-TASSER which achieve comparable enrichment rates to using the experimental bound crystal structure in the majority of the cases studied. For 65% of the targets, the I-TASSER models, which are constructed essentially in the apo conformations, reached 70% of the virtual screening performance of using the holo-crystal structures. A correlation was observed between the success of I-TASSER in modeling the global fold and local structures in the binding pockets of the proteins versus the relative success in virtual screening. The virtual screening performance can be further improved by the recognition of chemical features of the ligand compounds. These results suggest that the combination of structure-based docking and advanced protein structure modeling methods should be a valuable approach to the large-scale drug screening and discovery studies, especially for the proteins lacking crystallographic structures. PMID:25220914

  2. Structure-Based Predictive model for Coal Char Combustion.

    SciTech Connect

    Hurt, R.; Calo, J.; Essenhigh, R.; Hadad, C.; Stanley, E.

    1997-06-25

    During the second quarter of this project, progress was made on both major technical tasks. Three parallel efforts were initiated on the modeling of carbon structural evolution. Structural ordering during carbonization was studied by a numerical simulation scheme proposed by Alan Kerstein involving molecular weight growth and rotational mobility. Work was also initiated to adapt a model of carbonaceous mesophase formation, originally developed under parallel NSF funding, to the prediction of coke texture. This latter work makes use of the FG-DVC model of coal pyrolysis developed by Advanced Fuel Research to specify the pool of aromatic clusters that participate in the order/disorder transition. Boston University has initiated molecular dynamics simulations of carbonization processes and Ohio State has begun theoretical treatment of surface reactions. Experimental work has also begun on model compound studies at Brown and on pilot-scale combustion systems with widely varying flame types at OSE. The work on mobility / growth models shows great promise and is discussed in detail in the body of the report.

  3. Structure prediction and electronic structure study of pristine and doped cuprous sulfide (Cu2S)

    NASA Astrophysics Data System (ADS)

    Khatri, Prashant; Al-Jassim, Mowafak M.; Huda, Muhammad N.

    2014-03-01

    Cuprous sulfide (Cu2S) is among the materials that have high potential of being used in solar cells, but it is highly unstable mainly due to the formation of Cu vacancies. Due to this instability of Cu2S and mobile nature of Cu in Cu2S, it is hard to study Cu2S, and as a result not much is known about its structural details. A systematic theoretically understanding is necessary to utilize its potential fully in photovoltaic devices. The goal of this study is to predict the most probable structure for stoichiometric Cu2S which is energetically favorable, and to find a mechanism to stabilize it against the formation of Cu vacancy. DFT, DFT +U and DFT-Hybrid functional theory has been used in predicting the structure and studying the properties. Many different structures have been considered while performing the calculations. Acanthite like Cu2S structure has been found to be the most favorable structure energetically. We have also studied the structures with Cu-vacancy. A detail theoretical analysis of these aspects will be presented. National Renewable Energy Laboratory (NREL).

  4. Prediction and classification of ncRNAs using structural information

    PubMed Central

    2014-01-01

    Background Evidence is accumulating that non-coding transcripts, previously thought to be functionally inert, play important roles in various cellular activities. High throughput techniques like next generation sequencing have resulted in the generation of vast amounts of sequence data. It is therefore desirable, not only to discriminate coding and non-coding transcripts, but also to assign the noncoding RNA (ncRNA) transcripts into respective classes (families). Although there are several algorithms available for this task, their classification performance remains a major concern. Acknowledging the crucial role that non-coding transcripts play in cellular processes, it is required to develop algorithms that are able to precisely classify ncRNA transcripts. Results In this study, we initially develop prediction tools to discriminate coding or non-coding transcripts and thereafter classify ncRNAs into respective classes. In comparison to the existing methods that employed multiple features, our SVM-based method by using a single feature (tri-nucleotide composition), achieved MCC of 0.98. Knowing that the structure of a ncRNA transcript could provide insights into its biological function, we use graph properties of predicted ncRNA structures to classify the transcripts into 18 different non-coding RNA classes. We developed classification models using a variety of algorithms (BayeNet, NaiveBayes, MultilayerPerceptron, IBk, libSVM, SMO and RandomForest) and observed that model based on RandomForest performed better than other models. As compared to the GraPPLE study, the sensitivity (of 13 classes) and specificity (of 14 classes) was higher. Moreover, the overall sensitivity of 0.43 outperforms the sensitivity of GraPPLE (0.33) whereas the overall MCC measure of 0.40 (in contrast to MCC of 0.29 of GraPPLE) was significantly higher for our method. This clearly demonstrates that our models are more accurate than existing models. Conclusions This work conclusively demonstrates that a simple feature, tri-nucleotide composition, is sufficient to discriminate between coding and non-coding RNA sequences. Similarly, graph properties based feature set along with RandomForest algorithm are most suitable to classify different ncRNA classes. We have also developed an online and standalone tool-- RNAcon ( http://crdd.osdd.net/raghava/rnacon). PMID:24521294

  5. Failure prediction of thin beryllium sheets used in spacecraft structures

    NASA Technical Reports Server (NTRS)

    Roschke, Paul N.; Mascorro, Edward; Papados, Photios; Serna, Oscar R.

    1991-01-01

    The primary objective of this study is to develop a method for prediction of failure of thin beryllium sheets that undergo complex states of stress. Major components of the research include experimental evaluation of strength parameters for cross-rolled beryllium sheet, application of the Tsai-Wu failure criterion to plate bending problems, development of a high order failure criterion, application of the new criterion to a variety of structures, and incorporation of both failure criteria into a finite element code. A Tsai-Wu failure model for SR-200 sheet material is developed from available tensile data, experiments carried out by NASA on two circular plates, and compression and off-axis experiments performed in this study. The failure surface obtained from the resulting criterion forms an ellipsoid. By supplementing experimental data used in the the two-dimensional criterion and modifying previously suggested failure criteria, a multi-dimensional failure surface is proposed for thin beryllium structures. The new criterion for orthotropic material is represented by a failure surface in six-dimensional stress space. In order to determine coefficients of the governing equation, a number of uniaxial, biaxial, and triaxial experiments are required. Details of these experiments and a complementary ultrasonic investigation are described in detail. Finally, validity of the criterion and newly determined mechanical properties is established through experiments on structures composed of SR200 sheet material. These experiments include a plate-plug arrangement under a complex state of stress and a series of plates with an out-of-plane central point load. Both criteria have been incorporated into a general purpose finite element analysis code. Numerical simulation incrementally applied loads to a structural component that is being designed and checks each nodal point in the model for exceedance of a failure criterion. If stresses at all locations do not exceed the failure criterion, the load is increased and the process is repeated. Failure results for the plate-plug and clamped plate tests are accurate to within 2 percent.

  6. Tertiary structure prediction of RNA-RNA complexes using a secondary structure and fragment-based method.

    PubMed

    Yamasaki, Satoshi; Hirokawa, Takatsugu; Asai, Kiyoshi; Fukui, Kazuhiko

    2014-02-24

    A method has been developed for predicting the tertiary structures of RNA-RNA complex structures using secondary structure information and a fragment assembly algorithm. The linker base pair and secondary structure potential derived from the secondary structure information are particularly useful for prediction. Application of this method to several kinds of RNA-RNA complex structures, including kissing loops, hammerhead ribozymes, and other functional RNAs, produced promising results. Use of the secondary structure potential effectively restrained the conformational search space, leading to successful prediction of kissing loop structures, which mainly consist of common structural elements. The failure to predict more difficult targets had various causes but should be overcome through such measures as tuning the balance of the energy contributions from the Watson-Crick and non- Watson-Crick base pairs, by obtaining knowledge about a wider variety of RNA structures. PMID:24479711

  7. Predictive modeling of pedestal structure in KSTAR using EPED model

    SciTech Connect

    Han, Hyunsun; Kim, J. Y.; Kwon, Ohjin

    2013-10-15

    A predictive calculation is given for the structure of edge pedestal in the H-mode plasma of the KSTAR (Korea Superconducting Tokamak Advanced Research) device using the EPED model. Particularly, the dependence of pedestal width and height on various plasma parameters is studied in detail. The two codes, ELITE and HELENA, are utilized for the stability analysis of the peeling-ballooning and kinetic ballooning modes, respectively. Summarizing the main results, the pedestal slope and height have a strong dependence on plasma current, rapidly increasing with it, while the pedestal width is almost independent of it. The plasma density or collisionality gives initially a mild stabilization, increasing the pedestal slope and height, but above some threshold value its effect turns to a destabilization, reducing the pedestal width and height. Among several plasma shape parameters, the triangularity gives the most dominant effect, rapidly increasing the pedestal width and height, while the effect of elongation and squareness appears to be relatively weak. Implication of these edge results, particularly in relation to the global plasma performance, is discussed.

  8. Chromatin structure predicts epigenetic therapy responsiveness in sarcoma

    PubMed Central

    Mills, Joslyn; Hricik, Todd; Siddiqi, Sara; Matushansky, Igor

    2010-01-01

    To formally explore the potential therapeutic effect of histone deacetylase inhibitors (HDACIs) and DNA-methyltransferase inhibitors (DNA-MIs) on sarcomas, we treated a large sarcoma cell line panel with five different HDACIs in the absence and presence of the DNA-MI decitabine. We observed that the IC50 of each HDACI was consistent for all cell lines while decitabine as a single agent showed no growth inhibition at standard doses. Combination HDACI/DNA-MI therapy showed a preferential synergism for specific sarcoma cell lines. Subsequently we identified and validated (in vitro and in vivo) a two gene set signature (high CUGBP2; low RHOJ) that associated with the synergistic phenotype. We further uncover that the epigenetic synergism leading to specific upregulation of CDKI p21 in specific cell lines is a function of the differences in the degree of baseline chromatin modification. Finally, we show that these chromatin and gene expression patterns are similarly present in the majority of high grade primary sarcomas. Our results provide the first demonstration of a gene set that can predict responsiveness to HDACI/DNA-MI and links this responsiveness mechanistically to the baseline chromatin structure. PMID:21216937

  9. Predictive modeling of pedestal structure in KSTAR using EPED model

    NASA Astrophysics Data System (ADS)

    Han, Hyunsun; Kwon, Ohjin; Kim, J. Y.

    2013-10-01

    A predictive calculation is given for the structure of edge pedestal in the H-mode plasma of the KSTAR (Korea Superconducting Tokamak Advanced Research) device using the EPED model. Particularly, the dependence of pedestal width and height on various plasma parameters is studied in detail. The two codes, ELITE and HELENA, are utilized for the stability analysis of the peeling-ballooning and kinetic ballooning modes, respectively. Summarizing the main results, the pedestal slope and height have a strong dependence on plasma current, rapidly increasing with it, while the pedestal width is almost independent of it. The plasma density or collisionality gives initially a mild stabilization, increasing the pedestal slope and height, but above some threshold value its effect turns to a destabilization, reducing the pedestal width and height. Among several plasma shape parameters, the triangularity gives the most dominant effect, rapidly increasing the pedestal width and height, while the effect of elongation and squareness appears to be relatively weak. Implication of these edge results, particularly in relation to the global plasma performance, is discussed.

  10. Structural kinematics based damage zone prediction in gradient structures using vibration database

    NASA Astrophysics Data System (ADS)

    Talha, Mohammad; Ashokkumar, Chimpalthradi R.

    2014-05-01

    To explore the applications of functionally graded materials (FGMs) in dynamic structures, structural kinematics based health monitoring technique becomes an important problem. Depending upon the displacements in three dimensions, the health of the material to withstand dynamic loads is inferred in this paper, which is based on the net compressive and tensile displacements that each structural degree of freedom takes. These net displacements at each finite element node predicts damage zones of the FGM where the material is likely to fail due to a vibration response which is categorized according to loading condition. The damage zone prediction of a dynamically active FGMs plate have been accomplished using Reddy's higher-order theory. The constituent material properties are assumed to vary in the thickness direction according to the power-law behavior. The proposed C0 finite element model (FEM) is applied to get net tensile and compressive displacement distributions across the structures. A plate made of Aluminum/Ziconia is considered to illustrate the concept of structural kinematics-based health monitoring aspects of FGMs.

  11. STRUCTURE-BASED PREDICTIVE MODEL FOR COAL CHAR COMBUSTION

    SciTech Connect

    CHRISTOPHER M. HADAD; JOSEPH M. CALO; ROBERT H. ESSENHIGH; ROBERT H. HURT

    1998-06-04

    During the past quarter of this project, significant progress continued was made on both major technical tasks. Progress was made at OSU on advancing the application of computational chemistry to oxidative attack on model polyaromatic hydrocarbons (PAHs) and graphitic structures. This work is directed at the application of quantitative ab initio molecular orbital theory to address the decomposition products and mechanisms of coal char reactivity. Previously, it was shown that the �hybrid� B3LYP method can be used to provide quantitative information concerning the stability of the corresponding radicals that arise by hydrogen atom abstraction from monocyclic aromatic rings. In the most recent quarter, these approaches have been extended to larger carbocyclic ring systems, such as coronene, in order to compare the properties of a large carbonaceous PAH to that of the smaller, monocyclic aromatic systems. It was concluded that, at least for bond dissociation energy considerations, the properties of the large PAHs can be modeled reasonably well by smaller systems. In addition to the preceding work, investigations were initiated on the interaction of selected radicals in the �radical pool� with the different types of aromatic structures. In particular, the different pathways for addition vs. abstraction to benzene and furan by H and OH radicals were examined. Thus far, the addition channel appears to be significantly favored over abstraction on both kinetic and thermochemical grounds. Experimental work at Brown University in support of the development of predictive structural models of coal char combustion was focused on elucidating the role of coal mineral matter impurities on reactivity. An �inverse� approach was used where a carbon material was doped with coal mineral matter. The carbon material was derived from a high carbon content fly ash (Fly Ash 23 from the Salem Basin Power Plant. The ash was obtained from Pittsburgh #8 coal (PSOC 1451). Doped samples were then burned in a high temperature flame reactor fitted with rapid quench extractive sampling. It was found that the specific reaction rate decreased with increasing ash content by about an order of magnitude over the ash content range investigated. In this case, it was concluded that at least one of the primary reasons for the resultant observation was that an increasing amount of carbon becomes inaccessible to oxygen by being covered with a fused, �protective,� ash layer. Progress continued on equipment modification and testing for the combustion experiments with widely varying flame types at OSU.

  12. STRUCTURE-BASED PREDICTIVE MODEL FOR COAL CHAR COMBUSTION

    SciTech Connect

    CHRISTOPHER M. HADAD; JOSEPH M. CALO; ROBERT H. ESSENHIGH; ROBERT H. HURT

    1998-09-11

    Progress was made this period on a number of tasks. A significant advance was made in the incorporation of macrostructural ideas into high temperature combustion models. Work at OSU by R. Essenhigh in collaboration with the University of Stuttgart has led to a theory that the zone I / II transition in char combustion lies within the range of conditions of interest for pulverized char combustion. The group has presented evidence that some combustion data, previously interpreted with zone II models, in fact takes place in the transition from zone II to zone 1. This idea was used at Brown to make modifications to the CBK model (a char kinetics package specially designed for carbon burnout prediction, currently used by a number of research and furnace modeling groups in academia and industry). The resulting new model version, CBK8, shows improved ability to predict extinction behavior in the late stages of combustion, especially for particles with low ash content. The full development and release of CBK8, along with detailed descriptions of the role of the zone 1/2 transition will be reported on in subsequent reports. ABB-CE is currently implementing CBK7 into a special version of the CFD code Fluent for use in the modeling and design of their boilers. They have been appraised of the development, and have expressed interest in incorporating the new feature, realizing full CBK8 capabilities into their combustion codes. The computational chemistry task at OSU continued to study oxidative pathways for PAH, with emphasis this period on heteroatom containing ring compounds. Preliminary XPS studies were also carried out. Combustion experiments were also carried out at OSU this period, leading to the acquisition of samples at various residence times and the measurement of their oxidation reactivity by nonisothermal TGA techniques. Several members of the project team attended the Carbon Conference this period and made contacts with representatives from the new FETC Consortium for Premium Carbon Products from Coal. Possibilities for interactions with this new center will be explored. Also this period, an invited review paper was prepared for the 27th International Symposium on Combustion, to be held in Boulder, Colorado in August. The paper is entitled; "Structure, Properties, and Reactivity of Solid Fuels," and reports on a number of advances made in this collaborative project.

  13. The Proteome Folding Project: Proteome-scale prediction of structure and function

    PubMed Central

    Drew, Kevin; Winters, Patrick; Butterfoss, Glenn L.; Berstis, Viktors; Uplinger, Keith; Armstrong, Jonathan; Riffle, Michael; Schweighofer, Erik; Bovermann, Bill; Goodlett, David R.; Davis, Trisha N.; Shasha, Dennis; Malmström, Lars; Bonneau, Richard

    2011-01-01

    The incompleteness of proteome structure and function annotation is a critical problem for biologists and, in particular, severely limits interpretation of high-throughput and next-generation experiments. We have developed a proteome annotation pipeline based on structure prediction, where function and structure annotations are generated using an integration of sequence comparison, fold recognition, and grid-computing-enabled de novo structure prediction. We predict protein domain boundaries and three-dimensional (3D) structures for protein domains from 94 genomes (including human, Arabidopsis, rice, mouse, fly, yeast, Escherichia coli, and worm). De novo structure predictions were distributed on a grid of more than 1.5 million CPUs worldwide (World Community Grid). We generated significant numbers of new confident fold annotations (9% of domains that are otherwise unannotated in these genomes). We demonstrate that predicted structures can be combined with annotations from the Gene Ontology database to predict new and more specific molecular functions. PMID:21824995

  14. APC targeted micelle for enhanced intradermal delivery of hepatitis B DNA vaccine.

    PubMed

    Layek, Buddhadev; Lipp, Lindsey; Singh, Jagdish

    2015-06-10

    Chronic hepatitis B is a serious liver disease and puts people at high risk of death from cirrhosis and liver cancer. Although DNA vaccination has been emerged as a potential immunotherapeutic strategy for the treatment of chronic hepatitis B, the efficiencies were not adequate in clinical trials. Here we describe the design, synthesis, and evaluation of mannosylated phenylalanine grafted chitosan (Man-CS-Phe) as a DNA delivery vector for direct transfection of antigen presenting cells to improve cellular and humoral immunity to plasmid-coded antigen. The cationic Man-CS-Phe micelles condense plasmid DNA into nanoscale polyplexes and provide efficient protection of complexed DNA from nuclease degradation. The mannose receptor-mediated enhanced cell uptake and high in vitro transfection efficiency of the polyplexes were demonstrated in RAW 264.7 and DC 2.4 cells using GFP-expressing plasmid DNA. Furthermore, intradermal immunization of BALB/c mice indicated that hepatitis B DNA vaccine/Man-CS-Phe polyplexes not only induced multi-fold higher serum antibody titer in comparison to all other formulations including FuGENE HD, but also significantly stimulated T-cell proliferation and skewed T helper toward Th1 polarization. These results illustrate that the Man-CS-Phe can serve as a promising DNA delivery vector to harness both cellular and humoral arms of immune system. PMID:25886704

  15. A comparison study on feature selection of DNA structural properties for promoter prediction

    PubMed Central

    2012-01-01

    Background Promoter prediction is an integrant step for understanding gene regulation and annotating genomes. Traditional promoter analysis is mainly based on sequence compositional features. Recently, many kinds of structural features have been employed in promoter prediction. However, considering the high-dimensionality and overfitting problems, it is unfeasible to utilize all available features for promoter prediction. Thus it is necessary to choose some appropriate features for the prediction task. Results This paper conducts an extensive comparison study on feature selection of DNA structural properties for promoter prediction. Firstly, to examine whether promoters possess some special structures, we carry out a systematical comparison among the profiles of thirteen structural features on promoter and non-promoter sequences. Secondly, we investigate the correlations between these structural features and promoter sequences. Thirdly, both filter and wrapper methods are utilized to select appropriate feature subsets from thirteen different kinds of structural features for promoter prediction, and the predictive power of the selected feature subsets is evaluated. Finally, we compare the prediction performance of the feature subsets selected in this paper with nine existing promoter prediction approaches. Conclusions Experimental results show that the structural features are differentially correlated to promoters. Specifically, DNA-bending stiffness, DNA denaturation and energy-related features are highly correlated with promoters. The predictive power for promoter sequences differentiates greatly among different structural features. Selecting the relevant features can significantly improve the accuracy of promoter prediction. PMID:22226192

  16. Predicting aphasia type from brain damage measured with structural MRI.

    PubMed

    Yourganov, Grigori; Smith, Kimberly G; Fridriksson, Julius; Rorden, Chris

    2015-12-01

    Chronic aphasia is a common consequence of a left-hemisphere stroke. Since the early insights by Broca and Wernicke, studying the relationship between the loci of cortical damage and patterns of language impairment has been one of the concerns of aphasiology. We utilized multivariate classification in a cross-validation framework to predict the type of chronic aphasia from the spatial pattern of brain damage. Our sample consisted of 98 patients with five types of aphasia (Broca's, Wernicke's, global, conduction, and anomic), classified based on scores on the Western Aphasia Battery (WAB). Binary lesion maps were obtained from structural MRI scans (obtained at least 6 months poststroke, and within 2 days of behavioural assessment); after spatial normalization, the lesions were parcellated into a disjoint set of brain areas. The proportion of damage to the brain areas was used to classify patients' aphasia type. To create this parcellation, we relied on five brain atlases; our classifier (support vector machine - SVM) could differentiate between different kinds of aphasia using any of the five parcellations. In our sample, the best classification accuracy was obtained when using a novel parcellation that combined two previously published brain atlases, with the first atlas providing the segmentation of grey matter, and the second atlas used to segment the white matter. For each aphasia type, we computed the relative importance of different brain areas for distinguishing it from other aphasia types; our findings were consistent with previously published reports of lesion locations implicated in different types of aphasia. Overall, our results revealed that automated multivariate classification could distinguish between aphasia types based on damage to atlas-defined brain areas. PMID:26465238

  17. RNA Secondary Structure Prediction by Using Discrete Mathematics: An Interdisciplinary Research Experience for Undergraduate Students

    ERIC Educational Resources Information Center

    Ellington, Roni; Wachira, James; Nkwanta, Asamoah

    2010-01-01

    The focus of this Research Experience for Undergraduates (REU) project was on RNA secondary structure prediction by using a lattice walk approach. The lattice walk approach is a combinatorial and computational biology method used to enumerate possible secondary structures and predict RNA secondary structure from RNA sequences. The method uses

  18. RNA Secondary Structure Prediction by Using Discrete Mathematics: An Interdisciplinary Research Experience for Undergraduate Students

    ERIC Educational Resources Information Center

    Ellington, Roni; Wachira, James; Nkwanta, Asamoah

    2010-01-01

    The focus of this Research Experience for Undergraduates (REU) project was on RNA secondary structure prediction by using a lattice walk approach. The lattice walk approach is a combinatorial and computational biology method used to enumerate possible secondary structures and predict RNA secondary structure from RNA sequences. The method uses…

  19. Theoretical prediction of electronic structures of fully pi-conjugated zinc oligoporphyrins with curved surface structures.

    PubMed

    Yamaguchi, Yoichi

    2004-05-01

    A theoretical prediction of the electronic structures of fully pi-conjugated zinc oligoporphyrins with curved surface, ring, tube, and ball-shaped structures was conducted as the objective for the future development of triply meso-meso-, beta-beta-, and beta-beta-linked planar zinc oligoporphyrins. The excitation energies and oscillator strengths for the optimal ring and ball structures were calculated using the time-dependent density functional theory (DFT). Although there is an extremely small energy difference of < 0.1 eV between the highest occupied molecular orbital (HOMO) and the lowest unoccupied molecular orbital (LUMO) of the ring structure relative to the same-sized triply linked planar one, the Q and B bands of the former are smaller redshifted excitation energies and intensified oscillator strengths than those of the latter due to the structurally shortened effective pi-conjugated lengths for the electron transition. It is expected that the ball structure becomes an excellent electron acceptor and shows the highly redshifted Q' band in the near-IR region relative to the monomer. The minimum value of the HOMO-LUMO energy gaps of the infinite-length ring structures was estimated using periodic boundary conditions within the DFT, resulting in the metallic characters of both the tube structures with and without the spiral triply linked porphyrin array. The relation between the diameters and strain energies of the tube and ball structures was also examined. The present fused zinc porphyrins may become more colorful materials with new optelectronic properties including artificial photosynthesis than the carbon nanotubes and fullerenes when the axial coordinations of the central metal of porphyrins are functionally used. PMID:15267712

  20. A permutation based simulated annealing algorithm to predict pseudoknotted RNA secondary structures.

    PubMed

    Tsang, Herbert H; Wiese, Kay C

    2015-01-01

    Pseudoknots are RNA tertiary structures which perform essential biological functions. This paper discusses SARNA-Predict-pk, a RNA pseudoknotted secondary structure prediction algorithm based on Simulated Annealing (SA). The research presented here extends previous work of SARNA-Predict and further examines the effect of the new algorithm to include prediction of RNA secondary structure with pseudoknots. An evaluation of the performance of SARNA-Predict-pk in terms of prediction accuracy is made via comparison with several state-of-the-art prediction algorithms using 20 individual known structures from seven RNA classes. We measured the sensitivity and specificity of nine prediction algorithms. Three of these are dynamic programming algorithms: Pseudoknot (pknotsRE), NUPACK, and pknotsRG-mfe. One is using the statistical clustering approach: Sfold and the other five are heuristic algorithms: SARNA-Predict-pk, ILM, STAR, IPknot and HotKnots algorithms. The results presented in this paper demonstrate that SARNA-Predict-pk can out-perform other state-of-the-art algorithms in terms of prediction accuracy. This supports the use of the proposed method on pseudoknotted RNA secondary structure prediction of other known structures. PMID:26558299

  1. Rapid prediction of structural responses of double-bottom structures in shoal grounding scenario

    NASA Astrophysics Data System (ADS)

    Hu, Zhiqiang; Wang, Ge; Yao, Qi; Yu, Zhaolong

    2016-03-01

    This study presents a simplified analytical model for predicting the structural responses of double-bottom ships in a shoal grounding scenario. This solution is based on a series of analytical models developed from elastic-plastic mechanism theories for different structural components, including bottom girders, floors, bottom plating, and attached stiffeners. We verify this simplified analytical model by numerical simulation, and establish finite element models for a typical tanker hold and a rigid indenter representing seabed obstacles. Employing the LS-DYNA finite element solver, we conduct numerical simulations for shoal-grounding cases with a wide range of slope angles and indentation depths. In comparison with numerical simulations, we verify the proposed simplified analytical model with respect to the total energy dissipation and the horizontal grounding resistance. We also investigate the interaction effect of deformation patterns between bottom structure components. Our results show that the total energy dissipation and resistances predicted by the analytical model agree well with those from numerical simulations.

  2. A survey of machine learning methods for secondary and supersecondary protein structure prediction.

    PubMed

    Ho, Hui Kian; Zhang, Lei; Ramamohanarao, Kotagiri; Martin, Shawn

    2013-01-01

    In this chapter we provide a survey of protein secondary and supersecondary structure prediction using methods from machine learning. Our focus is on machine learning methods applicable to β-hairpin and β-sheet prediction, but we also discuss methods for more general supersecondary structure prediction. We provide background on the secondary and supersecondary structures that we discuss, the features used to describe them, and the basic theory behind the machine learning methods used. We survey the machine learning methods available for secondary and supersecondary structure prediction and compare them where possible. PMID:22987348

  3. Surface pressure profiles, vortex structure and initialization for hurricane prediction. Part II: numerical simulations of track, structure and intensity

    NASA Astrophysics Data System (ADS)

    Davidson, Noel E.; Ma, Yimin

    2012-07-01

    In part 1 of this study, an assessment of commonly used surface pressure profiles to represent TC structures was made. Using the Australian tropical cyclone model, the profiles are tested in case studies of high-resolution prediction of track, structure and intensity. We demonstrate that: (1) track forecasts are mostly insensitive to the imposed structure; (2) in some cases [here Katrina (2005)], specification of vortex structure can have a large impact on prediction of structure and intensity; (3) the forecast model mostly preserves the characteristics of the initial structure and so correct structure at t = 0 is a requirement for improved structure forecasting; and (4) skilful prediction of intensity does not guarantee skilful prediction of structure. It is shown that for Ivan (2004) the initial structure from each profile is preserved during the simulations, and that markedly different structures can have similar intensities. Evidence presented suggests that different initial profiles can sometimes change the timing of intensification. Thus, correct initial vortex structure is an essential ingredient for more accurate intensity and structure prediction.

  4. Structural synthetic biotechnology: from molecular structure to predictable design for industrial strain development.

    PubMed

    Chen, Zhen; Wilmanns, Matthias; Zeng, An-Ping

    2010-10-01

    The future of industrial biotechnology requires efficient development of highly productive and robust strains of microorganisms. Present praxis of strain development cannot adequately fulfill this requirement, primarily owing to the inability to control reactions precisely at a molecular level, or to predict reliably the behavior of cells upon perturbation. Recent developments in two areas of biology are changing the situation rapidly: structural biology has revealed details about enzymes and associated bioreactions at an atomic level; and synthetic biology has provided tools to design and assemble precisely controllable modules for re-programming cellular metabolic circuitry. However, because of different emphases, to date, these two areas have developed separately. A linkage between them is desirable to harness their concerted potential. We therefore propose structural synthetic biotechnology as a new field in biotechnology, specifically for application to the development of industrial microbial strains. PMID:20727604

  5. Aircraft Structural Mass Property Prediction Using Conceptual-Level Structural Analysis

    NASA Technical Reports Server (NTRS)

    Sexstone, Matthew G.

    1998-01-01

    This paper describes a methodology that extends the use of the Equivalent LAminated Plate Solution (ELAPS) structural analysis code from conceptual-level aircraft structural analysis to conceptual-level aircraft mass property analysis. Mass property analysis in aircraft structures has historically depended upon parametric weight equations at the conceptual design level and Finite Element Analysis (FEA) at the detailed design level ELAPS allows for the modeling of detailed geometry, metallic and composite materials, and non-structural mass coupled with analytical structural sizing to produce high-fidelity mass property analyses representing fully configured vehicles early in the design process. This capability is especially valuable for unusual configuration and advanced concept development where existing parametric weight equations are inapplicable and FEA is too time consuming for conceptual design. This paper contrasts the use of ELAPS relative to empirical weight equations and FEA. ELAPS modeling techniques are described and the ELAPS-based mass property analysis process is detailed Examples of mass property stochastic calculations produced during a recent systems study are provided This study involved the analysis of three remotely piloted aircraft required to carry scientific payloads to very high altitudes at subsonic speeds. Due to the extreme nature of this high-altitude flight regime,few existing vehicle designs are available for use in performance and weight prediction. ELAPS was employed within a concurrent engineering analysis process that simultaneously produces aerodynamic, structural, and static aeroelastic results for input to aircraft performance analyses. The ELAPS models produced for each concept were also used to provide stochastic analyses of wing structural mass properties. The results of this effort indicate that ELAPS is an efficient means to conduct multidisciplinary trade studies at the conceptual design level.

  6. Aircraft Structural Mass Property Prediction Using Conceptual-Level Structural Analysis

    NASA Technical Reports Server (NTRS)

    Sexstone, Matthew G.

    1998-01-01

    This paper describes a methodology that extends the use of the Equivalent LAminated Plate Solution (ELAPS) structural analysis code from conceptual-level aircraft structural analysis to conceptual-level aircraft mass property analysis. Mass property analysis in aircraft structures has historically depended upon parametric weight equations at the conceptual design level and Finite Element Analysis (FEA) at the detailed design level. ELAPS allows for the modeling of detailed geometry, metallic and composite materials, and non-structural mass coupled with analytical structural sizing to produce high-fidelity mass property analyses representing fully configured vehicles early in the design process. This capability is especially valuable for unusual configuration and advanced concept development where existing parametric weight equations are inapplicable and FEA is too time consuming for conceptual design. This paper contrasts the use of ELAPS relative to empirical weight equations and FEA. ELAPS modeling techniques are described and the ELAPS-based mass property analysis process is detailed. Examples of mass property stochastic calculations produced during a recent systems study are provided. This study involved the analysis of three remotely piloted aircraft required to carry scientific payloads to very high altitudes at subsonic speeds. Due to the extreme nature of this high-altitude flight regime, few existing vehicle designs are available for use in performance and weight prediction. ELAPS was employed within a concurrent engineering analysis process that simultaneously produces aerodynamic, structural, and static aeroelastic results for input to aircraft performance analyses. The ELAPS models produced for each concept were also used to provide stochastic analyses of wing structural mass properties. The results of this effort indicate that ELAPS is an efficient means to conduct multidisciplinary trade studies at the conceptual design level.

  7. Structural Dynamic Analyses And Test Predictions For Spacecraft Structures With Non-Linearities

    NASA Astrophysics Data System (ADS)

    Vergniaud, Jean-Baptiste; Soula, Laurent; Newerla, Alfred

    2012-07-01

    The overall objective of the mechanical development and verification process is to ensure that the spacecraft structure is able to sustain the mechanical environments encountered during launch. In general the spacecraft structures are a-priori assumed to behave linear, i.e. the responses to a static load or dynamic excitation, respectively, will increase or decrease proportionally to the amplitude of the load or excitation induced. However, past experiences have shown that various non-linearities might exist in spacecraft structures and the consequences of their dynamic effects can significantly affect the development and verification process. Current processes are mainly adapted to linear spacecraft structure behaviour. No clear rules exist for dealing with major structure non-linearities. They are handled outside the process by individual analysis and margin policy, and analyses after tests to justify the CLA coverage. Non-linearities can primarily affect the current spacecraft development and verification process on two aspects. Prediction of flights loads by launcher/satellite coupled loads analyses (CLA): only linear satellite models are delivered for performing CLA and no well-established rules exist how to properly linearize a model when non- linearities are present. The potential impact of the linearization on the results of the CLA has not yet been properly analyzed. There are thus difficulties to assess that CLA results will cover actual flight levels. Management of satellite verification tests: the CLA results generated with a linear satellite FEM are assumed flight representative. If the internal non- linearities are present in the tested satellite then there might be difficulties to determine which input level must be passed to cover satellite internal loads. The non-linear behaviour can also disturb the shaker control, putting the satellite at risk by potentially imposing too high levels. This paper presents the results of a test campaign performed in the frame of an ESA TRP study [1]. A bread-board including typical non-linearities has been designed, manufactured and tested through a typical spacecraft dynamic test campaign. The study has demonstrate the capabilities to perform non-linear dynamic test predictions on a flight representative spacecraft, the good correlation of test results with respect to Finite Elements Model (FEM) prediction and the possibility to identify modal behaviour and to characterize non-linearities characteristics from test results. As a synthesis for this study, overall guidelines have been derived on the mechanical verification process to improve level of expertise on tests involving spacecraft including non-linearity.

  8. CASP11 – An Evaluation of a Modular BCL::Fold-Based Protein Structure Prediction Pipeline

    PubMed Central

    Fischer, Axel W.; Heinze, Sten; Putnam, Daniel K.; Li, Bian; Pino, James C.; Xia, Yan; Lopez, Carlos F.; Meiler, Jens

    2016-01-01

    In silico prediction of a protein’s tertiary structure remains an unsolved problem. The community-wide Critical Assessment of Protein Structure Prediction (CASP) experiment provides a double-blind study to evaluate improvements in protein structure prediction algorithms. We developed a protein structure prediction pipeline employing a three-stage approach, consisting of low-resolution topology search, high-resolution refinement, and molecular dynamics simulation to predict the tertiary structure of proteins from the primary structure alone or including distance restraints either from predicted residue-residue contacts, nuclear magnetic resonance (NMR) nuclear overhauser effect (NOE) experiments, or mass spectroscopy (MS) cross-linking (XL) data. The protein structure prediction pipeline was evaluated in the CASP11 experiment on twenty regular protein targets as well as thirty-three ‘assisted’ protein targets, which also had distance restraints available. Although the low-resolution topology search module was able to sample models with a global distance test total score (GDT_TS) value greater than 30% for twelve out of twenty proteins, frequently it was not possible to select the most accurate models for refinement, resulting in a general decay of model quality over the course of the prediction pipeline. In this study, we provide a detailed overall analysis, study one target protein in more detail as it travels through the protein structure prediction pipeline, and evaluate the impact of limited experimental data. PMID:27046050

  9. CONSIDERATION OF REACTION INTERMEDIATES IN STRUCTURE-ACTIVITY RELATIONSHIPS: A KEY TO UNDERSTANDING AND PREDICTION

    EPA Science Inventory

    Consideration of Reaction Intermediates in Structure- Activity Relationships: A Key to Understanding and Prediction

    A structure-activity relationship (SAR) represents an empirical means for generalizing chemical information relative to biological activity, and is frequent...

  10. STRUCTURE-ACTIVITY RELATIONSHIP STUIDES AND THEIR ROLE IN PREDICTING AND INVESTIGATING CHEMICAL TOXICITY

    EPA Science Inventory

    Structure-Activity Relationship Studies and their Role in Predicting and Investigating Chemical Toxicity

    Structure-activity relationships (SAR) represent attempts to generalize chemical information relative to biological activity for the twin purposes of generating insigh...

  11. PREDICTING TOXICOLOGICAL ENDPOINTS OF CHEMICALS USING QUANTITATIVE STRUCTURE-ACTIVITY RELATIONSHIPS (QSARS)

    EPA Science Inventory

    Quantitative structure-activity relationships (QSARs) are being developed to predict the toxicological endpoints for untested chemicals similar in structure to chemicals that have known experimental toxicological data. Based on a very large number of predetermined descriptors, a...

  12. Predicting emergency evacuation and sheltering behavior: a structured analytical approach.

    PubMed

    Dombroski, Matt; Fischhoff, Baruch; Fischbeck, Paul

    2006-12-01

    We offer a general approach to predicting public compliance with emergency recommendations. It begins with a formal risk assessment of an anticipated emergency, whose parameters include factors potentially affecting and affected by behavior, as identified by social science research. Standard procedures are used to elicit scientific experts' judgments regarding these behaviors and dependencies, in the context of an emergency scenario. Their judgments are used to refine the model and scenario, enabling local emergency coordinators to predict the behavior of citizens in their area. The approach is illustrated with a case study involving a radiological dispersion device (RDD) exploded in downtown Pittsburgh, PA. Both groups of experts (national and local) predicted approximately 80-90% compliance with an order to evacuate workplaces and 60-70% compliance with an order to shelter in place at home. They predicted 10% lower compliance for people asked to shelter at the office or to evacuate their homes. They predicted 10% lower compliance should the media be skeptical, rather than supportive. They also identified preparatory policies that could improve public compliance by 20-30%. We consider the implications of these results for improving emergency risk assessment models and for anticipating and improving preparedness for disasters, using Hurricane Katrina as a further case in point. PMID:17184405

  13. μABC: a systematic microsecond molecular dynamics study of tetranucleotide sequence effects in B-DNA

    PubMed Central

    Pasi, Marco; Maddocks, John H.; Beveridge, David; Bishop, Thomas C.; Case, David A.; Cheatham, Thomas; Dans, Pablo D.; Jayaram, B.; Lankas, Filip; Laughton, Charles; Mitchell, Jonathan; Osman, Roman; Orozco, Modesto; Pérez, Alberto; Petkevičiūtė, Daiva; Spackova, Nada; Sponer, Jiri; Zakrzewska, Krystyna; Lavery, Richard

    2014-01-01

    We present the results of microsecond molecular dynamics simulations carried out by the ABC group of laboratories on a set of B-DNA oligomers containing the 136 distinct tetranucleotide base sequences. We demonstrate that the resulting trajectories have extensively sampled the conformational space accessible to B-DNA at room temperature. We confirm that base sequence effects depend strongly not only on the specific base pair step, but also on the specific base pairs that flank each step. Beyond sequence effects on average helical parameters and conformational fluctuations, we also identify tetranucleotide sequences that oscillate between several distinct conformational substates. By analyzing the conformation of the phosphodiester backbones, it is possible to understand for which sequences these substates will arise, and what impact they will have on specific helical parameters. PMID:25260586

  14. Predicting Gene Structures from Multiple RT-PCR Tests

    NASA Astrophysics Data System (ADS)

    Kováč, Jakub; Vinař, Tomáš; Brejová, Broňa

    It has been demonstrated that the use of additional information such as ESTs and protein homology can significantly improve accuracy of gene prediction. However, many sources of external information are still being omitted from consideration. Here, we investigate the use of product lengths from RT-PCR experiments in gene finding. We present hardness results and practical algorithms for several variants of the problem and apply our methods to a real RT-PCR data set in the Drosophila genome. We conclude that the use of RT-PCR data can improve the sensitivity of gene prediction and locate novel splicing variants.

  15. Perspective: Role of structure prediction in materials discovery and design

    NASA Astrophysics Data System (ADS)

    Needs, Richard J.; Pickard, Chris J.

    2016-05-01

    Materials informatics owes much to bioinformatics and the Materials Genome Initiative has been inspired by the Human Genome Project. But there is more to bioinformatics than genomes, and the same is true for materials informatics. Here we describe the rapidly expanding role of searching for structures of materials using first-principles electronic-structure methods. Structure searching has played an important part in unraveling structures of dense hydrogen and in identifying the record-high-temperature superconducting component in hydrogen sulfide at high pressures. We suggest that first-principles structure searching has already demonstrated its ability to determine structures of a wide range of materials and that it will play a central and increasing part in materials discovery and design.

  16. Probabilistic predictions of penetrating injury to anatomic structures.

    PubMed Central

    Ogunyemi, O.; Webber, B.; Clarke, J. R.

    1997-01-01

    This paper presents an interactive 3D graphical system which allows the user to visualize different bullet path hypotheses and stab wound paths and computes the probability that an anatomical structure associated with a given penetration path is injured. Probabilities can help to identify those anatomical structures which have potentially critical damage from penetrating trauma and differentiate these from structures that are not seriously injured. Images Figure 3 Figure 4 PMID:9357718

  17. Computational Approaches to RNA Structure Prediction, Analysis and Design

    PubMed Central

    Laing, Christian; Schlick, Tamar

    2011-01-01

    RNA molecules are important cellular components involved in many fundamental biological processes. Understanding the mechanisms behind their functions requires RNA tertiary structure knowledge. While modeling approaches for the study of RNA structures and dynamics lag behind efforts in protein folding, much progress has been achieved in the past two years. Here, we review recent advances in RNA folding algorithms, RNA tertiary motif discovery, applications of graph theory approaches to RNA structure and function, and in silico generation of RNA sequence pools for aptamer design. Advances within each area can be combined to impact many problems in RNA structure and function. PMID:21514143

  18. De novo tertiary structure prediction using RNA123--benchmarking and application to Macugen.

    PubMed

    Eriksson, Emma S E; Joshi, Lokesh; Billeter, Martin; Eriksson, Leif A

    2014-08-01

    The present benchmarking study utilizes the RNA123 program for de novo prediction of tertiary structures of a set of 50 RNA molecules for which X-ray/NMR structures are available, based on the nucleic acid sequence only. All molecules contain a hairpin loop motif and a helical structure of canonical and non-canonical base pairs, interrupted by bulges and internal loops to various degrees. RNA molecules with double helices made up purely by canonical base pairing, and molecules containing symmetric internal loops of non-canonical base pairing are, overall, very well predicted. Structures containing bulges and asymmetric internal loops, and more complex structures containing multiple bulges and internal loops in the same molecule, result in larger deviations from their X-ray/NMR predicted structures due to higher degree of flexibility of the nucleotide bases in these regions. In a majority of the molecules included herein, the RNA123 program was, however, able to predict the tertiary structure with a heavy atom RMSD of less than 5 Å to the X-ray/NMR structure, and the models were in most cases structurally closer to the X-ray/NMR structures than models predicted by MC-Fold and MC-Sym. A set of RNA molecules containing pseudoknot tertiary structure motifs were included, but neither of the programs was able to predict the folding of the single-stranded stem onto the helix without additional structural input. The RNA123 program was then applied to predict the tertiary structure of the RNA segment of Macugen®, the first RNA aptamer approved for clinical use, and for which no tertiary structure has yet been solved. Four possible tertiary structures were predicted for this 27-nucleic-acid-long RNA molecule, which will be used in constructing a full model of the PEGylated aptamer and its interaction with the vascular endothelial growth factor target. PMID:25107358

  19. Prediction of Harmful Human Health Effects of Chemicals from Structure

    NASA Astrophysics Data System (ADS)

    Cronin, Mark T. D.

    There is a great need to assess the harmful effects of chemicals to which man is exposed. Various in silico techniques including chemical grouping and category formation, as well as the use of (Q)SARs can be applied to predict the toxicity of chemicals for a number of toxicological effects. This chapter provides an overview of the state of the art of the prediction of the harmful effects of chemicals to human health. A variety of existing data can be used to obtain information; many such data are formalized into freely available and commercial databases. (Q)SARs can be developed (as illustrated with reference to skin sensitization) for local and global data sets. In addition, chemical grouping techniques can be applied on "similar" chemicals to allow for read-across predictions. Many "expert systems" are now available that incorporate these approaches. With these in silico approaches available, the techniques to apply them successfully have become essential. Integration of different in silico approaches with each other, as well as with other alternative approaches, e.g., in vitro and -omics through the development of integrated testing strategies, will assist in the more efficient prediction of the harmful health effects of chemicals

  20. Superfamily Assignments for the Yeast Proteome through Integration of Structure Prediction with the Gene Ontology

    PubMed Central

    Malmström, Lars; Riffle, Michael; Strauss, Charlie E. M; Chivian, Dylan; Davis, Trisha N; Bonneau, Richard; Baker, David

    2007-01-01

    Saccharomyces cerevisiae is one of the best-studied model organisms, yet the three-dimensional structure and molecular function of many yeast proteins remain unknown. Yeast proteins were parsed into 14,934 domains, and those lacking sequence similarity to proteins of known structure were folded using the Rosetta de novo structure prediction method on the World Community Grid. This structural data was integrated with process, component, and function annotations from the Saccharomyces Genome Database to assign yeast protein domains to SCOP superfamilies using a simple Bayesian approach. We have predicted the structure of 3,338 putative domains and assigned SCOP superfamily annotations to 581 of them. We have also assigned structural annotations to 7,094 predicted domains based on fold recognition and homology modeling methods. The domain predictions and structural information are available in an online database at http://rd.plos.org/10.1371_journal.pbio.0050076_01. PMID:17373854

  1. Predicting human resting-state functional connectivity from structural connectivity

    PubMed Central

    Honey, C. J.; Sporns, O.; Cammoun, L.; Gigandet, X.; Thiran, J. P.; Meuli, R.; Hagmann, P.

    2009-01-01

    In the cerebral cortex, the activity levels of neuronal populations are continuously fluctuating. When neuronal activity, as measured using functional MRI (fMRI), is temporally coherent across 2 populations, those populations are said to be functionally connected. Functional connectivity has previously been shown to correlate with structural (anatomical) connectivity patterns at an aggregate level. In the present study we investigate, with the aid of computational modeling, whether systems-level properties of functional networks—including their spatial statistics and their persistence across time—can be accounted for by properties of the underlying anatomical network. We measured resting state functional connectivity (using fMRI) and structural connectivity (using diffusion spectrum imaging tractography) in the same individuals at high resolution. Structural connectivity then provided the couplings for a model of macroscopic cortical dynamics. In both model and data, we observed (i) that strong functional connections commonly exist between regions with no direct structural connection, rendering the inference of structural connectivity from functional connectivity impractical; (ii) that indirect connections and interregional distance accounted for some of the variance in functional connectivity that was unexplained by direct structural connectivity; and (iii) that resting-state functional connectivity exhibits variability within and across both scanning sessions and model runs. These empirical and modeling results demonstrate that although resting state functional connectivity is variable and is frequently present between regions without direct structural linkage, its strength, persistence, and spatial statistics are nevertheless constrained by the large-scale anatomical structure of the human cerebral cortex. PMID:19188601

  2. Finite Element Based HWB Centerbody Structural Optimization and Weight Prediction

    NASA Technical Reports Server (NTRS)

    Gern, Frank H.

    2012-01-01

    This paper describes a scalable structural model suitable for Hybrid Wing Body (HWB) centerbody analysis and optimization. The geometry of the centerbody and primary wing structure is based on a Vehicle Sketch Pad (VSP) surface model of the aircraft and a FLOPS compatible parameterization of the centerbody. Structural analysis, optimization, and weight calculation are based on a Nastran finite element model of the primary HWB structural components, featuring centerbody, mid section, and outboard wing. Different centerbody designs like single bay or multi-bay options are analyzed and weight calculations are compared to current FLOPS results. For proper structural sizing and weight estimation, internal pressure and maneuver flight loads are applied. Results are presented for aerodynamic loads, deformations, and centerbody weight.

  3. The impact of population structure on genomic prediction in stratified populations.

    PubMed

    Guo, Zhigang; Tucker, Dominic M; Basten, Christopher J; Gandhi, Harish; Ersoz, Elhan; Guo, Baohong; Xu, Zhanyou; Wang, Daolong; Gay, Gilles

    2014-03-01

    Impacts of population structure on the evaluation of genomic heritability and prediction were investigated and quantified using high-density markers in diverse panels in rice and maize. Population structure is an important factor affecting estimation of genomic heritability and assessment of genomic prediction in stratified populations. In this study, our first objective was to assess effects of population structure on estimations of genomic heritability using the diversity panels in rice and maize. Results indicate population structure explained 33 and 7.5% of genomic heritability for rice and maize, respectively, depending on traits, with the remaining heritability explained by within-subpopulation variation. Estimates of within-subpopulation heritability were higher than that derived from quantitative trait loci identified in genome-wide association studies, suggesting 65% improvement in genetic gains. The second objective was to evaluate effects of population structure on genomic prediction using cross-validation experiments. When population structure exists in both training and validation sets, correcting for population structure led to a significant decrease in accuracy with genomic prediction. In contrast, when prediction was limited to a specific subpopulation, population structure showed little effect on accuracy and within-subpopulation genetic variance dominated predictions. Finally, effects of genomic heritability on genomic prediction were investigated. Accuracies with genomic prediction increased with genomic heritability in both training and validation sets, with the former showing a slightly greater impact. In summary, our results suggest that the population structure contribution to genomic prediction varies based on prediction strategies, and is also affected by the genetic architectures of traits and populations. In practical breeding, these conclusions may be helpful to better understand and utilize the different genetic resources in genomic prediction. PMID:24452438

  4. Vfold: A Web Server for RNA Structure and Folding Thermodynamics Prediction

    PubMed Central

    Xu, Xiaojun; Zhao, Peinan; Chen, Shi-Jie

    2014-01-01

    Background The ever increasing discovery of non-coding RNAs leads to unprecedented demand for the accurate modeling of RNA folding, including the predictions of two-dimensional (base pair) and three-dimensional all-atom structures and folding stabilities. Accurate modeling of RNA structure and stability has far-reaching impact on our understanding of RNA functions in human health and our ability to design RNA-based therapeutic strategies. Results The Vfold server offers a web interface to predict (a) RNA two-dimensional structure from the nucleotide sequence, (b) three-dimensional structure from the two-dimensional structure and the sequence, and (c) folding thermodynamics (heat capacity melting curve) from the sequence. To predict the two-dimensional structure (base pairs), the server generates an ensemble of structures, including loop structures with the different intra-loop mismatches, and evaluates the free energies using the experimental parameters for the base stacks and the loop entropy parameters given by a coarse-grained RNA folding model (the Vfold model) for the loops. To predict the three-dimensional structure, the server assembles the motif scaffolds using structure templates extracted from the known PDB structures and refines the structure using all-atom energy minimization. Conclusions The Vfold-based web server provides a user friendly tool for the prediction of RNA structure and stability. The web server and the source codes are freely accessible for public use at “http://rna.physics.missouri.edu”. PMID:25215508

  5. Statistical potential for assessment and prediction of protein structures.

    PubMed

    Shen, Min-Yi; Sali, Andrej

    2006-11-01

    Protein structures in the Protein Data Bank provide a wealth of data about the interactions that determine the native states of proteins. Using the probability theory, we derive an atomic distance-dependent statistical potential from a sample of native structures that does not depend on any adjustable parameters (Discrete Optimized Protein Energy, or DOPE). DOPE is based on an improved reference state that corresponds to noninteracting atoms in a homogeneous sphere with the radius dependent on a sample native structure; it thus accounts for the finite and spherical shape of the native structures. The DOPE potential was extracted from a nonredundant set of 1472 crystallographic structures. We tested DOPE and five other scoring functions by the detection of the native state among six multiple target decoy sets, the correlation between the score and model error, and the identification of the most accurate non-native structure in the decoy set. For all decoy sets, DOPE is the best performing function in terms of all criteria, except for a tie in one criterion for one decoy set. To facilitate its use in various applications, such as model assessment, loop modeling, and fitting into cryo-electron microscopy mass density maps combined with comparative protein structure modeling, DOPE was incorporated into the modeling package MODELLER-8. PMID:17075131

  6. Principles for Predicting RNA Secondary Structure Design Difficulty.

    PubMed

    Anderson-Lee, Jeff; Fisker, Eli; Kosaraju, Vineet; Wu, Michelle; Kong, Justin; Lee, Jeehyung; Lee, Minjae; Zada, Mathew; Treuille, Adrien; Das, Rhiju

    2016-02-27

    Designing RNAs that form specific secondary structures is enabling better understanding and control of living systems through RNA-guided silencing, genome editing and protein organization. Little is known, however, about which RNA secondary structures might be tractable for downstream sequence design, increasing the time and expense of design efforts due to inefficient secondary structure choices. Here, we present insights into specific structural features that increase the difficulty of finding sequences that fold into a target RNA secondary structure, summarizing the design efforts of tens of thousands of human participants and three automated algorithms (RNAInverse, INFO-RNA and RNA-SSD) in the Eterna massive open laboratory. Subsequent tests through three independent RNA design algorithms (NUPACK, DSS-Opt and MODENA) confirmed the hypothesized importance of several features in determining design difficulty, including sequence length, mean stem length, symmetry and specific difficult-to-design motifs such as zigzags. Based on these results, we have compiled an Eterna100 benchmark of 100 secondary structure design challenges that span a large range in design difficulty to help test future efforts. Our in silico results suggest new routes for improving computational RNA design methods and for extending these insights to assess "designability" of single RNA structures, as well as of switches for in vitro and in vivo applications. PMID:26902426

  7. Titanium ? -? phase transformation pathway and a predicted metastable structure

    NASA Astrophysics Data System (ADS)

    Zarkevich, N. A.; Johnson, D. D.

    2016-01-01

    As titanium is a highly utilized metal for structural lightweighting, its phases, transformation pathways (transition states), and structures have scientific and industrial importance. Using a proper solid-state nudged elastic band method employing two climbing images combined with density functional theory DFT + U methods for accurate energetics, we detail the pressure-induced ? (ductile) to ? (brittle) transformation at the coexistence pressure. We find two transition states along the minimal-enthalpy path and discover a metastable body-centered orthorhombic structure, with stable phonons, a lower density than the end-point phases, and decreasing stability with increasing pressure.

  8. Climate and species richness predict the phylogenetic structure of African mammal communities.

    PubMed

    Kamilar, Jason M; Beaudrot, Lydia; Reed, Kaye E

    2015-01-01

    We have little knowledge of how climatic variation (and by proxy, habitat variation) influences the phylogenetic structure of tropical communities. Here, we quantified the phylogenetic structure of mammal communities in Africa to investigate how community structure varies with respect to climate and species richness variation across the continent. In addition, we investigated how phylogenetic patterns vary across carnivores, primates, and ungulates. We predicted that climate would differentially affect the structure of communities from different clades due to between-clade biological variation. We examined 203 communities using two metrics, the net relatedness (NRI) and nearest taxon (NTI) indices. We used simultaneous autoregressive models to predict community phylogenetic structure from climate variables and species richness. We found that most individual communities exhibited a phylogenetic structure consistent with a null model, but both climate and species richness significantly predicted variation in community phylogenetic metrics. Using NTI, species rich communities were composed of more distantly related taxa for all mammal communities, as well as for communities of carnivorans or ungulates. Temperature seasonality predicted the phylogenetic structure of mammal, carnivoran, and ungulate communities, and annual rainfall predicted primate community structure. Additional climate variables related to temperature and rainfall also predicted the phylogenetic structure of ungulate communities. We suggest that both past interspecific competition and habitat filtering have shaped variation in tropical mammal communities. The significant effect of climatic factors on community structure has important implications for the diversity of mammal communities given current models of future climate change. PMID:25875361

  9. Climate and Species Richness Predict the Phylogenetic Structure of African Mammal Communities

    PubMed Central

    Kamilar, Jason M.; Beaudrot, Lydia; Reed, Kaye E.

    2015-01-01

    We have little knowledge of how climatic variation (and by proxy, habitat variation) influences the phylogenetic structure of tropical communities. Here, we quantified the phylogenetic structure of mammal communities in Africa to investigate how community structure varies with respect to climate and species richness variation across the continent. In addition, we investigated how phylogenetic patterns vary across carnivores, primates, and ungulates. We predicted that climate would differentially affect the structure of communities from different clades due to between-clade biological variation. We examined 203 communities using two metrics, the net relatedness (NRI) and nearest taxon (NTI) indices. We used simultaneous autoregressive models to predict community phylogenetic structure from climate variables and species richness. We found that most individual communities exhibited a phylogenetic structure consistent with a null model, but both climate and species richness significantly predicted variation in community phylogenetic metrics. Using NTI, species rich communities were composed of more distantly related taxa for all mammal communities, as well as for communities of carnivorans or ungulates. Temperature seasonality predicted the phylogenetic structure of mammal, carnivoran, and ungulate communities, and annual rainfall predicted primate community structure. Additional climate variables related to temperature and rainfall also predicted the phylogenetic structure of ungulate communities. We suggest that both past interspecific competition and habitat filtering have shaped variation in tropical mammal communities. The significant effect of climatic factors on community structure has important implications for the diversity of mammal communities given current models of future climate change. PMID:25875361

  10. A method for WD40 repeat detection and secondary structure prediction.

    PubMed

    Wang, Yang; Jiang, Fan; Zhuo, Zhu; Wu, Xian-Hui; Wu, Yun-Dong

    2013-01-01

    WD40-repeat proteins (WD40s), as one of the largest protein families in eukaryotes, play vital roles in assembling protein-protein/DNA/RNA complexes. WD40s fold into similar β-propeller structures despite diversified sequences. A program WDSP (WD40 repeat protein Structure Predictor) has been developed to accurately identify WD40 repeats and predict their secondary structures. The method is designed specifically for WD40 proteins by incorporating both local residue information and non-local family-specific structural features. It overcomes the problem of highly diversified protein sequences and variable loops. In addition, WDSP achieves a better prediction in identifying multiple WD40-domain proteins by taking the global combination of repeats into consideration. In secondary structure prediction, the average Q3 accuracy of WDSP in jack-knife test reaches 93.7%. A disease related protein LRRK2 was used as a representive example to demonstrate the structure prediction. PMID:23776530

  11. Prediction of structural features and application to outer membrane protein identification

    PubMed Central

    Yan, Renxiang; Wang, Xiaofeng; Huang, Lanqing; Yan, Feidi; Xue, Xiaoyu; Cai, Weiwen

    2015-01-01

    Protein three-dimensional (3D) structures provide insightful information in many fields of biology. One-dimensional properties derived from 3D structures such as secondary structure, residue solvent accessibility, residue depth and backbone torsion angles are helpful to protein function prediction, fold recognition and ab initio folding. Here, we predict various structural features with the assistance of neural network learning. Based on an independent test dataset, protein secondary structure prediction generates an overall Q3 accuracy of ~80%. Meanwhile, the prediction of relative solvent accessibility obtains the highest mean absolute error of 0.164, and prediction of residue depth achieves the lowest mean absolute error of 0.062. We further improve the outer membrane protein identification by including the predicted structural features in a scoring function using a simple profile-to-profile alignment. The results demonstrate that the accuracy of outer membrane protein identification can be improved by ~3% at a 1% false positive level when structural features are incorporated. Finally, our methods are available as two convenient and easy-to-use programs. One is PSSM-2-Features for predicting secondary structure, relative solvent accessibility, residue depth and backbone torsion angles, the other is PPA-OMP for identifying outer membrane proteins from proteomes. PMID:26104144

  12. Progress in predicting protein function from structure: unique features of O-glycosidases.

    PubMed

    Stawiski, E W; Mandel-Gutfreund, Y; Lowenthal, A C; Gregoret, L M

    2002-01-01

    The Structural Genomics Initiative promises to deliver between 10,000 and 20,000 new protein structures within the next ten years. One challenge will be to predict the functions of these proteins from their structures. Since the newly solved structures will be enriched in proteins with little sequence identity to those whose structures are known, new methods for predicting function will be required. Here we describe the unique structural characteristics of O-glycosidases, enzymes that hydrolyze O-glycosidic bonds between carbohydrates. O-glycosidase function has evolved independently many times and enzymes that carry out this function are represented by a large number of different folds. We show that O-glycosidases none-the-less have characteristic structural features that cross sequence and fold families. The electrostatic surfaces of this class of enzymes are particularly distinctive. We also demonstrate that accurate prediction of O-glycosidase function from structure is possible. PMID:11928515

  13. Molecular stripping in the NF-κB/IκB/DNA genetic regulatory network.

    PubMed

    Potoyan, Davit A; Zheng, Weihua; Komives, Elizabeth A; Wolynes, Peter G

    2016-01-01

    Genetic switches based on the [Formula: see text] system are master regulators of an array of cellular responses. Recent kinetic experiments have shown that [Formula: see text] can actively remove NF-κB bound to its genetic sites via a process called "molecular stripping." This allows the [Formula: see text] switch to function under kinetic control rather than the thermodynamic control contemplated in the traditional models of gene switches. Using molecular dynamics simulations of coarse-grained predictive energy landscape models for the constituent proteins by themselves and interacting with the DNA we explore the functional motions of the transcription factor [Formula: see text] and its various binary and ternary complexes with DNA and the inhibitor IκB. These studies show that the function of the [Formula: see text] genetic switch is realized via an allosteric mechanism. Molecular stripping occurs through the activation of a domain twist mode by the binding of [Formula: see text] that occurs through conformational selection. Free energy calculations for DNA binding show that the binding of [Formula: see text] not only results in a significant decrease of the affinity of the transcription factor for the DNA but also kinetically speeds DNA release. Projections of the free energy onto various reaction coordinates reveal the structural details of the stripping pathways. PMID:26699500

  14. Molecular docking studies of phytochemicals from Phyllanthus niruri against Hepatitis B DNA Polymerase

    PubMed Central

    Mohan, Mekha; James, Priyanka; Valsalan, Ravisankar; Nazeem, Puthiyaveetil Abdulla

    2015-01-01

    Hepatitis B virus (HBV) infection is the leading cause for liver disorders and can lead to hepatocellular carcinoma, cirrhosis and liver damage which in turn can cause death of patients. HBV DNA Polymerase is essential for HBV replication in the host and hence is used as one of the most potent pharmacological target for the inhibition of HBV. Chronic hepatitis B is currently treated with nucleotide analogues that suppress viral reverse transcriptase activity and most of them are reported to have viral resistance. Therefore, it is of interest to model HBV DNA polymerase to dock known phytochemicals. The present study focuses on homology modeling and molecular docking analysis of phytocompounds from the traditional antidote Phyllanthus niruri and other nucleoside analogues against HBV DNA Polymerase using the software Discovery studio 4.0. 3D structure of HBV DNA Polymerase was predicted based on previously reported alignment. Docking studies revealed that a few phytochemicals from Phyllanthus niruri had good interactions with HBV DNA Polymerase. These compounds had acceptable binding properties for further in vitro validation. Thus the study puts forth experimental validation for traditional antidote and these phytocompounds could be further promoted as potential lead molecule. PMID:26527851

  15. Multiple classifier integration for the prediction of protein structural classes.

    PubMed

    Chen, Lei; Lu, Lin; Feng, Kairui; Li, Wenjin; Song, Jie; Zheng, Lulu; Yuan, Youlang; Zeng, Zhenbin; Feng, Kaiyan; Lu, Wencong; Cai, Yudong

    2009-11-15

    Supervised classifiers, such as artificial neural network, partition trees, and support vector machines, are often used for the prediction and analysis of biological data. However, choosing an appropriate classifier is not straightforward because each classifier has its own strengths and weaknesses, and each biological dataset has its own characteristics. By integrating many classifiers together, people can avoid the dilemma of choosing an individual classifier out of many to achieve an optimized classification results (Rahman et al., Multiple Classifier Combination for Character Recognition: Revisiting the Majority Voting System and Its Variation, Springer, Berlin, 2002, 167-178). The classification algorithms come from Weka (Witten and Frank, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, San Francisco, 2005) (a collection of software tools for machine learning algorithms). By integrating many predictors (classifiers) together through simple voting, the correct prediction (classification) rates are 65.21% and 65.63% for a basic training dataset and an independent test set, respectively. These results are better than any single machine learning algorithm collected in Weka when exactly the same data are used. Furthermore, we introduce an integration strategy which takes care of both classifier weightings and classifier redundancy. A feature selection strategy, called minimum redundancy maximum relevance (mRMR), is transferred into algorithm selection to deal with classifier redundancy in this research, and the weightings are based on the performance of each classifier. The best classification results are obtained when 11 algorithms are selected by mRMR method, and integrated together through majority votes with weightings. As a result, the prediction correct rates are 68.56% and 69.29% for the basic training dataset and the independent test dataset, respectively. The web-server is available at http://chemdata.shu.edu.cn/protein_st/. PMID:19274708

  16. Multidrug resistance ABC transporter structure predictions by homology modeling approaches.

    PubMed

    Honorat, Mylène; Falson, Pierre; Terreux, Raphael; Di Pietro, Attilio; Dumontet, Charles; Payen, Léa

    2011-03-01

    Human multidrug resistance ABC transporters are ubiquitous membrane proteins responsible for the efflux of multiple, endogenous or exogenous, compounds out of the cells, and therefore they are involved in multi-drug resistance phenotype (MDR). They thus deeply impact the pharmacokinetic parameters and toxicity properties of drugs. A great pressure to develop inhibitors of these pumps is carried out, by either ligand-based drug design or (more ideally) structure-based drug design. In that goal, many biochemical studies have been carried out to characterize their transport functions, and many efforts have been spent to get high-resolution structures. Currently, beside the 3D-structures of bacterial ABC transporters Sav1866 and MsbA, only the mouse ABCB1 complete structure has been published at high-resolution, illustrating the tremendous difficulty in getting such information, taking into account that the human genome accounts for 48 ABC transporters encoding genes. Homology modeling is consequently a reasonable approach to overcome this obstacle. The present review describes, in the first part, the different approaches which have been published to set up human ABC pump 3D-homology models allowing the localization of binding sites for drug candidates, and the identification of critical residues therein. In a second part, the review proposes a more accurate strategy and practical keys to use such biological tools for initiating structure-based drug design. PMID:21470105

  17. RNA-Puzzles Round II: assessment of RNA structure prediction programs applied to three large RNA structures.

    PubMed

    Miao, Zhichao; Adamiak, Ryszard W; Blanchet, Marc-Frédérick; Boniecki, Michal; Bujnicki, Janusz M; Chen, Shi-Jie; Cheng, Clarence; Chojnowski, Grzegorz; Chou, Fang-Chieh; Cordero, Pablo; Cruz, José Almeida; Ferré-D'Amaré, Adrian R; Das, Rhiju; Ding, Feng; Dokholyan, Nikolay V; Dunin-Horkawicz, Stanislaw; Kladwang, Wipapat; Krokhotin, Andrey; Lach, Grzegorz; Magnus, Marcin; Major, François; Mann, Thomas H; Masquida, Benoît; Matelska, Dorota; Meyer, Mélanie; Peselis, Alla; Popenda, Mariusz; Purzycka, Katarzyna J; Serganov, Alexander; Stasiewicz, Juliusz; Szachniuk, Marta; Tandon, Arpit; Tian, Siqi; Wang, Jian; Xiao, Yi; Xu, Xiaojun; Zhang, Jinwei; Zhao, Peinan; Zok, Tomasz; Westhof, Eric

    2015-06-01

    This paper is a report of a second round of RNA-Puzzles, a collective and blind experiment in three-dimensional (3D) RNA structure prediction. Three puzzles, Puzzles 5, 6, and 10, represented sequences of three large RNA structures with limited or no homology with previously solved RNA molecules. A lariat-capping ribozyme, as well as riboswitches complexed to adenosylcobalamin and tRNA, were predicted by seven groups using RNAComposer, ModeRNA/SimRNA, Vfold, Rosetta, DMD, MC-Fold, 3dRNA, and AMBER refinement. Some groups derived models using data from state-of-the-art chemical-mapping methods (SHAPE, DMS, CMCT, and mutate-and-map). The comparisons between the predictions and the three subsequently released crystallographic structures, solved at diffraction resolutions of 2.5-3.2 Å, were carried out automatically using various sets of quality indicators. The comparisons clearly demonstrate the state of present-day de novo prediction abilities as well as the limitations of these state-of-the-art methods. All of the best prediction models have similar topologies to the native structures, which suggests that computational methods for RNA structure prediction can already provide useful structural information for biological problems. However, the prediction accuracy for non-Watson-Crick interactions, key to proper folding of RNAs, is low and some predicted models had high Clash Scores. These two difficulties point to some of the continuing bottlenecks in RNA structure prediction. All submitted models are available for download at http://ahsoka.u-strasbg.fr/rnapuzzles/. PMID:25883046

  18. RNA-Puzzles Round II: assessment of RNA structure prediction programs applied to three large RNA structures

    PubMed Central

    Miao, Zhichao; Adamiak, Ryszard W.; Blanchet, Marc-Frédérick; Boniecki, Michal; Bujnicki, Janusz M.; Chen, Shi-Jie; Cheng, Clarence; Chojnowski, Grzegorz; Chou, Fang-Chieh; Cordero, Pablo; Cruz, José Almeida; Ferré-D'Amaré, Adrian R.; Das, Rhiju; Ding, Feng; Dokholyan, Nikolay V.; Dunin-Horkawicz, Stanislaw; Kladwang, Wipapat; Krokhotin, Andrey; Lach, Grzegorz; Magnus, Marcin; Major, François; Mann, Thomas H.; Masquida, Benoît; Matelska, Dorota; Meyer, Mélanie; Peselis, Alla; Popenda, Mariusz; Purzycka, Katarzyna J.; Serganov, Alexander; Stasiewicz, Juliusz; Szachniuk, Marta; Tandon, Arpit; Tian, Siqi; Wang, Jian; Xiao, Yi; Xu, Xiaojun; Zhang, Jinwei; Zhao, Peinan; Zok, Tomasz; Westhof, Eric

    2015-01-01

    This paper is a report of a second round of RNA-Puzzles, a collective and blind experiment in three-dimensional (3D) RNA structure prediction. Three puzzles, Puzzles 5, 6, and 10, represented sequences of three large RNA structures with limited or no homology with previously solved RNA molecules. A lariat-capping ribozyme, as well as riboswitches complexed to adenosylcobalamin and tRNA, were predicted by seven groups using RNAComposer, ModeRNA/SimRNA, Vfold, Rosetta, DMD, MC-Fold, 3dRNA, and AMBER refinement. Some groups derived models using data from state-of-the-art chemical-mapping methods (SHAPE, DMS, CMCT, and mutate-and-map). The comparisons between the predictions and the three subsequently released crystallographic structures, solved at diffraction resolutions of 2.5–3.2 Å, were carried out automatically using various sets of quality indicators. The comparisons clearly demonstrate the state of present-day de novo prediction abilities as well as the limitations of these state-of-the-art methods. All of the best prediction models have similar topologies to the native structures, which suggests that computational methods for RNA structure prediction can already provide useful structural information for biological problems. However, the prediction accuracy for non-Watson–Crick interactions, key to proper folding of RNAs, is low and some predicted models had high Clash Scores. These two difficulties point to some of the continuing bottlenecks in RNA structure prediction. All submitted models are available for download at http://ahsoka.u-strasbg.fr/rnapuzzles/. PMID:25883046

  19. An adaptive genetic algorithm for crystal structure prediction

    SciTech Connect

    Wu, Shunqing; Ji, Min; Wang, Cai-Zhuang; Nguyen, Manh Cuong; Zhao, Xin; Umemoto, K.; Wentzcovitch, R. M.; Ho, Kai-Ming

    2013-10-31

    We present a genetic algorithm (GA) for structural search that combines the speed of structure exploration by classical potentials with the accuracy of density functional theory (DFT) calculations in an adaptive and iterative way. This strategy increases the efficiency of the DFT-based GA by several orders of magnitude. This gain allows a considerable increase in the size and complexity of systems that can be studied by first principles. The performance of the method is illustrated by successful structure identifications of complex binary and ternary intermetallic compounds with 36 and 54 atoms per cell, respectively. The discovery of a multi-TPa Mg-silicate phase with unit cell containing up to 56 atoms is also reported. Such a phase is likely to be an essential component of terrestrial exoplanetary mantles.

  20. An adaptive genetic algorithm for crystal structure prediction.

    PubMed

    Wu, S Q; Ji, M; Wang, C Z; Nguyen, M C; Zhao, X; Umemoto, K; Wentzcovitch, R M; Ho, K M

    2014-01-22

    We present a genetic algorithm (GA) for structural search that combines the speed of structure exploration by classical potentials with the accuracy of density functional theory (DFT) calculations in an adaptive and iterative way. This strategy increases the efficiency of the DFT-based GA by several orders of magnitude. This gain allows a considerable increase in the size and complexity of systems that can be studied by first principles. The performance of the method is illustrated by successful structure identifications of complex binary and ternary intermetallic compounds with 36 and 54 atoms per cell, respectively. The discovery of a multi-TPa Mg-silicate phase with unit cell containing up to 56 atoms is also reported. Such a phase is likely to be an essential component of terrestrial exoplanetary mantles. PMID:24351274

  1. Prediction of protein structural class using novel evolutionary collocation-based sequence representation.

    PubMed

    Chen, Ke; Kurgan, Lukasz A; Ruan, Jishou

    2008-07-30

    Knowledge of structural classes is useful in understanding of folding patterns in proteins. Although existing structural class prediction methods applied virtually all state-of-the-art classifiers, many of them use a relatively simple protein sequence representation that often includes amino acid (AA) composition. To this end, we propose a novel sequence representation that incorporates evolutionary information encoded using PSI-BLAST profile-based collocation of AA pairs. We used six benchmark datasets and five representative classifiers to quantify and compare the quality of the structural class prediction with the proposed representation. The best, classifier support vector machine achieved 61-96% accuracy on the six datasets. These predictions were comprehensively compared with a wide range of recently proposed methods for prediction of structural classes. Our comprehensive comparison shows superiority of the proposed representation, which results in error rate reductions that range between 14% and 26% when compared with predictions of the best-performing, previously published classifiers on the considered datasets. The study also shows that, for the benchmark dataset that includes sequences characterized by low identity (i.e., 25%, 30%, and 40%), the prediction accuracies are 20-35% lower than for the other three datasets that include sequences with a higher degree of similarity. In conclusion, the proposed representation is shown to substantially improve the accuracy of the structural class prediction. A web server that implements the presented prediction method is freely available at http://biomine.ece.ualberta.ca/Structural_Class/SCEC.html. PMID:18293306

  2. HYPLOSP: a knowledge-based approach to protein local structure prediction.

    PubMed

    Chen, Ching-Tai; Lin, Hsin-Nan; Sung, Ting-Yi; Hsu, Wen-Lian

    2006-12-01

    Local structure prediction can facilitate ab initio structure prediction, protein threading, and remote homology detection. However, the accuracy of existing methods is limited. In this paper, we propose a knowledge-based prediction method that assigns a measure called the local match rate to each position of an amino acid sequence to estimate the confidence of our method. Empirically, the accuracy of the method correlates positively with the local match rate; therefore, we employ it to predict the local structures of positions with a high local match rate. For positions with a low local match rate, we propose a neural network prediction method. To better utilize the knowledge-based and neural network methods, we design a hybrid prediction method, HYPLOSP (HYbrid method to Protein LOcal Structure Prediction) that combines both methods. To evaluate the performance of the proposed methods, we first perform cross-validation experiments by applying our knowledge-based method, a neural network method, and HYPLOSP to a large dataset of 3,925 protein chains. We test our methods extensively on three different structural alphabets and evaluate their performance by two widely used criteria, Maximum Deviation of backbone torsion Angle (MDA) and Q(N), which is similar to Q(3) in secondary structure prediction. We then compare HYPLOSP with three previous studies using a dataset of 56 new protein chains. HYPLOSP shows promising results in terms of MDA and Q(N) accuracy and demonstrates its alphabet-independent capability. PMID:17245815

  3. Building a better fragment library for de novo protein structure prediction.

    PubMed

    de Oliveira, Saulo H P; Shi, Jiye; Deane, Charlotte M

    2015-01-01

    Fragment-based approaches are the current standard for de novo protein structure prediction. These approaches rely on accurate and reliable fragment libraries to generate good structural models. In this work, we describe a novel method for structure fragment library generation and its application in fragment-based de novo protein structure prediction. The importance of correct testing procedures in assessing the quality of fragment libraries is demonstrated. In particular, the exclusion of homologs to the target from the libraries to correctly simulate a de novo protein structure prediction scenario, something which surprisingly is not always done. We demonstrate that fragments presenting different predominant predicted secondary structures should be treated differently during the fragment library generation step and that exhaustive and random search strategies should both be used. This information was used to develop a novel method, Flib. On a validation set of 41 structurally diverse proteins, Flib libraries presents both a higher precision and coverage than two of the state-of-the-art methods, NNMake and HHFrag. Flib also achieves better precision and coverage on the set of 275 protein domains used in the two previous experiments of the the Critical Assessment of Structure Prediction (CASP9 and CASP10). We compared Flib libraries against NNMake libraries in a structure prediction context. Of the 13 cases in which a correct answer was generated, Flib models were more accurate than NNMake models for 10. "Flib is available for download at: http://www.stats.ox.ac.uk/research/proteins/resources". PMID:25901595

  4. Building a Better Fragment Library for De Novo Protein Structure Prediction

    PubMed Central

    de Oliveira, Saulo H. P.; Shi, Jiye; Deane, Charlotte M.

    2015-01-01

    Fragment-based approaches are the current standard for de novo protein structure prediction. These approaches rely on accurate and reliable fragment libraries to generate good structural models. In this work, we describe a novel method for structure fragment library generation and its application in fragment-based de novo protein structure prediction. The importance of correct testing procedures in assessing the quality of fragment libraries is demonstrated. In particular, the exclusion of homologs to the target from the libraries to correctly simulate a de novo protein structure prediction scenario, something which surprisingly is not always done. We demonstrate that fragments presenting different predominant predicted secondary structures should be treated differently during the fragment library generation step and that exhaustive and random search strategies should both be used. This information was used to develop a novel method, Flib. On a validation set of 41 structurally diverse proteins, Flib libraries presents both a higher precision and coverage than two of the state-of-the-art methods, NNMake and HHFrag. Flib also achieves better precision and coverage on the set of 275 protein domains used in the two previous experiments of the the Critical Assessment of Structure Prediction (CASP9 and CASP10). We compared Flib libraries against NNMake libraries in a structure prediction context. Of the 13 cases in which a correct answer was generated, Flib models were more accurate than NNMake models for 10. “Flib is available for download at: http://www.stats.ox.ac.uk/research/proteins/resources”. PMID:25901595

  5. Conformational Transitions upon Ligand Binding: Holo-Structure Prediction from Apo Conformations

    PubMed Central

    Seeliger, Daniel; de Groot, Bert L.

    2010-01-01

    Biological function of proteins is frequently associated with the formation of complexes with small-molecule ligands. Experimental structure determination of such complexes at atomic resolution, however, can be time-consuming and costly. Computational methods for structure prediction of protein/ligand complexes, particularly docking, are as yet restricted by their limited consideration of receptor flexibility, rendering them not applicable for predicting protein/ligand complexes if large conformational changes of the receptor upon ligand binding are involved. Accurate receptor models in the ligand-bound state (holo structures), however, are a prerequisite for successful structure-based drug design. Hence, if only an unbound (apo) structure is available distinct from the ligand-bound conformation, structure-based drug design is severely limited. We present a method to predict the structure of protein/ligand complexes based solely on the apo structure, the ligand and the radius of gyration of the holo structure. The method is applied to ten cases in which proteins undergo structural rearrangements of up to 7.1 backbone RMSD upon ligand binding. In all cases, receptor models within 1.6 backbone RMSD to the target were predicted and close-to-native ligand binding poses were obtained for 8 of 10 cases in the top-ranked complex models. A protocol is presented that is expected to enable structure modeling of protein/ligand complexes and structure-based drug design for cases where crystal structures of ligand-bound conformations are not available. PMID:20066034

  6. LOCUSTRA: accurate prediction of local protein structure using a two-layer support vector machine approach.

    PubMed

    Zimmermann, Olav; Hansmann, Ulrich H E

    2008-09-01

    Constraint generation for 3d structure prediction and structure-based database searches benefit from fine-grained prediction of local structure. In this work, we present LOCUSTRA, a novel scheme for the multiclass prediction of local structure that uses two layers of support vector machines (SVM). Using a 16-letter structural alphabet from de Brevern et al. (Proteins: Struct., Funct., Bioinf. 2000, 41, 271-287), we assess its prediction ability for an independent test set of 222 proteins and compare our method to three-class secondary structure prediction and direct prediction of dihedral angles. The prediction accuracy is Q16=61.0% for the 16 classes of the structural alphabet and Q3=79.2% for a simple mapping to the three secondary classes helix, sheet, and coil. We achieve a mean phi(psi) error of 24.74 degrees (38.35 degrees) and a median RMSDA (root-mean-square deviation of the (dihedral) angles) per protein chain of 52.1 degrees. These results compare favorably with related approaches. The LOCUSTRA web server is freely available to researchers at http://www.fz-juelich.de/nic/cbb/service/service.php. PMID:18763837

  7. On the crystallographic accuracy of structure prediction by implicit water models: Tests for cyclic peptides

    NASA Astrophysics Data System (ADS)

    Goldtzvik, Yonathan; Goldstein, Moshe; Benny Gerber, R.

    2013-03-01

    Five small cyclic peptides and four implicit water models, were selected for this study. DEEPSAM, a structure prediction algorithm built upon TINKER, was used. Structures predicted using implicit water models were compared with experimental data, and with predictions calculated in the gas phase. The existence of very accurate X-ray crystallographic data allowed firm and conclusive comparisons between predictions and experiment. The introduction of implicit water models into the calculations improved the RMSD from experiment by about 13% compared with computations neglecting the presence of water. GBSA is shown to be consistently the best implicit water model.

  8. Memoir: template-based structure prediction for membrane proteins.

    PubMed

    Ebejer, Jean-Paul; Hill, Jamie R; Kelm, Sebastian; Shi, Jiye; Deane, Charlotte M

    2013-07-01

    Membrane proteins are estimated to be the targets of 50% of drugs that are currently in development, yet we have few membrane protein crystal structures. As a result, for a membrane protein of interest, the much-needed structural information usually comes from a homology model. Current homology modelling software is optimized for globular proteins, and ignores the constraints that the membrane is known to place on protein structure. Our Memoir server produces homology models using alignment and coordinate generation software that has been designed specifically for transmembrane proteins. Memoir is easy to use, with the only inputs being a structural template and the sequence that is to be modelled. We provide a video tutorial and a guide to assessing model quality. Supporting data aid manual refinement of the models. These data include a set of alternative conformations for each modelled loop, and a multiple sequence alignment that incorporates the query and template. Memoir works with both α-helical and β-barrel types of membrane proteins and is freely available at http://opig.stats.ox.ac.uk/webapps/memoir. PMID:23640332

  9. A Structural Equation Model for Predicting Business Student Performance

    ERIC Educational Resources Information Center

    Pomykalski, James J.; Dion, Paul; Brock, James L.

    2008-01-01

    In this study, the authors developed a structural equation model that accounted for 79% of the variability of a student's final grade point average by using a sample size of 147 students. The model is based on student grades in 4 foundational business courses: introduction to business, macroeconomics, statistics, and using databases. Educators and…

  10. Anatomy of the herpes simplex virus 1 strain F glycoprotein B gene: primary sequence and predicted protein structure of the wild type and of monoclonal antibody-resistant mutants.

    PubMed Central

    Pellett, P E; Kousoulas, K G; Pereira, L; Roizman, B

    1985-01-01

    In this paper we report the nucleotide sequence and predicted amino acid sequence of glycoprotein B of herpes simplex virus 1 strain F and the amino acid substitutions in the domains of the glycoprotein B gene of three mutants selected for resistance to monoclonal antibody H126-5 or H233 but not to both. Analyses of the amino acid sequence with respect to hydropathicity and secondary structure yielded a two-dimensional model of the protein. The model predicts an N-terminal, 29-amino-acid cleavable signal sequence, a 696-amino-acid hydrophilic surface domain containing six potential sites for N-linked glycosylation, a 69-amino-acid hydrophobic domain containing three segments traversing the membrane, and a charged 109-amino-acid domain projecting into the cytoplasm and previously shown to marker rescue glycoprotein B syn mutations. The nucleotide sequence of the mutant glycoprotein B DNA fragments previously shown to marker transfer or rescue the mutations revealed that the amino acid substitutions cluster in the hydrophilic surface domain between amino acids 273 and 305. Analyses of the secondary structure of these regions, coupled with the experimentally derived observation that the H126-5- and H233-antibody cognitive sites do not overlap, indicate the approximate locations of the epitopes of these neutralizing, surface-reacting, and immune-precipitating monoclonal antibodies. The predicted perturbations in the secondary structure introduced by the amino acid substitutions correlate with the extent of loss of reactivity with monoclonal antibodies in various immunoassays. Images PMID:2981343

  11. Validation of finite element and boundary element methods for predicting structural vibration and radiated noise

    NASA Technical Reports Server (NTRS)

    Seybert, A. F.; Wu, X. F.; Oswald, Fred B.

    1992-01-01

    Analytical and experimental validation of methods to predict structural vibration and radiated noise are presented. A rectangular box excited by a mechanical shaker was used as a vibrating structure. Combined finite element method (FEM) and boundary element method (BEM) models of the apparatus were used to predict the noise radiated from the box. The FEM was used to predict the vibration, and the surface vibration was used as input to the BEM to predict the sound intensity and sound power. Vibration predicted by the FEM model was validated by experimental modal analysis. Noise predicted by the BEM was validated by sound intensity measurements. Three types of results are presented for the total radiated sound power: (1) sound power predicted by the BEM modeling using vibration data measured on the surface of the box; (2) sound power predicted by the FEM/BEM model; and (3) sound power measured by a sound intensity scan. The sound power predicted from the BEM model using measured vibration data yields an excellent prediction of radiated noise. The sound power predicted by the combined FEM/BEM model also gives a good prediction of radiated noise except for a shift of the natural frequencies that are due to limitations in the FEM model.

  12. A Historical Perspective and Overview of Protein Structure Prediction

    NASA Astrophysics Data System (ADS)

    Wooley, John C.; Ye, Yuzhen

    Carrying on many different biological functions, proteins are all composed of one or more polypeptide chains, each containing from several to hundreds or even thousands of the 20 amino acids. During the 1950s at the dawn of modern biochemistry, an essential question for biochemists was to understand the structure and function of these polypeptide chains. The sequences of protein, also referred to as their primary structures, determine the different chemical properties for different proteins, and thus continue to captivate much of the attention of biochemists. As an early step in characterizing protein chemistry, British biochemist Frederick Sanger designed an experimental method to identify the sequence of insulin (Sanger et al., 1955). He became the first person to obtain the primary structure of a protein and in 1958 won his first Nobel Price in Chemistry. This important progress in sequencing did not answer the question of whether a single (individual) protein has a distinctive shape in three dimensions (3D), and if so, what factors determine its 3D architecture. However, during the period when Sanger was studying the primary structure of proteins, American biochemist Christian Anfinsen observed that the active polypeptide chain of a model protein, bovine pancreatic ribonuclease (RNase), could fold spontaneously into a unique 3D structure, which was later called native conformation of the protein (Anfinsen et al., 1954). Anfinsen also studied the refolding of RNase enzyme and observed that an enzyme unfolded under extreme chemical environment could refold spontaneously back into its native conformation upon changing the environment back to natural conditions (Anfinsen et al., 1961). By 1962, Anfinsen had developed his theory of protein folding (which was summarized in his 1972 Nobel acceptance speech): "The native conformation is determined by the totality of interatomic interactions and hence, by the amino acid sequence, in a given environment."

  13. Ab initio NMR Confirmed Evolutionary Structure Prediction for Organic Molecular Crystals

    NASA Astrophysics Data System (ADS)

    Pham, Cong-Huy; Kucukbenli, Emine; de Gironcoli, Stefano

    2015-03-01

    Ab initio crystal structure prediction of even small organic compounds is extremely challenging due to polymorphism, molecular flexibility and difficulties in addressing the dispersion interaction from first principles. We recently implemented vdW-aware density functionals and demonstrated their success in energy ordering of aminoacid crystals. In this work we combine this development with the evolutionary structure prediction method to study cholesterol polymorphs. Cholesterol crystals have paramount importance in various diseases, from cancer to atherosclerosis. The structure of some polymorphs (e.g. ChM, ChAl, ChAh) have already been resolved while some others, which display distinct NMR spectra and are involved in disease formation, are yet to be determined. Here we thoroughly assess the applicability of evolutionary structure prediction to address such real world problems. We validate the newly predicted structures with ab initio NMR chemical shift data using secondary referencing for an improved comparison with experiments.

  14. The Prediction of Botulinum Toxin Structure Based on in Silico and in Vitro Analysis

    NASA Astrophysics Data System (ADS)

    Suzuki, Tomonori; Miyazaki, Satoru

    2011-01-01

    Many of biological system mediated through protein-protein interactions. Knowledge of protein-protein complex structure is required for understanding the function. The determination of huge size and flexible protein-protein complex structure by experimental studies remains difficult, costly and five-consuming, therefore computational prediction of protein structures by homolog modeling and docking studies is valuable method. In addition, MD simulation is also one of the most powerful methods allowing to see the real dynamics of proteins. Here, we predict protein-protein complex structure of botulinum toxin to analyze its property. These bioinformatics methods are useful to report the relation between the flexibility of backbone structure and the activity.

  15. The structure of evaporating and combusting sprays: Measurements and predictions

    NASA Technical Reports Server (NTRS)

    Shuen, J. S.; Solomon, A. S. P.; Faeth, G. M.

    1984-01-01

    An apparatus developed, to allow observations of monodisperse sprays, consists of a methane-fueled turbulent jet diffusion flame with monodisperse methanol drops injected at the burner exit. Mean and fluctuating-phase velocities, drop sizes, drop-mass fluxes and mean-gas temperatures were measured. Initial drop diameters of 100 and 180 microns are being considered in order to vary drop penetration in the flow and effects of turbulent dispersion. Baseline tests of the burner flame with no drops present were also conducted. Calibration tests, needed to establish methods for predicting drop transport, involve drops supported in the post-flame region of a flat-flame burner operated at various mixture ratios. Spray models which are being evaluated include: (1) locally homogeneous flow (LFH) analysis, (2) deterministic separated flow (DSF) analysis and (3) stochastic separated flow (SSF) analysis.

  16. The structure of evaporating and combusting sprays: Measurements and predictions

    NASA Astrophysics Data System (ADS)

    Shuen, J. S.; Solomon, A. S. P.; Faeth, G. M.

    1984-07-01

    An apparatus developed, to allow observations of monodisperse sprays, consists of a methane-fueled turbulent jet diffusion flame with monodisperse methanol drops injected at the burner exit. Mean and fluctuating-phase velocities, drop sizes, drop-mass fluxes and mean-gas temperatures were measured. Initial drop diameters of 100 and 180 microns are being considered in order to vary drop penetration in the flow and effects of turbulent dispersion. Baseline tests of the burner flame with no drops present were also conducted. Calibration tests, needed to establish methods for predicting drop transport, involve drops supported in the post-flame region of a flat-flame burner operated at various mixture ratios. Spray models which are being evaluated include: (1) locally homogeneous flow (LFH) analysis, (2) deterministic separated flow (DSF) analysis and (3) stochastic separated flow (SSF) analysis.

  17. The structure of evaporating and combusting sprays: Measurements and predictions

    NASA Technical Reports Server (NTRS)

    Shuen, J. S.; Solomon, A. S. P.; Faeth, G. M.

    1982-01-01

    An apparatus was constructed to provide measurements in open sprays with no zones of recirculation, in order to provide well-defined conditions for use in evaluating spray models. Measurements were completed in a gas jet, in order to test experimental methods, and are currently in progress for nonevaporating sprays. A locally homogeneous flow (LHF) model where interphase transport rates are assumed to be infinitely fast; a separated flow (SF) model which allows for finite interphase transport rates but neglects effects of turbulent fluctuations on drop motion; and a stochastic SF model which considers effects of turbulent fluctuations on drop motion were evaluated using existing data on particle-laden jets. The LHF model generally overestimates rates of particle dispersion while the SF model underestimates dispersion rates. The stochastic SF flow yield satisfactory predictions except at high particle mass loadings where effects of turbulence modulation may have caused the model to overestimate turbulence levels.

  18. Structure of evaporating and combusting sprays: measurements and predictions

    SciTech Connect

    Shuen, J.S.; Solomon, A.S.P.; Faeth, G.M.

    1984-07-01

    An apparatus developed, to allow observations of monodisperse sprays, consists of a methane-fueled turbulent jet diffusion flame with monodisperse methanol drops injected at the burner exit. Mean and fluctuating-phase velocities, drop sizes, drop-mass fluxes and mean-gas temperatures were measured. Initial drop diameters of 100 and 180 microns are being considered in order to vary drop penetration in the flow and effects of turbulent dispersion. Baseline tests of the burner flame with no drops present were also conducted. Calibration tests, needed to establish methods for predicting drop transport, involve drops supported in the post-flame region of a flat-flame burner operated at various mixture ratios. Spray models which are being evaluated include: (1) locally homogeneous flow (LFH) analysis, (2) deterministic separated flow (DSF) analysis and (3) stochastic separated flow (SSF) analysis.

  19. Practical theories for service life prediction of critical aerospace structural components

    NASA Technical Reports Server (NTRS)

    Ko, William L.; Monaghan, Richard C.; Jackson, Raymond H.

    1992-01-01

    A new second-order theory was developed for predicting the service lives of aerospace structural components. The predictions based on this new theory were compared with those based on the Ko first-order theory and the classical theory of service life predictions. The new theory gives very accurate service life predictions. An equivalent constant-amplitude stress cycle method was proposed for representing the random load spectrum for crack growth calculations. This method predicts the most conservative service life. The proposed use of minimum detectable crack size, instead of proof load established crack size as an initial crack size for crack growth calculations, could give a more realistic service life.

  20. Structural Damage Prediction and Analysis for Hypervelocity Impacts: Handbook

    NASA Technical Reports Server (NTRS)

    Elfer, N. C.

    1996-01-01

    This handbook reviews the analysis of structural damage on spacecraft due to hypervelocity impacts by meteoroid and space debris. These impacts can potentially cause structural damage to a Space Station module wall. This damage ranges from craters, bulges, minor penetrations, and spall to critical damage associated with a large hole, or even rupture. The analysis of damage depends on a variety of assumptions and the area of most concern is at a velocity beyond well controlled laboratory capability. In the analysis of critical damage, one of the key questions is how much momentum can actually be transfered to the pressure vessel wall. When penetration occurs without maximum bulging at high velocity and obliquities (if less momentum is deposited in the rear wall), then large tears and rupture may be avoided. In analysis of rupture effects of cylindrical geometry, biaxial loading, bending of the crack, a central hole strain rate and R-curve effects are discussed.

  1. Structure of Evaporating and Combusting Sprays: Measurements and Predictions

    NASA Technical Reports Server (NTRS)

    Shuen, J. S.; Solomon, A. S. P.; Faeth, G. M.

    1983-01-01

    Complete measurements of the structure of nonevaporating, evaporating and combusting sprays for sufficiently well defined boundary conditions to allow evaluation of models of these processes were obtained. The development of rational design methods for aircraft combustion chambers and other devices involving spray combustion were investigated. Three methods for treating the discrete phase are being considered: a locally homogeneous flow (LHF) model, a deterministic separated flow (DSF) model, and a stochastic separated flow (SSF) model. The main properties of these models are summarized.

  2. Structure Based Predictive Model for Coal Char Combustion

    SciTech Connect

    Robert Hurt; Joseph Calo; Robert Essenhigh; Christopher Hadad

    2000-12-30

    This unique collaborative project has taken a very fundamental look at the origin of structure, and combustion reactivity of coal chars. It was a combined experimental and theoretical effort involving three universities and collaborators from universities outside the U.S. and from U.S. National Laboratories and contract research companies. The project goal was to improve our understanding of char structure and behavior by examining the fundamental chemistry of its polyaromatic building blocks. The project team investigated the elementary oxidative attack on polyaromatic systems, and coupled with a study of the assembly processes that convert these polyaromatic clusters to mature carbon materials (or chars). We believe that the work done in this project has defined a powerful new science-based approach to the understanding of char behavior. The work on aromatic oxidation pathways made extensive use of computational chemistry, and was led by Professor Christopher Hadad in the Department of Chemistry at Ohio State University. Laboratory experiments on char structure, properties, and combustion reactivity were carried out at both OSU and Brown, led by Principle Investigators Joseph Calo, Robert Essenhigh, and Robert Hurt. Modeling activities were divided into two parts: first unique models of crystal structure development were formulated by the team at Brown (PI'S Hurt and Calo) with input from Boston University and significant collaboration with Dr. Alan Kerstein at Sandia and with Dr. Zhong-Ying chen at SAIC. Secondly, new combustion models were developed and tested, led by Professor Essenhigh at OSU, Dieter Foertsch (a collaborator at the University of Stuttgart), and Professor Hurt at Brown. One product of this work is the CBK8 model of carbon burnout, which has already found practical use in CFD codes and in other numerical models of pulverized fuel combustion processes, such as EPRI's NOxLOI Predictor. The remainder of the report consists of detailed technical discussion organized into chapters whose organization is dictated by the nature of the research performed. Chapter 2 is entitled 'Experimental Work on Char Structure, Properties, and Reactivity', and focuses on fundamental structural studies at Brown using both phenollformaldehyde resin chars as model carbons and real coal chars. This work includes the first known in site high resolution TEM studies of carbonization processes, and some intriguing work on 'memory loss', a form of interaction between annealing and oxidation phenomena in chars. Chapter 3 entitled 'Computational Chemistry of Aromatic Oxidation Pathways' presents in detail the OSU work targeted at understanding the elementary molecular pathways of aromatic oxidation. Chapter 4 describes the 'Mesoscale Structural Models', using a combination of thermodynamic (equilibrium) approaches based on liquid crystal theory and kinetic simulations accounting for the effects of limited layer mobility in many fossil fuel derived carbons containing cross-linking agents. Chapter 5 entitled 'Combustion Modeling' presents work on extinction in the late stages of combustion and the development and features of the CBK8 model.

  3. Genomic-scale comparison of sequence- and structure-based methods of function prediction: Does structure provide additional insight?

    PubMed Central

    Fetrow, Jacquelyn S.; Siew, Naomi; Di Gennaro, Jeannine A.; Martinez-Yamout, Maria; Dyson, H. Jane; Skolnick, Jeffrey

    2001-01-01

    A function annotation method using the sequence-to-structure-to-function paradigm is applied to the identification of all disulfide oxidoreductases in the Saccharomyces cerevisiae genome. The method identifies 27 sequences as potential disulfide oxidoreductases. All previously known thioredoxins, glutaredoxins, and disulfide isomerases are correctly identified. Three of the 27 predictions are probable false-positives. Three novel predictions, which subsequently have been experimentally validated, are presented. Two additional novel predictions suggest a disulfide oxidoreductase regulatory mechanism for two subunits (OST3 and OST6) of the yeast oligosaccharyltransferase complex. Based on homology, this prediction can be extended to a potential tumor suppressor gene, N33, in humans, whose biochemical function was not previously known. Attempts to obtain a folded, active N33 construct to test the prediction were unsuccessful. The results show that structure prediction coupled with biochemically relevant structural motifs is a powerful method for the function annotation of genome sequences and can provide more detailed, robust predictions than function prediction methods that rely on sequence comparison alone. PMID:11316881

  4. STRUCTURE BASED PREDICTIVE MODEL FOR COAL CHAR COMBUSTION

    SciTech Connect

    Robert Hurt; Joseph Calo; Robert Essenhigh; Christopher Hadad

    2001-06-15

    This report is part on the ongoing effort at Brown University and Ohio State University to develop structure based models of coal combustion. A very fundamental approach is taken to the description of coal chars and their reaction processes, and the results are therefore expected to have broad applicability to the spectrum of carbon materials of interest in energy technologies. This quarter, the project was in a period no-cost extension and discussions were held about the end phase of the project and possible continuations. The technical tasks were essentially dormant this period, but presentations of results were made, and plans were formulated for renewed activity in the fiscal year 2001.

  5. Thermodynamic and phylogenetic prediction of RNA secondary structures in the coding region of hepatitis C virus.

    PubMed Central

    Tuplin, Andrew; Wood, Jonny; Evans, David J; Patel, Arvind H; Simmonds, Peter

    2002-01-01

    The existence and functional importance of RNA secondary structure in the replication of positive-stranded RNA viruses is increasingly recognized. We applied several computational methods to detect RNA secondary structure in the coding region of hepatitis C virus (HCV), including thermodynamic prediction, calculation of free energy on folding, and a newly developed method to scan sequences for covariant sites and associated secondary structures using a parsimony-based algorithm. Each of the prediction methods provided evidence for complex RNA folding in the core- and NS5B-encoding regions of the genome. The positioning of covariant sites and associated predicted stem-loop structures coincided with thermodynamic predictions of RNA base pairing, and localized precisely in parts of the genome with marked suppression of variability at synonymous sites. Combined, there was evidence for a total of six evolutionarily conserved stem-loop structures in the NS5B-encoding region and two in the core gene. The virus most closely related to HCV, GB virus-B (GBV-B) also showed evidence for similar internal base pairing in its coding region, although predictions of secondary structures were limited by the absence of comparative sequence data for this virus. While the role(s) of stem-loops in the coding region of HCV and GBV-B are currently unknown, the structure predictions in this study could provide the starting point for functional investigations using recently developed self-replicating clones of HCV. PMID:12088154

  6. A new approach to assess and predict the functional roles of proteins across all known structures.

    PubMed

    Julfayev, Elchin S; McLaughlin, Ryan J; Tao, Yi-Ping; McLaughlin, William A

    2011-03-01

    The three dimensional atomic structures of proteins provide information regarding their function; and codified relationships between structure and function enable the assessment of function from structure. In the current study, a new data mining tool was implemented that checks current gene ontology (GO) annotations and predicts new ones across all the protein structures available in the Protein Data Bank (PDB). The tool overcomes some of the challenges of utilizing large amounts of protein annotation and measurement information to form correspondences between protein structure and function. Protein attributes were extracted from the Structural Biology Knowledgebase and open source biological databases. Based on the presence or absence of a given set of attributes, a given protein's functional annotations were inferred. The results show that attributes derived from the three dimensional structures of proteins enhanced predictions over that using attributes only derived from primary amino acid sequence. Some predictions reflected known but not completely documented GO annotations. For example, predictions for the GO term for copper ion binding reflected used information a copper ion was known to interact with the protein based on information in a ligand interaction database. Other predictions were novel and require further experimental validation. These include predictions for proteins labeled as unknown function in the PDB. Two examples are a role in the regulation of transcription for the protein AF1396 from Archaeoglobus fulgidus and a role in RNA metabolism for the protein psuG from Thermotoga maritima. PMID:21445639

  7. Extended Aging Theories for Predictions of Safe Operational Life of Critical Airborne Structural Components

    NASA Technical Reports Server (NTRS)

    Ko, William L.; Chen, Tony

    2006-01-01

    The previously developed Ko closed-form aging theory has been reformulated into a more compact mathematical form for easier application. A new equivalent loading theory and empirical loading theories have also been developed and incorporated into the revised Ko aging theory for the prediction of a safe operational life of airborne failure-critical structural components. The new set of aging and loading theories were applied to predict the safe number of flights for the B-52B aircraft to carry a launch vehicle, the structural life of critical components consumed by load excursion to proof load value, and the ground-sitting life of B-52B pylon failure-critical structural components. A special life prediction method was developed for the preflight predictions of operational life of failure-critical structural components of the B-52H pylon system, for which no flight data are available.

  8. Predicting Target DNA Sequences of DNA-Binding Proteins Based on Unbound Structures

    PubMed Central

    Chen, Chien-Yu; Chien, Ting-Ying; Lin, Chih-Kang; Lin, Chih-Wei; Weng, Yi-Zhong; Chang, Darby Tien-Hao

    2012-01-01

    DNA-binding proteins such as transcription factors use DNA-binding domains (DBDs) to bind to specific sequences in the genome to initiate many important biological functions. Accurate prediction of such target sequences, often represented by position weight matrices (PWMs), is an important step to understand many biological processes. Recent studies have shown that knowledge-based potential functions can be applied on protein-DNA co-crystallized structures to generate PWMs that are considerably consistent with experimental data. However, this success has not been extended to DNA-binding proteins lacking co-crystallized structures. This study aims at investigating the possibility of predicting the DNA sequences bound by DNA-binding proteins from the proteins' unbound structures (structures of the unbound state). Given an unbound query protein and a template complex, the proposed method first employs structure alignment to generate synthetic protein-DNA complexes for the query protein. Once a complex is available, an atomic-level knowledge-based potential function is employed to predict PWMs characterizing the sequences to which the query protein can bind. The evaluation of the proposed method is based on seven DNA-binding proteins, which have structures of both DNA-bound and unbound forms for prediction as well as annotated PWMs for validation. Since this work is the first attempt to predict target sequences of DNA-binding proteins from their unbound structures, three types of structural variations that presumably influence the prediction accuracy were examined and discussed. Based on the analyses conducted in this study, the conformational change of proteins upon binding DNA was shown to be the key factor. This study sheds light on the challenge of predicting the target DNA sequences of a protein lacking co-crystallized structures, which encourages more efforts on the structure alignment-based approaches in addition to docking- and homology modeling-based approaches for generating synthetic complexes. PMID:22312425

  9. Combining Evolutionary Information and an Iterative Sampling Strategy for Accurate Protein Structure Prediction

    PubMed Central

    Braun, Tatjana; Koehler Leman, Julia; Lange, Oliver F.

    2015-01-01

    Recent work has shown that the accuracy of ab initio structure prediction can be significantly improved by integrating evolutionary information in form of intra-protein residue-residue contacts. Following this seminal result, much effort is put into the improvement of contact predictions. However, there is also a substantial need to develop structure prediction protocols tailored to the type of restraints gained by contact predictions. Here, we present a structure prediction protocol that combines evolutionary information with the resolution-adapted structural recombination approach of Rosetta, called RASREC. Compared to the classic Rosetta ab initio protocol, RASREC achieves improved sampling, better convergence and higher robustness against incorrect distance restraints, making it the ideal sampling strategy for the stated problem. To demonstrate the accuracy of our protocol, we tested the approach on a diverse set of 28 globular proteins. Our method is able to converge for 26 out of the 28 targets and improves the average TM-score of the entire benchmark set from 0.55 to 0.72 when compared to the top ranked models obtained by the EVFold web server using identical contact predictions. Using a smaller benchmark, we furthermore show that the prediction accuracy of our method is only slightly reduced when the contact prediction accuracy is comparatively low. This observation is of special interest for protein sequences that only have a limited number of homologs. PMID:26713437

  10. Family Structure versus Family Relationships for Predicting to Substance Use/Abuse and Illegal Behavior.

    ERIC Educational Resources Information Center

    Friedman, Alfred S.; Terras, Arlene; Glassman, Kimberly

    2000-01-01

    Study looked at sample of African-American adolescent males to determine the degree to which family structure (e.g., single parent vs. two-parent families) vs. the nature of the family relationships predict sons' involvement in substance use/abuse and illegal behavior. Of 33 relationships measures analyzed, 3 predicted the degree of recent…

  11. Web applet for predicting structure and thermodynamics of complex fluids

    NASA Astrophysics Data System (ADS)

    Popp, Theodore R.; Hollingshead, Kyle B.; Truskett, Thomas M.

    2015-03-01

    Based on a recently introduced analytical strategy [Hollingshead et al., J. Chem. Phys. 139, 161102 (2013)], we present a web applet that can quickly and semi-quantitatively estimate the equilibrium radial distribution function and related thermodynamic properties of a fluid from knowledge of its pair interaction. We describe the applet's features and present two (of many possible) examples of how it can be used to illustrate concepts of interest for introductory statistical mechanics courses: the transition from ideal gas-like behavior to correlated-liquid behavior with increasing density, and the tradeoff between dominant length scales with changing temperature in a system with ramp-shaped repulsions. The latter type of interaction qualitatively captures distinctive thermodynamic properties of liquid water, because its energetic bias toward locally open structures mimics that of water's hydrogen-bond network.

  12. The structure of evaporating and combusting sprays: Measurements and predictions

    NASA Technical Reports Server (NTRS)

    Shuen, J. S.; Solomon, A. S. P.; Faeth, F. M.

    1983-01-01

    The structure of particle-laden jets and nonevaporating and evaporating sprays was measured in order to evaluate models of these processes. Three models are being evaluated: (1) a locally homogeneous flow model, where slip between the phases is neglected and the flow is assumed to be in local thermodynamic equilibrium; (2) a deterministic separated flow model, where slip and finite interphase transport rates are considered but effects of particle/drop dispersion by turbulence and effects of turbulence on interphase transport rates are ignored; and (3) a stochastic separated flow model, where effects of interphase slip, turbulent dispersion and turbulent fluctuations are considered using random sampling for turbulence properties in conjunction with random-walk computations for particle motion. All three models use a k-e-g turbulence model. All testing and data reduction are completed for the particle laden jets. Mean and fluctuating velocities of the continuous phase and mean mixture fraction were measured in the evaporating sprays.

  13. Geometric programming prediction of design trends for OMV protective structures

    NASA Technical Reports Server (NTRS)

    Mog, R. A.; Horn, J. R.

    1990-01-01

    The global optimization trends of protective honeycomb structural designs for spacecraft subject to hypervelocity meteroid and space debris are presented. This nonlinear problem is first formulated for weight minimization of the orbital maneuvering vehicle (OMV) using a generic monomial predictor. Five problem formulations are considered, each dependent on the selection of independent design variables. Each case is optimized by considering the dual geometric programming problem. The dual variables are solved for in terms of the generic estimated exponents of the monomial predictor. The primal variables are then solved for by conversion. Finally, parametric design trends are developed for ranges of the estimated regression parameters. Results specify nonmonotonic relationships for the optimal first and second sheet mass per unit areas in terms of the estimated exponents.

  14. STRUCTURE-BASED PREDICTIVE MODEL FOR COAL CHAR COMBUSTION

    SciTech Connect

    Robert H. Hurt; Eric M. Suuberg

    2000-05-03

    This report is part on the ongoing effort at Brown University and Ohio State University to develop structure based models of coal combustion. A very fundamental approach is taken to the description of coal chars and their reaction processes, and the results are therefore expected to have broad applicability to the spectrum of carbon materials of interest in energy technologies. This quarter, our work on structure development in carbons continued. A combination of hot stage in situ and ex situ polarized light microscopy was used to identify the preferred orientational of graphene layers at gas interfaces in pitches used as carbon material precursors. The experiments show that edge-on orientation is the equilibrium state of the gas/pitch interface, implying that basal-rich surfaces have higher free energies than edge-rich surfaces in pitch. This result is in agreement with previous molecular modeling studies and TEM observations in the early stages of carbonization. The results may have important implications for the design of tailored carbons with edge-rich or basal-rich surfaces. In the computational chemistry task, we have continued our investigations into the reactivity of large aromatic rings. The role of H-atom abstraction as well as radical addition to monocyclic aromatic rings has been examined, and a manuscript is currently being revised after peer review. We have also shown that OH radical is more effective than H atom in the radical addition process with monocyclic rings. We have extended this analysis to H-atom and OH-radical addition to phenanthrene. Work on combustion kinetics focused on the theoretical analysis of the data previously gathered using thermogravametric analysis.

  15. Link prediction based on hyperbolic mapping with community structure for complex networks

    NASA Astrophysics Data System (ADS)

    Wang, Zuxi; Wu, Yao; Li, Qingguang; Jin, Fengdong; Xiong, Wei

    2016-05-01

    Link prediction is becoming a concerned topic in the complex network field in recent years. However, the existing link prediction methods are unsatisfactory for processing topological information and have high time complexity. This paper presents a novel method of Link Prediction with Community Structure (LPCS) based on hyperbolic mapping. Different from the existing link prediction methods, to utilize global structure information of the network, LPCS deals with the network from an overall perspective. LPCS takes full advantage of the community structure and its hierarchical organization to map networks into hyperbolic space, and obtains the hyperbolic coordinates which depict the global structure information of the network, then uses hyperbolic distance to describe the similarity between the nodes, finally predicts missing links according to the degree of the similarity between unconnected node pairs. The combination of the hyperbolic geometry framework and the community structure makes LPCS perform well in predicting missing links, and the time complexity of LPCS is linear, which makes LPCS can be applied to handle large scale networks in acceptable time. LPCS outperforms many state-of-the-art link prediction methods in the networks obeying power-law degree distribution.

  16. Handling context-sensitivity in protein structures using graph theory: bona fide prediction.

    PubMed

    Samudrala, R; Moult, J

    1997-01-01

    We constructed five comparative models in a blind manner for the second meeting on the Critical Assessment of protein Structure Prediction methods (CASP2). The method used is based on a novel graph-theoretic clique-finding approach, and attempts to address the problem of interconnected structural changes in the comparative modeling of protein structures. We discuss briefly how the method is used for protein structure prediction, and detail how it performs in the blind tests. We find that compared to CASP1, significant improvements in building insertions and deletions and sidechain conformations have been achieved. PMID:9485494

  17. Predicting multi-wall structural response to hypervelocity impact using the hull code

    NASA Technical Reports Server (NTRS)

    Schonberg, William P.

    1993-01-01

    Previously, multi-wall structures have been analyzed extensively, primarily through experiment, as a means of increasing the meteoroid/space debris impact protection of spacecraft. As structural configurations become more varied, the number of tests required to characterize their response increases dramatically. As an alternative to experimental testing, numerical modeling of high-speed impact phenomena is often being used to predict the response of a variety of structural systems under different impact loading conditions. The results of comparing experimental tests to Hull Hydrodynamic Computer Code predictions are reported. Also, the results of a numerical parametric study of multi-wall structural response to hypervelocity cylindrical projectile impact are presented.

  18. Influence of assignment on the prediction of transmembrane helices in protein structures.

    PubMed

    Pylouster, Jean; Bornot, Aurélie; Etchebest, Catherine; de Brevern, Alexandre G

    2010-11-01

    α-Helical transmembrane proteins (TMPα) are composed of a series of helices embedded in the lipid bilayer. Due to technical difficulties, few 3D structures are available. Therefore, the design of structural models of TMPα is of major interest. We study the secondary structures of TMPα by analyzing the influence of secondary structures assignment methods (SSAMs). For this purpose, a published and updated benchmark databank of TMPα is used and several SSAMs (9) are evaluated. The analysis of the results points to significant differences in SSA depending on the methods used. Pairwise comparisons between SSAMs led to more than 10% of disagreement. Helical regions corresponding to transmembrane zones are often correctly characterized. The study of the sequence-structure relationship shows very limited differences with regard to the structural disagreement. Secondary structure prediction based on Bayes' rule and using only a single sequence give correct prediction rates ranging from 78 to 81%. A structural alphabet approach gives a slightly better prediction, i.e., only 2% less than the best equivalent approach, whereas the prediction rate with a very different assignment bypasses 86%. This last result highlights the importance of the correct assignment choice to evaluate the prediction assessment. PMID:20349322

  19. Prediction of structural response to large earthquakes by using recordings from smaller earthquakes

    USGS Publications Warehouse

    Safak, Erdal

    1994-01-01

    The feasibility of predicting structural response to large earthquakes by using recorded responses from collocated smaller earthquakes is investigated. Records from large earthquakes can be approximated as linear combinations of records from smaller earthquakes. Two methods are introduced to predict structural response to a large earthquake by using the recorded response to a smaller earthquake. The accuracy of the methods are tested by applying them to data from a highway overpass.

  20. Correlation of predicted and measured thermal stresses on a truss-type aircraft structure

    NASA Technical Reports Server (NTRS)

    Jenkins, J. M.; Schuster, L. S.; Carter, A. L.

    1978-01-01

    A test structure representing a portion of a hypersonic vehicle was instrumented with strain gages and thermocouples. This test structure was then subjected to laboratory heating representative of supersonic and hypersonic flight conditions. A finite element computer model of this structure was developed using several types of elements with the NASA structural analysis (NASTRAN) computer program. Temperature inputs from the test were used to generate predicted model thermal stresses and these were correlated with the test measurements.

  1. STRUCTURE-BASED PREDICTIVE MODEL FOR COAL CHAR COMBUSTION

    SciTech Connect

    CHRISTOPHER M. HADAD; JOSEPH M. CALO; ROBERT H. ESSENHIGH; ROBERT H. HURT

    1999-01-13

    Significant progress continued to be made during the past reporting quarter on both major technical tasks. During the reporting period at OSU, computational investigations were conducted of addition vs. abstraction reactions of H, O(3 P), and OH with monocyclic aromatic hydrocarbons. The potential energy surface for more than 80 unique reactions of H, O ( 3 P), and OH with aromatic hydrocarbons were determined at the B3LYP/6-31G(d) level of theory. The calculated transition state barriers and reaction free energies indicate that the addition channel is preferred at 298K, but that the abstraction channel becomes dominant at high temperatures. The thermodynamic preference for reactivity with aromatic hydrocarbons increases in the order O(3 P) < H < OH. Abstraction from six-membered aromatic rings is more facile than abstraction from five-membered aromatic rings. However, addition to five-membered rings is thermodynamically more favorable than addition to six-membered rings. The free energies for the abstraction and addition reactions of H, O, and OH with aromatic hydrocarbons and the characteristics of the respective transition states can be used to calculate the reaction rate constants for these important combustion reactions. Experimental work at Brown University on the effect of reaction on the structural evolution of different chars (i.e., phenolic resin char and chars produced from three different coals) have been investigated in a TGA/TPD-MS system. It has been found that samples of different age of these chars appeared to lose their "memory" concerning their initial structures at high burn-offs. During the reporting period, thermal desorption experiments of selected samples were conducted. These spectra show that the population of low temperature oxygen surface complexes, which are primarily responsible for reactivity, are more similar for the high burn-off than for the low burn-off samples of different ages; i.e., the population of active sites are more similar for the �younger� and �older� chars at high burn-offs. Progress continued on experimental work at OSU. Another furnace run was conducted with a Pittsburgh seam coal. Temperature profiles were obtained, as well as char samples from three sampling ports. Nonisothermal TGA reactivities were also obtained for these samples. Work is continuing on final �fine-tuning� of the gas analysis section.

  2. Prediction of complex super-secondary structure ??? motifs based on combined features

    PubMed Central

    Sun, Lixia; Hu, Xiuzhen; Li, Shaobo; Jiang, Zhuo; Li, Kun

    2015-01-01

    Prediction of a complex super-secondary structure is a key step in the study of tertiary structures of proteins. The strand-loop-helix-loop-strand (???) motif is an important complex super-secondary structure in proteins. Many functional sites and active sites often occur in polypeptides of ??? motifs. Therefore, the accurate prediction of ??? motifs is very important to recognizing protein tertiary structure and the study of protein function. In this study, the ??? motif dataset was first constructed using the DSSP package. A statistical analysis was then performed on ??? motifs and non-??? motifs. The target motif was selected, and the length of the loop-?-loop varies from 10 to 26 amino acids. The ideal fixed-length pattern comprised 32 amino acids. A Support Vector Machine algorithm was developed for predicting ??? motifs by using the sequence information, the predicted structure and function information to express the sequence feature. The overall predictive accuracy of 5-fold cross-validation and independent test was 81.7% and 76.7%, respectively. The Matthews correlation coefficient of the 5-fold cross-validation and independent test are 0.63 and 0.53, respectively. Results demonstrate that the proposed method is an effective approach for predicting ??? motifs and can be used for structure and function studies of proteins. PMID:26858540

  3. Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction

    PubMed Central

    Dowell, Robin D; Eddy, Sean R

    2004-01-01

    Background RNA secondary structure prediction methods based on probabilistic modeling can be developed using stochastic context-free grammars (SCFGs). Such methods can readily combine different sources of information that can be expressed probabilistically, such as an evolutionary model of comparative RNA sequence analysis and a biophysical model of structure plausibility. However, the number of free parameters in an integrated model for consensus RNA structure prediction can become untenable if the underlying SCFG design is too complex. Thus a key question is, what small, simple SCFG designs perform best for RNA secondary structure prediction? Results Nine different small SCFGs were implemented to explore the tradeoffs between model complexity and prediction accuracy. Each model was tested for single sequence structure prediction accuracy on a benchmark set of RNA secondary structures. Conclusions Four SCFG designs had prediction accuracies near the performance of current energy minimization programs. One of these designs, introduced by Knudsen and Hein in their PFOLD algorithm, has only 21 free parameters and is significantly simpler than the others. PMID:15180907

  4. Predicting structure and stability for RNA complexes with intermolecular loop-loop base-pairing.

    PubMed

    Cao, Song; Xu, Xiaojun; Chen, Shi-Jie

    2014-06-01

    RNA loop-loop interactions are essential for genomic RNA dimerization and regulation of gene expression. In this article, a statistical mechanics-based computational method that predicts the structures and thermodynamic stabilities of RNA complexes with loop-loop kissing interactions is described. The method accounts for the entropy changes for the formation of loop-loop interactions, which is a notable advancement that other computational models have neglected. Benchmark tests with several experimentally validated systems show that the inclusion of the entropy parameters can indeed improve predictions for RNA complexes. Furthermore, the method can predict not only the native structures of RNA/RNA complexes but also alternative metastable structures. For instance, the model predicts that the SL1 domain of HIV-1 RNA can form two different dimer structures with similar stabilities. The prediction is consistent with experimental observation. In addition, the model predicts two different binding sites for hTR dimerization: One binding site has been experimentally proposed, and the other structure, which has a higher stability, is structurally feasible and needs further experimental validation. PMID:24751648

  5. Predicting structure and stability for RNA complexes with intermolecular loop–loop base-pairing

    PubMed Central

    Cao, Song; Xu, Xiaojun; Chen, Shi-Jie

    2014-01-01

    RNA loop–loop interactions are essential for genomic RNA dimerization and regulation of gene expression. In this article, a statistical mechanics-based computational method that predicts the structures and thermodynamic stabilities of RNA complexes with loop–loop kissing interactions is described. The method accounts for the entropy changes for the formation of loop–loop interactions, which is a notable advancement that other computational models have neglected. Benchmark tests with several experimentally validated systems show that the inclusion of the entropy parameters can indeed improve predictions for RNA complexes. Furthermore, the method can predict not only the native structures of RNA/RNA complexes but also alternative metastable structures. For instance, the model predicts that the SL1 domain of HIV-1 RNA can form two different dimer structures with similar stabilities. The prediction is consistent with experimental observation. In addition, the model predicts two different binding sites for hTR dimerization: One binding site has been experimentally proposed, and the other structure, which has a higher stability, is structurally feasible and needs further experimental validation. PMID:24751648

  6. RNA 3D Modules in Genome-Wide Predictions of RNA 2D Structure

    PubMed Central

    Theis, Corinna; Zirbel, Craig L.; zu Siederdissen, Christian Höner; Anthon, Christian; Hofacker, Ivo L.; Nielsen, Henrik; Gorodkin, Jan

    2015-01-01

    Recent experimental and computational progress has revealed a large potential for RNA structure in the genome. This has been driven by computational strategies that exploit multiple genomes of related organisms to identify common sequences and secondary structures. However, these computational approaches have two main challenges: they are computationally expensive and they have a relatively high false discovery rate (FDR). Simultaneously, RNA 3D structure analysis has revealed modules composed of non-canonical base pairs which occur in non-homologous positions, apparently by independent evolution. These modules can, for example, occur inside structural elements which in RNA 2D predictions appear as internal loops. Hence one question is if the use of such RNA 3D information can improve the prediction accuracy of RNA secondary structure at a genome-wide level. Here, we use RNAz in combination with 3D module prediction tools and apply them on a 13-way vertebrate sequence-based alignment. We find that RNA 3D modules predicted by metaRNAmodules and JAR3D are significantly enriched in the screened windows compared to their shuffled counterparts. The initially estimated FDR of 47.0% is lowered to below 25% when certain 3D module predictions are present in the window of the 2D prediction. We discuss the implications and prospects for further development of computational strategies for detection of RNA 2D structure in genomic sequence. PMID:26509713

  7. Target highlights in CASP9: Experimental target structures for the critical assessment of techniques for protein structure prediction.

    PubMed

    Kryshtafovych, Andriy; Moult, John; Bartual, Sergio G; Bazan, J Fernando; Berman, Helen; Casteel, Darren E; Christodoulou, Evangelos; Everett, John K; Hausmann, Jens; Heidebrecht, Tatjana; Hills, Tanya; Hui, Raymond; Hunt, John F; Seetharaman, Jayaraman; Joachimiak, Andrzej; Kennedy, Michael A; Kim, Choel; Lingel, Andreas; Michalska, Karolina; Montelione, Gaetano T; Otero, José M; Perrakis, Anastassis; Pizarro, Juan C; van Raaij, Mark J; Ramelot, Theresa A; Rousseau, Francois; Tong, Liang; Wernimont, Amy K; Young, Jasmine; Schwede, Torsten

    2011-01-01

    One goal of the CASP community wide experiment on the critical assessment of techniques for protein structure prediction is to identify the current state of the art in protein structure prediction and modeling. A fundamental principle of CASP is blind prediction on a set of relevant protein targets, that is, the participating computational methods are tested on a common set of experimental target proteins, for which the experimental structures are not known at the time of modeling. Therefore, the CASP experiment would not have been possible without broad support of the experimental protein structural biology community. In this article, several experimental groups discuss the structures of the proteins which they provided as prediction targets for CASP9, highlighting structural and functional peculiarities of these structures: the long tail fiber protein gp37 from bacteriophage T4, the cyclic GMP-dependent protein kinase Iβ dimerization/docking domain, the ectodomain of the JTB (jumping translocation breakpoint) transmembrane receptor, Autotaxin in complex with an inhibitor, the DNA-binding J-binding protein 1 domain essential for biosynthesis and maintenance of DNA base-J (β-D-glucosyl-hydroxymethyluracil) in Trypanosoma and Leishmania, an so far uncharacterized 73 residue domain from Ruminococcus gnavus with a fold typical for PDZ-like domains, a domain from the phycobilisome core-membrane linker phycobiliprotein ApcE from Synechocystis, the heat shock protein 90 activators PFC0360w and PFC0270w from Plasmodium falciparum, and 2-oxo-3-deoxygalactonate kinase from Klebsiella pneumoniae. PMID:22020785

  8. Experimental validation of finite element and boundary element methods for predicting structural vibration and radiated noise

    NASA Technical Reports Server (NTRS)

    Seybert, A. F.; Wu, T. W.; Wu, X. F.

    1994-01-01

    This research report is presented in three parts. In the first part, acoustical analyses were performed on modes of vibration of the housing of a transmission of a gear test rig developed by NASA. The modes of vibration of the transmission housing were measured using experimental modal analysis. The boundary element method (BEM) was used to calculate the sound pressure and sound intensity on the surface of the housing and the radiation efficiency of each mode. The radiation efficiency of each of the transmission housing modes was then compared to theoretical results for a finite baffled plate. In the second part, analytical and experimental validation of methods to predict structural vibration and radiated noise are presented. A rectangular box excited by a mechanical shaker was used as a vibrating structure. Combined finite element method (FEM) and boundary element method (BEM) models of the apparatus were used to predict the noise level radiated from the box. The FEM was used to predict the vibration, while the BEM was used to predict the sound intensity and total radiated sound power using surface vibration as the input data. Vibration predicted by the FEM model was validated by experimental modal analysis; noise predicted by the BEM was validated by measurements of sound intensity. Three types of results are presented for the total radiated sound power: sound power predicted by the BEM model using vibration data measured on the surface of the box; sound power predicted by the FEM/BEM model; and sound power measured by an acoustic intensity scan. In the third part, the structure used in part two was modified. A rib was attached to the top plate of the structure. The FEM and BEM were then used to predict structural vibration and radiated noise respectively. The predicted vibration and radiated noise were then validated through experimentation.

  9. Predictability of gene ontology slim-terms from primary structure information in Embryophyta plant proteins

    PubMed Central

    2013-01-01

    Background Proteins are the key elements on the path from genetic information to the development of life. The roles played by the different proteins are difficult to uncover experimentally as this process involves complex procedures such as genetic modifications, injection of fluorescent proteins, gene knock-out methods and others. The knowledge learned from each protein is usually annotated in databases through different methods such as the proposed by The Gene Ontology (GO) consortium. Different methods have been proposed in order to predict GO terms from primary structure information, but very few are available for large-scale functional annotation of plants, and reported success rates are much less than the reported by other non-plant predictors. This paper explores the predictability of GO annotations on proteins belonging to the Embryophyta group from a set of features extracted solely from their primary amino acid sequence. Results High predictability of several GO terms was found for Molecular Function and Cellular Component. As expected, a lower degree of predictability was found on Biological Process ontology annotations, although a few biological processes were easily predicted. Proteins related to transport and transcription were particularly well predicted from primary structure information. The most discriminant features for prediction were those related to electric charges of the amino-acid sequence and hydropathicity derived features. Conclusions An analysis of GO-slim terms predictability in plants was carried out, in order to determine single categories or groups of functions that are most related with primary structure information. For each highly predictable GO term, the responsible features of such successfulness were identified and discussed. In addition to most published studies, focused on few categories or single ontologies, results in this paper comprise a complete landscape of GO predictability from primary structure encompassing 75 GO terms at molecular, cellular and phenotypical level. Thus, it provides a valuable guide for researchers interested on further advances in protein function prediction on Embryophyta plants. PMID:23441934

  10. Full-length RNA structure prediction of the HIV-1 genome reveals a conserved core domain.

    PubMed

    Sükösd, Zsuzsanna; Andersen, Ebbe S; Seemann, Stefan E; Jensen, Mads Krogh; Hansen, Mathias; Gorodkin, Jan; Kjems, Jørgen

    2015-12-01

    A distance constrained secondary structural model of the ≈10 kb RNA genome of the HIV-1 has been predicted but higher-order structures, involving long distance interactions, are currently unknown. We present the first global RNA secondary structure model for the HIV-1 genome, which integrates both comparative structure analysis and information from experimental data in a full-length prediction without distance constraints. Besides recovering known structural elements, we predict several novel structural elements that are conserved in HIV-1 evolution. Our results also indicate that the structure of the HIV-1 genome is highly variable in most regions, with a limited number of stable and conserved RNA secondary structures. Most interesting, a set of long distance interactions form a core organizing structure (COS) that organize the genome into three major structural domains. Despite overlapping protein-coding regions the COS is supported by a particular high frequency of compensatory base changes, suggesting functional importance for this element. This new structural element potentially organizes the whole genome into three major domains protruding from a conserved core structure with potential roles in replication and evolution for the virus. PMID:26476446

  11. Full-length RNA structure prediction of the HIV-1 genome reveals a conserved core domain

    PubMed Central

    Sükösd, Zsuzsanna; Andersen, Ebbe S.; Seemann, Stefan E.; Jensen, Mads Krogh; Hansen, Mathias; Gorodkin, Jan; Kjems, Jørgen

    2015-01-01

    A distance constrained secondary structural model of the ≈10 kb RNA genome of the HIV-1 has been predicted but higher-order structures, involving long distance interactions, are currently unknown. We present the first global RNA secondary structure model for the HIV-1 genome, which integrates both comparative structure analysis and information from experimental data in a full-length prediction without distance constraints. Besides recovering known structural elements, we predict several novel structural elements that are conserved in HIV-1 evolution. Our results also indicate that the structure of the HIV-1 genome is highly variable in most regions, with a limited number of stable and conserved RNA secondary structures. Most interesting, a set of long distance interactions form a core organizing structure (COS) that organize the genome into three major structural domains. Despite overlapping protein-coding regions the COS is supported by a particular high frequency of compensatory base changes, suggesting functional importance for this element. This new structural element potentially organizes the whole genome into three major domains protruding from a conserved core structure with potential roles in replication and evolution for the virus. PMID:26476446

  12. Prediction of rodent carcinogenicity bioassays from molecular structure using inductive logic programming

    SciTech Connect

    King, R.D.; Srinivasan, A.

    1996-10-01

    The machine learning program Progol was applied to the problem of forming the structure-activity relationship (SAR) for a set of compounds tested for carcinogenicity in rodent bioassays by the U.S. National Toxicology Program (NTP). Progol is the first inductive logic programming (ILP) algorithm to use a fully relational method for describing chemical structure in SARs, based on using atoms and their bond connectivities. Progol is well suited to forming SARs for carcinogenicity as it is designed to produce easily understandable rules (structural alerts) for sets of noncongeneric compounds. The Progol SAR method was tested by prediction of a set of compounds that have been widely predicted by other SAR methods (the compounds used in the NTP`s first round of carcinogenesis predictions). For these compounds no method (human or machine) was significantly more accurate than Progol. Progol was the most accurate method that did not use data from biological tests on rodents (however, the difference in accuracy is not significant). The Progol predictions were based solely on chemical structure and the results of tests for Salmonella mutagenicity. Using the full NTP database, the prediction accuracy of Progol was estimated to be 63% ({+-}3%) using 5-fold cross validation. A set of structural alerts for carcinogenesis was automatically generated and the chemical rationale for them investigated-these structural alerts are statistically independent of the Salmonella mutagenicity. Carcinogenicity is predicted for the compounds used in the NTP`s second round of carcinogenesis predictions. The results for prediction of carcinogenesis, taken together with the previous successful applications of predicting mutagenicity in nitroaromatic compounds, and inhibition of angiogenesis by suramin analogues, show that Progol has a role to play in understanding the SARs of cancer-related compounds. 29 refs., 2 figs., 4 tabs.

  13. Direct-Coupling Analysis of nucleotide coevolution facilitates RNA secondary and tertiary structure prediction

    PubMed Central

    De Leonardis, Eleonora; Lutz, Benjamin; Ratz, Sebastian; Cocco, Simona; Monasson, Rémi; Schug, Alexander; Weigt, Martin

    2015-01-01

    Despite the biological importance of non-coding RNA, their structural characterization remains challenging. Making use of the rapidly growing sequence databases, we analyze nucleotide coevolution across homologous sequences via Direct-Coupling Analysis to detect nucleotide-nucleotide contacts. For a representative set of riboswitches, we show that the results of Direct-Coupling Analysis in combination with a generalized Nussinov algorithm systematically improve the results of RNA secondary structure prediction beyond traditional covariance approaches based on mutual information. Even more importantly, we show that the results of Direct-Coupling Analysis are enriched in tertiary structure contacts. By integrating these predictions into molecular modeling tools, systematically improved tertiary structure predictions can be obtained, as compared to using secondary structure information alone. PMID:26420827

  14. Lattice-free prediction of three-dimensional structure of programmed DNA assemblies

    PubMed Central

    Pan, Keyao; Kim, Do-Nyun; Zhang, Fei; Adendorff, Matthew R.; Yan, Hao; Bathe, Mark

    2014-01-01

    DNA can be programmed to self-assemble into high molecular weight 3D assemblies with precise nanometer-scale structural features. Although numerous sequence design strategies exist to realize these assemblies in solution, there is currently no computational framework to predict their 3D structures on the basis of programmed underlying multi-way junction topologies constrained by DNA duplexes. Here, we introduce such an approach and apply it to assemblies designed using the canonical immobile four-way junction. The procedure is used to predict the 3D structure of high molecular weight planar and spherical ring-like origami objects, a tile-based sheet-like ribbon, and a 3D crystalline tensegrity motif, in quantitative agreement with experiments. Our framework provides a new approach to predict programmed nucleic acid 3D structure on the basis of prescribed secondary structure motifs, with possible application to the design of such assemblies for use in biomolecular and materials science. PMID:25470497

  15. Lattice-free prediction of three-dimensional structure of programmed DNA assemblies.

    PubMed

    Pan, Keyao; Kim, Do-Nyun; Zhang, Fei; Adendorff, Matthew R; Yan, Hao; Bathe, Mark

    2014-01-01

    DNA can be programmed to self-assemble into high molecular weight 3D assemblies with precise nanometer-scale structural features. Although numerous sequence design strategies exist to realize these assemblies in solution, there is currently no computational framework to predict their 3D structures on the basis of programmed underlying multi-way junction topologies constrained by DNA duplexes. Here, we introduce such an approach and apply it to assemblies designed using the canonical immobile four-way junction. The procedure is used to predict the 3D structure of high molecular weight planar and spherical ring-like origami objects, a tile-based sheet-like ribbon, and a 3D crystalline tensegrity motif, in quantitative agreement with experiments. Our framework provides a new approach to predict programmed nucleic acid 3D structure on the basis of prescribed secondary structure motifs, with possible application to the design of such assemblies for use in biomolecular and materials science. PMID:25470497

  16. All-atom 3D structure prediction of transmembrane β-barrel proteins from sequences.

    PubMed

    Hayat, Sikander; Sander, Chris; Marks, Debora S; Elofsson, Arne

    2015-04-28

    Transmembrane β-barrels (TMBs) carry out major functions in substrate transport and protein biogenesis but experimental determination of their 3D structure is challenging. Encouraged by successful de novo 3D structure prediction of globular and α-helical membrane proteins from sequence alignments alone, we developed an approach to predict the 3D structure of TMBs. The approach combines the maximum-entropy evolutionary coupling method for predicting residue contacts (EVfold) with a machine-learning approach (boctopus2) for predicting β-strands in the barrel. In a blinded test for 19 TMB proteins of known structure that have a sufficient number of diverse homologous sequences available, this combined method (EVfold_bb) predicts hydrogen-bonded residue pairs between adjacent β-strands at an accuracy of ∼70%. This accuracy is sufficient for the generation of all-atom 3D models. In the transmembrane barrel region, the average 3D structure accuracy [template-modeling (TM) score] of top-ranked models is 0.54 (ranging from 0.36 to 0.85), with a higher (44%) number of residue pairs in correct strand-strand registration than in earlier methods (18%). Although the nonbarrel regions are predicted less accurately overall, the evolutionary couplings identify some highly constrained loop residues and, for FecA protein, the barrel including the structure of a plug domain can be accurately modeled (TM score = 0.68). Lower prediction accuracy tends to be associated with insufficient sequence information and we therefore expect increasing numbers of β-barrel families to become accessible to accurate 3D structure prediction as the number of available sequences increases. PMID:25858953

  17. All-atom 3D structure prediction of transmembrane β-barrel proteins from sequences

    PubMed Central

    Hayat, Sikander; Sander, Chris; Marks, Debora S.

    2015-01-01

    Transmembrane β-barrels (TMBs) carry out major functions in substrate transport and protein biogenesis but experimental determination of their 3D structure is challenging. Encouraged by successful de novo 3D structure prediction of globular and α-helical membrane proteins from sequence alignments alone, we developed an approach to predict the 3D structure of TMBs. The approach combines the maximum-entropy evolutionary coupling method for predicting residue contacts (EVfold) with a machine-learning approach (boctopus2) for predicting β-strands in the barrel. In a blinded test for 19 TMB proteins of known structure that have a sufficient number of diverse homologous sequences available, this combined method (EVfold_bb) predicts hydrogen-bonded residue pairs between adjacent β-strands at an accuracy of ∼70%. This accuracy is sufficient for the generation of all-atom 3D models. In the transmembrane barrel region, the average 3D structure accuracy [template-modeling (TM) score] of top-ranked models is 0.54 (ranging from 0.36 to 0.85), with a higher (44%) number of residue pairs in correct strand–strand registration than in earlier methods (18%). Although the nonbarrel regions are predicted less accurately overall, the evolutionary couplings identify some highly constrained loop residues and, for FecA protein, the barrel including the structure of a plug domain can be accurately modeled (TM score = 0.68). Lower prediction accuracy tends to be associated with insufficient sequence information and we therefore expect increasing numbers of β-barrel families to become accessible to accurate 3D structure prediction as the number of available sequences increases. PMID:25858953

  18. Hybrid experimental/analytical models of structural dynamics - Creation and use for predictions

    NASA Technical Reports Server (NTRS)

    Balmes, Etienne

    1993-01-01

    An original complete methodology for the construction of predictive models of damped structural vibrations is introduced. A consistent definition of normal and complex modes is given which leads to an original method to accurately identify non-proportionally damped normal mode models. A new method to create predictive hybrid experimental/analytical models of damped structures is introduced, and the ability of hybrid models to predict the response to system configuration changes is discussed. Finally a critical review of the overall methodology is made by application to the case of the MIT/SERC interferometer testbed.

  19. A genetic algorithm for predicting the structures of interfaces in multicomponent systems.

    PubMed

    Chua, Alvin L-S; Benedek, Nicole A; Chen, Lin; Finnis, Mike W; Sutton, Adrian P

    2010-05-01

    Recent years have seen great advances in our ability to predict crystal structures from first principles. However, previous algorithms have focused on the prediction of bulk crystal structures, where the global minimum is the target. Here, we present a general atomistic approach to simulate in multicomponent systems the structures and free energies of grain boundaries and heterophase interfaces with fixed stoichiometric and non-stoichiometric compositions. The approach combines a new genetic algorithm using empirical interatomic potentials to explore the configurational phase space of boundaries, and thereafter refining structures and free energies with first-principles electronic structure methods. We introduce a structural order parameter to bias the genetic algorithm search away from the global minimum (which would be bulk crystal), while not favouring any particular structure types, unless they lower the energy. We demonstrate the power and efficiency of the algorithm by considering non-stoichiometric grain boundaries in a ternary oxide, SrTiO(3). PMID:20190770

  20. Predicting adverse drug reaction profiles by integrating protein interaction networks with drug structures.

    PubMed

    Huang, Liang-Chin; Wu, Xiaogang; Chen, Jake Y

    2013-01-01

    The prediction of adverse drug reactions (ADRs) has become increasingly important, due to the rising concern on serious ADRs that can cause drugs to fail to reach or stay in the market. We proposed a framework for predicting ADR profiles by integrating protein-protein interaction (PPI) networks with drug structures. We compared ADR prediction performances over 18 ADR categories through four feature groups-only drug targets, drug targets with PPI networks, drug structures, and drug targets with PPI networks plus drug structures. The results showed that the integration of PPI networks and drug structures can significantly improve the ADR prediction performance. The median AUC values for the four groups were 0.59, 0.61, 0.65, and 0.70. We used the protein features in the best two models, "Cardiac disorders" (median-AUC: 0.82) and "Psychiatric disorders" (median-AUC: 0.76), to build ADR-specific PPI networks with literature supports. For validation, we examined 30 drugs withdrawn from the U.S. market to see if our approach can predict their ADR profiles and explain why they were withdrawn. Except for three drugs having ADRs in the categories we did not predict, 25 out of 27 withdrawn drugs (92.6%) having severe ADRs were successfully predicted by our approach. PMID:23184540

  1. Assessing a novel approach for predicting local 3D protein structures from sequence.

    PubMed

    Benros, Cristina; de Brevern, Alexandre G; Etchebest, Catherine; Hazout, Serge

    2006-03-01

    We developed a novel approach for predicting local protein structure from sequence. It relies on the Hybrid Protein Model (HPM), an unsupervised clustering method we previously developed. This model learns three-dimensional protein fragments encoded into a structural alphabet of 16 protein blocks (PBs). Here, we focused on 11-residue fragments encoded as a series of seven PBs and used HPM to cluster them according to their local similarities. We thus built a library of 120 overlapping prototypes (mean fragments from each cluster), with good three-dimensional local approximation, i.e., a mean accuracy of 1.61 A Calpha root-mean-square distance. Our prediction method is intended to optimize the exploitation of the sequence-structure relations deduced from this library of long protein fragments. This was achieved by setting up a system of 120 experts, each defined by logistic regression to optimize the discrimination from sequence of a given prototype relative to the others. For a target sequence window, the experts computed probabilities of sequence-structure compatibility for the prototypes and ranked them, proposing the top scorers as structural candidates. Predictions were defined as successful when a prototype <2.5 A from the true local structure was found among those proposed. Our strategy yielded a prediction rate of 51.2% for an average of 4.2 candidates per sequence window. We also proposed a confidence index to estimate prediction quality. Our approach predicts from sequence alone and will thus provide valuable information for proteins without structural homologs. Candidates will also contribute to global structure prediction by fragment assembly. PMID:16385557

  2. Theoretical predictions of structures in dispersions containing charged colloidal particles and non-adsorbing polymers.

    PubMed

    Xie, Fei; Turesson, Martin; Woodward, Clifford E; van Gruijthuijsen, Kitty; Stradner, Anna; Forsman, Jan

    2016-04-20

    We develop a theoretical model to describe structural effects on a specific system of charged colloidal polystyrene particles, upon the addition of non-adsorbing PEG polymers. This system has previously been investigated experimentally, by scattering methods, so we are able to quantitatively compare predicted structure factors with corresponding experimental data. Our aim is to construct a model that is coarse-grained enough to be computationally manageable, yet detailed enough to capture the important physics. To this end, we utilize classical polymer density functional theory, wherein all possible polymer configurations are accounted for, subject to a mean-field Boltzmann weight. We make efforts to counteract drawbacks with this mean-field approach, resulting in structural predictions that agree very well with computationally more demanding simulations. Electrostatic interactions are handled at the fully non-linear Poisson-Boltzmann level, and we demonstrate that a linearization leads to less accurate predictions. The particle charge is an experimentally unknown parameter. We define the surface charge such that the experimental and theoretical gel point at equal polymer concentration coincide. Assuming a fixed surface charge for a certain salt concentration, we find very good agreements between measured and predicted structure factors across a wide range of polymer concentrations. We also present predictions for other structural quantities, such as radial distribution functions, and cluster size distributions. Finally, we demonstrate that our model predicts the occurrence of equilibrium clusters at high polymer concentrations, but low particle volume fractions and salt levels. PMID:27056112

  3. Predicting faunal fire responses in heterogeneous landscapes: the role of habitat structure.

    PubMed

    Swan, Matthew; Christie, Fiona; Sitters, Holly; York, Alan; Di Stefano, Julian

    2015-12-01

    Predicting the effects of fire on biota is important for biodiversity conservation in fire-prone landscapes. Time since fire is often used to predict the occurrence of fauna, yet for many species, it is a surrogate variable and it is temporal change in resource availability to which animals actually respond. Therefore prediction of fire-fauna relationships will be uncertain if time since fire is not strongly related to resources. In this study, we used a space-for-time substitution across a large diverse landscape to investigate interrelationships between the occurrence of ground-dwelling mammals, time since fire, and structural resources. We predicted that much variation in habitat structure would remain unexplained by time since fire and that habitat structure would predict species' occurrence better than time since fire. In line with predictions, we found that time since fire was moderately correlated with habitat structure yet was a poor surrogate for mammal occurrence. Variables representing habitat structure were better predictors of occurrence than time since fire for all species considered. Our results suggest that time since fire is unlikely to be a useful surrogate for ground-dwelling mammals in heterogeneous landscapes. Faunal conservation in fire-prone landscapes will benefit from a combined understanding of fauna-resource relationships and the ways in which fire (including planned fires and wildfires) alters the spatial and temporal distribution of faunal resources. PMID:26910956

  4. Bayesian probabilistic approach for predicting backbone structures in terms of protein blocks.

    PubMed

    de Brevern, A G; Etchebest, C; Hazout, S

    2000-11-15

    By using an unsupervised cluster analyzer, we have identified a local structural alphabet composed of 16 folding patterns of five consecutive C(alpha) ("protein blocks"). The dependence that exists between successive blocks is explicitly taken into account. A Bayesian approach based on the relation protein block-amino acid propensity is used for prediction and leads to a success rate close to 35%. Sharing sequence windows associated with certain blocks into "sequence families" improves the prediction accuracy by 6%. This prediction accuracy exceeds 75% when keeping the first four predicted protein blocks at each site of the protein. In addition, two different strategies are proposed: the first one defines the number of protein blocks in each site needed for respecting a user-fixed prediction accuracy, and alternatively, the second one defines the different protein sites to be predicted with a user-fixed number of blocks and a chosen accuracy. This last strategy applied to the ubiquitin conjugating enzyme (alpha/beta protein) shows that 91% of the sites may be predicted with a prediction accuracy larger than 77% considering only three blocks per site. The prediction strategies proposed improve our knowledge about sequence-structure dependence and should be very useful in ab initio protein modelling. PMID:11025540

  5. Challenging the state-of-the-art in protein structure prediction: Highlights of experimental target structures for the 10th Critical Assessment of Techniques for Protein Structure Prediction Experiment CASP10

    PubMed Central

    Kryshtafovych, Andriy; Moult, John; Bales, Patrick; Bazan, J. Fernando; Biasini, Marco; Burgin, Alex; Chen, Chen; Cochran, Frank V.; Craig, Timothy K.; Das, Rhiju; Fass, Deborah; Garcia-Doval, Carmela; Herzberg, Osnat; Lorimer, Donald; Luecke, Hartmut; Ma, Xiaolei; Nelson, Daniel C.; van Raaij, Mark J.; Rohwer, Forest; Segall, Anca; Seguritan, Victor; Zeth, Kornelius; Schwede, Torsten

    2014-01-01

    For the last two decades, CASP has assessed the state of the art in techniques for protein structure prediction and identified areas which required further development. CASP would not have been possible without the prediction targets provided by the experimental structural biology community. In the latest experiment, CASP10, over 100 structures were suggested as prediction targets, some of which appeared to be extraordinarily difficult for modeling. In this paper, authors of some of the most challenging targets discuss which specific scientific question motivated the experimental structure determination of the target protein, which structural features were especially interesting from a structural or functional perspective, and to what extent these features were correctly reproduced in the predictions submitted to CASP10. Specifically, the following targets will be presented: the acid-gated urea channel, a difficult to predict trans-membrane protein from the important human pathogen Helicobacter pylori; the structure of human interleukin IL-34, a recently discovered helical cytokine; the structure of a functionally uncharacterized enzyme OrfY from Thermoproteus tenax formed by a gene duplication and a novel fold; an ORFan domain of mimivirus sulfhydryl oxidase R596; the fibre protein gp17 from bacteriophage T7; the Bacteriophage CBA-120 tailspike protein; a virus coat protein from metagenomic samples of the marine environment; and finally an unprecedented class of structure prediction targets based on engineered disulfide-rich small proteins. PMID:24318984

  6. Challenging the state of the art in protein structure prediction: Highlights of experimental target structures for the 10th Critical Assessment of Techniques for Protein Structure Prediction Experiment CASP10.

    PubMed

    Kryshtafovych, Andriy; Moult, John; Bales, Patrick; Bazan, J Fernando; Biasini, Marco; Burgin, Alex; Chen, Chen; Cochran, Frank V; Craig, Timothy K; Das, Rhiju; Fass, Deborah; Garcia-Doval, Carmela; Herzberg, Osnat; Lorimer, Donald; Luecke, Hartmut; Ma, Xiaolei; Nelson, Daniel C; van Raaij, Mark J; Rohwer, Forest; Segall, Anca; Seguritan, Victor; Zeth, Kornelius; Schwede, Torsten

    2014-02-01

    For the last two decades, CASP has assessed the state of the art in techniques for protein structure prediction and identified areas which required further development. CASP would not have been possible without the prediction targets provided by the experimental structural biology community. In the latest experiment, CASP10, more than 100 structures were suggested as prediction targets, some of which appeared to be extraordinarily difficult for modeling. In this article, authors of some of the most challenging targets discuss which specific scientific question motivated the experimental structure determination of the target protein, which structural features were especially interesting from a structural or functional perspective, and to what extent these features were correctly reproduced in the predictions submitted to CASP10. Specifically, the following targets will be presented: the acid-gated urea channel, a difficult to predict transmembrane protein from the important human pathogen Helicobacter pylori; the structure of human interleukin (IL)-34, a recently discovered helical cytokine; the structure of a functionally uncharacterized enzyme OrfY from Thermoproteus tenax formed by a gene duplication and a novel fold; an ORFan domain of mimivirus sulfhydryl oxidase R596; the fiber protein gene product 17 from bacteriophage T7; the bacteriophage CBA-120 tailspike protein; a virus coat protein from metagenomic samples of the marine environment; and finally, an unprecedented class of structure prediction targets based on engineered disulfide-rich small proteins. PMID:24318984

  7. Damage predictions of aluminum thin-walled structures subjected to explosive loads.

    SciTech Connect

    Saul, W. Venner; Reu, Phillip L.; Gruda, Jeffrey Donald; Haulenbeek, Kimberly K.; Larsen, Marvin Elwood; Phelan, James M.; Stofleth, Jerome H.; Corona, Edmundo; Gwinn, Kenneth West

    2010-11-01

    Predicting failure of thin-walled structures from explosive loading is a very complex task. The problem can be divided into two parts; the detonation of the explosive to produce the loading on the structure, and secondly the structural response. First, the factors that affect the explosive loading include: size, shape, stand-off, confinement, and chemistry of the explosive. The goal of the first part of the analysis is predicting the pressure on the structure based on these factors. The hydrodynamic code CTH is used to conduct these calculations. Secondly, the response of a structure from the explosive loading is predicted using a detailed finite element model within the explicit analysis code Presto. Material response, to failure, must be established in the analysis to model the failure of this class of structures; validation of this behavior is also required to allow these analyses to be predictive for their intended use. The presentation will detail the validation tests used to support this program. Validation tests using explosively loaded aluminum thin flat plates were used to study all the aspects mentioned above. Experimental measurements of the pressures generated by the explosive and the resulting plate deformations provided data for comparison against analytical predictions. These included pressure-time histories and digital image correlation of the full field plate deflections. The issues studied in the structural analysis were mesh sensitivity, strain based failure metrics, and the coupling methodologies between the blast and structural models. These models have been successfully validated using these tests, thereby increasing confidence of the results obtained in the prediction of failure thresholds of complex structures, including aircraft.

  8. A probabilistic model for secondary structure prediction from protein chemical shifts.

    PubMed

    Mechelke, Martin; Habeck, Michael

    2013-06-01

    Protein chemical shifts encode detailed structural information that is difficult and computationally costly to describe at a fundamental level. Statistical and machine learning approaches have been used to infer correlations between chemical shifts and secondary structure from experimental chemical shifts. These methods range from simple statistics such as the chemical shift index to complex methods using neural networks. Notwithstanding their higher accuracy, more complex approaches tend to obscure the relationship between secondary structure and chemical shift and often involve many parameters that need to be trained. We present hidden Markov models (HMMs) with Gaussian emission probabilities to model the dependence between protein chemical shifts and secondary structure. The continuous emission probabilities are modeled as conditional probabilities for a given amino acid and secondary structure type. Using these distributions as outputs of first- and second-order HMMs, we achieve a prediction accuracy of 82.3%, which is competitive with existing methods for predicting secondary structure from protein chemical shifts. Incorporation of sequence-based secondary structure prediction into our HMM improves the prediction accuracy to 84.0%. Our findings suggest that an HMM with correlated Gaussian distributions conditioned on the secondary structure provides an adequate generative model of chemical shifts. PMID:23292699

  9. Mimicking the folding pathway to improve homology-free protein structure prediction

    NASA Astrophysics Data System (ADS)

    Freed, Karl; Debartolo, Joe; Colubri, Andres; Jha, Abhishek; Fitzgerald, James; Sosnick, Tobin

    2010-03-01

    Since demonstrating that a protein's sequence encodes its structure, the prediction of structure from sequence remains an outstanding problem that impacts numerous scientific disciplines including many genome projects. By iteratively fixing secondary structure assignments of residues during Monte Carlo simulations of folding, our coarse grained model without information concerning homology or explicit side chains outperforms current homology-based secondary structure prediction methods for many proteins. The computationally rapid algorithm using only single residue (phi, psi) dihedral angle moves also generates tertiary structures of comparable accuracy to existing all-atom methods for many small proteins, particularly ones with low homology. Hence, given appropriate search strategies and scoring functions, reduced representations can be used for accurately predicting secondary structure as well as providing three-dimensional structures, thereby increasing the size of proteins approachable by homology-free methods and the accuracy of template methods whose accuracy depends on the quality of the input secondary structure. Inclusion of information from evolutionarily related sequences enhances the statistics and the accuracy of the predictions.

  10. Improving protein fold recognition and structural class prediction accuracies using physicochemical properties of amino acids.

    PubMed

    Raicar, Gaurav; Saini, Harsh; Dehzangi, Abdollah; Lal, Sunil; Sharma, Alok

    2016-08-01

    Predicting the three-dimensional (3-D) structure of a protein is an important task in the field of bioinformatics and biological sciences. However, directly predicting the 3-D structure from the primary structure is hard to achieve. Therefore, predicting the fold or structural class of a protein sequence is generally used as an intermediate step in determining the protein's 3-D structure. For protein fold recognition (PFR) and structural class prediction (SCP), two steps are required - feature extraction step and classification step. Feature extraction techniques generally utilize syntactical-based information, evolutionary-based information and physicochemical-based information to extract features. In this study, we explore the importance of utilizing the physicochemical properties of amino acids for improving PFR and SCP accuracies. For this, we propose a Forward Consecutive Search (FCS) scheme which aims to strategically select physicochemical attributes that will supplement the existing feature extraction techniques for PFR and SCP. An exhaustive search is conducted on all the existing 544 physicochemical attributes using the proposed FCS scheme and a subset of physicochemical attributes is identified. Features extracted from these selected attributes are then combined with existing syntactical-based and evolutionary-based features, to show an improvement in the recognition and prediction performance on benchmark datasets. PMID:27164998

  11. A novel method for structure-based prediction of ion channel conductance properties.

    PubMed Central

    Smart, O S; Breed, J; Smith, G R; Sansom, M S

    1997-01-01

    A rapid and easy-to-use method of predicting the conductance of an ion channel from its three-dimensional structure is presented. The method combines the pore dimensions of the channel as measured in the HOLE program with an Ohmic model of conductance. An empirically based correction factor is then applied. The method yielded good results for six experimental channel structures (none of which were included in the training set) with predictions accurate to within an average factor of 1.62 to the true values. The predictive r2 was equal to 0.90, which is indicative of a good predictive ability. The procedure is used to validate model structures of alamethicin and phospholamban. Two genuine predictions for the conductance of channels with known structure but without reported conductances are given. A modification of the procedure that calculates the expected results for the effect of the addition of nonelectrolyte polymers on conductance is set out. Results for a cholera toxin B-subunit crystal structure agree well with the measured values. The difficulty in interpreting such studies is discussed, with the conclusion that measurements on channels of known structure are required. Images FIGURE 1 FIGURE 3 FIGURE 4 FIGURE 6 FIGURE 10 PMID:9138559

  12. Biochemical functional predictions for protein structures of unknown or uncertain function

    PubMed Central

    Mills, Caitlyn L.; Beuning, Penny J.; Ondrechen, Mary Jo

    2015-01-01

    With the exponential growth in the determination of protein sequences and structures via genome sequencing and structural genomics efforts, there is a growing need for reliable computational methods to determine the biochemical function of these proteins. This paper reviews the efforts to address the challenge of annotating the function at the molecular level of uncharacterized proteins. While sequence- and three-dimensional-structure-based methods for protein function prediction have been reviewed previously, the recent trends in local structure-based methods have received less attention. These local structure-based methods are the primary focus of this review. Computational methods have been developed to predict the residues important for catalysis and the local spatial arrangements of these residues can be used to identify protein function. In addition, the combination of different types of methods can help obtain more information and better predictions of function for proteins of unknown function. Global initiatives, including the Enzyme Function Initiative (EFI), COMputational BRidges to EXperiments (COMBREX), and the Critical Assessment of Function Annotation (CAFA), are evaluating and testing the different approaches to predicting the function of proteins of unknown function. These initiatives and global collaborations will increase the capability and reliability of methods to predict biochemical function computationally and will add substantial value to the current volume of structural genomics data by reducing the number of absent or inaccurate functional annotations. PMID:25848497

  13. Manual for the prediction of blast and fragment loadings on structures

    SciTech Connect

    Not Available

    1980-11-01

    The purpose of this manual is to provide Architect-Engineer (AE) firms guidance for the prediction of air blast, ground shock and fragment loadings on structures as a result of accidental explosions in or near these structures. Information in this manual is the result of an extensive literature survey and data gathering effort, supplemented by some original analytical studies on various aspects of blast phenomena. Many prediction equations and graphs are presented, accompanied by numerous example problems illustrating their use. The manual is complementary to existing structural design manuals and is intended to reflect the current state-of-the-art in prediction of blast and fragment loads for accidental explosions of high explosives at the Pantex Plant. In some instances, particularly for explosions within blast-resistant structures of complex geometry, rational estimation of these loads is beyond the current state-of-the-art.

  14. Structural protein descriptors in 1-dimension and their sequence-based predictions.

    PubMed

    Kurgan, Lukasz; Disfani, Fatemeh Miri

    2011-09-01

    The last few decades observed an increasing interest in development and application of 1-dimensional (1D) descriptors of protein structure. These descriptors project 3D structural features onto 1D strings of residue-wise structural assignments. They cover a wide-range of structural aspects including conformation of the backbone, burying depth/solvent exposure and flexibility of residues, and inter-chain residue-residue contacts. We perform first-of-its-kind comprehensive comparative review of the existing 1D structural descriptors. We define, review and categorize ten structural descriptors and we also describe, summarize and contrast over eighty computational models that are used to predict these descriptors from the protein sequences. We show that the majority of the recent sequence-based predictors utilize machine learning models, with the most popular being neural networks, support vector machines, hidden Markov models, and support vector and linear regressions. These methods provide high-throughput predictions and most of them are accessible to a non-expert user via web servers and/or stand-alone software packages. We empirically evaluate several recent sequence-based predictors of secondary structure, disorder, and solvent accessibility descriptors using a benchmark set based on CASP8 targets. Our analysis shows that the secondary structure can be predicted with over 80% accuracy and segment overlap (SOV), disorder with over 0.9 AUC, 0.6 Matthews Correlation Coefficient (MCC), and 75% SOV, and relative solvent accessibility with PCC of 0.7 and MCC of 0.6 (0.86 when homology is used). We demonstrate that the secondary structure predicted from sequence without the use of homology modeling is as good as the structure extracted from the 3D folds predicted by top-performing template-based methods. PMID:21787299

  15. A non-B DNA can replace heptamer of V(D)J recombination when present along with a nonamer: implications in chromosomal translocations and cancer.

    PubMed

    Nishana, Mayilaadumveettil; Raghavan, Sathees C

    2012-11-15

    The RAG (recombination-activating gene) complex is responsible for the generation of antigen receptor diversity by acting as a sequence-specific nuclease. Recent studies have shown that it also acts as a structure-specific nuclease. However, little is known about the factors regulating this activity at the genomic level. We show in the present study that the proximity of a V(D)J nonamer to heteroduplex DNA significantly increases RAG cleavage and binding efficiencies at physiological concentrations of MgCl(2). The position of the nonamer with respect to heteroduplex DNA was important, but not orientation. A spacer length of 18 bp between the nonamer and mismatch was optimal for RAG-mediated DNA cleavage. Mutations to the sequence of the nonamer and deletion of the nonamer-binding domain of RAG1 reinforced the role of the nonamer in the enhancement in RAG cleavage. Interestingly, partial mutation of the nonamer did not significantly reduce RAG cleavage on heteroduplex DNA, suggesting that even cryptic nonamers were sufficient to enhance RAG cleavage. More importantly, we show that the fragile region involved in chromosomal translocations associated with BCL2 (B-cell lymphoma 2) can be cleaved by RAGs following a nonamer-dependent mechanism. Hence our results from the present study suggest that a non-B DNA can replace the heptamer of RSS (recombination signal sequence) when present adjacent to nonamers, explaining the generation of certain chromosomal translocations in lymphoid malignancies. PMID:22891626

  16. conSSert: Consensus SVM Model for Accurate Prediction of Ordered Secondary Structure.

    PubMed

    Kieslich, Chris A; Smadbeck, James; Khoury, George A; Floudas, Christodoulos A

    2016-03-28

    Accurate prediction of protein secondary structure remains a crucial step in most approaches to the protein-folding problem, yet the prediction of ordered secondary structure, specifically beta-strands, remains a challenge. We developed a consensus secondary structure prediction method, conSSert, which is based on support vector machines (SVM) and provides exceptional accuracy for the prediction of beta-strands with QE accuracy of over 0.82 and a Q2-EH of 0.86. conSSert uses as input probabilities for the three types of secondary structure (helix, strand, and coil) that are predicted by four top performing methods: PSSpred, PSIPRED, SPINE-X, and RAPTOR. conSSert was trained/tested using 4261 protein chains from PDBSelect25, and 8632 chains from PISCES. Further validation was performed using targets from CASP9, CASP10, and CASP11. Our data suggest that poor performance in strand prediction is likely a result of training bias and not solely due to the nonlocal nature of beta-sheet contacts. conSSert is freely available for noncommercial use as a webservice: http://ares.tamu.edu/conSSert/ . PMID:26928531

  17. Towards crystal structure prediction of complex organic compounds – a report on the fifth blind test

    PubMed Central

    Bardwell, David A.; Adjiman, Claire S.; Arnautova, Yelena A.; Bartashevich, Ekaterina; Boerrigter, Stephan X. M.; Braun, Doris E.; Cruz-Cabeza, Aurora J.; Day, Graeme M.; Della Valle, Raffaele G.; Desiraju, Gautam R.; van Eijck, Bouke P.; Facelli, Julio C.; Ferraro, Marta B.; Grillo, Damian; Habgood, Matthew; Hofmann, Detlef W. M.; Hofmann, Fridolin; Jose, K. V. Jovan; Karamertzanis, Panagiotis G.; Kazantsev, Andrei V.; Kendrick, John; Kuleshova, Liudmila N.; Leusen, Frank J. J.; Maleev, Andrey V.; Misquitta, Alston J.; Mohamed, Sharmarke; Needs, Richard J.; Neumann, Marcus A.; Nikylov, Denis; Orendt, Anita M.; Pal, Rumpa; Pantelides, Constantinos C.; Pickard, Chris J.; Price, Louise S.; Price, Sarah L.; Scheraga, Harold A.; van de Streek, Jacco; Thakur, Tejender S.; Tiwari, Siddharth; Venuti, Elisabetta; Zhitkov, Ilia K.

    2011-01-01

    Following on from the success of the previous crystal structure prediction blind tests (CSP1999, CSP2001, CSP2004 and CSP2007), a fifth such collaborative project (CSP2010) was organized at the Cambridge Crystallographic Data Centre. A range of methodologies was used by the participating groups in order to evaluate the ability of the current computational methods to predict the crystal structures of the six organic molecules chosen as targets for this blind test. The first four targets, two rigid molecules, one semi-flexible molecule and a 1:1 salt, matched the criteria for the targets from CSP2007, while the last two targets belonged to two new challenging categories – a larger, much more flexible molecule and a hydrate with more than one polymorph. Each group submitted three predictions for each target it attempted. There was at least one successful prediction for each target, and two groups were able to successfully predict the structure of the large flexible molecule as their first place submission. The results show that while not as many groups successfully predicted the structures of the three smallest molecules as in CSP2007, there is now evidence that methodologies such as dispersion-corrected density functional theory (DFT-D) are able to reliably do so. The results also highlight the many challenges posed by more complex systems and show that there are still issues to be overcome. PMID:22101543

  18. Impact of RNA structure on the prediction of donor and acceptor splice sites

    PubMed Central

    Marashi, Sayed-Amir; Eslahchi, Changiz; Pezeshk, Hamid; Sadeghi, Mehdi

    2006-01-01

    Background gene identification in genomic DNA sequences by computational methods has become an important task in bioinformatics and computational gene prediction tools are now essential components of every genome sequencing project. Prediction of splice sites is a key step of all gene structural prediction algorithms. Results we sought the role of mRNA secondary structures and their information contents for five vertebrate and plant splice site datasets. We selected 900-nucleotide sequences centered at each (real or decoy) donor and acceptor sites, and predicted their corresponding RNA structures by Vienna software. Then, based on whether the nucleotide is in a stem or not, the conventional four-letter nucleotide alphabet was translated into an eight-letter alphabet. Zero-, first- and second-order Markov models were selected as the signal detection methods. It is shown that applying the eight-letter alphabet compared to the four-letter alphabet considerably increases the accuracy of both donor and acceptor site predictions in case of higher order Markov models. Conclusion Our results imply that RNA structure contains important data and future gene prediction programs can take advantage of such information. PMID:16772025

  19. Protein secondary structure prediction based on an improved support vector machines approach.

    PubMed

    Kim, Hyunsoo; Park, Haesun

    2003-08-01

    The prediction of protein secondary structure is an important step in the prediction of protein tertiary structure. A new protein secondary structure prediction method, SVMpsi, was developed to improve the current level of prediction by incorporating new tertiary classifiers and their jury decision system, and the PSI-BLAST PSSM profiles. Additionally, efficient methods to handle unbalanced data and a new optimization strategy for maximizing the Q(3) measure were developed. The SVMpsi produces the highest published Q(3) and SOV94 scores on both the RS126 and CB513 data sets to date. For a new KP480 set, the prediction accuracy of SVMpsi was Q(3) = 78.5% and SOV94 = 82.8%. Moreover, the blind test results for 136 non-redundant protein sequences which do not contain homologues of training data sets were Q(3) = 77.2% and SOV94 = 81.8%. The SVMpsi results in CASP5 illustrate that it is another competitive method to predict protein secondary structure. PMID:12968073

  20. DOX: A new computational protocol for accurate prediction of the protein-ligand binding structures.

    PubMed

    Rao, Li; Chi, Bo; Ren, Yanliang; Li, Yongjian; Xu, Xin; Wan, Jian

    2016-01-30

    Molecular docking techniques have now been widely used to predict the protein-ligand binding modes, especially when the structures of crystal complexes are not available. Most docking algorithms are able to effectively generate and rank a large number of probable binding poses. However, it is hard for them to accurately evaluate these poses and identify the most accurate binding structure. In this study, we first examined the performance of some docking programs, based on a testing set made of 15 crystal complexes with drug statins for the human 3-hydroxy-3-methylglutaryl coenzyme A reductase (HMGR). We found that most of the top ranking HMGR-statin binding poses, predicted by the docking programs, were energetically unstable as revealed by the high theoretical-level calculations, which were usually accompanied by the large deviations from the geometric parameters of the corresponding crystal binding structures. Subsequently, we proposed a new computational protocol, DOX, based on the joint use of molecular Docking, ONIOM, and eXtended ONIOM (XO) methods to predict the accurate binding structures for the protein-ligand complexes of interest. Our testing results demonstrate that the DOX protocol can efficiently predict accurate geometries for all 15 HMGR-statin crystal complexes without exception. This study suggests a promising computational route, as an effective alternative to the experimental one, toward predicting the accurate binding structures, which is the prerequisite for all the deep understandings of the properties, functions, and mechanisms of the protein-ligand complexes. © 2015 Wiley Periodicals, Inc. PMID:26459237

  1. Structural link prediction based on ant colony approach in social networks

    NASA Astrophysics Data System (ADS)

    Sherkat, Ehsan; Rahgozar, Maseud; Asadpour, Masoud

    2015-02-01

    As the size and number of online social networks are increasing day by day, social network analysis has become a popular issue in many branches of science. The link prediction is one of the key rolling issues in the analysis of social network's evolution. As the size of social networks is increasing, the necessity for scalable link prediction algorithms is being felt more. The aim of this paper is to introduce a new unsupervised structural link prediction algorithm based on the ant colony approach. Recently, ant colony approach has been used for solving some graph problems. Different kinds of networks are used for testing the proposed approach. In some networks, the proposed scalable algorithm has the best result in comparison to other structural unsupervised link prediction algorithms. In order to evaluate the algorithm results, methods like the top- n precision, area under the Receiver Operating Characteristic (ROC) and Precision-Recall curves are carried out on real-world networks.

  2. A benchmark server using high resolution protein structure data, and benchmark results for membrane helix predictions

    PubMed Central

    2013-01-01

    Background Helical membrane proteins are vital for the interaction of cells with their environment. Predicting the location of membrane helices in protein amino acid sequences provides substantial understanding of their structure and function and identifies membrane proteins in sequenced genomes. Currently there is no comprehensive benchmark tool for evaluating prediction methods, and there is no publication comparing all available prediction tools. Current benchmark literature is outdated, as recently determined membrane protein structures are not included. Current literature is also limited to global assessments, as specialised benchmarks for predicting specific classes of membrane proteins were not previously carried out. Description We present a benchmark server at http://sydney.edu.au/pharmacy/sbio/software/TMH_benchmark.shtml that uses recent high resolution protein structural data to provide a comprehensive assessment of the accuracy of existing membrane helix prediction methods. The server further allows a user to compare uploaded predictions generated by novel methods, permitting the comparison of these novel methods against all existing methods compared by the server. Benchmark metrics include sensitivity and specificity of predictions for membrane helix location and orientation, and many others. The server allows for customised evaluations such as assessing prediction method performances for specific helical membrane protein subtypes. We report results for custom benchmarks which illustrate how the server may be used for specialised benchmarks. Which prediction method is the best performing method depends on which measure is being benchmarked. The OCTOPUS membrane helix prediction method is consistently one of the highest performing methods across all measures in the benchmarks that we performed. Conclusions The benchmark server allows general and specialised assessment of existing and novel membrane helix prediction methods. Users can employ this benchmark server to determine the most suitable method for the type of prediction the user needs to perform, be it general whole-genome annotation or the prediction of specific types of helical membrane protein. Creators of novel prediction methods can use this benchmark server to evaluate the performance of their new methods. The benchmark server will be a valuable tool for researchers seeking to extract more sophisticated information from the large and growing protein sequence databases. PMID:23530628

  3. Selective refinement and selection of near-native models in protein structure prediction

    PubMed Central

    Zhang, Jiong; Barz, Bagdan; Zhang, Jingfen; Xu, Dong; Kosztin, Ioan

    2015-01-01

    In recent years in silico protein structure prediction reached a level where fully automated servers can generate large pools of near-native structures. However, the identification and further refinement of the best structures from the pool of models remain problematic. To address these issues, we have developed (i) a target-specific selective refinement (SR) protocol; and (ii) molecular dynamics (MD) simulation based ranking (SMDR) method. In SR the all-atom refinement of structures is accomplished via the Rosetta Relax protocol, subject to specific constraints determined by the size and complexity of the target. The best-refined models are selected with SMDR by testing their relative stability against gradual heating through all-atom MD simulations. Through extensive testing we have found that Mufold-MD, our fully automated protein structure prediction server updated with the SR and SMDR modules consistently outperformed its previous versions. PMID:26214389

  4. Analytical Methodology for Predicting the Onset of Widespread Fatigue Damage in Fuselage Structure

    NASA Technical Reports Server (NTRS)

    Harris, Charles E.; Newman, James C., Jr.; Piascik, Robert S.; Starnes, James H., Jr.

    1996-01-01

    NASA has developed a comprehensive analytical methodology for predicting the onset of widespread fatigue damage in fuselage structure. The determination of the number of flights and operational hours of aircraft service life that are related to the onset of widespread fatigue damage includes analyses for crack initiation, fatigue crack growth, and residual strength. Therefore, the computational capability required to predict analytically the onset of widespread fatigue damage must be able to represent a wide range of crack sizes from the material (microscale) level to the global structural-scale level. NASA studies indicate that the fatigue crack behavior in aircraft structure can be represented conveniently by the following three analysis scales: small three-dimensional cracks at the microscale level, through-the-thickness two-dimensional cracks at the local structural level, and long cracks at the global structural level. The computational requirements for each of these three analysis scales are described in this paper.

  5. Selective refinement and selection of near-native models in protein structure prediction.

    PubMed

    Zhang, Jiong; Barz, Bogdan; Zhang, Jingfen; Xu, Dong; Kosztin, Ioan

    2015-10-01

    In recent years in silico protein structure prediction reached a level where fully automated servers can generate large pools of near-native structures. However, the identification and further refinement of the best structures from the pool of models remain problematic. To address these issues, we have developed (i) a target-specific selective refinement (SR) protocol; and (ii) molecular dynamics (MD) simulation based ranking (SMDR) method. In SR the all-atom refinement of structures is accomplished via the Rosetta Relax protocol, subject to specific constraints determined by the size and complexity of the target. The best-refined models are selected with SMDR by testing their relative stability against gradual heating through all-atom MD simulations. Through extensive testing we have found that Mufold-MD, our fully automated protein structure prediction server updated with the SR and SMDR modules consistently outperformed its previous versions. PMID:26214389

  6. Predicting RNA 3D structure using a coarse-grain helix-centered model

    PubMed Central

    Kerpedjiev, Peter; Höner zu Siederdissen, Christian; Hofacker, Ivo L.

    2015-01-01

    A 3D model of RNA structure can provide information about its function and regulation that is not possible with just the sequence or secondary structure. Current models suffer from low accuracy and long running times and either neglect or presume knowledge of the long-range interactions which stabilize the tertiary structure. Our coarse-grained, helix-based, tertiary structure model operates with only a few degrees of freedom compared with all-atom models while preserving the ability to sample tertiary structures given a secondary structure. It strikes a balance between the precision of an all-atom tertiary structure model and the simplicity and effectiveness of a secondary structure representation. It provides a simplified tool for exploring global arrangements of helices and loops within RNA structures. We provide an example of a novel energy function relying only on the positions of stems and loops. We show that coupling our model to this energy function produces predictions as good as or better than the current state of the art tools. We propose that given the wide range of conformational space that needs to be explored, a coarse-grain approach can explore more conformations in less iterations than an all-atom model coupled to a fine-grain energy function. Finally, we emphasize the overarching theme of providing an ensemble of predicted structures, something which our tool excels at, rather than providing a handful of the lowest energy structures. PMID:25904133

  7. Prediction of biodegradability from chemical structure: Modeling or ready biodegradation test data

    SciTech Connect

    Loonen, H.; Lindgren, F.; Hansen, B.

    1999-08-01

    Biodegradation data were collected and evaluated for 894 substances with widely varying chemical structures. All data were determined according to the Japanese Ministry of International Trade and Industry (MITI) I test protocol. The MITI I test is a screening test for ready biodegradability and has been described by Organization for Economic Cooperation and Development (OECD) test guideline 301 C and European Union (EU) test guideline C4F. The chemicals were characterized by a set of 127 predefined structural fragments. This data set was used to develop a model for the prediction of the biodegradability of chemicals under standardized OECD and EU ready biodegradation test conditions. Partial least squares (PLS) discriminant analysis was used for the model development. The model was evaluated by means of internal cross-validation and repeated external validation. The importance of various structural fragments and fragment interactions was investigated. The most important fragments include the presence of a long alkyl chain; hydroxy, ester, and acid groups (enhancing biodegradation); and the presence of one or more aromatic rings and halogen substituents (regarding biodegradation). More than 85% of the model predictions were correct for using the complete data set. The not readily biodegradable predictions were slightly better than the readily biodegradable predictions (86 vs 84%). The average percentage of correct predictions from four external validation studies was 83%. Model optimization by including fragment interactions improve the model predicting capabilities to 89%. It can be concluded that the PLS model provides predictions of high reliability for a diverse range of chemical structures. The predictions conform to the concept of readily biodegradable (or not readily biodegradable) as defined by OECD and EU test guidelines.

  8. A Bayesian approach to improved calibration and prediction of groundwater models with structural error

    NASA Astrophysics Data System (ADS)

    Xu, Tianfang; Valocchi, Albert J.

    2015-11-01

    Numerical groundwater flow and solute transport models are usually subject to model structural error due to simplification and/or misrepresentation of the real system, which raises questions regarding the suitability of conventional least squares regression-based (LSR) calibration. We present a new framework that explicitly describes the model structural error statistically in an inductive, data-driven way. We adopt a fully Bayesian approach that integrates Gaussian process error models into the calibration, prediction, and uncertainty analysis of groundwater flow models. We test the usefulness of the fully Bayesian approach with a synthetic case study of the impact of pumping on surface-ground water interaction. We illustrate through this example that the Bayesian parameter posterior distributions differ significantly from parameters estimated by conventional LSR, which does not account for model structural error. For the latter method, parameter compensation for model structural error leads to biased, overconfident prediction under changing pumping condition. In contrast, integrating Gaussian process error models significantly reduces predictive bias and leads to prediction intervals that are more consistent with validation data. Finally, we carry out a generalized LSR recalibration step to assimilate the Bayesian prediction while preserving mass conservation and other physical constraints, using a full error covariance matrix obtained from Bayesian results. It is found that the recalibrated model achieved lower predictive bias compared to the model calibrated using conventional LSR. The results highlight the importance of explicit treatment of model structural error especially in circumstances where subsequent decision-making and risk analysis require accurate prediction and uncertainty quantification.

  9. Prediction of vibration characteristics in beam structure using sub-scale modeling with experimental validation

    NASA Astrophysics Data System (ADS)

    Zai, Behzad Ahmed; Sami, Saad; Khan, M. Amir; Ahmad, Furqan; Park, Myung Kyun

    2015-09-01

    Geometric or sub-scale modeling techniques are used for the evaluation of large and complex dynamic structures to ensure accurate reproduction of load path and thus leading to true dynamic characteristics of such structures. The sub-scale modeling technique is very effective in the prediction of vibration characteristics of original large structure when the experimental testing is not feasible due to the absence of a large testing facility. Previous researches were more focused on free and harmonic vibration case with little or no consideration for readily encountered random vibration. A sub-scale modeling technique is proposed for estimating the vibration characteristics of any large scale structure such as Launch vehicles, Mega structures, etc., under various vibration load cases by utilizing precise scaled-down model of that dynamic structure. In order to establish an analytical correlation between the original structure and its scaled models, different scale models of isotropic cantilever beam are selected and analyzed under various vibration conditions( i.e. free, harmonic and random) using finite element package ANSYS. The developed correlations are also validated through experimental testing. The prediction made from the vibratory response of the scaled-down beam through the established sets of correlation are found similar to the response measured from the testing of original beam structure. The established correlations are equally applicable in the prediction of dynamic characteristics of any complex structure through its scaled-down models. This paper presents modified sub-scale modeling technique that enables accurate prediction of vibration characteristics of large and complex structure under not only sinusoidal but also for random vibrations.

  10. PAIRpred: partner-specific prediction of interacting residues from sequence and structure.

    PubMed

    Minhas, Fayyaz ul Amir Afsar; Geiss, Brian J; Ben-Hur, Asa

    2014-07-01

    We present a novel partner-specific protein-protein interaction site prediction method called PAIRpred. Unlike most existing machine learning binding site prediction methods, PAIRpred uses information from both proteins in a protein complex to predict pairs of interacting residues from the two proteins. PAIRpred captures sequence and structure information about residue pairs through pairwise kernels that are used for training a support vector machine classifier. As a result, PAIRpred presents a more detailed model of protein binding, and offers state of the art accuracy in predicting binding sites at the protein level as well as inter-protein residue contacts at the complex level. We demonstrate PAIRpred's performance on Docking Benchmark 4.0 and recent CAPRI targets. We present a detailed performance analysis outlining the contribution of different sequence and structure features, together with a comparison to a variety of existing interface prediction techniques. We have also studied the impact of binding-associated conformational change on prediction accuracy and found PAIRpred to be more robust to such structural changes than existing schemes. As an illustration of the potential applications of PAIRpred, we provide a case study in which PAIRpred is used to analyze the nature and specificity of the interface in the interaction of human ISG15 protein with NS1 protein from influenza A virus. Python code for PAIRpred is available at http://combi.cs.colostate.edu/supplements/pairpred/. PMID:24243399

  11. Structural descriptor database: a new tool for sequence-based functional site prediction

    PubMed Central

    Bernardes, Juliana S; Fernandez, Jorge H; Vasconcelos, Ana Tereza R

    2008-01-01

    Background The Structural Descriptor Database (SDDB) is a web-based tool that predicts the function of proteins and functional site positions based on the structural properties of related protein families. Structural alignments and functional residues of a known protein set (defined as the training set) are used to build special Hidden Markov Models (HMM) called HMM descriptors. SDDB uses previously calculated and stored HMM descriptors for predicting active sites, binding residues, and protein function. The database integrates biologically relevant data filtered from several databases such as PDB, PDBSUM, CSA and SCOP. It accepts queries in fasta format and predicts functional residue positions, protein-ligand interactions, and protein function, based on the SCOP database. Results To assess the SDDB performance, we used different data sets. The Trypsion-like Serine protease data set assessed how well SDDB predicts functional sites when curated data is available. The SCOP family data set was used to analyze SDDB performance by using training data extracted from PDBSUM (binding sites) and from CSA (active sites). The ATP-binding experiment was used to compare our approach with the most current method. For all evaluations, significant improvements were obtained with SDDB. Conclusion SDDB performed better when trusty training data was available. SDDB worked better in predicting active sites rather than binding sites because the former are more conserved than the latter. Nevertheless, by using our prediction method we obtained results with precision above 70%. PMID:19032768

  12. Quantitative structure-activity relationships for predicting skin and respiratory sensitization.

    PubMed

    Rodford, Rosemary; Patlewicz, Grace; Walker, John D; Payne, Martin P

    2003-08-01

    Quantitative structure-activity relationships (QSARs) for predicting skin and respiratory sensitization are reviewed. Overall, progress has been hampered by the sparseness of good quality experimental data, a fact that makes it difficult, at this time, to recommend one or two QSARs for predicting skin and respiratory sensitization. Creation of appropriate data sets for uninvestigated classes of chemicals by experimentation should facilitate the development of more robust QSARs for predicting skin and respiratory sensitization. Such QSARs will be valuable in the evaluation of identifiable toxic hazards where dose responses are relevant, as is the case for skin and respiratory sensitization. PMID:12924584

  13. A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction.

    PubMed

    Spencer, Matt; Eickholt, Jesse; Jianlin Cheng

    2015-01-01

    Ab initio protein secondary structure (SS) predictions are utilized to generate tertiary structure predictions, which are increasingly demanded due to the rapid discovery of proteins. Although recent developments have slightly exceeded previous methods of SS prediction, accuracy has stagnated around 80 percent and many wonder if prediction cannot be advanced beyond this ceiling. Disciplines that have traditionally employed neural networks are experimenting with novel deep learning techniques in attempts to stimulate progress. Since neural networks have historically played an important role in SS prediction, we wanted to determine whether deep learning could contribute to the advancement of this field as well. We developed an SS predictor that makes use of the position-specific scoring matrix generated by PSI-BLAST and deep learning network architectures, which we call DNSS. Graphical processing units and CUDA software optimize the deep network architecture and efficiently train the deep networks. Optimal parameters for the training process were determined, and a workflow comprising three separately trained deep networks was constructed in order to make refined predictions. This deep learning network approach was used to predict SS for a fully independent test dataset of 198 proteins, achieving a Q3 accuracy of 80.7 percent and a Sov accuracy of 74.2 percent. PMID:25750595

  14. Predicting forest structure across space and time using lidar and Landsat time series (Invited)

    NASA Astrophysics Data System (ADS)

    Cohen, W. B.; Pflugmacher, D.; Yang, Z.

    2013-12-01

    Lidar is unprecedented in its ability to provide detailed characterizations of forest structure. However, use of lidar is currently limited to relatively small areas associated with specific projects. Moreover, lidar data are even more severely limited historically, which inhibits retrospective analyses of structure change. Landsat data is commonly dismissed when considering a need to map forest structure due to its lack of sensitivity to structural variability. But with the opening of the archive by USGS, Landsat data can now be used in creative ways that take advantage of dense time series to describe historic disturbance and recovery. Because the condition and state of a forest at any given location is largely a function of its disturbance history, this provides an opportunity to use Landsat time series to inform statistical models that predict current forest structure. Additionally, because Landsat time series go back to 1972, it becomes possible to extend those models back in time to derive structure trajectories for retrospective analyses. We will present the results from one or two studies in the Pacific Northwest, USA that use disturbance history metrics derived from Landsat time series to demonstrate the new power of Landsat to predict forest structure (e.g., aboveground live biomass, height). The primary metrics used relate to the magnitude of the greatest disturbance, pre- and post- disturbance spectral trends, and current spectral properties. This is accomplished using a limited field dataset to translate a lidar coverage into the structure measures of interest, and then sampling the lidar data to build a robust statistical relationship between lidar-derived structure and disturbance history. We examined the effect of number of years of history on prediction strength and found that R2 increases and RMSE decreases for a period of ~20 years. This means we can predict forest structure as far back as 1992, using the 20 years of history information contained the MSS to TM data from 1972-1992. Because the time series data are highly calibrated through time, we can apply the model developed for the current period directly to the Landsat time series from 1972-1992 to predict 1992 forest structure. Results compare well to re-measured field data such that change in forest structure between 1992 and the present could be reliably calculated directly from a difference in the two predictions.

  15. On the relevance of sophisticated structural annotations for disulfide connectivity pattern prediction.

    PubMed

    Becker, Julien; Maes, Francis; Wehenkel, Louis

    2013-01-01

    Disulfide bridges strongly constrain the native structure of many proteins and predicting their formation is therefore a key sub-problem of protein structure and function inference. Most recently proposed approaches for this prediction problem adopt the following pipeline: first they enrich the primary sequence with structural annotations, second they apply a binary classifier to each candidate pair of cysteines to predict disulfide bonding probabilities and finally, they use a maximum weight graph matching algorithm to derive the predicted disulfide connectivity pattern of a protein. In this paper, we adopt this three step pipeline and propose an extensive study of the relevance of various structural annotations and feature encodings. In particular, we consider five kinds of structural annotations, among which three are novel in the context of disulfide bridge prediction. So as to be usable by machine learning algorithms, these annotations must be encoded into features. For this purpose, we propose four different feature encodings based on local windows and on different kinds of histograms. The combination of structural annotations with these possible encodings leads to a large number of possible feature functions. In order to identify a minimal subset of relevant feature functions among those, we propose an efficient and interpretable feature function selection scheme, designed so as to avoid any form of overfitting. We apply this scheme on top of three supervised learning algorithms: k-nearest neighbors, support vector machines and extremely randomized trees. Our results indicate that the use of only the PSSM (position-specific scoring matrix) together with the CSP (cysteine separation profile) are sufficient to construct a high performance disulfide pattern predictor and that extremely randomized trees reach a disulfide pattern prediction accuracy of [Formula: see text] on the benchmark dataset SPX[Formula: see text], which corresponds to [Formula: see text] improvement over the state of the art. A web-application is available at http://m24.giga.ulg.ac.be:81/x3CysBridges. PMID:23533562

  16. On the Relevance of Sophisticated Structural Annotations for Disulfide Connectivity Pattern Prediction

    PubMed Central

    Becker, Julien; Maes, Francis; Wehenkel, Louis

    2013-01-01

    Disulfide bridges strongly constrain the native structure of many proteins and predicting their formation is therefore a key sub-problem of protein structure and function inference. Most recently proposed approaches for this prediction problem adopt the following pipeline: first they enrich the primary sequence with structural annotations, second they apply a binary classifier to each candidate pair of cysteines to predict disulfide bonding probabilities and finally, they use a maximum weight graph matching algorithm to derive the predicted disulfide connectivity pattern of a protein. In this paper, we adopt this three step pipeline and propose an extensive study of the relevance of various structural annotations and feature encodings. In particular, we consider five kinds of structural annotations, among which three are novel in the context of disulfide bridge prediction. So as to be usable by machine learning algorithms, these annotations must be encoded into features. For this purpose, we propose four different feature encodings based on local windows and on different kinds of histograms. The combination of structural annotations with these possible encodings leads to a large number of possible feature functions. In order to identify a minimal subset of relevant feature functions among those, we propose an efficient and interpretable feature function selection scheme, designed so as to avoid any form of overfitting. We apply this scheme on top of three supervised learning algorithms: k-nearest neighbors, support vector machines and extremely randomized trees. Our results indicate that the use of only the PSSM (position-specific scoring matrix) together with the CSP (cysteine separation profile) are sufficient to construct a high performance disulfide pattern predictor and that extremely randomized trees reach a disulfide pattern prediction accuracy of on the benchmark dataset SPX, which corresponds to improvement over the state of the art. A web-application is available at http://m24.giga.ulg.ac.be:81/x3CysBridges. PMID:23533562

  17. ASTRO-FOLD 2.0: an Enhanced Framework for Protein Structure Prediction

    PubMed Central

    Subramani, A.; Wei, Y.; Floudas, C. A.

    2011-01-01

    The three-dimensional (3-D) structure prediction of proteins, given their amino acid sequence, is addressed using the first principlesbased approach ASTRO-FOLD 2.0. The key features presented are: (1) Secondary structure prediction using a novel optimization-based consensus approach, (2) ?-sheet topology prediction using mixed-integer linear optimization (MILP), (3) Residue-to-residue contact prediction using a high-resolution distance-dependent force field and MILP formulation, (4) Tight dihedral angle and distance bound generation for loop residues using dihedral angle clustering and non-linear optimization (NLP), (5) 3-D structure prediction using deterministic global optimization, stochastic conformational space annealing, and the full-atomistic ECEPP/3 potential, (6) Near-native structure selection using a traveling salesman problem-based clustering approach, ICON, and (7) Improved bound generation using chemical shifts of subsets of heavy atoms, generated by SPARTA and CS23D. Computational results of ASTRO-FOLD 2.0 on 47 blind targets of the recently concluded CASP9 experiment are presented. PMID:23049093

  18. Structure-aided prediction of mammalian transcription factor complexes in conserved non-coding elements

    PubMed Central

    Guturu, Harendra; Doxey, Andrew C.; Wenger, Aaron M.; Bejerano, Gill

    2013-01-01

    Mapping the DNA-binding preferences of transcription factor (TF) complexes is critical for deciphering the functions of cis-regulatory elements. Here, we developed a computational method that compares co-occurring motif spacings in conserved versus unconserved regions of the human genome to detect evolutionarily constrained binding sites of rigid TF complexes. Structural data were used to estimate TF complex physical plausibility, explore overlapping motif arrangements seldom tackled by non-structure-aware methods, and generate and analyse three-dimensional models of the predicted complexes bound to DNA. Using this approach, we predicted 422 physically realistic TF complex motifs at 18% false discovery rate, the majority of which (326, 77%) contain some sequence overlap between binding sites. The set of mostly novel complexes is enriched in known composite motifs, predictive of binding site configurations in TF–TF–DNA crystal structures, and supported by ChIP-seq datasets. Structural modelling revealed three cooperativity mechanisms: direct protein–protein interactions, potentially indirect interactions and ‘through-DNA’ interactions. Indeed, 38% of the predicted complexes were found to contain four or more bases in which TF pairs appear to synergize through overlapping binding to the same DNA base pairs in opposite grooves or strands. Our TF complex and associated binding site predictions are available as a web resource at http://bejerano.stanford.edu/complex. PMID:24218641

  19. Towards Practical Carbonation Prediction and Modelling for Service Life Design of Reinforced Concrete Structures

    NASA Astrophysics Data System (ADS)

    Ekolu, O. S.

    2015-11-01

    Amongst the scientific community, the interest in durability of concrete structures has been high for quite a long time of over 40 years. Of the various causes of degradation of concrete structures, corrosion is the most widespread durability problem and carbonation is one of the two causes of steel reinforcement corrosion. While much scientific understanding has been gained from the numerous carbonation studies undertaken over the past years, it is still presently not possible to accurately predict carbonation and apply it in design of structures. This underscores the complex nature of the mechanisms as influenced by several interactive factors. Based on critical literature and some experience of the author, it is found that there still exist major challenges in establishing a mathematical constitutive relation for realistic carbonation prediction. While most current models employ permeability /diffusion as the main model property, analysis shows that the most practical material property would be compressive strength, which has a low coefficient of variation of 20% compared to 30 to 50% for permeability. This important characteristic of compressive strength, combined with its merit of simplicity and data availability at all stages of a structure's life, promote its potential use in modelling over permeability. By using compressive strength in carbonation prediction, the need for accelerated testing and permeability measurement can be avoided. This paper attempts to examine the issues associated with carbonation prediction, which could underlie the current lack of a sound established prediction method. Suggestions are then made for possible employment of different or alternative approaches.

  20. Fast computational methods for predicting protein structure from primary amino acid sequence

    DOEpatents

    Agarwal, Pratul Kumar

    2011-07-19

    The present invention provides a method utilizing primary amino acid sequence of a protein, energy minimization, molecular dynamics and protein vibrational modes to predict three-dimensional structure of a protein. The present invention also determines possible intermediates in the protein folding pathway. The present invention has important applications to the design of novel drugs as well as protein engineering. The present invention predicts the three-dimensional structure of a protein independent of size of the protein, overcoming a significant limitation in the prior art.

  1. Artificial Intelligence in Prediction of Secondary Protein Structure Using CB513 Database.

    PubMed

    Avdagic, Zikrija; Purisevic, Elvir; Omanovic, Samir; Coralic, Zlatan

    2009-01-01

    In this paper we describe CB513 a non-redundant dataset, suitable for development of algorithms for prediction of secondary protein structure. A program was made in Borland Delphi for transforming data from our dataset to make it suitable for learning of neural network for prediction of secondary protein structure implemented in MATLAB Neural-Network Toolbox. Learning (training and testing) of neural network is researched with different sizes of windows, different number of neurons in the hidden layer and different number of training epochs, while using dataset CB513. PMID:21347158

  2. Prediction of service life of aircraft structural components using the half-cycle method

    NASA Technical Reports Server (NTRS)

    Ko, William L.

    1987-01-01

    The service life of aircraft structural components undergoing random stress cycling was analyzed by the application of fracture mechanics. The initial crack sizes at the critical stress points for the fatigue-crack growth analysis were established through proof load tests. The fatigue-crack growth rates for random stress cycles were calculated using the half-cycle method. A new equation was developed for calculating the number of remaining flights for the structural components. The number of remaining flights predicted by the new equation is much lower than that predicted by the conventional equation.

  3. New tools and expanded data analysis capabilities at the protein structure prediction center

    PubMed Central

    Kryshtafovych, Andriy; Prlic, Andreas; Dmytriv, Zinoviy; Daniluk, Pawel; Milostan, Maciej; Eyrich, Volker; Hubbard, Tim; Fidelis, Krzysztof

    2007-01-01

    We outline the main tasks performed by the Protein Structure Prediction Center in support of the CASP7 experiment and provide a brief review of the major measures used in the automatic evaluation of predictions. We describe in more detail the software developed to facilitate analysis of modeling success over and beyond the available templates and the adopted Java-based tool enabling visualization of multiple structural superpositions between target and several models/templates. We also give an overview of the CASP infrastructure provided by the Center and discuss the organization of the results web pages available through http://predictioncenter.org PMID:17705273

  4. Artificial Intelligence in Prediction of Secondary Protein Structure Using CB513 Database

    PubMed Central

    Avdagic, Zikrija; Purisevic, Elvir; Omanovic, Samir; Coralic, Zlatan

    2009-01-01

    In this paper we describe CB513 a non-redundant dataset, suitable for development of algorithms for prediction of secondary protein structure. A program was made in Borland Delphi for transforming data from our dataset to make it suitable for learning of neural network for prediction of secondary protein structure implemented in MATLAB Neural-Network Toolbox. Learning (training and testing) of neural network is researched with different sizes of windows, different number of neurons in the hidden layer and different number of training epochs, while using dataset CB513. PMID:21347158

  5. Ab-initio crystal structure prediction. A case study: NaBH{sub 4}

    SciTech Connect

    Caputo, Riccarda; Tekin, Adem

    2011-07-15

    Crystal structure prediction from first principles is still one of the most challenging and interesting issue in condensed matter science. we explored the potential energy surface of NaBH{sub 4} by a combined ab-initio approach, based on global structure optimizations and quantum chemistry. In particular, we used simulated annealing (SA) and density functional theory (DFT) calculations. The methodology enabled the identification of several local minima, of which the global minimum corresponded to the tetragonal ground-state structure (P4{sub 2}/nmc), and the prediction of higher energy stable structures, among them a monoclinic (Pm) one was identified to be 22.75 kJ/mol above the ground-state at T=298 K. In between, orthorhombic and cubic structures were recovered, in particular those with Pnma and F4-bar 3m symmetries. - Graphical abstract: The total electron energy difference of the calculated stable structures. Here, the tetragonal (IT 137) and the monoclinic (IT 6) symmetry groups corresponded to the lowest and the highest energy structures, respectively. Highlights: > Potential energy surface of NaBH{sub 4} is investigated. > This is done a combination of global structure optimizations based on simulated annealing and density functional calculations. > We successfully reproduced experimentally found tetragonal and orthorhombic structures of NaBH{sub 4}. > Furthermore, we found a new stable high energy structure.

  6. Analysis and Design of Fuselage Structures Including Residual Strength Prediction Methodology

    NASA Technical Reports Server (NTRS)

    Knight, Norman F.

    1998-01-01

    The goal of this research project is to develop and assess methodologies for the design and analysis of fuselage structures accounting for residual strength. Two primary objectives are included in this research activity: development of structural analysis methodology for predicting residual strength of fuselage shell-type structures; and the development of accurate, efficient analysis, design and optimization tool for fuselage shell structures. Assessment of these tools for robustness, efficient, and usage in a fuselage shell design environment will be integrated with these two primary research objectives.

  7. Effect of Using Suboptimal Alignments in Template-Based Protein Structure Prediction

    PubMed Central

    Chen, Hao; Kihara, Daisuke

    2010-01-01

    Computational protein structure prediction remains a challenging task in protein bioinformatics. In the recent years, the importance of template-based structure prediction is increasing due to the growing number of protein structures solved by the structural genomics projects. To capitalize the significant efforts and investments paid on the structural genomics projects, it is urgent to establish effective ways to use the solved structures as templates by developing methods for exploiting remotely related proteins that cannot be simply identified by homology. In this work, we examine the effect of employing suboptimal alignments in template-based protein structure prediction. We showed that suboptimal alignments are often more accurate than the optimal one, and such accurate suboptimal alignments can occur even at a very low rank of the alignment score. Suboptimal alignments contain a significant number of correct amino acid residue contacts. Moreover, suboptimal alignments can improve template-based models when used as input to Modeller. Finally, we employ suboptimal alignments for handling a contact potential in a probabilistic way in a threading program, SUPRB. The probabilistic contacts strategy outperforms the partly thawed approach which only uses the optimal alignment in defining residue contacts and also the reranking strategy, which uses the contact potential in reranking alignments. The comparison with existing methods in the template-recognition test shows that SUPRB is very competitive and outperform existing methods. PMID:21058297

  8. Molecular Phylogeny and Predicted 3D Structure of Plant beta-D-N-Acetylhexosaminidase

    PubMed Central

    Hossain, Md. Anowar

    2014-01-01

    beta-D-N-Acetylhexosaminidase, a family 20 glycosyl hydrolase, catalyzes the removal of β-1,4-linked N-acetylhexosamine residues from oligosaccharides and their conjugates. We constructed phylogenetic tree of β-hexosaminidases to analyze the evolutionary history and predicted functions of plant hexosaminidases. Phylogenetic analysis reveals the complex history of evolution of plant β-hexosaminidase that can be described by gene duplication events. The 3D structure of tomato β-hexosaminidase (β-Hex-Sl) was predicted by homology modeling using 1now as a template. Structural conformity studies of the best fit model showed that more than 98% of the residues lie inside the favoured and allowed regions where only 0.9% lie in the unfavourable region. Predicted 3D structure contains 531 amino acids residues with glycosyl hydrolase20b domain-I and glycosyl hydrolase20 superfamily domain-II including the (β/α)8 barrel in the central part. The α and β contents of the modeled structure were found to be 33.3% and 12.2%, respectively. Eleven amino acids were found to be involved in ligand-binding site; Asp(330) and Glu(331) could play important roles in enzyme-catalyzed reactions. The predicted model provides a structural framework that can act as a guide to develop a hypothesis for β-Hex-Sl mutagenesis experiments for exploring the functions of this class of enzymes in plant kingdom. PMID:25165734

  9. Predicting RNA-binding sites from the protein structure based on electrostatics, evolution and geometry

    PubMed Central

    Chen, Yao Chi; Lim, Carmay

    2008-01-01

    An RNA-binding protein places a surface helix, β-ribbon, or loop in an RNA helix groove and/or uses a cavity to accommodate unstacked bases. Hence, our strategy for predicting RNA-binding residues is based on detecting a surface patch and a disparate cleft. These were generated and scored according to the gas-phase electrostatic energy change upon mutating each residue to Asp−/Glu− and each residue's relative conservation. The method requires as input the protein structure and sufficient homologous sequences to define each residue's relative conservation. It yields as output a priority list of surface patch residues followed by a backup list of surface cleft residues distant from the patch residues for experimental testing of RNA binding. Among the 69 structurally non-homologous proteins tested, 81% possess a RNA-binding site with at least 70% of the maximum number of true positives in randomly generated patches of the same size as the predicted site; only two proteins did not contain any true RNA-binding residues in both predicted regions. Regardless of the protein conformational changes upon RNA-binding, the prediction accuracies based on the RNA-free/bound protein structures were found to be comparable and their binding sites overlapped as long as there are no disordered RNA-binding regions in the free structure that are ordered in the corresponding RNA-bound protein structure. PMID:18276647

  10. M3Ag17(SPh)12 Nanoparticles and Their Structure Prediction.

    PubMed

    Wickramasinghe, Sameera; Atnagulov, Aydar; Yoon, Bokwon; Barnett, Robert N; Griffith, Wendell P; Landman, Uzi; Bigioni, Terry P

    2015-09-16

    Although silver nanoparticles are of great fundamental and practical interest, only one structure has been determined thus far: M4Ag44(SPh)30, where M is a monocation, and SPh is an aromatic thiolate ligand. This is in part due to the fact that no other molecular silver nanoparticles have been synthesized with aromatic thiolate ligands. Here we report the synthesis of M3Ag17(4-tert-butylbenzene-thiol)12, which has good stability and an unusual optical spectrum. We also present a rational strategy for predicting the structure of this molecule. First-principles calculations support the structural model, predict a HOMO-LUMO energy gap of 1.77 eV, and predict a new "monomer mount" capping motif, Ag(SR)3, for Ag nanoparticles. The calculated optical absorption spectrum is in good correspondence with the measured spectrum. Heteroatom substitution was also used as a structural probe. First-principles calculations based on the structural model predicted a strong preference for a single Au atom substitution in agreement with experiment. PMID:26301320

  11. The four ingredients of single-sequence RNA secondary structure prediction. A unifying perspective

    PubMed Central

    Rivas, Elena

    2013-01-01

    Any method for RNA secondary structure prediction is determined by four ingredients. The architecture is the choice of features implemented by the model (such as stacked basepairs, loop length distributions, etc.). The architecture determines the number of parameters in the model. The scoring scheme is the nature of those parameters (whether thermodynamic, probabilistic, or weights). The parameterization stands for the specific values assigned to the parameters. These three ingredients are referred to as “the model.” The fourth ingredient is the folding algorithms used to predict plausible secondary structures given the model and the sequence of a structural RNA. Here, I make several unifying observations drawn from looking at more than 40 years of methods for RNA secondary structure prediction in the light of this classification. As a final observation, there seems to be a performance ceiling that affects all methods with complex architectures, a ceiling that impacts all scoring schemes with remarkable similarity. This suggests that modeling RNA secondary structure by using intrinsic sequence-based plausible “foldability” will require the incorporation of other forms of information in order to constrain the folding space and to improve prediction accuracy. This could give an advantage to probabilistic scoring systems since a probabilistic framework is a natural platform to incorporate different sources of information into one single inference problem. PMID:23695796

  12. New insights from cluster analysis methods for RNA secondary structure prediction.

    PubMed

    Rogers, Emily; Heitsch, Christine

    2016-05-01

    A widening gap exists between the best practices for RNA secondary structure prediction developed by computational researchers and the methods used in practice by experimentalists. Minimum free energy predictions, although broadly used, are outperformed by methods which sample from the Boltzmann distribution and data mine the results. In particular, moving beyond the single structure prediction paradigm yields substantial gains in accuracy. Furthermore, the largest improvements in accuracy and precision come from viewing secondary structures not at the base pair level but at lower granularity/higher abstraction. This suggests that random errors affecting precision and systematic ones affecting accuracy are both reduced by this 'fuzzier' view of secondary structures. Thus experimentalists who are willing to adopt a more rigorous, multilayered approach to secondary structure prediction by iterating through these levels of granularity will be much better able to capture fundamental aspects of RNA base pairing. WIREs RNA 2016, 7:278-294. doi: 10.1002/wrna.1334 For further resources related to this article, please visit the WIREs website. PMID:26971529

  13. ENTPRISE: An Algorithm for Predicting Human Disease-Associated Amino Acid Substitutions from Sequence Entropy and Predicted Protein Structures

    PubMed Central

    Zhou, Hongyi; Gao, Mu; Skolnick, Jeffrey

    2016-01-01

    The advance of next-generation sequencing technologies has made exome sequencing rapid and relatively inexpensive. A major application of exome sequencing is the identification of genetic variations likely to cause Mendelian diseases. This requires processing large amounts of sequence information and therefore computational approaches that can accurately and efficiently identify the subset of disease-associated variations are needed. The accuracy and high false positive rates of existing computational tools leave much room for improvement. Here, we develop a boosted tree regression machine-learning approach to predict human disease-associated amino acid variations by utilizing a comprehensive combination of protein sequence and structure features. On comparing our method, ENTPRISE, to the state-of-the-art methods SIFT, PolyPhen-2, MUTATIONASSESSOR, MUTATIONTASTER, FATHMM, ENTPRISE exhibits significant improvement. In particular, on a testing dataset consisting of only proteins with balanced disease-associated and neutral variations defined as having the ratio of neutral/disease-associated variations between 0.3 and 3, the Mathews Correlation Coefficient by ENTPRISE is 0.493 as compared to 0.432 by PPH2-HumVar, 0.406 by SIFT, 0.403 by MUTATIONASSESSOR, 0.402 by PPH2-HumDiv, 0.305 by MUTATIONTASTER, and 0.181 by FATHMM. ENTPRISE is then applied to nucleic acid binding proteins in the human proteome. Disease-associated predictions are shown to be highly correlated with the number of protein-protein interactions. Both these predictions and the ENTPRISE server are freely available for academic users as a web service at http://cssb.biology.gatech.edu/entprise/. PMID:26982818

  14. ENTPRISE: An Algorithm for Predicting Human Disease-Associated Amino Acid Substitutions from Sequence Entropy and Predicted Protein Structures.

    PubMed

    Zhou, Hongyi; Gao, Mu; Skolnick, Jeffrey

    2016-01-01

    The advance of next-generation sequencing technologies has made exome sequencing rapid and relatively inexpensive. A major application of exome sequencing is the identification of genetic variations likely to cause Mendelian diseases. This requires processing large amounts of sequence information and therefore computational approaches that can accurately and efficiently identify the subset of disease-associated variations are needed. The accuracy and high false positive rates of existing computational tools leave much room for improvement. Here, we develop a boosted tree regression machine-learning approach to predict human disease-associated amino acid variations by utilizing a comprehensive combination of protein sequence and structure features. On comparing our method, ENTPRISE, to the state-of-the-art methods SIFT, PolyPhen-2, MUTATIONASSESSOR, MUTATIONTASTER, FATHMM, ENTPRISE exhibits significant improvement. In particular, on a testing dataset consisting of only proteins with balanced disease-associated and neutral variations defined as having the ratio of neutral/disease-associated variations between 0.3 and 3, the Mathews Correlation Coefficient by ENTPRISE is 0.493 as compared to 0.432 by PPH2-HumVar, 0.406 by SIFT, 0.403 by MUTATIONASSESSOR, 0.402 by PPH2-HumDiv, 0.305 by MUTATIONTASTER, and 0.181 by FATHMM. ENTPRISE is then applied to nucleic acid binding proteins in the human proteome. Disease-associated predictions are shown to be highly correlated with the number of protein-protein interactions. Both these predictions and the ENTPRISE server are freely available for academic users as a web service at http://cssb.biology.gatech.edu/entprise/. PMID:26982818

  15. Evaluation of a universal flow-through model for predicting and designing phosphorus removal structures.

    PubMed

    Penn, Chad; Bowen, James; McGrath, Joshua; Nairn, Robert; Fox, Garey; Brown, Glenn; Wilson, Stuart; Gill, Clinton

    2016-05-01

    Phosphorus (P) removal structures have been shown to decrease dissolved P loss from agricultural and urban areas which may reduce the threat of eutrophication. In order to design or quantify performance of these structures, the relationship between discrete and cumulative removal with cumulative P loading must be determined, either by individual flow-through experiments or model prediction. A model was previously developed for predicting P removal with P sorption materials (PSMs) under flow-through conditions, as a function of inflow P concentration, retention time (RT), and PSM characteristics. The objective of this study was to compare model results to measured P removal data from several PSM under a range of conditions (P concentrations and RT) and scales ranging from laboratory to field. Materials tested included acid mine drainage residuals (AMDRs), treated and non-treated electric arc furnace (EAF) steel slag at different size fractions, and flue gas desulfurization (FGD) gypsum. Equations for P removal curves and cumulative P removed were not significantly different between predicted and actual values for any of the 23 scenarios examined. However, the model did tend to slightly over-predict cumulative P removal for calcium-based PSMs. The ability of the model to predict P removal for various materials, RTs, and P concentrations in both controlled settings and field structures validate its use in design and quantification of these structures. This ability to predict P removal without constant monitoring is vital to widespread adoption of P removal structures, especially for meeting discharge regulations and nutrient trading programs. PMID:26950026

  16. An improved hybrid global optimization method for protein tertiary structure prediction

    PubMed Central

    McAllister, Scott R.

    2009-01-01

    First principles approaches to the protein structure prediction problem must search through an enormous conformational space to identify low-energy, near-native structures. In this paper, we describe the formulation of the tertiary structure prediction problem as a nonlinear constrained minimization problem, where the goal is to minimize the energy of a protein conformation subject to constraints on torsion angles and interatomic distances. The core of the proposed algorithm is a hybrid global optimization method that combines the benefits of the ?BB deterministic global optimization approach with conformational space annealing. These global optimization techniques employ a local minimization strategy that combines torsion angle dynamics and rotamer optimization to identify and improve the selection of initial conformations and then applies a sequential quadratic programming approach to further minimize the energy of the protein conformations subject to constraints. The proposed algorithm demonstrates the ability to identify both lower energy protein structures, as well as larger ensembles of low-energy conformations. PMID:20357906

  17. Structural predictions based on the compositions of cathodic materials by first-principles calculations

    NASA Astrophysics Data System (ADS)

    Li, Yang; Lian, Fang; Chen, Ning; Hao, Zhen-jia; Chou, Kuo-chih

    2015-05-01

    A first-principles method is applied to comparatively study the stability of lithium metal oxides with layered or spinel structures to predict the most energetically favorable structure for different compositions. The binding and reaction energies of the real or virtual layered LiMO2 and spinel LiM2O4 (M = Sc-Cu, Y-Ag, Mg-Sr, and Al-In) are calculated. The effect of element M on the structural stability, especially in the case of multiple-cation compounds, is discussed herein. The calculation results indicate that the phase stability depends on both the binding and reaction energies. The oxidation state of element M also plays a role in determining the dominant structure, i.e., layered or spinel phase. Moreover, calculation-based theoretical predictions of the phase stability of the doped materials agree with the previously reported experimental data.

  18. A Non-parametric Bayesian Approach for Predicting RNA Secondary Structures

    NASA Astrophysics Data System (ADS)

    Sato, Kengo; Hamada, Michiaki; Mituyama, Toutai; Asai, Kiyoshi; Sakakibara, Yasubumi

    Since many functional RNAs form stable secondary structures which are related to their functions, RNA secondary structure prediction is a crucial problem in bioinformatics. We propose a novel model for generating RNA secondary structures based on a non-parametric Bayesian approach, called hierarchical Dirichlet processes for stochastic context-free grammars (HDP-SCFGs). Here non-parametric means that some meta-parameters, such as the number of non-terminal symbols and production rules, do not have to be fixed. Instead their distributions are inferred in order to be adapted (in the Bayesian sense) to the training sequences provided. The results of our RNA secondary structure predictions show that HDP-SCFGs are more accurate than the MFE-based and other generative models.

  19. Hidden Markov models that use predicted local structure for fold recognition: alphabets of backbone geometry.

    PubMed

    Karchin, Rachel; Cline, Melissa; Mandel-Gutfreund, Yael; Karplus, Kevin

    2003-06-01

    An important problem in computational biology is predicting the structure of the large number of putative proteins discovered by genome sequencing projects. Fold-recognition methods attempt to solve the problem by relating the target proteins to known structures, searching for template proteins homologous to the target. Remote homologs that may have significant structural similarity are often not detectable by sequence similarities alone. To address this, we incorporated predicted local structure, a generalization of secondary structure, into two-track profile hidden Markov models (HMMs). We did not rely on a simple helix-strand-coil definition of secondary structure, but experimented with a variety of local structure descriptions, following a principled protocol to establish which descriptions are most useful for improving fold recognition and alignment quality. On a test set of 1298 nonhomologous proteins, HMMs incorporating a 3-letter STRIDE alphabet improved fold recognition accuracy by 15% over amino-acid-only HMMs and 23% over PSI-BLAST, measured by ROC-65 numbers. We compared two-track HMMs to amino-acid-only HMMs on a difficult alignment test set of 200 protein pairs (structurally similar with 3-24% sequence identity). HMMs with a 6-letter STRIDE secondary track improved alignment quality by 62%, relative to DALI structural alignments, while HMMs with an STR track (an expanded DSSP alphabet that subdivides strands into six states) improved by 40% relative to CE. PMID:12784210

  20. ATP-Dependent Chromatin Remodeling by the Cockayne Syndrome B DNA Repair-Transcription-Coupling Factor

    PubMed Central

    Citterio, Elisabetta; Van Den Boom, Vincent; Schnitzler, Gavin; Kanaar, Roland; Bonte, Edgar; Kingston, Robert E.; Hoeijmakers, Jan H. J.; Vermeulen, Wim

    2000-01-01

    The Cockayne syndrome B protein (CSB) is required for coupling DNA excision repair to transcription in a process known as transcription-coupled repair (TCR). Cockayne syndrome patients show UV sensitivity and severe neurodevelopmental abnormalities. CSB is a DNA-dependent ATPase of the SWI2/SNF2 family. SWI2/SNF2-like proteins are implicated in chromatin remodeling during transcription. Since chromatin structure also affects DNA repair efficiency, chromatin remodeling activities within repair are expected. Here we used purified recombinant CSB protein to investigate whether it can remodel chromatin in vitro. We show that binding of CSB to DNA results in an alteration of the DNA double-helix conformation. In addition, we find that CSB is able to remodel chromatin structure at the expense of ATP hydrolysis. Specifically, CSB can alter DNase I accessibility to reconstituted mononucleosome cores and disarrange an array of nucleosomes regularly spaced on plasmid DNA. In addition, we show that CSB interacts not only with double-stranded DNA but also directly with core histones. Finally, intact histone tails play an important role in CSB remodeling. CSB is the first repair protein found to play a direct role in modulating nucleosome structure. The relevance of this finding to the interplay between transcription and repair is discussed. PMID:11003660

  1. PiDNA: Predicting protein-DNA interactions with structural models.

    PubMed

    Lin, Chih-Kang; Chen, Chien-Yu

    2013-07-01

    Predicting binding sites of a transcription factor in the genome is an important, but challenging, issue in studying gene regulation. In the past decade, a large number of protein-DNA co-crystallized structures available in the Protein Data Bank have facilitated the understanding of interacting mechanisms between transcription factors and their binding sites. Recent studies have shown that both physics-based and knowledge-based potential functions can be applied to protein-DNA complex structures to deliver position weight matrices (PWMs) that are consistent with the experimental data. To further use the available structural models, the proposed Web server, PiDNA, aims at first constructing reliable PWMs by applying an atomic-level knowledge-based scoring function on numerous in silico mutated complex structures, and then using the PWM constructed by the structure models with small energy changes to predict the interaction between proteins and DNA sequences. With PiDNA, the users can easily predict the relative preference of all the DNA sequences with limited mutations from the native sequence co-crystallized in the model in a single run. More predictions on sequences with unlimited mutations can be realized by additional requests or file uploading. Three types of information can be downloaded after prediction: (i) the ranked list of mutated sequences, (ii) the PWM constructed by the favourable mutated structures, and (iii) any mutated protein-DNA complex structure models specified by the user. This study first shows that the constructed PWMs are similar to the annotated PWMs collected from databases or literature. Second, the prediction accuracy of PiDNA in detecting relatively high-specificity sites is evaluated by comparing the ranked lists against in vitro experiments from protein-binding microarrays. Finally, PiDNA is shown to be able to select the experimentally validated binding sites from 10,000 random sites with high accuracy. With PiDNA, the users can design biological experiments based on the predicted sequence specificity and/or request mutated structure models for further protein design. As well, it is expected that PiDNA can be incorporated with chromatin immunoprecipitation data to refine large-scale inference of in vivo protein-DNA interactions. PiDNA is available at: http://dna.bime.ntu.edu.tw/pidna. PMID:23703214

  2. Multiscale model for predicting shear zone structure and permeability in deforming rock

    NASA Astrophysics Data System (ADS)

    Cleary, Paul W.; Pereira, Gerald G.; Lemiale, Vincent; Piane, Claudio Delle; Clennell, M. Ben

    2016-04-01

    A novel multiscale model is proposed for the evolution of faults in rocks, which predicts their internal properties and permeability as strain increases. The macroscale model, based on smoothed particle hydrodynamics (SPH), predicts system scale deformation by a pressure-dependent elastoplastic representation of the rock and shear zone. Being a continuum method, SPH contains no intrinsic information on the grain scale structure or behaviour of the shear zone, so a series of discrete element method microscale shear cell models are embedded into the macroscale model at specific locations. In the example used here, the overall geometry and kinematics of a direct shear test on a block of intact rock is simulated. Deformation is imposed by a macroscale model where stresses and displacement rates are applied at the shear cell walls in contact with the rock. Since the microscale models within the macroscale block of deforming rock now include representations of the grains, the structure of the shear zone, the evolution of the size and shape distribution of these grains, and the dilatancy of the shear zone can all be predicted. The microscale dilatancy can be used to vary the macroscale model dilatancy both spatially and temporally to give a full two-way coupling between the spatial scales. The ability of this model to predict shear zone structure then allows the prediction of the shear zone permeability using the Lattice-Boltzmann method.

  3. Predicting community structure in snakes on Eastern Nearctic islands using ecological neutral theory and phylogenetic methods.

    PubMed

    Burbrink, Frank T; McKelvy, Alexander D; Pyron, R Alexander; Myers, Edward A

    2015-11-22

    Predicting species presence and richness on islands is important for understanding the origins of communities and how likely it is that species will disperse and resist extinction. The equilibrium theory of island biogeography (ETIB) and, as a simple model of sampling abundances, the unified neutral theory of biodiversity (UNTB), predict that in situations where mainland to island migration is high, species-abundance relationships explain the presence of taxa on islands. Thus, more abundant mainland species should have a higher probability of occurring on adjacent islands. In contrast to UNTB, if certain groups have traits that permit them to disperse to islands better than other taxa, then phylogeny may be more predictive of which taxa will occur on islands. Taking surveys of 54 island snake communities in the Eastern Nearctic along with mainland communities that have abundance data for each species, we use phylogenetic assembly methods and UNTB estimates to predict island communities. Species richness is predicted by island area, whereas turnover from the mainland to island communities is random with respect to phylogeny. Community structure appears to be ecologically neutral and abundance on the mainland is the best predictor of presence on islands. With regard to young and proximate islands, where allopatric or cladogenetic speciation is not a factor, we find that simple neutral models following UNTB and ETIB predict the structure of island communities. PMID:26609083

  4. Striatal structure and function predict individual biases in learning to avoid pain.

    PubMed

    Eldar, Eran; Hauser, Tobias U; Dayan, Peter; Dolan, Raymond J

    2016-04-26

    Pain is an elemental inducer of avoidance. Here, we demonstrate that people differ in how they learn to avoid pain, with some individuals refraining from actions that resulted in painful outcomes, whereas others favor actions that helped prevent pain. These individual biases were best explained by differences in learning from outcome prediction errors and were associated with distinct forms of striatal responses to painful outcomes. Specifically, striatal responses to pain were modulated in a manner consistent with an aversive prediction error in individuals who learned predominantly from pain, whereas in individuals who learned predominantly from success in preventing pain, modulation was consistent with an appetitive prediction error. In contrast, striatal responses to success in preventing pain were consistent with an appetitive prediction error in both groups. Furthermore, variation in striatal structure, encompassing the region where pain prediction errors were expressed, predicted participants' predominant mode of learning, suggesting the observed learning biases may reflect stable individual traits. These results reveal functional and structural neural components underlying individual differences in avoidance learning, which may be important contributors to psychiatric disorders involving pathological harm avoidance behavior. PMID:27071092

  5. Striatal structure and function predict individual biases in learning to avoid pain

    PubMed Central

    Eldar, Eran; Hauser, Tobias U.; Dayan, Peter; Dolan, Raymond J.

    2016-01-01

    Pain is an elemental inducer of avoidance. Here, we demonstrate that people differ in how they learn to avoid pain, with some individuals refraining from actions that resulted in painful outcomes, whereas others favor actions that helped prevent pain. These individual biases were best explained by differences in learning from outcome prediction errors and were associated with distinct forms of striatal responses to painful outcomes. Specifically, striatal responses to pain were modulated in a manner consistent with an aversive prediction error in individuals who learned predominantly from pain, whereas in individuals who learned predominantly from success in preventing pain, modulation was consistent with an appetitive prediction error. In contrast, striatal responses to success in preventing pain were consistent with an appetitive prediction error in both groups. Furthermore, variation in striatal structure, encompassing the region where pain prediction errors were expressed, predicted participants’ predominant mode of learning, suggesting the observed learning biases may reflect stable individual traits. These results reveal functional and structural neural components underlying individual differences in avoidance learning, which may be important contributors to psychiatric disorders involving pathological harm avoidance behavior. PMID:27071092

  6. Prediction of the Fundamental Period of Infilled RC Frame Structures Using Artificial Neural Networks

    PubMed Central

    Asteris, Panagiotis G.; Tsaris, Athanasios K.; Cavaleri, Liborio; Repapis, Constantinos C.; Papalou, Angeliki; Di Trapani, Fabio; Karypidis, Dimitrios F.

    2016-01-01

    The fundamental period is one of the most critical parameters for the seismic design of structures. There are several literature approaches for its estimation which often conflict with each other, making their use questionable. Furthermore, the majority of these approaches do not take into account the presence of infill walls into the structure despite the fact that infill walls increase the stiffness and mass of structure leading to significant changes in the fundamental period. In the present paper, artificial neural networks (ANNs) are used to predict the fundamental period of infilled reinforced concrete (RC) structures. For the training and the validation of the ANN, a large data set is used based on a detailed investigation of the parameters that affect the fundamental period of RC structures. The comparison of the predicted values with analytical ones indicates the potential of using ANNs for the prediction of the fundamental period of infilled RC frame structures taking into account the crucial parameters that influence its value. PMID:27066069

  7. De novo structure prediction and experimental characterization of folded peptoid oligomers

    PubMed Central

    Butterfoss, Glenn L.; Yoo, Barney; Jaworski, Jonathan N.; Chorny, Ilya; Dill, Ken A.; Zuckermann, Ronald N.; Bonneau, Richard; Kirshenbaum, Kent; Voelz, Vincent A.

    2012-01-01

    Peptoid molecules are biomimetic oligomers that can fold into unique three-dimensional structures. As part of an effort to advance computational design of folded oligomers, we present blind-structure predictions for three peptoid sequences using a combination of Replica Exchange Molecular Dynamics (REMD) simulation and Quantum Mechanical refinement. We correctly predicted the structure of a N-aryl peptoid trimer to within 0.2 Å rmsd-backbone and a cyclic peptoid nonamer to an accuracy of 1.0 Å rmsd-backbone. X-ray crystallographic structures are presented for a linear N-alkyl peptoid trimer and for the cyclic peptoid nonamer. The peptoid macrocycle structure features a combination of cis and trans backbone amides, significant nonplanarity of the amide bonds, and a unique “basket” arrangement of (S)-N(1-phenylethyl) side chains encompassing a bound ethanol molecule. REMD simulations of the peptoid trimers reveal that well folded peptoids can exhibit funnel-like conformational free energy landscapes similar to those for ordered polypeptides. These results indicate that physical modeling can successfully perform de novo structure prediction for small peptoid molecules. PMID:22908242

  8. Prediction of the Fundamental Period of Infilled RC Frame Structures Using Artificial Neural Networks.

    PubMed

    Asteris, Panagiotis G; Tsaris, Athanasios K; Cavaleri, Liborio; Repapis, Constantinos C; Papalou, Angeliki; Di Trapani, Fabio; Karypidis, Dimitrios F

    2016-01-01

    The fundamental period is one of the most critical parameters for the seismic design of structures. There are several literature approaches for its estimation which often conflict with each other, making their use questionable. Furthermore, the majority of these approaches do not take into account the presence of infill walls into the structure despite the fact that infill walls increase the stiffness and mass of structure leading to significant changes in the fundamental period. In the present paper, artificial neural networks (ANNs) are used to predict the fundamental period of infilled reinforced concrete (RC) structures. For the training and the validation of the ANN, a large data set is used based on a detailed investigation of the parameters that affect the fundamental period of RC structures. The comparison of the predicted values with analytical ones indicates the potential of using ANNs for the prediction of the fundamental period of infilled RC frame structures taking into account the crucial parameters that influence its value. PMID:27066069

  9. Thermodynamic ground state of MgB{sub 6} predicted from first principles structure search methods

    SciTech Connect

    Wang, Hui; Department of Physics and Engineering Physics, University of Saskatchewan, Saskatoon, Saskatchewan S7N 5E2 ; LeBlanc, K. A.; Gao, Bo; Yao, Yansun; Canadian Light Source, Saskatoon, Saskatchewan S7N 0X4

    2014-01-28

    Crystalline structures of magnesium hexaboride, MgB{sub 6}, were investigated using unbiased structure searching methods combined with first principles density functional calculations. An orthorhombic Cmcm structure was predicted as the thermodynamic ground state of MgB{sub 6}. The energy of the Cmcm structure is significantly lower than the theoretical MgB{sub 6} models previously considered based on a primitive cubic arrangement of boron octahedra. The Cmcm structure is stable against the decomposition to elemental magnesium and boron solids at atmospheric pressure and high pressures up to 18.3 GPa. A unique feature of the predicted Cmcm structure is that the boron atoms are clustered into two forms: localized B{sub 6} octahedra and extended B{sub ∞} ribbons. Within the boron ribbons, the electrons are delocalized and this leads to a metallic ground state with vanished electric dipoles. The present prediction is in contrast to the previous proposal that the crystalline MgB{sub 6} maintains a semiconducting state with permanent dipole moments. MgB{sub 6} is estimated to have much weaker electron-phonon coupling compared with that of MgB{sub 2}, and therefore it is not expected to be able to sustain superconductivity at high temperatures.

  10. Interaction of the IκBα C-terminal PEST sequence with NF-κB: Insights into the Inhibition of NF-κB DNA binding by IκBα

    PubMed Central

    Sue, Shih-Che; Dyson, H. Jane

    2009-01-01

    The transcription factor NF-κB (p50/p65) binds either a κB DNA element or its inhibitor protein, IκBα, but these two binding events are mutually exclusive. The reason for this exclusivity is not obvious from the available crystal structure data. The C-terminal PEST-like sequence of IκBα appears to be involved in the process, but it is located in both of the published X-ray structures of the IκBα/NF-κB complex at a significant distance away from the DNA contact loop in the NF-κB DNA-binding domain. We have used nuclear magnetic resonance spectroscopy and differential isotopic labeling to probe the interactions between the p50/p65 NF-κB heterodimer and IκBα in solution. Our measurements are able to resolve a local structural discrepancy between the two crystal structures, and we confirm that the primary interaction of the IκBα PEST domain is with the DNA-binding domain of the p65 subunit. Mutagenesis of key arginine residues in the DNA contact sequence results in the loss of specific interaction of the PEST sequence with the p65 subdomain. We conclude that the local structure of the IκBα/NF-κB complex in the region of the PEST sequence is consistent with a direct interaction of this acidic sequence with the basic DNA contact sequence in p65, thus reducing the affinity of NF-κB for DNA by a competitive mechanism that is still to be elucidated fully. PMID:19327364

  11. Protein Tertiary Structure Prediction Based on Main Chain Angle Using a Hybrid Bees Colony Optimization Algorithm

    NASA Astrophysics Data System (ADS)

    Mahmood, Zakaria N.; Mahmuddin, Massudi; Mahmood, Mohammed Nooraldeen

    Encoding proteins of amino acid sequence to predict classified into their respective families and subfamilies is important research area. However for a given protein, knowing the exact action whether hormonal, enzymatic, transmembranal or nuclear receptors does not depend solely on amino acid sequence but on the way the amino acid thread folds as well. This study provides a prototype system that able to predict a protein tertiary structure. Several methods are used to develop and evaluate the system to produce better accuracy in protein 3D structure prediction. The Bees Optimization algorithm which inspired from the honey bees food foraging method, is used in the searching phase. In this study, the experiment is conducted on short sequence proteins that have been used by the previous researches using well-known tools. The proposed approach shows a promising result.

  12. Flow-induced vibration prediction using fluid-structure interaction method

    SciTech Connect

    Oyamada, O.; Kawahata, J.; Ono, S.; Murayama, K.; Takamura, N.

    1995-12-01

    An FEM analysis program which can predict FIV (flow-induced vibration) in piping system solving fluid-structure interaction is developed. This program solves the relationship between fluid vibration source (pump, orifice, etc.) and piping vibration by coupling fluid and pipe structure together. A present evaluation method of flow-induced vibration is based on conventional design methods with excessive margin or on the rules having been cumulated in the past plant experiences. Therefore the countermeasures based on the measurement of vibrations in start-up tests are very important. In such methods, the phenomena of flow-induced vibrations are not accurately predicted but sometimes conservatively evaluated. This newly developed program provides the function to predict the influence of fluid pulsations on piping vibrations. In this paper, the development and application result of the program are presented.

  13. Factor Structure, Stability, and Predictive Validity of College Students' Relationship Self-Efficacy Beliefs

    ERIC Educational Resources Information Center

    Lopez, Frederick G.; Morua, Wendy; Rice, Kenneth G.

    2007-01-01

    This study explored the underlying structure, stability, and predictive validity of college students' scores on a measure of relationship maintenance self-efficacy beliefs. Three identified efficacy-related factors were found to be stable; related in expected directions with gender, commitment status, and adult attachment orientations; and…

  14. DEVELOPMENT OF QUANTITATIVE STRUCTURE-ACTIVITY RELATIONSHIPS FOR PREDICTING BIODEGRADATION KINETICS

    EPA Science Inventory

    Results have been presented on the development of a structure-activity relationship for biodegradation using a group contribution approach. sing this approach, reported results of the kinetic rate constant agree within 20% with the predicted values. dditional compound studies are...

  15. Weighted Structural Regression: A Broad Class of Adaptive Methods for Improving Linear Prediction.

    ERIC Educational Resources Information Center

    Pruzek, Robert M.; Lepak, Greg M.

    1992-01-01

    Adaptive forms of weighted structural regression are developed and discussed. Bootstrapping studies indicate that the new methods have potential to recover known population regression weights and predict criterion score values routinely better than do ordinary least squares methods. The new methods are scale free and simple to compute. (SLD)

  16. Improved Displacement Transfer Functions for Structure Deformed Shape Predictions Using Discretely Distributed Surface Strains

    NASA Technical Reports Server (NTRS)

    Ko, William L.; Fleischer, Van Tran

    2012-01-01

    In the formulations of earlier Displacement Transfer Functions for structure shape predictions, the surface strain distributions, along a strain-sensing line, were represented with piecewise linear functions. To improve the shape-prediction accuracies, Improved Displacement Transfer Functions were formulated using piecewise nonlinear strain representations. Through discretization of an embedded beam (depth-wise cross section of a structure along a strain-sensing line) into multiple small domains, piecewise nonlinear functions were used to describe the surface strain distributions along the discretized embedded beam. Such piecewise approach enabled the piecewise integrations of the embedded beam curvature equations to yield slope and deflection equations in recursive forms. The resulting Improved Displacement Transfer Functions, written in summation forms, were expressed in terms of beam geometrical parameters and surface strains along the strain-sensing line. By feeding the surface strains into the Improved Displacement Transfer Functions, structural deflections could be calculated at multiple points for mapping out the overall structural deformed shapes for visual display. The shape-prediction accuracies of the Improved Displacement Transfer Functions were then examined in view of finite-element-calculated deflections using different tapered cantilever tubular beams. It was found that by using the piecewise nonlinear strain representations, the shape-prediction accuracies could be greatly improved, especially for highly-tapered cantilever tubular beams.

  17. Body Vigilance in Nonclinical and Anxiety Disorder Samples: Structure, Correlates, and Prediction of Health Concerns

    ERIC Educational Resources Information Center

    Olatunji, Bunmi O.; Deacon, Brett J.; Abramowitz, Jonathan S.; Valentiner, David P.

    2007-01-01

    The Body Vigilance Scale (BVS) is a measure developed to assess one's conscious attendance to internal cues. The present report investigated the structure, correlates, and predictive utility of the BVS in nonclinical (N=442) and anxiety (N=135) disorder samples. The findings of Study 1 suggest that the BVS is 1-dimensional in a nonclinical sample,…

  18. DSSTOX WEBSITE LAUNCH: IMPROVING PUBLIC ACCESS TO DATABASES FOR BUILDING STRUCTURE-TOXICITY PREDICTION MODELS

    EPA Science Inventory

    DSSTox Website Launch: Improving Public Access to Databases for Building Structure-Toxicity Prediction Models
    Ann M. Richard
    US Environmental Protection Agency, Research Triangle Park, NC, USA

    Distributed: Decentralized set of standardized, field-delimited databases,...

  19. Predicting hepatotoxicity using ToxCast in vitro bioactivity and chemical structure

    EPA Science Inventory

    Background: The U.S. EPA ToxCastTM program is screening thousands of environmental chemicals for bioactivity using hundreds of high-throughput in vitro assays to build predictive models of toxicity. We represented chemicals based on bioactivity and chemical structure descriptors ...

  20. Prediction of trabecular bone principal structural orientation using quantitative ultrasound scanning.

    PubMed

    Lin, Liangjun; Cheng, Jiqi; Lin, Wei; Qin, Yi-Xian

    2012-06-26

    Bone has the ability to adapt its structure in response to the mechanical environment as defined as Wolff's Law. The alignment of trabecular structure is intended to adapt to the particular mechanical milieu applied to it. Due to the absence of normal mechanical loading, it will be extremely important to assess the anisotropic deterioration of bone during the extreme conditions, i.e., long term space mission and disease orientated disuse, to predict risk of fractures. The propagation of ultrasound wave in trabecular bone is substantially influenced by the anisotropy of the trabecular structure. Previous studies have shown that both ultrasound velocity and amplitude is dependent on the incident angle of the ultrasound signal into the bone sample. In this work, seven bovine trabecular bone balls were used for rotational ultrasound measurement around three anatomical axes to elucidate the ability of ultrasound to identify trabecular orientation. Both ultrasound attenuation (ATT) and fast wave velocity (UV) were used to calculate the principal orientation of the trabecular bone. By comparing to the mean intercept length (MIL) tensor obtained from μCT, the angle difference of the prediction by UV was 4.45°, while it resulted in 11.67° angle difference between direction predicted by μCT and the prediction by ATT. This result demonstrates the ability of ultrasound as a non-invasive measurement tool for the principal structural orientation of the trabecular bone. PMID:22560370

  1. Comparison of Algorithms for Prediction of Protein Structural Features from Evolutionary Data.

    PubMed

    Bywater, Robert P

    2016-01-01

    Proteins have many functions and predicting these is still one of the major challenges in theoretical biophysics and bioinformatics. Foremost amongst these functions is the need to fold correctly thereby allowing the other genetically dictated tasks that the protein has to carry out to proceed efficiently. In this work, some earlier algorithms for predicting protein domain folds are revisited and they are compared with more recently developed methods. In dealing with intractable problems such as fold prediction, when different algorithms show convergence onto the same result there is every reason to take all algorithms into account such that a consensus result can be arrived at. In this work it is shown that the application of different algorithms in protein structure prediction leads to results that do not converge as such but rather they collude in a striking and useful way that has never been considered before. PMID:26963911

  2. Comparison of Algorithms for Prediction of Protein Structural Features from Evolutionary Data

    PubMed Central

    Bywater, Robert P.

    2016-01-01

    Proteins have many functions and predicting these is still one of the major challenges in theoretical biophysics and bioinformatics. Foremost amongst these functions is the need to fold correctly thereby allowing the other genetically dictated tasks that the protein has to carry out to proceed efficiently. In this work, some earlier algorithms for predicting protein domain folds are revisited and they are compared with more recently developed methods. In dealing with intractable problems such as fold prediction, when different algorithms show convergence onto the same result there is every reason to take all algorithms into account such that a consensus result can be arrived at. In this work it is shown that the application of different algorithms in protein structure prediction leads to results that do not converge as such but rather they collude in a striking and useful way that has never been considered before. PMID:26963911

  3. I-TASSER: fully automated protein structure prediction in CASP8.

    PubMed

    Zhang, Yang

    2009-01-01

    The I-TASSER algorithm for 3D protein structure prediction was tested in CASP8, with the procedure fully automated in both the Server and Human sections. The quality of the server models is close to that of human ones but the human predictions incorporate more diverse templates from other servers which improve the human predictions in some of the distant homology targets. For the first time, the sequence-based contact predictions from machine learning techniques are found helpful for both template-based modeling (TBM) and template-free modeling (FM). In TBM, although the accuracy of the sequence based contact predictions is on average lower than that from template-based ones, the novel contacts in the sequence-based predictions, which are complementary to the threading templates in the weakly or unaligned regions, are important to improve the global and local packing in these regions. Moreover, the newly developed atomic structural refinement algorithm was tested in CASP8 and found to improve the hydrogen-bonding networks and the overall TM-score, which is mainly due to its ability of removing steric clashes so that the models can be generated from cluster centroids. Nevertheless, one of the major issues of the I-TASSER pipeline is the model selection where the best models could not be appropriately recognized when the correct templates are detected only by the minority of the threading algorithms. There are also problems related with domain-splitting and mirror image recognition which mainly influences the performance of I-TASSER modeling in the FM-based structure predictions. PMID:19768687

  4. Use of tiling array data and RNA secondary structure predictions to identify noncoding RNA genes

    PubMed Central

    Weile, Christian; Gardner, Paul P; Hedegaard, Mads M; Vinther, Jeppe

    2007-01-01

    Background Within the last decade a large number of noncoding RNA genes have been identified, but this may only be the tip of the iceberg. Using comparative genomics a large number of sequences that have signals concordant with conserved RNA secondary structures have been discovered in the human genome. Moreover, genome wide transcription profiling with tiling arrays indicate that the majority of the genome is transcribed. Results We have combined tiling array data with genome wide structural RNA predictions to search for novel noncoding and structural RNA genes that are expressed in the human neuroblastoma cell line SK-N-AS. Using this strategy, we identify thousands of human candidate RNA genes. To further verify the expression of these genes, we focused on candidate genes that had a stable hairpin structures or a high level of covariance. Using northern blotting, we verify the expression of 2 out of 3 of the hairpin structures and 3 out of 9 high covariance structures in SK-N-AS cells. Conclusion Our results demonstrate that many human noncoding, structured and conserved RNA genes remain to be discovered and that tissue specific tiling array data can be used in combination with computational predictions of sequences encoding structural RNAs to improve the search for such genes. PMID:17645787

  5. De novo prediction of protein folding pathways and structure using the principle of sequential stabilization

    PubMed Central

    Adhikari, Aashish N.; Freed, Karl F.; Sosnick, Tobin R.

    2012-01-01

    Motivated by the relationship between the folding mechanism and the native structure, we develop a unified approach for predicting folding pathways and tertiary structure using only the primary sequence as input. Simulations begin from a realistic unfolded state devoid of secondary structure and use a chain representation lacking explicit side chains, rendering the simulations many orders of magnitude faster than molecular dynamics simulations. The multiple round nature of the algorithm mimics the authentic folding process and tests the effectiveness of sequential stabilization (SS) as a search strategy wherein 2° structural elements add onto existing structures in a process of progressive learning and stabilization of structure found in prior rounds of folding. Because no a priori knowledge is used, we can identify kinetically significant non-native interactions and intermediates, sometimes generated by only two mutations, while the evolution of contact matrices is often consistent with experiments. Moreover, structure prediction improves substantially by incorporating information from prior rounds. The success of our simple, homology-free approach affirms the validity of our description of the primary determinants of folding pathways and structure, and the effectiveness of SS as a search strategy. PMID:23045636

  6. Facing the challenges of structure-based target prediction by inverse virtual screening.

    PubMed

    Schomburg, Karen T; Bietz, Stefan; Briem, Hans; Henzler, Angela M; Urbaczek, Sascha; Rarey, Matthias

    2014-06-23

    Computational target prediction for bioactive compounds is a promising field in assessing off-target effects. Structure-based methods not only predict off-targets, but, simultaneously, binding modes, which are essential for understanding the mode of action and rationally designing selective compounds. Here, we highlight the current open challenges of computational target prediction methods based on protein structures and show why inverse screening rather than sequential pairwise protein-ligand docking methods are needed. A new inverse screening method based on triangle descriptors is introduced: iRAISE (inverse Rapid Index-based Screening Engine). A Scoring Cascade considering the reference ligand as well as the ligand and active site coverage is applied to overcome interprotein scoring noise of common protein-ligand scoring functions. Furthermore, a statistical evaluation of a score cutoff for each individual protein pocket is used. The ranking and binding mode prediction capabilities are evaluated on different datasets and compared to inverse docking and pharmacophore-based methods. On the Astex Diverse Set, iRAISE ranks more than 35% of the targets to the first position and predicts more than 80% of the binding modes with a root-mean-square deviation (RMSD) accuracy of <2.0 Å. With a median computing time of 5 s per protein, large amounts of protein structures can be screened rapidly. On a test set with 7915 protein structures and 117 query ligands, iRAISE predicts the first true positive in a ranked list among the top eight ranks (median), i.e., among 0.28% of the targets. PMID:24851945

  7. Better prediction of sub-cellular localization by combining evolutionary and structural information.

    PubMed

    Nair, Rajesh; Rost, Burkhard

    2003-12-01

    The native sub-cellular compartment of a protein is one aspect of its function. Thus, predicting localization is an important step toward predicting function. Short zip code-like sequence fragments regulate some of the shuttling between compartments. Cataloguing and predicting such motifs is the most accurate means of determining localization in silico. However, only few motifs are currently known, and not all the trafficking appears regulated in this way. The amino acid composition of a protein correlates with its localization. All general prediction methods employed this observation. Here, we explored the evolutionary information contained in multiple alignments and aspects of protein structure to predict localization in absence of homology and targeting motifs. Our final system combined statistical rules and a variety of neural networks to achieve an overall four-state accuracy above 65%, a significant improvement over systems using only composition. The system was at its best for extra-cellular and nuclear proteins; it was significantly less accurate than TargetP for mitochondrial proteins. Interestingly, all methods that were developed on SWISS-PROT sequences failed grossly when fed with sequences from proteins of known structures taken from PDB. We therefore developed two separate systems: one for proteins of known structure and one for proteins of unknown structure. Finally, we applied the PDB-based system along with homology-based inferences and automatic text analysis to annotate all eukaryotic proteins in the PDB (http://cubic.bioc.columbia.edu/db/LOC3D). We imagine that this pilot method-certainly in combination with similar tools-may be valuable target selection in structural genomics. PMID:14635133

  8. Predicting inactive conformations of protein kinases using active structures: conformational selection of type-II inhibitors.

    PubMed

    Xu, Min; Yu, Lu; Wan, Bo; Yu, Long; Huang, Qiang

    2011-01-01

    Protein kinases have been found to possess two characteristic conformations in their activation-loops: the active DFG-in conformation and the inactive DFG-out conformation. Recently, it has been very interesting to develop type-II inhibitors which target the DFG-out conformation and are more specific than the type-I inhibitors binding to the active DFG-in conformation. However, solving crystal structures of kinases with the DFG-out conformation remains a challenge, and this seriously hampers the application of the structure-based approaches in development of novel type-II inhibitors. To overcome this limitation, here we present a computational approach for predicting the DFG-out inactive conformation using the DFG-in active structures, and develop related conformational selection protocols for the uses of the predicted DFG-out models in the binding pose prediction and virtual screening of type-II ligands. With the DFG-out models, we predicted the binding poses for known type-II inhibitors, and the results were found in good agreement with the X-ray crystal structures. We also tested the abilities of the DFG-out models to recognize their specific type-II inhibitors by screening a database of small molecules. The AUC (area under curve) results indicated that the predicted DFG-out models were selective toward their specific type-II inhibitors. Therefore, the computational approach and protocols presented in this study are very promising for the structure-based design and screening of novel type-II kinase inhibitors. PMID:21818358

  9. A Fully Bayesian Approach to Improved Calibration and Prediction of Groundwater Models With Structure Error

    NASA Astrophysics Data System (ADS)

    Xu, T.; Valocchi, A. J.

    2014-12-01

    Effective water resource management typically relies on numerical models to analyse groundwater flow and solute transport processes. These models are usually subject to model structure error due to simplification and/or misrepresentation of the real system. As a result, the model outputs may systematically deviate from measurements, thus violating a key assumption for traditional regression-based calibration and uncertainty analysis. On the other hand, model structure error induced bias can be described statistically in an inductive, data-driven way based on historical model-to-measurement misfit. We adopt a fully Bayesian approach that integrates a Gaussian process error model to account for model structure error to the calibration, prediction and uncertainty analysis of groundwater models. The posterior distributions of parameters of the groundwater model and the Gaussian process error model are jointly inferred using DREAM, an efficient Markov chain Monte Carlo sampler. We test the usefulness of the fully Bayesian approach towards a synthetic case study of surface-ground water interaction under changing pumping conditions. We first illustrate through this example that traditional least squares regression without accounting for model structure error yields biased parameter estimates due to parameter compensation as well as biased predictions. In contrast, the Bayesian approach gives less biased parameter estimates. Moreover, the integration of a Gaussian process error model significantly reduces predictive bias and leads to prediction intervals that are more consistent with observations. The results highlight the importance of explicit treatment of model structure error especially in circumstances where subsequent decision-making and risk analysis require accurate prediction and uncertainty quantification. In addition, the data-driven error modelling approach is capable of extracting more information from observation data than using a groundwater model alone.

  10. Structural response measurements and predictions for the SANDIA 34-Meter Test Bed

    SciTech Connect

    Ashwill, T.D.; Veers, P.S. )

    1989-01-01

    Measurements of structural response during operation of the 34-Meter Test Bed vertical axis wind turbine are compared with analytical predictions. Measured structural data include stationary and rotating modal frequencies, cable natural frequencies, and operating stresses. These data are compared to analytical results obtained with the use of NASTRAN-based structural codes. In the case of operating stresses, analytical results with and without turbulence are compared to measured stresses. Data taken during two significant events, a high wind over-speed condition with an emergency stop and a cable resonance that couples with a tower natural frequency, are shown. 11 refs., 19 figs.

  11. Prediction of structures and infrared spectra of the candidate circumstellar molecules SinOm

    NASA Astrophysics Data System (ADS)

    Liu, Na; Zhao, Hui-Yan; Zheng, Li-Jia; Qin, Sheng-Li; Liu, Ying

    2016-01-01

    A systematic study of the geometric structures of steady states and metastable states of silicon oxide clusters has been performed using density functional theory. We find that silicon-rich and oxygen-rich clusters have different characteristics. Oxygen-rich clusters usually have oxygen atoms on the edges of the clusters, but separated from others by Si atoms. However, silicon-rich clusters tend to have rings nested within each other. The spectra for the structures have been calculated to compare with observed spectra. The predicted structures and spectroscopic properties are expected to be useful for the identification of silicon oxide species in the interstellar medium.

  12. Staple Fitness: A Concept to Understand and Predict the Structures of Thiolated Gold Nanoclusters

    SciTech Connect

    Jiang, Deen

    2011-01-01

    A profound connection has been found between the structures of thiolated gold clusters and the combinatorial problem of pairing up dots on a surface. The bridge is the concept of staple fitness: the fittest combination corresponds to the experimental structure. This connection has been demonstrated for both Au{sub 25}(SR){sub 18} and Au{sub 38}(SR){sub 24} (-SR being a thiolate group) and applied to predict a promising structure for the recently synthesized Au{sub 19}(SR){sub 13}.

  13. Prediction of the rodent carcinogenicity of organic compounds from their chemical structures using the FALS method.

    PubMed Central

    Moriguchi, I; Hirano, H; Hirono, S

    1996-01-01

    Fuzzy adaptive least-squares (FALS), a pattern recognition method recently developed in our laboratory for correlating structure with activity rating, was used to generate quantitative structure-activity relationship (QSAR) models on the carcinogenicity of organic compounds of several chemical classes. Using the predictive models obtained from the chemical class-based FALS QSAR approach, the rodent carcinogenicity or noncarcinogenicity of a group of organic chemicals currently being tested by the U.S. National Toxicology Program was estimated from their chemical structures. PMID:8933054

  14. sDFIRE: Sequence-specific statistical energy function for protein structure prediction by decoy selections.

    PubMed

    Hoque, Md Tamjidul; Yang, Yuedong; Mishra, Avdesh; Zhou, Yaoqi

    2016-05-01

    An important unsolved problem in molecular and structural biology is the protein folding and structure prediction problem. One major bottleneck for solving this is the lack of an accurate energy to discriminate near-native conformations against other possible conformations. Here we have developed sDFIRE energy function, which is an optimized linear combination of DFIRE (the Distance-scaled Finite Ideal gas Reference state based Energy), the orientation dependent (polar-polar and polar-nonpolar) statistical potentials, and the matching scores between predicted and model structural properties including predicted main-chain torsion angles and solvent accessible surface area. The weights for these scoring terms are optimized by three widely used decoy sets consisting of a total of 134 proteins. Independent tests on CASP8 and CASP9 decoy sets indicate that sDFIRE outperforms other state-of-the-art energy functions in selecting near native structures and in the Pearson's correlation coefficient between the energy score and structural accuracy of the model (measured by TM-score). © 2016 Wiley Periodicals, Inc. PMID:26849026

  15. Prediction of Spontaneous Protein Deamidation from Sequence-Derived Secondary Structure and Intrinsic Disorder

    PubMed Central

    Lorenzo, J. Ramiro; Alonso, Leonardo G.; Sánchez, Ignacio E.

    2015-01-01

    Asparagine residues in proteins undergo spontaneous deamidation, a post-translational modification that may act as a molecular clock for the regulation of protein function and turnover. Asparagine deamidation is modulated by protein local sequence, secondary structure and hydrogen bonding. We present NGOME, an algorithm able to predict non-enzymatic deamidation of internal asparagine residues in proteins in the absence of structural data, using sequence-based predictions of secondary structure and intrinsic disorder. Compared to previous algorithms, NGOME does not require three-dimensional structures yet yields better predictions than available sequence-only methods. Four case studies of specific proteins show how NGOME may help the user identify deamidation-prone asparagine residues, often related to protein gain of function, protein degradation or protein misfolding in pathological processes. A fifth case study applies NGOME at a proteomic scale and unveils a correlation between asparagine deamidation and protein degradation in yeast. NGOME is freely available as a webserver at the National EMBnet node Argentina, URL: http://www.embnet.qb.fcen.uba.ar/ in the subpage “Protein and nucleic acid structure and sequence analysis”. PMID:26674530

  16. RNA Secondary Structure Prediction by Using Discrete Mathematics: An Interdisciplinary Research Experience for Undergraduate Students

    PubMed Central

    Ellington, Roni; Wachira, James

    2010-01-01

    The focus of this Research Experience for Undergraduates (REU) project was on RNA secondary structure prediction by using a lattice walk approach. The lattice walk approach is a combinatorial and computational biology method used to enumerate possible secondary structures and predict RNA secondary structure from RNA sequences. The method uses discrete mathematical techniques and identifies specified base pairs as parameters. The goal of the REU was to introduce upper-level undergraduate students to the principles and challenges of interdisciplinary research in molecular biology and discrete mathematics. At the beginning of the project, students from the biology and mathematics departments of a mid-sized university received instruction on the role of secondary structure in the function of eukaryotic RNAs and RNA viruses, RNA related to combinatorics, and the National Center for Biotechnology Information resources. The student research projects focused on RNA secondary structure prediction on a regulatory region of the yellow fever virus RNA genome and on an untranslated region of an mRNA of a gene associated with the neurological disorder epilepsy. At the end of the project, the REU students gave poster and oral presentations, and they submitted written final project reports to the program director. The outcome of the REU was that the students gained transferable knowledge and skills in bioinformatics and an awareness of the applications of discrete mathematics to biological research problems. PMID:20810968

  17. Advances in Rosetta structure prediction for difficult molecular-replacement problems

    SciTech Connect

    DiMaio, Frank

    2013-11-01

    Modeling advances using Rosetta structure prediction to aid in solving difficult molecular-replacement problems are discussed. Recent work has shown the effectiveness of structure-prediction methods in solving difficult molecular-replacement problems. The Rosetta protein structure modeling suite can aid in the solution of difficult molecular-replacement problems using templates from 15 to 25% sequence identity; Rosetta refinement guided by noisy density has consistently led to solved structures where other methods fail. In this paper, an overview of the use of Rosetta for these difficult molecular-replacement problems is provided and new modeling developments that further improve model quality are described. Several variations to the method are introduced that significantly reduce the time needed to generate a model and the sampling required to improve the starting template. The improvements are benchmarked on a set of nine difficult cases and it is shown that this improved method obtains consistently better models in less running time. Finally, strategies for best using Rosetta to solve difficult molecular-replacement problems are presented and future directions for the role of structure-prediction methods in crystallography are discussed.

  18. Predicting Protein Ligand Binding Sites by Combining Evolutionary Sequence Conservation and 3D Structure

    PubMed Central

    Capra, John A.; Laskowski, Roman A.; Thornton, Janet M.; Singh, Mona; Funkhouser, Thomas A.

    2009-01-01

    Identifying a protein's functional sites is an important step towards characterizing its molecular function. Numerous structure- and sequence-based methods have been developed for this problem. Here we introduce ConCavity, a small molecule binding site prediction algorithm that integrates evolutionary sequence conservation estimates with structure-based methods for identifying protein surface cavities. In large-scale testing on a diverse set of single- and multi-chain protein structures, we show that ConCavity substantially outperforms existing methods for identifying both 3D ligand binding pockets and individual ligand binding residues. As part of our testing, we perform one of the first direct comparisons of conservation-based and structure-based methods. We find that the two approaches provide largely complementary information, which can be combined to improve upon either approach alone. We also demonstrate that ConCavity has state-of-the-art performance in predicting catalytic sites and drug binding pockets. Overall, the algorithms and analysis presented here significantly improve our ability to identify ligand binding sites and further advance our understanding of the relationship between evolutionary sequence conservation and structural and functional attributes of proteins. Data, source code, and prediction visualizations are available on the ConCavity web site (http://compbio.cs.princeton.edu/concavity/). PMID:19997483

  19. Protein local 3D structure prediction by Super Granule Support Vector Machines (Super GSVM)

    PubMed Central

    2009-01-01

    Background Understanding the relationship between the protein sequence and the 3D structure is a major research area in bioinformatics. The prediction of complete protein tertiary structure based only on sequence information is still an impractical work. This paper aims at revealing the hidden knowledge of the sequence motifs and the local tertiary structure. Results In this paper, we propose a Super Granule Support Vector Machine (Super GSVM) model to obtain the high quality protein sequence motifs and to predict local tertiary structure information based on purely sequence information. Conclusion The proposed model overcomes the innate shortcoming of using the SVM on such a large data set, which is the inherent computational complexity involved in training support vectors for huge datasets including half million of samples. The satisfactory prediction results show the Super GSVM model generates decent protein sequence clusters and has the ability to capture the hidden sequence-to-structure information. This model also has a strong potential in the application of SVMs on other research areas with huge datasets. PMID:19811680

  20. RNA secondary structure prediction by using discrete mathematics: an interdisciplinary research experience for undergraduate students.

    PubMed

    Ellington, Roni; Wachira, James; Nkwanta, Asamoah

    2010-01-01

    The focus of this Research Experience for Undergraduates (REU) project was on RNA secondary structure prediction by using a lattice walk approach. The lattice walk approach is a combinatorial and computational biology method used to enumerate possible secondary structures and predict RNA secondary structure from RNA sequences. The method uses discrete mathematical techniques and identifies specified base pairs as parameters. The goal of the REU was to introduce upper-level undergraduate students to the principles and challenges of interdisciplinary research in molecular biology and discrete mathematics. At the beginning of the project, students from the biology and mathematics departments of a mid-sized university received instruction on the role of secondary structure in the function of eukaryotic RNAs and RNA viruses, RNA related to combinatorics, and the National Center for Biotechnology Information resources. The student research projects focused on RNA secondary structure prediction on a regulatory region of the yellow fever virus RNA genome and on an untranslated region of an mRNA of a gene associated with the neurological disorder epilepsy. At the end of the project, the REU students gave poster and oral presentations, and they submitted written final project reports to the program director. The outcome of the REU was that the students gained transferable knowledge and skills in bioinformatics and an awareness of the applications of discrete mathematics to biological research problems. PMID:20810968

  1. Ligand-Target Prediction by Structural Network Biology Using nAnnoLyze

    PubMed Central

    Martínez-Jiménez, Francisco; Marti-Renom, Marc A.

    2015-01-01

    Target identification is essential for drug design, drug-drug interaction prediction, dosage adjustment and side effect anticipation. Specifically, the knowledge of structural details is essential for understanding the mode of action of a compound on a target protein. Here, we present nAnnoLyze, a method for target identification that relies on the hypothesis that structurally similar binding sites bind similar ligands. nAnnoLyze integrates structural information into a bipartite network of interactions and similarities to predict structurally detailed compound-protein interactions at proteome scale. The method was benchmarked on a dataset of 6,282 pairs of known interacting ligand-target pairs reaching a 0.96 of area under the Receiver Operating Characteristic curve (AUC) when using the drug names as an input feature for the classifier, and a 0.70 of AUC for “anonymous” compounds or compounds not present in the training set. nAnnoLyze resulted in higher accuracies than its predecessor, AnnoLyze. We applied the method to predict interactions for all the compounds in the DrugBank database with each human protein structure and provide examples of target identification for known drugs against human diseases. The accuracy and applicability of our method to any compound indicate that a comparative docking approach such as nAnnoLyze enables large-scale annotation and analysis of compound–protein interactions and thus may benefit drug development. PMID:25816344

  2. Microbes as Engines of Ecosystem Function: When Does Community Structure Enhance Predictions of Ecosystem Processes?

    PubMed

    Graham, Emily B; Knelman, Joseph E; Schindlbacher, Andreas; Siciliano, Steven; Breulmann, Marc; Yannarell, Anthony; Beman, J M; Abell, Guy; Philippot, Laurent; Prosser, James; Foulquier, Arnaud; Yuste, Jorge C; Glanville, Helen C; Jones, Davey L; Angel, Roey; Salminen, Janne; Newton, Ryan J; Bürgmann, Helmut; Ingram, Lachlan J; Hamer, Ute; Siljanen, Henri M P; Peltoniemi, Krista; Potthast, Karin; Bañeras, Lluís; Hartmann, Martin; Banerjee, Samiran; Yu, Ri-Qing; Nogaro, Geraldine; Richter, Andreas; Koranda, Marianne; Castle, Sarah C; Goberna, Marta; Song, Bongkeun; Chatterjee, Amitava; Nunes, Olga C; Lopes, Ana R; Cao, Yiping; Kaisermann, Aurore; Hallin, Sara; Strickland, Michael S; Garcia-Pausas, Jordi; Barba, Josep; Kang, Hojeong; Isobe, Kazuo; Papaspyrou, Sokratis; Pastorelli, Roberta; Lagomarsino, Alessandra; Lindström, Eva S; Basiliko, Nathan; Nemergut, Diana R

    2016-01-01

    Microorganisms are vital in mediating the earth's biogeochemical cycles; yet, despite our rapidly increasing ability to explore complex environmental microbial communities, the relationship between microbial community structure and ecosystem processes remains poorly understood. Here, we address a fundamental and unanswered question in microbial ecology: 'When do we need to understand microbial community structure to accurately predict function?' We present a statistical analysis investigating the value of environmental data and microbial community structure independently and in combination for explaining rates of carbon and nitrogen cycling processes within 82 global datasets. Environmental variables were the strongest predictors of process rates but left 44% of variation unexplained on average, suggesting the potential for microbial data to increase model accuracy. Although only 29% of our datasets were significantly improved by adding information on microbial community structure, we observed improvement in models of processes mediated by narrow phylogenetic guilds via functional gene data, and conversely, improvement in models of facultative microbial processes via community diversity metrics. Our results also suggest that microbial diversity can strengthen predictions of respiration rates beyond microbial biomass parameters, as 53% of models were improved by incorporating both sets of predictors compared to 35% by microbial biomass alone. Our analysis represents the first comprehensive analysis of research examining links between microbial community structure and ecosystem function. Taken together, our results indicate that a greater understanding of microbial communities informed by ecological principles may enhance our ability to predict ecosystem process rates relative to assessments based on environmental variables and microbial physiology. PMID:26941732

  3. Development and application of vibroacoustic structural data banks in predicting vibration design and test criteria for rocket vehicle structures

    NASA Technical Reports Server (NTRS)

    Bandgren, H. J.; Smith, W. C.

    1973-01-01

    A method of predicting broadband random vibration criteria for components on space vehicles is presented. Large amounts of vibration and acoustic data obtained from flights and static firing tests of space vehicle were formulated into vibroacoustic data banks for structural categories of ring frame, skin stringer, and honeycomb. The vibration spectra with their associated acoustic spectra are normalized to a reference acoustic spectrum. The individual normalized spectra are grouped according to definite structural characteristics and statistically analyzed to form the vibroacoustic data banks described in this report. These data banks represent the reference vibration criteria available for determining the new vehicle vibration criteria.

  4. Contact Prediction for Beta and Alpha-Beta Proteins Using Integer Linear Optimization and its Impact on the First Principles 3D Structure Prediction Method ASTRO-FOLD

    PubMed Central

    Rajgaria, R.; Wei, Y.; Floudas, C. A.

    2010-01-01

    An integer linear optimization model is presented to predict residue contacts in β, α + β, and α/β proteins. The total energy of a protein is expressed as sum of a Cα – Cα distance dependent contact energy contribution and a hydrophobic contribution. The model selects contacts that assign lowest energy to the protein structure while satisfying a set of constraints that are included to enforce certain physically observed topological information. A new method based on hydrophobicity is proposed to find the β-sheet alignments. These β-sheet alignments are used as constraints for contacts between residues of β-sheets. This model was tested on three independent protein test sets and CASP8 test proteins consisting of β, α + β, α/β proteins and was found to perform very well. The average accuracy of the predictions (separated by at least six residues) was approximately 61%. The average true positive and false positive distances were also calculated for each of the test sets and they are 7.58 Å and 15.88 Å, respectively. Residue contact prediction can be directly used to facilitate the protein tertiary structure prediction. This proposed residue contact prediction model is incorporated into the first principles protein tertiary structure prediction approach, ASTRO-FOLD. The effectiveness of the contact prediction model was further demonstrated by the improvement in the quality of the protein structure ensemble generated using the predicted residue contacts for a test set of 10 proteins. PMID:20225257

  5. Displacement Theories for In-Flight Deformed Shape Predictions of Aerospace Structures

    NASA Technical Reports Server (NTRS)

    Ko, William L.; Richards, W. L.; Tran, Van t.

    2007-01-01

    Displacement theories are developed for a variety of structures with the goal of providing real-time shape predictions for aerospace vehicles during flight. These theories are initially developed for a cantilever beam to predict the deformed shapes of the Helios flying wing. The main structural configuration of the Helios wing is a cantilever wing tubular spar subjected to bending, torsion, and combined bending and torsion loading. The displacement equations that are formulated are expressed in terms of strains measured at multiple sensing stations equally spaced on the surface of the wing spar. Displacement theories for other structures, such as tapered cantilever beams, two-point supported beams, wing boxes, and plates also are developed. The accuracy of the displacement theories is successfully validated by finite-element analysis and classical beam theory using input-strains generated by finite-element analysis. The displacement equations and associated strain-sensing system (such as fiber optic sensors) create a powerful means for in-flight deformation monitoring of aerospace structures. This method serves multiple purposes for structural shape sensing, loads monitoring, and structural health monitoring. Ultimately, the calculated displacement data can be visually displayed to the ground-based pilot or used as input to the control system to actively control the shape of structures during flight.

  6. Accurate secondary structure prediction and fold recognition for circular dichroism spectroscopy

    PubMed Central

    Micsonai, András; Wien, Frank; Kernya, Linda; Lee, Young-Ho; Goto, Yuji; Réfrégiers, Matthieu; Kardos, József

    2015-01-01

    Circular dichroism (CD) spectroscopy is a widely used technique for the study of protein structure. Numerous algorithms have been developed for the estimation of the secondary structure composition from the CD spectra. These methods often fail to provide acceptable results on α/β-mixed or β-structure–rich proteins. The problem arises from the spectral diversity of β-structures, which has hitherto been considered as an intrinsic limitation of the technique. The predictions are less reliable for proteins of unusual β-structures such as membrane proteins, protein aggregates, and amyloid fibrils. Here, we show that the parallel/antiparallel orientation and the twisting of the β-sheets account for the observed spectral diversity. We have developed a method called β-structure selection (BeStSel) for the secondary structure estimation that takes into account the twist of β-structures. This method can reliably distinguish parallel and antiparallel β-sheets and accurately estimates the secondary structure for a broad range of proteins. Moreover, the secondary structure components applied by the method are characteristic to the protein fold, and thus the fold can be predicted to the level of topology in the CATH classification from a single CD spectrum. By constructing a web server, we offer a general tool for a quick and reliable structure analysis using conventional CD or synchrotron radiation CD (SRCD) spectroscopy for the protein science research community. The method is especially useful when X-ray or NMR techniques fail. Using BeStSel on data collected by SRCD spectroscopy, we investigated the structure of amyloid fibrils of various disease-related proteins and peptides. PMID:26038575

  7. Structure based function prediction of proteins using fragment library frequency vectors

    PubMed Central

    Yadav, Akshay; Jayaraman, Valadi Krishnamoorthy

    2012-01-01

    The function of the protein is primarily dictated by its structure. Therefore it is far more logical to find the functional clues of the protein in its overall 3-dimensional fold or its global structure. In this paper, we have developed a novel Support Vector Machines (SVM) based prediction model for functional classification and prediction of proteins using features extracted from its global structure based on fragment libraries. Fragment libraries have been previously used for abintio modelling of proteins and protein structure comparisons. The query protein structure is broken down into a collection of short contiguous backbone fragments and this collection is discretized using a library of fragments. The input feature vector is frequency vector that counts the number of each library fragment in the collection of fragments by all-to-all fragment comparisons. SVM models were trained and optimised for obtaining the best 10-fold Cross validation accuracy for classification. As an example, this method was applied for prediction and classification of Cell Adhesion molecules (CAMs). Thirty-four different fragment libraries with sizes ranging from 4 to 400 and fragment lengths ranging from 4 to 12 were used for obtaining the best prediction model. The best 10-fold CV accuracy of 95.25% was obtained for library of 400 fragments of length 10. An accuracy of 87.5% was obtained on an unseen test dataset consisting of 20 CAMs and 20 NonCAMs. This shows that protein structure can be accurately and uniquely described using 400 representative fragments of length 10. PMID:23144557

  8. Predictions of Native American Population Structure Using Linguistic Covariates in a Hidden Regression Framework

    PubMed Central

    Jay, Flora; François, Olivier; Blum, Michael G. B.

    2011-01-01

    Background The mainland of the Americas is home to a remarkable diversity of languages, and the relationships between genes and languages have attracted considerable attention in the past. Here we investigate to which extent geography and languages can predict the genetic structure of Native American populations. Methodology/Principal Findings Our approach is based on a Bayesian latent cluster regression model in which cluster membership is explained by geographic and linguistic covariates. After correcting for geographic effects, we find that the inclusion of linguistic information improves the prediction of individual membership to genetic clusters. We further compare the predictive power of Greenberg's and The Ethnologue classifications of Amerindian languages. We report that The Ethnologue classification provides a better genetic proxy than Greenberg's classification at the stock and at the group levels. Although high predictive values can be achieved from The Ethnologue classification, we nevertheless emphasize that Choco, Chibchan and Tupi linguistic families do not exhibit a univocal correspondence with genetic clusters. Conclusions/Significance The Bayesian latent class regression model described here is efficient at predicting population genetic structure using geographic and linguistic information in Native American populations. PMID:21305006

  9. Structural MRI-Based Predictions in Patients with Treatment-Refractory Depression (TRD)

    PubMed Central

    Johnston, Blair A.; Steele, J. Douglas; Tolomeo, Serenella; Christmas, David; Matthews, Keith

    2015-01-01

    The application of machine learning techniques to psychiatric neuroimaging offers the possibility to identify robust, reliable and objective disease biomarkers both within and between contemporary syndromal diagnoses that could guide routine clinical practice. The use of quantitative methods to identify psychiatric biomarkers is consequently important, particularly with a view to making predictions relevant to individual patients, rather than at a group-level. Here, we describe predictions of treatment-refractory depression (TRD) diagnosis using structural T1-weighted brain scans obtained from twenty adult participants with TRD and 21 never depressed controls. We report 85% accuracy of individual subject diagnostic prediction. Using an automated feature selection method, the major brain regions supporting this significant classification were in the caudate, insula, habenula and periventricular grey matter. It was not, however, possible to predict the degree of ‘treatment resistance’ in individual patients, at least as quantified by the Massachusetts General Hospital (MGH-S) clinical staging method; but the insula was again identified as a region of interest. Structural brain imaging data alone can be used to predict diagnostic status, but not MGH-S staging, with a high degree of accuracy in patients with TRD. PMID:26186455

  10. Prediction and Design of Materials from Crystal Structures to Nanocrystal Morphology and Assembly

    NASA Astrophysics Data System (ADS)

    Hennig, Richard

    2012-02-01

    Predictions of structure formation by computational methods have the potential to accelerate materials discovery and design. Here we present two computational approaches for the prediction of crystal structures and the morphology of nanoparticles. Many materials properties are controlled by composition and crystal structure. We show that evolutionary algorithms coupled to ab-initio relaxations can accurately predict the crystal structure and composition of compounds without any prior information about the system. We will discuss results for various systems including the prediction of unexpected quasi-1D and 2D electronic structures in Li-Be compounds under pressure [1] and of the crystal structure of the superconducting high-pressure phase of Eu [2]. The self-assembly of nanocrystals into mesoscale superlattices provides a path to the design of materials with tunable electronic, physical and chemical properties for various applications. The self-assembly is controlled by the nanocrystal shape and by ligand-mediated interactions between them. To understand this, it is necessary to know the effect of the ligands on the surface energies (which tune the nanocrystal shape), as well as the relative coverage of the different facets (which control the interactions). Density functional calculations for the binding energy of oleic acid-based ligands on PbSe nanocrystals determine the surface energies as a function of ligand coverage. The Wulff construction predicts the thermodynamic equilibrium shape of the PbSe nanocrystals as a function of the ligand coverage. We show that the different ligand binding energies on the 100 and 111 facets results in different ligand coverages on the facets and predict a transition in the equilibrium shape from octahedral to cubic when increasing the ligand concentration during synthesis. Our results furthermore suggest that the experimentally observed transformation of the nanocrystal superlattice structure from fcc to bcc is caused by the preferential detachment of ligands from particular facets, leading to anisotropic ligand coverage [3]. [4pt] [1] J. Feng, R. G. Hennig, N. W. Ashcroft and Roald Hoffmann. Nature 451, 445 (2008). [0pt] [2] W. Bi, Y. Meng, R. S. Kumar, A. L. Cornelius, W. W. Tipton, R. G. Hennig, Y. Zhang, C. Chen, and J. S. Schilling. Phys. Rev. B 83, 104106 (2011). [0pt] [3] J. J. Choi, C. R. Bealing, K. Bian, K. J. Hughes, W. Zhang, D.-M. Smilgies, R. G. Hennig, James R. Engstrom, and Tobias Hanrath. J. Am. Chem. Soc. 133, 3131 (2011).

  11. SAHG, a comprehensive database of predicted structures of all human proteins

    PubMed Central

    Motono, Chie; Nakata, Junichi; Koike, Ryotaro; Shimizu, Kana; Shirota, Matsuyuki; Amemiya, Takayuki; Tomii, Kentaro; Nagano, Nozomi; Sakaya, Naofumi; Misoo, Kiyotaka; Sato, Miwa; Kidera, Akinori; Hiroaki, Hidekazu; Shirai, Tsuyoshi; Kinoshita, Kengo; Noguchi, Tamotsu; Ota, Motonori

    2011-01-01

    Most proteins from higher organisms are known to be multi-domain proteins and contain substantial numbers of intrinsically disordered (ID) regions. To analyse such protein sequences, those from human for instance, we developed a special protein-structure-prediction pipeline and accumulated the products in the Structure Atlas of Human Genome (SAHG) database at http://bird.cbrc.jp/sahg. With the pipeline, human proteins were examined by local alignment methods (BLAST, PSI-BLAST and Smith–Waterman profile–profile alignment), global–local alignment methods (FORTE) and prediction tools for ID regions (POODLE-S) and homology modeling (MODELLER). Conformational changes of protein models upon ligand-binding were predicted by simultaneous modeling using templates of apo and holo forms. When there were no suitable templates for holo forms and the apo models were accurate, we prepared holo models using prediction methods for ligand-binding (eF-seek) and conformational change (the elastic network model and the linear response theory). Models are displayed as animated images. As of July 2010, SAHG contains 42 581 protein-domain models in approximately 24 900 unique human protein sequences from the RefSeq database. Annotation of models with functional information and links to other databases such as EzCatDB, InterPro or HPRD are also provided to facilitate understanding the protein structure-function relationships. PMID:21051360

  12. Predicting US Infants' and Toddlers' TV/Video Viewing Rates: Mothers' Cognitions and Structural Life Circumstances.

    PubMed

    Vaala, Sarah E; Hornik, Robert C

    2014-04-01

    There has been rising international concern over media use with children under two. As little is known about the factors associated with more or less viewing among very young children, this study examines maternal factors predictive of TV/video viewing rates among American infants and toddlers. Guided by the Integrative Model of Behavioral Prediction, this survey study examines relationships between children's rates of TV/video viewing and their mothers' structural life circumstances (e.g., number of children in the home; mother's screen use), and cognitions (e.g., attitudes; norms). Results suggest that mothers' structural circumstances and cognitions respectively contribute independent explanatory power to the prediction of children's TV/video viewing. Influence of structural circumstances is partially mediated through cognitions. Mothers' attitudes as well as their own TV/video viewing behavior were particularly predictive of children's viewing. Implications of these findings for international efforts to understand and reduce infant/toddler TV/video exposure are discussed. PMID:25489335

  13. Prediction of HIV drug resistance from genotype with encoded three-dimensional protein structure

    PubMed Central

    2014-01-01

    Background Drug resistance has become a severe challenge for treatment of HIV infections. Mutations accumulate in the HIV genome and make certain drugs ineffective. Prediction of resistance from genotype data is a valuable guide in choice of drugs for effective therapy. Results In order to improve the computational prediction of resistance from genotype data we have developed a unified encoding of the protein sequence and three-dimensional protein structure of the drug target for classification and regression analysis. The method was tested on genotype-resistance data for mutants of HIV protease and reverse transcriptase. Our graph based sequence-structure approach gives high accuracy with a new sparse dictionary classification method, as well as support vector machine and artificial neural networks classifiers. Cross-validated regression analysis with the sparse dictionary gave excellent correlation between predicted and observed resistance. Conclusion The approach of encoding the protein structure and sequence as a 210-dimensional vector, based on Delaunay triangulation, has promise as an accurate method for predicting resistance from sequence for drugs inhibiting HIV protease and reverse transcriptase. PMID:25081370

  14. Enhanced hybrid search algorithm for protein structure prediction using the 3D-HP lattice model.

    PubMed

    Zhou, Changjun; Hou, Caixia; Zhang, Qiang; Wei, Xiaopeng

    2013-09-01

    The problem of protein structure prediction in the hydrophobic-polar (HP) lattice model is the prediction of protein tertiary structure. This problem is usually referred to as the protein folding problem. This paper presents a method for the application of an enhanced hybrid search algorithm to the problem of protein folding prediction, using the three dimensional (3D) HP lattice model. The enhanced hybrid search algorithm is a combination of the particle swarm optimizer (PSO) and tabu search (TS) algorithms. Since the PSO algorithm entraps local minimum in later evolution extremely easily, we combined PSO with the TS algorithm, which has properties of global optimization. Since the technologies of crossover and mutation are applied many times to PSO and TS algorithms, so enhanced hybrid search algorithm is called the MCMPSO-TS (multiple crossover and mutation PSO-TS) algorithm. Experimental results show that the MCMPSO-TS algorithm can find the best solutions so far for the listed benchmarks, which will help comparison with any future paper approach. Moreover, real protein sequences and Fibonacci sequences are verified in the 3D HP lattice model for the first time. Compared with the previous evolutionary algorithms, the new hybrid search algorithm is novel, and can be used effectively to predict 3D protein folding structure. With continuous development and changes in amino acids sequences, the new algorithm will also make a contribution to the study of new protein sequences. PMID:23824509

  15. Prediction of clathrate structure type and guest position by molecular mechanics.

    PubMed

    Fleischer, Everly B; Janda, Kenneth C

    2013-05-16

    The clathrate hydrates occur in various types in which the number, size, and shape of the various cages differ. Usually the clathrate type of a specific guest is predicted by the size and shape of the molecular guest. We have developed a methodology to determine the clathrate type employing molecular mechanics with the MMFF force field employing a strategy to calculate the energy of formation of the clathrate from the sum of the guest/cage energies. The clathrate type with the most negative (most stable) energy of formation would be the type predicted (we mainly focused on type I, type II, or bromine type). This strategy allows for a calculation to predict the clathrate type for any cage guest in a few minutes on a laptop computer. It proved successful in predicting the clathrate structure for 46 out of 47 guest molecules. The molecular mechanics calculations also provide a prediction of the guest position within the cage and clathrate structure. These predictions are generally consistent with the X-ray and neutron diffraction studies. By supplementing the diffraction study with molecular mechanics, we gain a more detailed insight regarding the details of the structure. We have also compared MM calculations to studies of the multiple occupancy of the cages. Finally, we present a density functional calculation that demonstrates that the inside of the clathrates cages have a relatively uniform and low electrostatic potential in comparison with the outside oxygen and hydrogen atoms. This implies that van der Waals forces will usually be dominant in the guest-cage interactions. PMID:23600658

  16. Predicting acute aquatic toxicity of structurally diverse chemicals in fish using artificial intelligence approaches.

    PubMed

    Singh, Kunwar P; Gupta, Shikha; Rai, Premanjali

    2013-09-01

    The research aims to develop global modeling tools capable of categorizing structurally diverse chemicals in various toxicity classes according to the EEC and European Community directives, and to predict their acute toxicity in fathead minnow using set of selected molecular descriptors. Accordingly, artificial intelligence approach based classification and regression models, such as probabilistic neural networks (PNN), generalized regression neural networks (GRNN), multilayer perceptron neural network (MLPN), radial basis function neural network (RBFN), support vector machines (SVM), gene expression programming (GEP), and decision tree (DT) were constructed using the experimental toxicity data. Diversity and non-linearity in the chemicals' data were tested using the Tanimoto similarity index and Brock-Dechert-Scheinkman statistics. Predictive and generalization abilities of various models constructed here were compared using several statistical parameters. PNN and GRNN models performed relatively better than MLPN, RBFN, SVM, GEP, and DT. Both in two and four category classifications, PNN yielded a considerably high accuracy of classification in training (95.85 percent and 90.07 percent) and validation data (91.30 percent and 86.96 percent), respectively. GRNN rendered a high correlation between the measured and model predicted -log LC50 values both for the training (0.929) and validation (0.910) data and low prediction errors (RMSE) of 0.52 and 0.49 for two sets. Efficiency of the selected PNN and GRNN models in predicting acute toxicity of new chemicals was adequately validated using external datasets of different fish species (fathead minnow, bluegill, trout, and guppy). The PNN and GRNN models showed good predictive and generalization abilities and can be used as tools for predicting toxicities of structurally diverse chemical compounds. PMID:23764236

  17. SPOT-Seq-RNA: Predicting protein-RNA complex structure and RNA-binding function by fold recognition and binding affinity prediction

    PubMed Central

    Yang, Yuedong; Zhao, Huiying; Wang, Jihua; Zhou, Yaoqi

    2013-01-01

    Summary RNA-binding proteins (RBPs) play key roles in RNA metabolism and post-transcriptional regulation. Computational methods have been developed separately for prediction of RNA-binding proteins and RNA-binding residues by machine learning techniques and prediction of protein-RNA complex structures by rigid or semi-flexible structure-to-structure docking. Here, we describe a template-based technique called SPOT-Seq-RNA that integrates prediction of RNA-binding proteins, RNA-binding residues, and protein-RNA complex structures into a single package. This integration is achieved by combining template-based structure-prediction software, SPARKS X, with binding-affinity prediction software, DRNA. This tool yields reasonable sensitivity (46%) and high precision (84%) for an independent test set of 215 RBPs and 5766 non-RBPs. SPOT-Seq-RNA is computationally efficient for genome-scale prediction of RNA-binding proteins and protein-RNA complex structures. Its application to human genome study has revealed a similar sensitivity and ability to uncover hundreds of novel RBPs beyond simple homology. The online server and downloadable version of SPOT-Seq-RNA are available at http://sparks.informatics.iupui.edu/server/SPOT-Seq-RNA/ PMID:24573478

  18. Structural predictions of neurobiologically relevant G-protein coupled receptors and intrinsically disordered proteins.

    PubMed

    Rossetti, Giulia; Dibenedetto, Domenica; Calandrini, Vania; Giorgetti, Alejandro; Carloni, Paolo

    2015-09-15

    G protein coupled receptors (GPCRs) and intrinsic disordered proteins (IDPs) are key players for neuronal function and dysfunction. Unfortunately, their structural characterization is lacking in most cases. From one hand, no experimental structure has been determined for the two largest GPCRs subfamilies, both key proteins in neuronal pathways. These are the odorant (450 members out of 900 human GPCRs) and the bitter taste receptors (25 members) subfamilies. On the other hand, also IDPs structural characterization is highly non-trivial. They exist as dynamic, highly flexible structural ensembles that undergo conformational conversions on a wide range of timescales, spanning from picoseconds to milliseconds. Computational methods may be of great help to characterize these neuronal proteins. Here we review recent progress from our lab and other groups to develop and apply in silico methods for structural predictions of these highly relevant, fascinating and challenging systems. PMID:25797436

  19. A Comparative Taxonomy of Parallel Algorithms for RNA Secondary Structure Prediction

    PubMed Central

    Al-Khatib, Ra’ed M.; Abdullah, Rosni; Rashid, Nur’Aini Abdul

    2010-01-01

    RNA molecules have been discovered playing crucial roles in numerous biological and medical procedures and processes. RNA structures determination have become a major problem in the biology context. Recently, computer scientists have empowered the biologists with RNA secondary structures that ease an understanding of the RNA functions and roles. Detecting RNA secondary structure is an NP-hard problem, especially in pseudoknotted RNA structures. The detection process is also time-consuming; as a result, an alternative approach such as using parallel architectures is a desirable option. The main goal in this paper is to do an intensive investigation of parallel methods used in the literature to solve the demanding issues, related to the RNA secondary structure prediction methods. Then, we introduce a new taxonomy for the parallel RNA folding methods. Based on this proposed taxonomy, a systematic and scientific comparison is performed among these existing methods. PMID:20458364

  20. RosettaHoles: Rapid assessment of protein core packing for structure prediction, refinement, design, and validation

    PubMed Central

    Sheffler, Will; Baker, David

    2009-01-01

    We present a novel method called RosettaHoles for visual and quantitative assessment of underpacking in the protein core. RosettaHoles generates a set of spherical cavity balls that fill the empty volume between atoms in the protein interior. For visualization, the cavity balls are aggregated into contiguous overlapping clusters and small cavities are discarded, leaving an uncluttered representation of the unfilled regions of space in a structure. For quantitative analysis, the cavity ball data are used to estimate the probability of observing a given cavity in a high-resolution crystal structure. RosettaHoles provides excellent discrimination between real and computationally generated structures, is predictive of incorrect regions in models, identifies problematic structures in the Protein Data Bank, and promises to be a useful validation tool for newly solved experimental structures. PMID:19177366

  1. Neural signature of hierarchically structured expectations predicts clustering and transfer of rule sets in reinforcement learning.

    PubMed

    Collins, Anne Gabrielle Eva; Frank, Michael Joshua

    2016-07-01

    Often the world is structured such that distinct sensory contexts signify the same abstract rule set. Learning from feedback thus informs us not only about the value of stimulus-action associations but also about which rule set applies. Hierarchical clustering models suggest that learners discover structure in the environment, clustering distinct sensory events into a single latent rule set. Such structure enables a learner to transfer any newly acquired information to other contexts linked to the same rule set, and facilitates re-use of learned knowledge in novel contexts. Here, we show that humans exhibit this transfer, generalization and clustering during learning. Trial-by-trial model-based analysis of EEG signals revealed that subjects' reward expectations incorporated this hierarchical structure; these structured neural signals were predictive of behavioral transfer and clustering. These results further our understanding of how humans learn and generalize flexibly by building abstract, behaviorally relevant representations of the complex, high-dimensional sensory environment. PMID:27082659

  2. Tertiary structure-based prediction of conformational B-cell epitopes through B factors

    PubMed Central

    Ren, Jing; Liu, Qian; Ellis, John; Li, Jinyan

    2014-01-01

    Motivation: B-cell epitope is a small area on the surface of an antigen that binds to an antibody. Accurately locating epitopes is of critical importance for vaccine development. Compared with wet-lab methods, computational methods have strong potential for efficient and large-scale epitope prediction for antigen candidates at much lower cost. However, it is still not clear which features are good determinants for accurate epitope prediction, leading to the unsatisfactory performance of existing prediction methods. Method and results: We propose a much more accurate B-cell epitope prediction method. Our method uses a new feature B factor (obtained from X-ray crystallography), combined with other basic physicochemical, statistical, evolutionary and structural features of each residue. These basic features are extended by a sequence window and a structure window. All these features are then learned by a two-stage random forest model to identify clusters of antigenic residues and to remove isolated outliers. Tested on a dataset of 55 epitopes from 45 tertiary structures, we prove that our method significantly outperforms all three existing structure-based epitope predictors. Following comprehensive analysis, it is found that features such as B factor, relative accessible surface area and protrusion index play an important role in characterizing B-cell epitopes. Our detailed case studies on an HIV antigen and an influenza antigen confirm that our second stage learning is effective for clustering true antigenic residues and for eliminating self-made prediction errors introduced by the first-stage learning. Availability and implementation: Source codes are available on request. Contact: jinyan.li@uts.edu.au Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24931993

  3. Rule generation for protein secondary structure prediction with support vector machines and decision tree.

    PubMed

    He, Jieyue; Hu, Hae-Jin; Harrison, Robert; Tai, Phang C; Pan, Yi

    2006-03-01

    Support vector machines (SVMs) have shown strong generalization ability in a number of application areas, including protein structure prediction. However, the poor comprehensibility hinders the success of the SVM for protein structure prediction. The explanation of how a decision made is important for accepting the machine learning technology, especially for applications such as bioinformatics. The reasonable interpretation is not only useful to guide the "wet experiments," but also the extracted rules are helpful to integrate computational intelligence with symbolic AI systems for advanced deduction. On the other hand, a decision tree has good comprehensibility. In this paper, a novel approach to rule generation for protein secondary structure prediction by integrating merits of both the SVM and decision tree is presented. This approach combines the SVM with decision tree into a new algorithm called SVM_ DT, which proceeds in three steps. This algorithm first trains an SVM. Then, a new training set is generated through careful selection from the output of the SVM. Finally, the obtained training set is used to train a decision tree learning system and to extract the corresponding rule sets. The results of the experiments of protein secondary structure prediction on RS126 data set show that the comprehensibility of SVM_DT is much better than that of the SVM. Moreover, the generalization ability of SVM_DT is better than that of C4.5 decision trees and is similar to that of the SVM. Hence, SVM_DT can be used not only for prediction, but also for guiding biological experiments. PMID:16570873

  4. Synthesized variable structure control and gray prediction for a class of perturbed systems.

    PubMed

    Chou, Chien-Hsin

    2003-04-01

    Based on the Lyapunov stability theorem, we apply a gray prediction scheme to eliminate the "chattering" disadvantage of the traditional variable structure control. By using a moving window of recent past data, we can directly identify the system dynamics and unknown perturbations via the gray prediction scheme. Therefore there is no need for the information of the upper bound of the perturbations in advance. In addition, the presented control scheme ensures the property of the globally uniformly ultimate boundness for the overall controlled system. Finally, a numerical example is given to illustrate the feasibility of the proposed control scheme. PMID:12708542

  5. Structure classification and melting temperature prediction in octet AB solids via machine learning

    NASA Astrophysics Data System (ADS)

    Pilania, G.; Gubernatis, J. E.; Lookman, T.

    2015-06-01

    Machine learning methods are being increasingly used in condensed matter physics and materials science to classify crystals structures and predict material properties. However, the reliability of these methods for a given problem, especially when large data sets are unavailable, has not been well studied. By addressing the tasks of classifying crystal structure and predicting melting temperatures of the octet subset of AB solids, we performed such a study and found potential problems with using machine learning methods on relatively small data sets. At the same time, however, we can reaffirm the potential power of such methods for these tasks. In particular, we uncovered an important new material feature, the excess Born effective charge, that significantly increased the accuracy of the predictions for the classification problem we defined. This discovery leads us to propose a new scale for the degree of ionicity and covalency in these solids. More specifically, we partitioned the crystal structures of a set of 75 octet solids into those that are ionic and covalent bonded and thus performed a binary classification task. We found that using the standard indices (rσ,rπ) , suggested by St. John and Bloch several decades ago, enabled an average success in classification of 92 % . Using just rσ and the excess Born effective charge Δ ZA of the A atom enabled an average success of 97 % , but we also found relatively large variations about these averages that were dependent on how certain machine learning methods were used and for which a standard deviation was not a proper measure of the degree of confidence we can place in either average. Instead, we calculated and report with 95 % confidence that the traditional classification pair predicts an accuracy in the interval [89 %,95 %] and the accuracy of the new pair lies in the interval [96 %,99 %] . For melting temperature predictions, the size of our data set was 46. We estimate the root-mean-squared error of our resulting model to be 11 % of the mean melting temperature of the data, but we note that if the accuracy of this predicted error is itself measured, our estimated fitting error itself has a root-mean-square error of 50 % . In short, what we illustrate is that classification and regression predictions can vary significantly, depending on the details of how machine learning methods are applied to small data sets. This variation makes it important, if not essential, to average the predictions and compute confidence intervals about these averages to report results meaningfully. However, when properly used, these statistical methods can advance our understanding and improve predictions of material properties even for small data sets.

  6. Predicting mostly disordered proteins by using structure-unknown protein data

    PubMed Central

    Shimizu, Kana; Muraoka, Yoichi; Hirose, Shuichi; Tomii, Kentaro; Noguchi, Tamotsu

    2007-01-01

    Background Predicting intrinsically disordered proteins is important in structural biology because they are thought to carry out various cellular functions even though they have no stable three-dimensional structure. We know the structures of far more ordered proteins than disordered proteins. The structural distribution of proteins in nature can therefore be inferred to differ from that of proteins whose structures have been determined experimentally. We know many more protein sequences than we do protein structures, and many of the known sequences can be expected to be those of disordered proteins. Thus it would be efficient to use the information of structure-unknown proteins in order to avoid training data sparseness. We propose a novel method for predicting which proteins are mostly disordered by using spectral graph transducer and training with a huge amount of structure-unknown sequences as well as structure-known sequences. Results When the proposed method was evaluated on data that included 82 disordered proteins and 526 ordered proteins, its sensitivity was 0.723 and its specificity was 0.977. It resulted in a Matthews correlation coefficient 0.202 points higher than that obtained using FoldIndex, 0.221 points higher than that obtained using the method based on plotting hydrophobicity against the number of contacts and 0.07 points higher than that obtained using support vector machines (SVMs). To examine robustness against training data sparseness, we investigated the correlation between two results obtained when the method was trained on different datasets and tested on the same dataset. The correlation coefficient for the proposed method is 0.14 higher than that for the method using SVMs. When the proposed SGT-based method was compared with four per-residue predictors (VL3, GlobPlot, DISOPRED2 and IUPred (long)), its sensitivity was 0.834 for disordered proteins, which is 0.052–0.523 higher than that of the per-residue predictors, and its specificity was 0.991 for ordered proteins, which is 0.036–0.153 higher than that of the per-residue predictors. The proposed method was also evaluated on data that included 417 partially disordered proteins. It predicted the frequency of disordered proteins to be 1.95% for the proteins with 5%–10% disordered sequences, 1.46% for the proteins with 10%–20% disordered sequences and 16.57% for proteins with 20%–40% disordered sequences. Conclusion The proposed method, which utilizes the information of structure-unknown data, predicts disordered proteins more accurately than other methods and is less affected by training data sparseness. PMID:17338828

  7. Structure band-gap correlations in semiconductors: Implications for computational band gap prediction

    NASA Astrophysics Data System (ADS)

    Schneider, Guenter; Foster, David H.

    2014-03-01

    Large scale structure prediction for novel materials requires computationally inexpensive lattice relaxation methods, which are typically based on density functional theory (DFT) using a semi-local approximation for the exchange-correlation functional. These methods provide structural parameters accurate to within a few percent, but cannot predict band-gaps. Band-gap calculations, require much more computationally expensive methods such as hybrid functionals or the GW approximation. Such an accuracy-tiered method fails dramatically for Cu3PSe4. When the generalized gradient approximation (GGA) is used to relax the lattice and ions, band-gaps calculated using both the single shot GGA+GW method and the Heyd-Scuseria-Ernzerhof (HSE) hybrid functional method are a full 0.5 eV lower than the band gaps calculated for the unrelaxed, experimental structure. The GW and HSE methods predict accurate band gaps only when used with the correct experimental structure. We show that in Cu3PSe4, the calculated band-gap depends strongly on the P-Se bondlength, which can be explained by the P-Se* anti-bonding character of the lowest conduction band state. We show this effect for different lattice relaxation methods including recently developed meta-GGAs.

  8. Prediction of Protein Structural Class Based on Gapped-Dipeptides and a Recursive Feature Selection Approach

    PubMed Central

    Liu, Taigang; Qin, Yufang; Wang, Yongjie; Wang, Chunhua

    2015-01-01

    The prior knowledge of protein structural class may offer useful clues on understanding its functionality as well as its tertiary structure. Though various significant efforts have been made to find a fast and effective computational approach to address this problem, it is still a challenging topic in the field of bioinformatics. The position-specific score matrix (PSSM) profile has been shown to provide a useful source of information for improving the prediction performance of protein structural class. However, this information has not been adequately explored. To this end, in this study, we present a feature extraction technique which is based on gapped-dipeptides composition computed directly from PSSM. Then, a careful feature selection technique is performed based on support vector machine-recursive feature elimination (SVM-RFE). These optimal features are selected to construct a final predictor. The results of jackknife tests on four working datasets show that our method obtains satisfactory prediction accuracies by extracting features solely based on PSSM and could serve as a very promising tool to predict protein structural class. PMID:26712737

  9. Structure-Based Function Prediction of Uncharacterized Protein Using Binding Sites Comparison

    PubMed Central

    Konc, Janez; Hodošček, Milan; Ogrizek, Mitja; Trykowska Konc, Joanna; Janežič, Dušanka

    2013-01-01

    A challenge in structural genomics is prediction of the function of uncharacterized proteins. When proteins cannot be related to other proteins of known activity, identification of function based on sequence or structural homology is impossible and in such cases it would be useful to assess structurally conserved binding sites in connection with the protein's function. In this paper, we propose the function of a protein of unknown activity, the Tm1631 protein from Thermotoga maritima, by comparing its predicted binding site to a library containing thousands of candidate structures. The comparison revealed numerous similarities with nucleotide binding sites including specifically, a DNA-binding site of endonuclease IV. We constructed a model of this Tm1631 protein with a DNA-ligand from the newly found similar binding site using ProBiS, and validated this model by molecular dynamics. The interactions predicted by the Tm1631-DNA model corresponded to those known to be important in endonuclease IV-DNA complex model and the corresponding binding free energies, calculated from these models were in close agreement. We thus propose that Tm1631 is a DNA binding enzyme with endonuclease activity that recognizes DNA lesions in which at least two consecutive nucleotides are unpaired. Our approach is general, and can be applied to any protein of unknown function. It might also be useful to guide experimental determination of function of uncharacterized proteins. PMID:24244144

  10. Automated antibody structure prediction using Accelrys tools: Results and best practices

    PubMed Central

    Fasnacht, Marc; Butenhof, Ken; Goupil-Lamy, Anne; Hernandez-Guzman, Francisco; Huang, Hongwei; Yan, Lisa

    2014-01-01

    We describe the methodology and results from our participation in the second Antibody Modeling Assessment experiment. During the experiment we predicted the structure of eleven unpublished antibody Fv fragments. Our prediction methods centered on template-based modeling; potential templates were selected from an antibody database based on their sequence similarity to the target in the framework regions. Depending on the quality of the templates, we constructed models of the antibody framework regions either using a single, chimeric or multiple template approach. The hypervariable loop regions in the initial models were rebuilt by grafting the corresponding regions from suitable templates onto the model. For the H3 loop region, we further refined models using ab initio methods. The final models were subjected to constrained energy minimization to resolve severe local structural problems. The analysis of the models submitted show that Accelrys tools allow for the construction of quite accurate models for the framework and the canonical CDR regions, with RMSDs to the X-ray structure on average below 1 Å for most of these regions. The results show that accurate prediction of the H3 hypervariable loops remains a challenge. Furthermore, model quality assessment of the submitted models show that the models are of quite high quality, with local geometry assessment scores similar to that of the target X-ray structures. Proteins 2014; 82:1583–1598. © 2014 The Authors. Proteins published by Wiley Periodicals, Inc. PMID:24833271

  11. Correlation of pp data with predictions of improved six-quark structure models

    NASA Astrophysics Data System (ADS)

    González, P.; Lafrance, P.; Lomon, E. L.

    1987-04-01

    Recent experimental data indicate a structure in ΔσL corresponding to a pp mass of 2.7 GeV/c2, as earlier predicted for a six-quark 1S0 state by an R-matrix treatment of the cloudy-bag-model quark degrees of freedom interior to a coupled-isobar-channel system. The 1S0 model is improved to agree with 2π production data at 800 MeV laboratory energy. The resulting 1S0 partial wave and recently improved models of the background partial waves as well as older versions of the phase parameters predict experimental observables in the resonance region. The predicted width and inelasticity are consistent with the data. Detailed energy and angular dependence of the model are in agreement with ΔσL, CLL, and CNN data in the resonance energy region. More data on these observables are needed to confirm the structure and its characteristics. Measurable aspects of the structure in other observables are displayed. Another six-quark resonance structure, in the 1D2 state, is described.

  12. Constrained evolutionary algorithm for structure prediction of molecular crystals: methodology and applications.

    PubMed

    Zhu, Qiang; Oganov, Artem R; Glass, Colin W; Stokes, Harold T

    2012-06-01

    Evolutionary crystal structure prediction proved to be a powerful approach for studying a wide range of materials. Here we present a specifically designed algorithm for the prediction of the structure of complex crystals consisting of well defined molecular units. The main feature of this new approach is that each unit is treated as a whole body, which drastically reduces the search space and improves the efficiency, but necessitates the introduction of new variation operators described here. To increase the diversity of the population of structures, the initial population and part (~20%) of the new generations are produced using space-group symmetry combined with random cell parameters, and random positions and orientations of molecular units. We illustrate the efficiency and reliability of this approach by a number of tests (ice, ammonia, carbon dioxide, methane, benzene, glycine and butane-1,4-diammonium dibromide). This approach easily predicts the crystal structure of methane A containing 21 methane molecules (105 atoms) per unit cell. We demonstrate that this new approach also has a high potential for the study of complex inorganic crystals as shown on examples of a complex hydrogen storage material Mg(BH(4))(2) and elemental boron. PMID:22610672

  13. A fast method for large-scale de novo peptide and miniprotein structure prediction.

    PubMed

    Maupetit, Julien; Derreumaux, Philippe; Tufféry, Pierre

    2010-03-01

    Although peptides have many biological and biomedical implications, an accurate method predicting their equilibrium structural ensembles from amino acid sequences and suitable for large-scale experiments is still missing. We introduce a new approach-PEP-FOLD-to the de novo prediction of peptides and miniproteins. It first predicts, in the terms of a Hidden Markov Model-derived structural alphabet, a limited number of local conformations at each position of the structure. It then performs their assembly using a greedy procedure driven by a coarse-grained energy score. On a benchmark of 52 peptides with 9-23 amino acids, PEP-FOLD generates lowest-energy conformations within 2.8 and 2.3 A Calpha root-mean-square deviation from the full nuclear magnetic resonance structures (NMR) and the NMR rigid cores, respectively, outperforming previous approaches. For 13 miniproteins with 27-49 amino acids, PEP-FOLD reaches an accuracy of 3.6 and 4.6 A Calpha root-mean-square deviation for the most-native and lowest-energy conformations, using the nonflexible regions identified by NMR. PEP-FOLD simulations are fast-a few minutes only-opening therefore, the door to in silico large-scale rational design of new bioactive peptides and miniproteins. PMID:19569182

  14. Less-structured time in children's daily lives predicts self-directed executive functioning

    PubMed Central

    Barker, Jane E.; Semenov, Andrei D.; Michaelson, Laura; Provan, Lindsay S.; Snyder, Hannah R.; Munakata, Yuko

    2014-01-01

    Executive functions (EFs) in childhood predict important life outcomes. Thus, there is great interest in attempts to improve EFs early in life. Many interventions are led by trained adults, including structured training activities in the lab, and less-structured activities implemented in schools. Such programs have yielded gains in children's externally-driven executive functioning