Sample records for molecular ensemble based

  1. Molecular activity prediction by means of supervised subspace projection based ensembles of classifiers.

    PubMed

    Cerruela García, G; García-Pedrajas, N; Luque Ruiz, I; Gómez-Nieto, M Á

    2018-03-01

    This paper proposes a method for molecular activity prediction in QSAR studies using ensembles of classifiers constructed by means of two supervised subspace projection methods, namely nonparametric discriminant analysis (NDA) and hybrid discriminant analysis (HDA). We studied the performance of the proposed ensembles compared to classical ensemble methods using four molecular datasets and eight different models for the representation of the molecular structure. Using several measures and statistical tests for classifier comparison, we observe that our proposal improves the classification results with respect to classical ensemble methods. Therefore, we show that ensembles constructed using supervised subspace projections offer an effective way of creating classifiers in cheminformatics.

  2. Hybrid nanomembrane-based capacitors for the determination of the dielectric constant of semiconducting molecular ensembles.

    PubMed

    Petrini, Paula A; Silva, Ricardo M L; de Oliveira, Rafael F; Merces, Leandro; Bof Bufon, Carlos C

    2018-06-29

    Considerable advances in the field of molecular electronics have been achieved over the recent years. One persistent challenge, however, is the exploitation of the electronic properties of molecules fully integrated into devices. Typically, the molecular electronic properties are investigated using sophisticated techniques incompatible with a practical device technology, such as the scanning tunneling microscopy. The incorporation of molecular materials in devices is not a trivial task as the typical dimensions of electrical contacts are much larger than the molecular ones. To tackle this issue, we report on hybrid capacitors using mechanically-compliant nanomembranes to encapsulate ultrathin molecular ensembles for the investigation of molecular dielectric properties. As the prototype material, copper (II) phthalocyanine (CuPc) has been chosen as information on its dielectric constant (k CuPc ) at the molecular scale is missing. Here, hybrid nanomembrane-based capacitors containing metallic nanomembranes, insulating Al 2 O 3 layers, and the CuPc molecular ensembles have been fabricated and evaluated. The Al 2 O 3 is used to prevent short circuits through the capacitor plates as the molecular layer is considerably thin (<30 nm). From the electrical measurements of devices with molecular layers of different thicknesses, the CuPc dielectric constant has been reliably determined (k CuPc  = 4.5 ± 0.5). These values suggest a mild contribution of the molecular orientation on the CuPc dielectric properties. The reported nanomembrane-based capacitor is a viable strategy for the dielectric characterization of ultrathin molecular ensembles integrated into a practical, real device technology.

  3. Hybrid nanomembrane-based capacitors for the determination of the dielectric constant of semiconducting molecular ensembles

    NASA Astrophysics Data System (ADS)

    Petrini, Paula A.; Silva, Ricardo M. L.; de Oliveira, Rafael F.; Merces, Leandro; Bof Bufon, Carlos C.

    2018-06-01

    Considerable advances in the field of molecular electronics have been achieved over the recent years. One persistent challenge, however, is the exploitation of the electronic properties of molecules fully integrated into devices. Typically, the molecular electronic properties are investigated using sophisticated techniques incompatible with a practical device technology, such as the scanning tunneling microscopy. The incorporation of molecular materials in devices is not a trivial task as the typical dimensions of electrical contacts are much larger than the molecular ones. To tackle this issue, we report on hybrid capacitors using mechanically-compliant nanomembranes to encapsulate ultrathin molecular ensembles for the investigation of molecular dielectric properties. As the prototype material, copper (II) phthalocyanine (CuPc) has been chosen as information on its dielectric constant (k CuPc) at the molecular scale is missing. Here, hybrid nanomembrane-based capacitors containing metallic nanomembranes, insulating Al2O3 layers, and the CuPc molecular ensembles have been fabricated and evaluated. The Al2O3 is used to prevent short circuits through the capacitor plates as the molecular layer is considerably thin (<30 nm). From the electrical measurements of devices with molecular layers of different thicknesses, the CuPc dielectric constant has been reliably determined (k CuPc = 4.5 ± 0.5). These values suggest a mild contribution of the molecular orientation on the CuPc dielectric properties. The reported nanomembrane-based capacitor is a viable strategy for the dielectric characterization of ultrathin molecular ensembles integrated into a practical, real device technology.

  4. An ensemble predictive modeling framework for breast cancer classification.

    PubMed

    Nagarajan, Radhakrishnan; Upreti, Meenakshi

    2017-12-01

    Molecular changes often precede clinical presentation of diseases and can be useful surrogates with potential to assist in informed clinical decision making. Recent studies have demonstrated the usefulness of modeling approaches such as classification that can predict the clinical outcomes from molecular expression profiles. While useful, a majority of these approaches implicitly use all molecular markers as features in the classification process often resulting in sparse high-dimensional projection of the samples often comparable to that of the sample size. In this study, a variant of the recently proposed ensemble classification approach is used for predicting good and poor-prognosis breast cancer samples from their molecular expression profiles. In contrast to traditional single and ensemble classifiers, the proposed approach uses multiple base classifiers with varying feature sets obtained from two-dimensional projection of the samples in conjunction with a majority voting strategy for predicting the class labels. In contrast to our earlier implementation, base classifiers in the ensembles are chosen based on maximal sensitivity and minimal redundancy by choosing only those with low average cosine distance. The resulting ensemble sets are subsequently modeled as undirected graphs. Performance of four different classification algorithms is shown to be better within the proposed ensemble framework in contrast to using them as traditional single classifier systems. Significance of a subset of genes with high-degree centrality in the network abstractions across the poor-prognosis samples is also discussed. Copyright © 2017 Elsevier Inc. All rights reserved.

  5. Cosolvent-Based Molecular Dynamics for Ensemble Docking: Practical Method for Generating Druggable Protein Conformations.

    PubMed

    Uehara, Shota; Tanaka, Shigenori

    2017-04-24

    Protein flexibility is a major hurdle in current structure-based virtual screening (VS). In spite of the recent advances in high-performance computing, protein-ligand docking methods still demand tremendous computational cost to take into account the full degree of protein flexibility. In this context, ensemble docking has proven its utility and efficiency for VS studies, but it still needs a rational and efficient method to select and/or generate multiple protein conformations. Molecular dynamics (MD) simulations are useful to produce distinct protein conformations without abundant experimental structures. In this study, we present a novel strategy that makes use of cosolvent-based molecular dynamics (CMD) simulations for ensemble docking. By mixing small organic molecules into a solvent, CMD can stimulate dynamic protein motions and induce partial conformational changes of binding pocket residues appropriate for the binding of diverse ligands. The present method has been applied to six diverse target proteins and assessed by VS experiments using many actives and decoys of DEKOIS 2.0. The simulation results have revealed that the CMD is beneficial for ensemble docking. Utilizing cosolvent simulation allows the generation of druggable protein conformations, improving the VS performance compared with the use of a single experimental structure or ensemble docking by standard MD with pure water as the solvent.

  6. CarcinoPred-EL: Novel models for predicting the carcinogenicity of chemicals using molecular fingerprints and ensemble learning methods.

    PubMed

    Zhang, Li; Ai, Haixin; Chen, Wen; Yin, Zimo; Hu, Huan; Zhu, Junfeng; Zhao, Jian; Zhao, Qi; Liu, Hongsheng

    2017-05-18

    Carcinogenicity refers to a highly toxic end point of certain chemicals, and has become an important issue in the drug development process. In this study, three novel ensemble classification models, namely Ensemble SVM, Ensemble RF, and Ensemble XGBoost, were developed to predict carcinogenicity of chemicals using seven types of molecular fingerprints and three machine learning methods based on a dataset containing 1003 diverse compounds with rat carcinogenicity. Among these three models, Ensemble XGBoost is found to be the best, giving an average accuracy of 70.1 ± 2.9%, sensitivity of 67.0 ± 5.0%, and specificity of 73.1 ± 4.4% in five-fold cross-validation and an accuracy of 70.0%, sensitivity of 65.2%, and specificity of 76.5% in external validation. In comparison with some recent methods, the ensemble models outperform some machine learning-based approaches and yield equal accuracy and higher specificity but lower sensitivity than rule-based expert systems. It is also found that the ensemble models could be further improved if more data were available. As an application, the ensemble models are employed to discover potential carcinogens in the DrugBank database. The results indicate that the proposed models are helpful in predicting the carcinogenicity of chemicals. A web server called CarcinoPred-EL has been built for these models ( http://ccsipb.lnu.edu.cn/toxicity/CarcinoPred-EL/ ).

  7. Bioactive focus in conformational ensembles: a pluralistic approach

    NASA Astrophysics Data System (ADS)

    Habgood, Matthew

    2017-12-01

    Computational generation of conformational ensembles is key to contemporary drug design. Selecting the members of the ensemble that will approximate the conformation most likely to bind to a desired target (the bioactive conformation) is difficult, given that the potential energy usually used to generate and rank the ensemble is a notoriously poor discriminator between bioactive and non-bioactive conformations. In this study an approach to generating a focused ensemble is proposed in which each conformation is assigned multiple rankings based not just on potential energy but also on solvation energy, hydrophobic or hydrophilic interaction energy, radius of gyration, and on a statistical potential derived from Cambridge Structural Database data. The best ranked structures derived from each system are then assembled into a new ensemble that is shown to be better focused on bioactive conformations. This pluralistic approach is tested on ensembles generated by the Molecular Operating Environment's Low Mode Molecular Dynamics module, and by the Cambridge Crystallographic Data Centre's conformation generator software.

  8. Molecular dynamics of liquid crystals

    NASA Astrophysics Data System (ADS)

    Sarman, Sten

    1997-02-01

    We derive Green-Kubo relations for the viscosities of a nematic liquid crystal. The derivation is based on the application of a Gaussian constraint algorithm that makes the director angular velocity of a liquid crystal a constant of motion. Setting this velocity equal to zero means that a director-based coordinate system becomes an inertial frame and that the constraint torques do not do any work on the system. The system consequently remains in equilibrium. However, one generates a different equilibrium ensemble. The great advantage of this ensemble is that the Green-Kubo relations for the viscosities become linear combinations of time correlation function integrals, whereas they are complicated rational functions in the conventional canonical ensemble. This facilitates the numerical evaluation of the viscosities by molecular dynamics simulations.

  9. Similarity Measures for Protein Ensembles

    PubMed Central

    Lindorff-Larsen, Kresten; Ferkinghoff-Borg, Jesper

    2009-01-01

    Analyses of similarities and changes in protein conformation can provide important information regarding protein function and evolution. Many scores, including the commonly used root mean square deviation, have therefore been developed to quantify the similarities of different protein conformations. However, instead of examining individual conformations it is in many cases more relevant to analyse ensembles of conformations that have been obtained either through experiments or from methods such as molecular dynamics simulations. We here present three approaches that can be used to compare conformational ensembles in the same way as the root mean square deviation is used to compare individual pairs of structures. The methods are based on the estimation of the probability distributions underlying the ensembles and subsequent comparison of these distributions. We first validate the methods using a synthetic example from molecular dynamics simulations. We then apply the algorithms to revisit the problem of ensemble averaging during structure determination of proteins, and find that an ensemble refinement method is able to recover the correct distribution of conformations better than standard single-molecule refinement. PMID:19145244

  10. Using simulation to interpret experimental data in terms of protein conformational ensembles.

    PubMed

    Allison, Jane R

    2017-04-01

    In their biological environment, proteins are dynamic molecules, necessitating an ensemble structural description. Molecular dynamics simulations and solution-state experiments provide complimentary information in the form of atomically detailed coordinates and averaged or distributions of structural properties or related quantities. Recently, increases in the temporal and spatial scale of conformational sampling and comparison of the more diverse conformational ensembles thus generated have revealed the importance of sampling rare events. Excitingly, new methods based on maximum entropy and Bayesian inference are promising to provide a statistically sound mechanism for combining experimental data with molecular dynamics simulations. Copyright © 2016 Elsevier Ltd. All rights reserved.

  11. Simulation of weak polyelectrolytes: a comparison between the constant pH and the reaction ensemble method

    NASA Astrophysics Data System (ADS)

    Landsgesell, Jonas; Holm, Christian; Smiatek, Jens

    2017-03-01

    The reaction ensemble and the constant pH method are well-known chemical equilibrium approaches to simulate protonation and deprotonation reactions in classical molecular dynamics and Monte Carlo simulations. In this article, we demonstrate the similarity between both methods under certain conditions. We perform molecular dynamics simulations of a weak polyelectrolyte in order to compare the titration curves obtained by both approaches. Our findings reveal a good agreement between the methods when the reaction ensemble is used to sweep the reaction constant. Pronounced differences between the reaction ensemble and the constant pH method can be observed for stronger acids and bases in terms of adaptive pH values. These deviations are due to the presence of explicit protons in the reaction ensemble method which induce a screening of electrostatic interactions between the charged titrable groups of the polyelectrolyte. The outcomes of our simulation hint to a better applicability of the reaction ensemble method for systems in confined geometries and titrable groups in polyelectrolytes with different pKa values.

  12. A Kolmogorov-Smirnov test for the molecular clock based on Bayesian ensembles of phylogenies

    PubMed Central

    Antoneli, Fernando; Passos, Fernando M.; Lopes, Luciano R.

    2018-01-01

    Divergence date estimates are central to understand evolutionary processes and depend, in the case of molecular phylogenies, on tests of molecular clocks. Here we propose two non-parametric tests of strict and relaxed molecular clocks built upon a framework that uses the empirical cumulative distribution (ECD) of branch lengths obtained from an ensemble of Bayesian trees and well known non-parametric (one-sample and two-sample) Kolmogorov-Smirnov (KS) goodness-of-fit test. In the strict clock case, the method consists in using the one-sample Kolmogorov-Smirnov (KS) test to directly test if the phylogeny is clock-like, in other words, if it follows a Poisson law. The ECD is computed from the discretized branch lengths and the parameter λ of the expected Poisson distribution is calculated as the average branch length over the ensemble of trees. To compensate for the auto-correlation in the ensemble of trees and pseudo-replication we take advantage of thinning and effective sample size, two features provided by Bayesian inference MCMC samplers. Finally, it is observed that tree topologies with very long or very short branches lead to Poisson mixtures and in this case we propose the use of the two-sample KS test with samples from two continuous branch length distributions, one obtained from an ensemble of clock-constrained trees and the other from an ensemble of unconstrained trees. Moreover, in this second form the test can also be applied to test for relaxed clock models. The use of a statistically equivalent ensemble of phylogenies to obtain the branch lengths ECD, instead of one consensus tree, yields considerable reduction of the effects of small sample size and provides a gain of power. PMID:29300759

  13. Girsanov reweighting for path ensembles and Markov state models

    NASA Astrophysics Data System (ADS)

    Donati, L.; Hartmann, C.; Keller, B. G.

    2017-06-01

    The sensitivity of molecular dynamics on changes in the potential energy function plays an important role in understanding the dynamics and function of complex molecules. We present a method to obtain path ensemble averages of a perturbed dynamics from a set of paths generated by a reference dynamics. It is based on the concept of path probability measure and the Girsanov theorem, a result from stochastic analysis to estimate a change of measure of a path ensemble. Since Markov state models (MSMs) of the molecular dynamics can be formulated as a combined phase-space and path ensemble average, the method can be extended to reweight MSMs by combining it with a reweighting of the Boltzmann distribution. We demonstrate how to efficiently implement the Girsanov reweighting in a molecular dynamics simulation program by calculating parts of the reweighting factor "on the fly" during the simulation, and we benchmark the method on test systems ranging from a two-dimensional diffusion process and an artificial many-body system to alanine dipeptide and valine dipeptide in implicit and explicit water. The method can be used to study the sensitivity of molecular dynamics on external perturbations as well as to reweight trajectories generated by enhanced sampling schemes to the original dynamics.

  14. Multi-Conformer Ensemble Docking to Difficult Protein Targets

    DOE PAGES

    Ellingson, Sally R.; Miao, Yinglong; Baudry, Jerome; ...

    2014-09-08

    We investigate large-scale ensemble docking using five proteins from the Directory of Useful Decoys (DUD, dud.docking.org) for which docking to crystal structures has proven difficult. Molecular dynamics trajectories are produced for each protein and an ensemble of representative conformational structures extracted from the trajectories. Docking calculations are performed on these selected simulation structures and ensemble-based enrichment factors compared with those obtained using docking in crystal structures of the same protein targets or random selection of compounds. We also found simulation-derived snapshots with improved enrichment factors that increased the chemical diversity of docking hits for four of the five selected proteins.more » A combination of all the docking results obtained from molecular dynamics simulation followed by selection of top-ranking compounds appears to be an effective strategy for increasing the number and diversity of hits when using docking to screen large libraries of chemicals against difficult protein targets.« less

  15. Hybrid quantum processors: molecular ensembles as quantum memory for solid state circuits.

    PubMed

    Rabl, P; DeMille, D; Doyle, J M; Lukin, M D; Schoelkopf, R J; Zoller, P

    2006-07-21

    We investigate a hybrid quantum circuit where ensembles of cold polar molecules serve as long-lived quantum memories and optical interfaces for solid state quantum processors. The quantum memory realized by collective spin states (ensemble qubit) is coupled to a high-Q stripline cavity via microwave Raman processes. We show that, for convenient trap-surface distances of a few microm, strong coupling between the cavity and ensemble qubit can be achieved. We discuss basic quantum information protocols, including a swap from the cavity photon bus to the molecular quantum memory, and a deterministic two qubit gate. Finally, we investigate coherence properties of molecular ensemble quantum bits.

  16. Artificial neural networks for efficient clustering of conformational ensembles and their potential for medicinal chemistry.

    PubMed

    Pandini, Alessandro; Fraccalvieri, Domenico; Bonati, Laura

    2013-01-01

    The biological function of proteins is strictly related to their molecular flexibility and dynamics: enzymatic activity, protein-protein interactions, ligand binding and allosteric regulation are important mechanisms involving protein motions. Computational approaches, such as Molecular Dynamics (MD) simulations, are now routinely used to study the intrinsic dynamics of target proteins as well as to complement molecular docking approaches. These methods have also successfully supported the process of rational design and discovery of new drugs. Identification of functionally relevant conformations is a key step in these studies. This is generally done by cluster analysis of the ensemble of structures in the MD trajectory. Recently Artificial Neural Network (ANN) approaches, in particular methods based on Self-Organising Maps (SOMs), have been reported performing more accurately and providing more consistent results than traditional clustering algorithms in various data-mining problems. In the specific case of conformational analysis, SOMs have been successfully used to compare multiple ensembles of protein conformations demonstrating a potential in efficiently detecting the dynamic signatures central to biological function. Moreover, examples of the use of SOMs to address problems relevant to other stages of the drug-design process, including clustering of docking poses, have been reported. In this contribution we review recent applications of ANN algorithms in analysing conformational and structural ensembles and we discuss their potential in computer-based approaches for medicinal chemistry.

  17. Preserving the Boltzmann ensemble in replica-exchange molecular dynamics.

    PubMed

    Cooke, Ben; Schmidler, Scott C

    2008-10-28

    We consider the convergence behavior of replica-exchange molecular dynamics (REMD) [Sugita and Okamoto, Chem. Phys. Lett. 314, 141 (1999)] based on properties of the numerical integrators in the underlying isothermal molecular dynamics (MD) simulations. We show that a variety of deterministic algorithms favored by molecular dynamics practitioners for constant-temperature simulation of biomolecules fail either to be measure invariant or irreducible, and are therefore not ergodic. We then show that REMD using these algorithms also fails to be ergodic. As a result, the entire configuration space may not be explored even in an infinitely long simulation, and the simulation may not converge to the desired equilibrium Boltzmann ensemble. Moreover, our analysis shows that for initial configurations with unfavorable energy, it may be impossible for the system to reach a region surrounding the minimum energy configuration. We demonstrate these failures of REMD algorithms for three small systems: a Gaussian distribution (simple harmonic oscillator dynamics), a bimodal mixture of Gaussians distribution, and the alanine dipeptide. Examination of the resulting phase plots and equilibrium configuration densities indicates significant errors in the ensemble generated by REMD simulation. We describe a simple modification to address these failures based on a stochastic hybrid Monte Carlo correction, and prove that this is ergodic.

  18. Assessing an ensemble docking-based virtual screening strategy for kinase targets by considering protein flexibility.

    PubMed

    Tian, Sheng; Sun, Huiyong; Pan, Peichen; Li, Dan; Zhen, Xuechu; Li, Youyong; Hou, Tingjun

    2014-10-27

    In this study, to accommodate receptor flexibility, based on multiple receptor conformations, a novel ensemble docking protocol was developed by using the naïve Bayesian classification technique, and it was evaluated in terms of the prediction accuracy of docking-based virtual screening (VS) of three important targets in the kinase family: ALK, CDK2, and VEGFR2. First, for each target, the representative crystal structures were selected by structural clustering, and the capability of molecular docking based on each representative structure to discriminate inhibitors from non-inhibitors was examined. Then, for each target, 50 ns molecular dynamics (MD) simulations were carried out to generate an ensemble of the conformations, and multiple representative structures/snapshots were extracted from each MD trajectory by structural clustering. On average, the representative crystal structures outperform the representative structures extracted from MD simulations in terms of the capabilities to separate inhibitors from non-inhibitors. Finally, by using the naïve Bayesian classification technique, an integrated VS strategy was developed to combine the prediction results of molecular docking based on different representative conformations chosen from crystal structures and MD trajectories. It was encouraging to observe that the integrated VS strategy yields better performance than the docking-based VS based on any single rigid conformation. This novel protocol may provide an improvement over existing strategies to search for more diverse and promising active compounds for a target of interest.

  19. Ensemble inequivalence and Maxwell construction in the self-gravitating ring model

    NASA Astrophysics Data System (ADS)

    Rocha Filho, T. M.; Silvestre, C. H.; Amato, M. A.

    2018-06-01

    The statement that Gibbs equilibrium ensembles are equivalent is a base line in many approaches in the context of equilibrium statistical mechanics. However, as a known fact, for some physical systems this equivalence may not be true. In this paper we illustrate from first principles the inequivalence between the canonical and microcanonical ensembles for a system with long range interactions. We make use of molecular dynamics simulations and Monte Carlo simulations to explore the thermodynamics properties of the self-gravitating ring model and discuss on what conditions the Maxwell construction is applicable.

  20. Ensemble Sampling vs. Time Sampling in Molecular Dynamics Simulations of Thermal Conductivity

    DOE PAGES

    Gordiz, Kiarash; Singh, David J.; Henry, Asegun

    2015-01-29

    In this report we compare time sampling and ensemble averaging as two different methods available for phase space sampling. For the comparison, we calculate thermal conductivities of solid argon and silicon structures, using equilibrium molecular dynamics. We introduce two different schemes for the ensemble averaging approach, and show that both can reduce the total simulation time as compared to time averaging. It is also found that velocity rescaling is an efficient mechanism for phase space exploration. Although our methodology is tested using classical molecular dynamics, the ensemble generation approaches may find their greatest utility in computationally expensive simulations such asmore » first principles molecular dynamics. For such simulations, where each time step is costly, time sampling can require long simulation times because each time step must be evaluated sequentially and therefore phase space averaging is achieved through sequential operations. On the other hand, with ensemble averaging, phase space sampling can be achieved through parallel operations, since each ensemble is independent. For this reason, particularly when using massively parallel architectures, ensemble sampling can result in much shorter simulation times and exhibits similar overall computational effort.« less

  1. An Ensemble-Based Protocol for the Computational Prediction of Helix-Helix Interactions in G Protein-Coupled Receptors using Coarse-Grained Molecular Dynamics.

    PubMed

    Altwaijry, Nojood A; Baron, Michael; Wright, David W; Coveney, Peter V; Townsend-Nicholson, Andrea

    2017-05-09

    The accurate identification of the specific points of interaction between G protein-coupled receptor (GPCR) oligomers is essential for the design of receptor ligands targeting oligomeric receptor targets. A coarse-grained molecular dynamics computer simulation approach would provide a compelling means of identifying these specific protein-protein interactions and could be applied both for known oligomers of interest and as a high-throughput screen to identify novel oligomeric targets. However, to be effective, this in silico modeling must provide accurate, precise, and reproducible information. This has been achieved recently in numerous biological systems using an ensemble-based all-atom molecular dynamics approach. In this study, we describe an equivalent methodology for ensemble-based coarse-grained simulations. We report the performance of this method when applied to four different GPCRs known to oligomerize using error analysis to determine the ensemble size and individual replica simulation time required. Our measurements of distance between residues shown to be involved in oligomerization of the fifth transmembrane domain from the adenosine A 2A receptor are in very good agreement with the existing biophysical data and provide information about the nature of the contact interface that cannot be determined experimentally. Calculations of distance between rhodopsin, CXCR4, and β 1 AR transmembrane domains reported to form contact points in homodimers correlate well with the corresponding measurements obtained from experimental structural data, providing an ability to predict contact interfaces computationally. Interestingly, error analysis enables identification of noninteracting regions. Our results confirm that GPCR interactions can be reliably predicted using this novel methodology.

  2. Genetic Feedback Regulation of Frontal Cortical Neuronal Ensembles Through Activity-Dependent Arc Expression and Dopaminergic Input.

    PubMed

    Mastwal, Surjeet; Cao, Vania; Wang, Kuan Hong

    2016-01-01

    Mental functions involve coordinated activities of specific neuronal ensembles that are embedded in complex brain circuits. Aberrant neuronal ensemble dynamics is thought to form the neurobiological basis of mental disorders. A major challenge in mental health research is to identify these cellular ensembles and determine what molecular mechanisms constrain their emergence and consolidation during development and learning. Here, we provide a perspective based on recent studies that use activity-dependent gene Arc/Arg3.1 as a cellular marker to identify neuronal ensembles and a molecular probe to modulate circuit functions. These studies have demonstrated that the transcription of Arc is activated in selective groups of frontal cortical neurons in response to specific behavioral tasks. Arc expression regulates the persistent firing of individual neurons and predicts the consolidation of neuronal ensembles during repeated learning. Therefore, the Arc pathway represents a prototypical example of activity-dependent genetic feedback regulation of neuronal ensembles. The activation of this pathway in the frontal cortex starts during early postnatal development and requires dopaminergic (DA) input. Conversely, genetic disruption of Arc leads to a hypoactive mesofrontal dopamine circuit and its related cognitive deficit. This mutual interaction suggests an auto-regulatory mechanism to amplify the impact of neuromodulators and activity-regulated genes during postnatal development. Such a mechanism may contribute to the association of mutations in dopamine and Arc pathways with neurodevelopmental psychiatric disorders. As the mesofrontal dopamine circuit shows extensive activity-dependent developmental plasticity, activity-guided modulation of DA projections or Arc ensembles during development may help to repair circuit deficits related to neuropsychiatric disorders.

  3. Convergence and reproducibility in molecular dynamics simulations of the DNA duplex d(GCACGAACGAACGAACGC).

    PubMed

    Galindo-Murillo, Rodrigo; Roe, Daniel R; Cheatham, Thomas E

    2015-05-01

    The structure and dynamics of DNA are critically related to its function. Molecular dynamics simulations augment experiment by providing detailed information about the atomic motions. However, to date the simulations have not been long enough for convergence of the dynamics and structural properties of DNA. Molecular dynamics simulations performed with AMBER using the ff99SB force field with the parmbsc0 modifications, including ensembles of independent simulations, were compared to long timescale molecular dynamics performed with the specialized Anton MD engine on the B-DNA structure d(GCACGAACGAACGAACGC). To assess convergence, the decay of the average RMSD values over longer and longer time intervals was evaluated in addition to assessing convergence of the dynamics via the Kullback-Leibler divergence of principal component projection histograms. These molecular dynamics simulations-including one of the longest simulations of DNA published to date at ~44μs-surprisingly suggest that the structure and dynamics of the DNA helix, neglecting the terminal base pairs, are essentially fully converged on the ~1-5μs timescale. We can now reproducibly converge the structure and dynamics of B-DNA helices, omitting the terminal base pairs, on the μs time scale with both the AMBER and CHARMM C36 nucleic acid force fields. Results from independent ensembles of simulations starting from different initial conditions, when aggregated, match the results from long timescale simulations on the specialized Anton MD engine. With access to large-scale GPU resources or the specialized MD engine "Anton" it is possible for a variety of molecular systems to reproducibly and reliably converge the conformational ensemble of sampled structures. This article is part of a Special Issue entitled: Recent developments of molecular dynamics. Copyright © 2014. Published by Elsevier B.V.

  4. Geometric analysis characterizes molecular rigidity in generic and non-generic protein configurations

    PubMed Central

    Budday, Dominik; Leyendecker, Sigrid; van den Bedem, Henry

    2015-01-01

    Proteins operate and interact with partners by dynamically exchanging between functional substates of a conformational ensemble on a rugged free energy landscape. Understanding how these substates are linked by coordinated, collective motions requires exploring a high-dimensional space, which remains a tremendous challenge. While molecular dynamics simulations can provide atomically detailed insight into the dynamics, computational demands to adequately sample conformational ensembles of large biomolecules and their complexes often require tremendous resources. Kinematic models can provide high-level insights into conformational ensembles and molecular rigidity beyond the reach of molecular dynamics by reducing the dimensionality of the search space. Here, we model a protein as a kinematic linkage and present a new geometric method to characterize molecular rigidity from the constraint manifold Q and its tangent space Q at the current configuration q. In contrast to methods based on combinatorial constraint counting, our method is valid for both generic and non-generic, e.g., singular configurations. Importantly, our geometric approach provides an explicit basis for collective motions along floppy modes, resulting in an efficient procedure to probe conformational space. An atomically detailed structural characterization of coordinated, collective motions would allow us to engineer or allosterically modulate biomolecules by selectively stabilizing conformations that enhance or inhibit function with broad implications for human health. PMID:26213417

  5. Geometric analysis characterizes molecular rigidity in generic and non-generic protein configurations

    NASA Astrophysics Data System (ADS)

    Budday, Dominik; Leyendecker, Sigrid; van den Bedem, Henry

    2015-10-01

    Proteins operate and interact with partners by dynamically exchanging between functional substates of a conformational ensemble on a rugged free energy landscape. Understanding how these substates are linked by coordinated, collective motions requires exploring a high-dimensional space, which remains a tremendous challenge. While molecular dynamics simulations can provide atomically detailed insight into the dynamics, computational demands to adequately sample conformational ensembles of large biomolecules and their complexes often require tremendous resources. Kinematic models can provide high-level insights into conformational ensembles and molecular rigidity beyond the reach of molecular dynamics by reducing the dimensionality of the search space. Here, we model a protein as a kinematic linkage and present a new geometric method to characterize molecular rigidity from the constraint manifold Q and its tangent space Tq Q at the current configuration q. In contrast to methods based on combinatorial constraint counting, our method is valid for both generic and non-generic, e.g., singular configurations. Importantly, our geometric approach provides an explicit basis for collective motions along floppy modes, resulting in an efficient procedure to probe conformational space. An atomically detailed structural characterization of coordinated, collective motions would allow us to engineer or allosterically modulate biomolecules by selectively stabilizing conformations that enhance or inhibit function with broad implications for human health.

  6. An Ensemble-Based Protocol for the Computational Prediction of Helix–Helix Interactions in G Protein-Coupled Receptors using Coarse-Grained Molecular Dynamics

    PubMed Central

    2017-01-01

    The accurate identification of the specific points of interaction between G protein-coupled receptor (GPCR) oligomers is essential for the design of receptor ligands targeting oligomeric receptor targets. A coarse-grained molecular dynamics computer simulation approach would provide a compelling means of identifying these specific protein–protein interactions and could be applied both for known oligomers of interest and as a high-throughput screen to identify novel oligomeric targets. However, to be effective, this in silico modeling must provide accurate, precise, and reproducible information. This has been achieved recently in numerous biological systems using an ensemble-based all-atom molecular dynamics approach. In this study, we describe an equivalent methodology for ensemble-based coarse-grained simulations. We report the performance of this method when applied to four different GPCRs known to oligomerize using error analysis to determine the ensemble size and individual replica simulation time required. Our measurements of distance between residues shown to be involved in oligomerization of the fifth transmembrane domain from the adenosine A2A receptor are in very good agreement with the existing biophysical data and provide information about the nature of the contact interface that cannot be determined experimentally. Calculations of distance between rhodopsin, CXCR4, and β1AR transmembrane domains reported to form contact points in homodimers correlate well with the corresponding measurements obtained from experimental structural data, providing an ability to predict contact interfaces computationally. Interestingly, error analysis enables identification of noninteracting regions. Our results confirm that GPCR interactions can be reliably predicted using this novel methodology. PMID:28383913

  7. Uncertainty Quantification in Alchemical Free Energy Methods.

    PubMed

    Bhati, Agastya P; Wan, Shunzhou; Hu, Yuan; Sherborne, Brad; Coveney, Peter V

    2018-06-12

    Alchemical free energy methods have gained much importance recently from several reports of improved ligand-protein binding affinity predictions based on their implementation using molecular dynamics simulations. A large number of variants of such methods implementing different accelerated sampling techniques and free energy estimators are available, each claimed to be better than the others in its own way. However, the key features of reproducibility and quantification of associated uncertainties in such methods have barely been discussed. Here, we apply a systematic protocol for uncertainty quantification to a number of popular alchemical free energy methods, covering both absolute and relative free energy predictions. We show that a reliable measure of error estimation is provided by ensemble simulation-an ensemble of independent MD simulations-which applies irrespective of the free energy method. The need to use ensemble methods is fundamental and holds regardless of the duration of time of the molecular dynamics simulations performed.

  8. Predicting drug-induced liver injury using ensemble learning methods and molecular fingerprints.

    PubMed

    Ai, Haixin; Chen, Wen; Zhang, Li; Huang, Liangchao; Yin, Zimo; Hu, Huan; Zhao, Qi; Zhao, Jian; Liu, Hongsheng

    2018-05-21

    Drug-induced liver injury (DILI) is a major safety concern in the drug-development process, and various methods have been proposed to predict the hepatotoxicity of compounds during the early stages of drug trials. In this study, we developed an ensemble model using three machine learning algorithms and 12 molecular fingerprints from a dataset containing 1,241 diverse compounds. The ensemble model achieved an average accuracy of 71.1±2.6%, sensitivity of 79.9±3.6%, specificity of 60.3±4.8%, and area under the receiver operating characteristic curve (AUC) of 0.764±0.026 in five-fold cross-validation and an accuracy of 84.3%, sensitivity of 86.9%, specificity of 75.4%, and AUC of 0.904 in an external validation dataset of 286 compounds collected from the Liver Toxicity Knowledge Base (LTKB). Compared with previous methods, the ensemble model achieved relatively high accuracy and sensitivity. We also identified several substructures related to DILI. In addition, we provide a web server offering access to our models (http://ccsipb.lnu.edu.cn/toxicity/HepatoPred-EL/).

  9. Investigating energy-based pool structure selection in the structure ensemble modeling with experimental distance constraints: The example from a multidomain protein Pub1.

    PubMed

    Zhu, Guanhua; Liu, Wei; Bao, Chenglong; Tong, Dudu; Ji, Hui; Shen, Zuowei; Yang, Daiwen; Lu, Lanyuan

    2018-05-01

    The structural variations of multidomain proteins with flexible parts mediate many biological processes, and a structure ensemble can be determined by selecting a weighted combination of representative structures from a simulated structure pool, producing the best fit to experimental constraints such as interatomic distance. In this study, a hybrid structure-based and physics-based atomistic force field with an efficient sampling strategy is adopted to simulate a model di-domain protein against experimental paramagnetic relaxation enhancement (PRE) data that correspond to distance constraints. The molecular dynamics simulations produce a wide range of conformations depicted on a protein energy landscape. Subsequently, a conformational ensemble recovered with low-energy structures and the minimum-size restraint is identified in good agreement with experimental PRE rates, and the result is also supported by chemical shift perturbations and small-angle X-ray scattering data. It is illustrated that the regularizations of energy and ensemble-size prevent an arbitrary interpretation of protein conformations. Moreover, energy is found to serve as a critical control to refine the structure pool and prevent data overfitting, because the absence of energy regularization exposes ensemble construction to the noise from high-energy structures and causes a more ambiguous representation of protein conformations. Finally, we perform structure-ensemble optimizations with a topology-based structure pool, to enhance the understanding on the ensemble results from different sources of pool candidates. © 2018 Wiley Periodicals, Inc.

  10. Grand canonical ensemble Monte Carlo simulation of the dCpG/proflavine crystal hydrate.

    PubMed

    Resat, H; Mezei, M

    1996-09-01

    The grand canonical ensemble Monte Carlo molecular simulation method is used to investigate hydration patterns in the crystal hydrate structure of the dCpG/proflavine intercalated complex. The objective of this study is to show by example that the recently advocated grand canonical ensemble simulation is a computationally efficient method for determining the positions of the hydrating water molecules in protein and nucleic acid structures. A detailed molecular simulation convergence analysis and an analogous comparison of the theoretical results with experiments clearly show that the grand ensemble simulations can be far more advantageous than the comparable canonical ensemble simulations.

  11. CABS-flex: server for fast simulation of protein structure fluctuations

    PubMed Central

    Jamroz, Michal; Kolinski, Andrzej; Kmiecik, Sebastian

    2013-01-01

    The CABS-flex server (http://biocomp.chem.uw.edu.pl/CABSflex) implements CABS-model–based protocol for the fast simulations of near-native dynamics of globular proteins. In this application, the CABS model was shown to be a computationally efficient alternative to all-atom molecular dynamics—a classical simulation approach. The simulation method has been validated on a large set of molecular dynamics simulation data. Using a single input (user-provided file in PDB format), the CABS-flex server outputs an ensemble of protein models (in all-atom PDB format) reflecting the flexibility of the input structure, together with the accompanying analysis (residue mean-square-fluctuation profile and others). The ensemble of predicted models can be used in structure-based studies of protein functions and interactions. PMID:23658222

  12. CABS-flex: Server for fast simulation of protein structure fluctuations.

    PubMed

    Jamroz, Michal; Kolinski, Andrzej; Kmiecik, Sebastian

    2013-07-01

    The CABS-flex server (http://biocomp.chem.uw.edu.pl/CABSflex) implements CABS-model-based protocol for the fast simulations of near-native dynamics of globular proteins. In this application, the CABS model was shown to be a computationally efficient alternative to all-atom molecular dynamics--a classical simulation approach. The simulation method has been validated on a large set of molecular dynamics simulation data. Using a single input (user-provided file in PDB format), the CABS-flex server outputs an ensemble of protein models (in all-atom PDB format) reflecting the flexibility of the input structure, together with the accompanying analysis (residue mean-square-fluctuation profile and others). The ensemble of predicted models can be used in structure-based studies of protein functions and interactions.

  13. Structural, electronic, and dynamical properties of liquid water by ab initio molecular dynamics based on SCAN functional within the canonical ensemble

    NASA Astrophysics Data System (ADS)

    Zheng, Lixin; Chen, Mohan; Sun, Zhaoru; Ko, Hsin-Yu; Santra, Biswajit; Dhuvad, Pratikkumar; Wu, Xifan

    2018-04-01

    We perform ab initio molecular dynamics (AIMD) simulation of liquid water in the canonical ensemble at ambient conditions using the strongly constrained and appropriately normed (SCAN) meta-generalized-gradient approximation (GGA) functional approximation and carry out systematic comparisons with the results obtained from the GGA-level Perdew-Burke-Ernzerhof (PBE) functional and Tkatchenko-Scheffler van der Waals (vdW) dispersion correction inclusive PBE functional. We analyze various properties of liquid water including radial distribution functions, oxygen-oxygen-oxygen triplet angular distribution, tetrahedrality, hydrogen bonds, diffusion coefficients, ring statistics, density of states, band gaps, and dipole moments. We find that the SCAN functional is generally more accurate than the other two functionals for liquid water by not only capturing the intermediate-range vdW interactions but also mitigating the overly strong hydrogen bonds prescribed in PBE simulations. We also compare the results of SCAN-based AIMD simulations in the canonical and isothermal-isobaric ensembles. Our results suggest that SCAN provides a reliable description for most structural, electronic, and dynamical properties in liquid water.

  14. Novel Breast Cancer Therapeutics Based on Bacterial Cupredoxin

    DTIC Science & Technology

    2008-09-01

    M. and Lim, C. (1999) Exploring the dynamic information content of a protein NMR structure: comparison of a molecular dynamics simulation with the...crowding has structural effects on the folded ensemble of polypeptides. energy landscape theory excluded volume effect molecular simulations protein... molecular simulations (51). Thermo- dynamic properties such as the radius of gyration (Rg), shape parameters ( and S) (11), and the fraction of native

  15. Green-Kubo relations for the viscosity of biaxial nematic liquid crystals

    NASA Astrophysics Data System (ADS)

    Sarman, Sten

    1996-09-01

    We derive Green-Kubo relations for the viscosities of a biaxial nematic liquid crystal. In this system there are seven shear viscosities, three twist viscosities, and three cross coupling coefficients between the antisymmetric strain rate and the symmetric traceless pressure tensor. According to the Onsager reciprocity relations these couplings are equal to the cross couplings between the symmetric traceless strain rate and the antisymmetric pressure. Our method is based on a comparison of the microscopic linear response generated by the SLLOD equations of motion for planar Couette flow (so named because of their close connection to the Doll's tensor Hamiltonian) and the macroscopic linear phenomenological relations between the pressure tensor and the strain rate. In order to obtain simple Green-Kubo relations we employ an equilibrium ensemble where the angular velocities of the directors are identically zero. This is achieved by adding constraint torques to the equations for the molecular angular accelerations. One finds that all the viscosity coefficients can be expressed as linear combinations of time correlation function integrals (TCFIs). This is much simpler compared to the expressions in the conventional canonical ensemble, where the viscosities are complicated rational functions of the TCFIs. The reason for this is, that in the constrained angular velocity ensemble, the thermodynamic forces are given external parameters whereas the thermodynamic fluxes are ensemble averages of phase functions. This is not the case in the canonical ensemble. The simplest way of obtaining numerical estimates of viscosity coefficients of a particular molecular model system is to evaluate these fluctuation relations by equilibrium molecular dynamics simulations.

  16. Supramolecular chemistry-general principles and selected examples from anion recognition and metallosupramolecular chemistry.

    PubMed

    Albrecht, Markus

    2007-12-01

    This review gives an introduction into supramolecular chemistry describing in the first part general principles, focusing on terms like noncovalent interaction, molecular recognition, self-assembly, and supramolecular function. In the second part those will be illustrated by simple examples from our laboratories. Supramolecular chemistry is the science that bridges the gap between the world of molecules and nanotechnology. In supramolecular chemistry noncovalent interactions occur between molecular building blocks, which by molecular recognition and self-assembly form (functional) supramolecular entities. It is also termed the "chemistry of the noncovalent bond." Molecular recognition is based on geometrical complementarity based on the "key-and-lock" principle with nonshape-dependent effects, e.g., solvatization, being also highly influential. Self-assembly leads to the formation of well-defined aggregates. Hereby the overall structure of the target ensemble is controlled by the symmetry features of the certain building blocks. Finally, the aggregates can possess special properties or supramolecular functions, which are only found in the ensemble but not in the participating molecules. This review gives an introduction on supramolecular chemistry and illustrates the fundamental principles by recent examples from our group.

  17. Grand canonical ensemble Monte Carlo simulation of the dCpG/proflavine crystal hydrate.

    PubMed Central

    Resat, H; Mezei, M

    1996-01-01

    The grand canonical ensemble Monte Carlo molecular simulation method is used to investigate hydration patterns in the crystal hydrate structure of the dCpG/proflavine intercalated complex. The objective of this study is to show by example that the recently advocated grand canonical ensemble simulation is a computationally efficient method for determining the positions of the hydrating water molecules in protein and nucleic acid structures. A detailed molecular simulation convergence analysis and an analogous comparison of the theoretical results with experiments clearly show that the grand ensemble simulations can be far more advantageous than the comparable canonical ensemble simulations. Images FIGURE 5 FIGURE 7 PMID:8873992

  18. Motor-motor interactions in ensembles of muscle myosin: using theory to connect single molecule to ensemble measurements

    NASA Astrophysics Data System (ADS)

    Walcott, Sam

    2013-03-01

    Interactions between the proteins actin and myosin drive muscle contraction. Properties of a single myosin interacting with an actin filament are largely known, but a trillion myosins work together in muscle. We are interested in how single-molecule properties relate to ensemble function. Myosin's reaction rates depend on force, so ensemble models keep track of both molecular state and force on each molecule. These models make subtle predictions, e.g. that myosin, when part of an ensemble, moves actin faster than when isolated. This acceleration arises because forces between molecules speed reaction kinetics. Experiments support this prediction and allow parameter estimates. A model based on this analysis describes experiments from single molecule to ensemble. In vivo, actin is regulated by proteins that, when present, cause the binding of one myosin to speed the binding of its neighbors; binding becomes cooperative. Although such interactions preclude the mean field approximation, a set of linear ODEs describes these ensembles under simplified experimental conditions. In these experiments cooperativity is strong, with the binding of one molecule affecting ten neighbors on either side. We progress toward a description of myosin ensembles under physiological conditions.

  19. Wang-Landau Reaction Ensemble Method: Simulation of Weak Polyelectrolytes and General Acid-Base Reactions.

    PubMed

    Landsgesell, Jonas; Holm, Christian; Smiatek, Jens

    2017-02-14

    We present a novel method for the study of weak polyelectrolytes and general acid-base reactions in molecular dynamics and Monte Carlo simulations. The approach combines the advantages of the reaction ensemble and the Wang-Landau sampling method. Deprotonation and protonation reactions are simulated explicitly with the help of the reaction ensemble method, while the accurate sampling of the corresponding phase space is achieved by the Wang-Landau approach. The combination of both techniques provides a sufficient statistical accuracy such that meaningful estimates for the density of states and the partition sum can be obtained. With regard to these estimates, several thermodynamic observables like the heat capacity or reaction free energies can be calculated. We demonstrate that the computation times for the calculation of titration curves with a high statistical accuracy can be significantly decreased when compared to the original reaction ensemble method. The applicability of our approach is validated by the study of weak polyelectrolytes and their thermodynamic properties.

  20. Reproducing the Ensemble Average Polar Solvation Energy of a Protein from a Single Structure: Gaussian-Based Smooth Dielectric Function for Macromolecular Modeling.

    PubMed

    Chakravorty, Arghya; Jia, Zhe; Li, Lin; Zhao, Shan; Alexov, Emil

    2018-02-13

    Typically, the ensemble average polar component of solvation energy (ΔG polar solv ) of a macromolecule is computed using molecular dynamics (MD) or Monte Carlo (MC) simulations to generate conformational ensemble and then single/rigid conformation solvation energy calculation is performed on each snapshot. The primary objective of this work is to demonstrate that Poisson-Boltzmann (PB)-based approach using a Gaussian-based smooth dielectric function for macromolecular modeling previously developed by us (Li et al. J. Chem. Theory Comput. 2013, 9 (4), 2126-2136) can reproduce that ensemble average (ΔG polar solv ) of a protein from a single structure. We show that the Gaussian-based dielectric model reproduces the ensemble average ΔG polar solv (⟨ΔG polar solv ⟩) from an energy-minimized structure of a protein regardless of the minimization environment (structure minimized in vacuo, implicit or explicit waters, or crystal structure); the best case, however, is when it is paired with an in vacuo-minimized structure. In other minimization environments (implicit or explicit waters or crystal structure), the traditional two-dielectric model can still be selected with which the model produces correct solvation energies. Our observations from this work reflect how the ability to appropriately mimic the motion of residues, especially the salt bridge residues, influences a dielectric model's ability to reproduce the ensemble average value of polar solvation free energy from a single in vacuo-minimized structure.

  1. Adaptive sampling strategies with high-throughput molecular dynamics

    NASA Astrophysics Data System (ADS)

    Clementi, Cecilia

    Despite recent significant hardware and software developments, the complete thermodynamic and kinetic characterization of large macromolecular complexes by molecular simulations still presents significant challenges. The high dimensionality of these systems and the complexity of the associated potential energy surfaces (creating multiple metastable regions connected by high free energy barriers) does not usually allow to adequately sample the relevant regions of their configurational space by means of a single, long Molecular Dynamics (MD) trajectory. Several different approaches have been proposed to tackle this sampling problem. We focus on the development of ensemble simulation strategies, where data from a large number of weakly coupled simulations are integrated to explore the configurational landscape of a complex system more efficiently. Ensemble methods are of increasing interest as the hardware roadmap is now mostly based on increasing core counts, rather than clock speeds. The main challenge in the development of an ensemble approach for efficient sampling is in the design of strategies to adaptively distribute the trajectories over the relevant regions of the systems' configurational space, without using any a priori information on the system global properties. We will discuss the definition of smart adaptive sampling approaches that can redirect computational resources towards unexplored yet relevant regions. Our approaches are based on new developments in dimensionality reduction for high dimensional dynamical systems, and optimal redistribution of resources. NSF CHE-1152344, NSF CHE-1265929, Welch Foundation C-1570.

  2. Unbiased Rare Event Sampling in Spatial Stochastic Systems Biology Models Using a Weighted Ensemble of Trajectories

    PubMed Central

    Donovan, Rory M.; Tapia, Jose-Juan; Sullivan, Devin P.; Faeder, James R.; Murphy, Robert F.; Dittrich, Markus; Zuckerman, Daniel M.

    2016-01-01

    The long-term goal of connecting scales in biological simulation can be facilitated by scale-agnostic methods. We demonstrate that the weighted ensemble (WE) strategy, initially developed for molecular simulations, applies effectively to spatially resolved cell-scale simulations. The WE approach runs an ensemble of parallel trajectories with assigned weights and uses a statistical resampling strategy of replicating and pruning trajectories to focus computational effort on difficult-to-sample regions. The method can also generate unbiased estimates of non-equilibrium and equilibrium observables, sometimes with significantly less aggregate computing time than would be possible using standard parallelization. Here, we use WE to orchestrate particle-based kinetic Monte Carlo simulations, which include spatial geometry (e.g., of organelles, plasma membrane) and biochemical interactions among mobile molecular species. We study a series of models exhibiting spatial, temporal and biochemical complexity and show that although WE has important limitations, it can achieve performance significantly exceeding standard parallel simulation—by orders of magnitude for some observables. PMID:26845334

  3. Combining Rosetta with molecular dynamics (MD): A benchmark of the MD-based ensemble protein design.

    PubMed

    Ludwiczak, Jan; Jarmula, Adam; Dunin-Horkawicz, Stanislaw

    2018-07-01

    Computational protein design is a set of procedures for computing amino acid sequences that will fold into a specified structure. Rosetta Design, a commonly used software for protein design, allows for the effective identification of sequences compatible with a given backbone structure, while molecular dynamics (MD) simulations can thoroughly sample near-native conformations. We benchmarked a procedure in which Rosetta design is started on MD-derived structural ensembles and showed that such a combined approach generates 20-30% more diverse sequences than currently available methods with only a slight increase in computation time. Importantly, the increase in diversity is achieved without a loss in the quality of the designed sequences assessed by their resemblance to natural sequences. We demonstrate that the MD-based procedure is also applicable to de novo design tasks started from backbone structures without any sequence information. In addition, we implemented a protocol that can be used to assess the stability of designed models and to select the best candidates for experimental validation. In sum our results demonstrate that the MD ensemble-based flexible backbone design can be a viable method for protein design, especially for tasks that require a large pool of diverse sequences. Copyright © 2018 Elsevier Inc. All rights reserved.

  4. Force-momentum-based self-guided Langevin dynamics: A rapid sampling method that approaches the canonical ensemble

    NASA Astrophysics Data System (ADS)

    Wu, Xiongwu; Brooks, Bernard R.

    2011-11-01

    The self-guided Langevin dynamics (SGLD) is a method to accelerate conformational searching. This method is unique in the way that it selectively enhances and suppresses molecular motions based on their frequency to accelerate conformational searching without modifying energy surfaces or raising temperatures. It has been applied to studies of many long time scale events, such as protein folding. Recent progress in the understanding of the conformational distribution in SGLD simulations makes SGLD also an accurate method for quantitative studies. The SGLD partition function provides a way to convert the SGLD conformational distribution to the canonical ensemble distribution and to calculate ensemble average properties through reweighting. Based on the SGLD partition function, this work presents a force-momentum-based self-guided Langevin dynamics (SGLDfp) simulation method to directly sample the canonical ensemble. This method includes interaction forces in its guiding force to compensate the perturbation caused by the momentum-based guiding force so that it can approximately sample the canonical ensemble. Using several example systems, we demonstrate that SGLDfp simulations can approximately maintain the canonical ensemble distribution and significantly accelerate conformational searching. With optimal parameters, SGLDfp and SGLD simulations can cross energy barriers of more than 15 kT and 20 kT, respectively, at similar rates for LD simulations to cross energy barriers of 10 kT. The SGLDfp method is size extensive and works well for large systems. For studies where preserving accessible conformational space is critical, such as free energy calculations and protein folding studies, SGLDfp is an efficient approach to search and sample the conformational space.

  5. Molecular dynamic simulations of selective self-diffusion of CH4/CO2/H2O/N2 in coal

    NASA Astrophysics Data System (ADS)

    Song, Y.; Jiang, B.; Li, F. L.

    2017-06-01

    The self-diffusion coefficients (D) of CH4/CO2/H2O/N2 at a relatively broad range of temperatures(298.15∼ 458.15K)and pressures (1∼6MPa) under the NPT, NPH, NVE, and NVT ensembles were obtained after the calculations of molecular mechanics(MM), annealing kinetics(AK), giant canonical Monte Carlo(GCMC), and molecular dynamics (MD) based on Wiser bituminous coal model (WM). The Ds of the adsorbates at the saturated adsorption configurations are D CH4418K. The average swelling ratios manifest as H2O (14.7∼35.18%)>CO2 (13.38∼32.25%)>CH4 (15.35∼23.71%)> N2 (11.47∼22.14%) (NPH, 1∼6MPa). There exits differences in D, swelling ratios and E among various ensembles, indicating that the selection of ensembles has an important influence on the MD calculations for self-diffusion coefficients.

  6. Effects of ensembles on methane hydrate nucleation kinetics.

    PubMed

    Zhang, Zhengcai; Liu, Chan-Juan; Walsh, Matthew R; Guo, Guang-Jun

    2016-06-21

    By performing molecular dynamics simulations to form a hydrate with a methane nano-bubble in liquid water at 250 K and 50 MPa, we report how different ensembles, such as the NPT, NVT, and NVE ensembles, affect the nucleation kinetics of the methane hydrate. The nucleation trajectories are monitored using the face-saturated incomplete cage analysis (FSICA) and the mutually coordinated guest (MCG) order parameter (OP). The nucleation rate and the critical nucleus are obtained using the mean first-passage time (MFPT) method based on the FS cages and the MCG-1 OPs, respectively. The fitting results of MFPT show that hydrate nucleation and growth are coupled together, consistent with the cage adsorption hypothesis which emphasizes that the cage adsorption of methane is a mechanism for both hydrate nucleation and growth. For the three different ensembles, the hydrate nucleation rate is quantitatively ordered as follows: NPT > NVT > NVE, while the sequence of hydrate crystallinity is exactly reversed. However, the largest size of the critical nucleus appears in the NVT ensemble, rather than in the NVE ensemble. These results are helpful for choosing a suitable ensemble when to study hydrate formation via computer simulations, and emphasize the importance of the order degree of the critical nucleus.

  7. Ensemble pharmacophore meets ensemble docking: a novel screening strategy for the identification of RIPK1 inhibitors

    NASA Astrophysics Data System (ADS)

    Fayaz, S. M.; Rajanikant, G. K.

    2014-07-01

    Programmed cell death has been a fascinating area of research since it throws new challenges and questions in spite of the tremendous ongoing research in this field. Recently, necroptosis, a programmed form of necrotic cell death, has been implicated in many diseases including neurological disorders. Receptor interacting serine/threonine protein kinase 1 (RIPK1) is an important regulatory protein involved in the necroptosis and inhibition of this protein is essential to stop necroptotic process and eventually cell death. Current structure-based virtual screening methods involve a wide range of strategies and recently, considering the multiple protein structures for pharmacophore extraction has been emphasized as a way to improve the outcome. However, using the pharmacophoric information completely during docking is very important. Further, in such methods, using the appropriate protein structures for docking is desirable. If not, potential compound hits, obtained through pharmacophore-based screening, may not have correct ranks and scores after docking. Therefore, a comprehensive integration of different ensemble methods is essential, which may provide better virtual screening results. In this study, dual ensemble screening, a novel computational strategy was used to identify diverse and potent inhibitors against RIPK1. All the pharmacophore features present in the binding site were captured using both the apo and holo protein structures and an ensemble pharmacophore was built by combining these features. This ensemble pharmacophore was employed in pharmacophore-based screening of ZINC database. The compound hits, thus obtained, were subjected to ensemble docking. The leads acquired through docking were further validated through feature evaluation and molecular dynamics simulation.

  8. Bayesian ensemble refinement by replica simulations and reweighting.

    PubMed

    Hummer, Gerhard; Köfinger, Jürgen

    2015-12-28

    We describe different Bayesian ensemble refinement methods, examine their interrelation, and discuss their practical application. With ensemble refinement, the properties of dynamic and partially disordered (bio)molecular structures can be characterized by integrating a wide range of experimental data, including measurements of ensemble-averaged observables. We start from a Bayesian formulation in which the posterior is a functional that ranks different configuration space distributions. By maximizing this posterior, we derive an optimal Bayesian ensemble distribution. For discrete configurations, this optimal distribution is identical to that obtained by the maximum entropy "ensemble refinement of SAXS" (EROS) formulation. Bayesian replica ensemble refinement enhances the sampling of relevant configurations by imposing restraints on averages of observables in coupled replica molecular dynamics simulations. We show that the strength of the restraints should scale linearly with the number of replicas to ensure convergence to the optimal Bayesian result in the limit of infinitely many replicas. In the "Bayesian inference of ensembles" method, we combine the replica and EROS approaches to accelerate the convergence. An adaptive algorithm can be used to sample directly from the optimal ensemble, without replicas. We discuss the incorporation of single-molecule measurements and dynamic observables such as relaxation parameters. The theoretical analysis of different Bayesian ensemble refinement approaches provides a basis for practical applications and a starting point for further investigations.

  9. Bayesian ensemble refinement by replica simulations and reweighting

    NASA Astrophysics Data System (ADS)

    Hummer, Gerhard; Köfinger, Jürgen

    2015-12-01

    We describe different Bayesian ensemble refinement methods, examine their interrelation, and discuss their practical application. With ensemble refinement, the properties of dynamic and partially disordered (bio)molecular structures can be characterized by integrating a wide range of experimental data, including measurements of ensemble-averaged observables. We start from a Bayesian formulation in which the posterior is a functional that ranks different configuration space distributions. By maximizing this posterior, we derive an optimal Bayesian ensemble distribution. For discrete configurations, this optimal distribution is identical to that obtained by the maximum entropy "ensemble refinement of SAXS" (EROS) formulation. Bayesian replica ensemble refinement enhances the sampling of relevant configurations by imposing restraints on averages of observables in coupled replica molecular dynamics simulations. We show that the strength of the restraints should scale linearly with the number of replicas to ensure convergence to the optimal Bayesian result in the limit of infinitely many replicas. In the "Bayesian inference of ensembles" method, we combine the replica and EROS approaches to accelerate the convergence. An adaptive algorithm can be used to sample directly from the optimal ensemble, without replicas. We discuss the incorporation of single-molecule measurements and dynamic observables such as relaxation parameters. The theoretical analysis of different Bayesian ensemble refinement approaches provides a basis for practical applications and a starting point for further investigations.

  10. Nullspace Sampling with Holonomic Constraints Reveals Molecular Mechanisms of Protein Gαs.

    PubMed

    Pachov, Dimitar V; van den Bedem, Henry

    2015-07-01

    Proteins perform their function or interact with partners by exchanging between conformational substates on a wide range of spatiotemporal scales. Structurally characterizing these exchanges is challenging, both experimentally and computationally. Large, diffusional motions are often on timescales that are difficult to access with molecular dynamics simulations, especially for large proteins and their complexes. The low frequency modes of normal mode analysis (NMA) report on molecular fluctuations associated with biological activity. However, NMA is limited to a second order expansion about a minimum of the potential energy function, which limits opportunities to observe diffusional motions. By contrast, kino-geometric conformational sampling (KGS) permits large perturbations while maintaining the exact geometry of explicit conformational constraints, such as hydrogen bonds. Here, we extend KGS and show that a conformational ensemble of the α subunit Gαs of heterotrimeric stimulatory protein Gs exhibits structural features implicated in its activation pathway. Activation of protein Gs by G protein-coupled receptors (GPCRs) is associated with GDP release and large conformational changes of its α-helical domain. Our method reveals a coupled α-helical domain opening motion while, simultaneously, Gαs helix α5 samples an activated conformation. These motions are moderated in the activated state. The motion centers on a dynamic hub near the nucleotide-binding site of Gαs, and radiates to helix α4. We find that comparative NMA-based ensembles underestimate the amplitudes of the motion. Additionally, the ensembles fall short in predicting the accepted direction of the full activation pathway. Taken together, our findings suggest that nullspace sampling with explicit, holonomic constraints yields ensembles that illuminate molecular mechanisms involved in GDP release and protein Gs activation, and further establish conformational coupling between key structural elements of Gαs.

  11. Nullspace Sampling with Holonomic Constraints Reveals Molecular Mechanisms of Protein Gαs

    PubMed Central

    Pachov, Dimitar V.; van den Bedem, Henry

    2015-01-01

    Proteins perform their function or interact with partners by exchanging between conformational substates on a wide range of spatiotemporal scales. Structurally characterizing these exchanges is challenging, both experimentally and computationally. Large, diffusional motions are often on timescales that are difficult to access with molecular dynamics simulations, especially for large proteins and their complexes. The low frequency modes of normal mode analysis (NMA) report on molecular fluctuations associated with biological activity. However, NMA is limited to a second order expansion about a minimum of the potential energy function, which limits opportunities to observe diffusional motions. By contrast, kino-geometric conformational sampling (KGS) permits large perturbations while maintaining the exact geometry of explicit conformational constraints, such as hydrogen bonds. Here, we extend KGS and show that a conformational ensemble of the α subunit Gαs of heterotrimeric stimulatory protein Gs exhibits structural features implicated in its activation pathway. Activation of protein Gs by G protein-coupled receptors (GPCRs) is associated with GDP release and large conformational changes of its α-helical domain. Our method reveals a coupled α-helical domain opening motion while, simultaneously, Gαs helix α5 samples an activated conformation. These motions are moderated in the activated state. The motion centers on a dynamic hub near the nucleotide-binding site of Gαs, and radiates to helix α4. We find that comparative NMA-based ensembles underestimate the amplitudes of the motion. Additionally, the ensembles fall short in predicting the accepted direction of the full activation pathway. Taken together, our findings suggest that nullspace sampling with explicit, holonomic constraints yields ensembles that illuminate molecular mechanisms involved in GDP release and protein Gs activation, and further establish conformational coupling between key structural elements of Gαs. PMID:26218073

  12. In Silico Design of Smart Binders to Anthrax PA

    DTIC Science & Technology

    2012-09-01

    nanosecond(ns) molecular dynamics simulation in the NPT ensemble (constant particle number, pressure, and temperature) at 300K, with the CHARMM force...protective antigen (PA). Before the docking runs, the DS23 peptide was simulated using molecular dynamics to generate an ensemble of structures...structure), we do not see a large amount of structural change when using molecular dynamics after Rosetta docking. We note that this RMSD does not take

  13. Knowledge-Based Methods To Train and Optimize Virtual Screening Ensembles

    PubMed Central

    2016-01-01

    Ensemble docking can be a successful virtual screening technique that addresses the innate conformational heterogeneity of macromolecular drug targets. Yet, lacking a method to identify a subset of conformational states that effectively segregates active and inactive small molecules, ensemble docking may result in the recommendation of a large number of false positives. Here, three knowledge-based methods that construct structural ensembles for virtual screening are presented. Each method selects ensembles by optimizing an objective function calculated using the receiver operating characteristic (ROC) curve: either the area under the ROC curve (AUC) or a ROC enrichment factor (EF). As the number of receptor conformations, N, becomes large, the methods differ in their asymptotic scaling. Given a set of small molecules with known activities and a collection of target conformations, the most resource intense method is guaranteed to find the optimal ensemble but scales as O(2N). A recursive approximation to the optimal solution scales as O(N2), and a more severe approximation leads to a faster method that scales linearly, O(N). The techniques are generally applicable to any system, and we demonstrate their effectiveness on the androgen nuclear hormone receptor (AR), cyclin-dependent kinase 2 (CDK2), and the peroxisome proliferator-activated receptor δ (PPAR-δ) drug targets. Conformations that consisted of a crystal structure and molecular dynamics simulation cluster centroids were used to form AR and CDK2 ensembles. Multiple available crystal structures were used to form PPAR-δ ensembles. For each target, we show that the three methods perform similarly to one another on both the training and test sets. PMID:27097522

  14. Canonical-ensemble extended Lagrangian Born-Oppenheimer molecular dynamics for the linear scaling density functional theory.

    PubMed

    Hirakawa, Teruo; Suzuki, Teppei; Bowler, David R; Miyazaki, Tsuyoshi

    2017-10-11

    We discuss the development and implementation of a constant temperature (NVT) molecular dynamics scheme that combines the Nosé-Hoover chain thermostat with the extended Lagrangian Born-Oppenheimer molecular dynamics (BOMD) scheme, using a linear scaling density functional theory (DFT) approach. An integration scheme for this canonical-ensemble extended Lagrangian BOMD is developed and discussed in the context of the Liouville operator formulation. Linear scaling DFT canonical-ensemble extended Lagrangian BOMD simulations are tested on bulk silicon and silicon carbide systems to evaluate our integration scheme. The results show that the conserved quantity remains stable with no systematic drift even in the presence of the thermostat.

  15. High-Temperature unfolding of a trp-Cage mini-protein: a molecular dynamics simulation study

    PubMed Central

    Seshasayee, Aswin Sai Narain

    2005-01-01

    Background Trp cage is a recently-constructed fast-folding miniprotein. It consists of a short helix, a 3,10 helix and a C-terminal poly-proline that packs against a Trp in the alpha helix. It is known to fold within 4 ns. Results High-temperature unfolding molecular dynamics simulations of the Trp cage miniprotein have been carried out in explicit water using the OPLS-AA force-field incorporated in the program GROMACS. The radius of gyration (Rg) and Root Mean Square Deviation (RMSD) have been used as order parameters to follow the unfolding process. Distributions of Rg were used to identify ensembles. Conclusion Three ensembles could be identified. While the native-state ensemble shows an Rg distribution that is slightly skewed, the second ensemble, which is presumably the Transition State Ensemble (TSE), shows an excellent fit. The denatured ensemble shows large fluctuations, but a Gaussian curve could be fitted. This means that the unfolding process is two-state. Representative structures from each of these ensembles are presented here. PMID:15760474

  16. Weaving colloidal webs around droplets: spontaneous assembly of extended colloidal networks encasing microfluidic droplet ensembles.

    PubMed

    Zheng, Lu; Ho, Leon Yoon; Khan, Saif A

    2016-10-26

    The ability to form transient, self-assembling solid networks that 'cocoon' emulsion droplets on-demand allows new possibilities in the rapidly expanding area of microfluidic droplet-based materials science. In this communication, we demonstrate the spontaneous formation of extended colloidal networks that encase large microfluidic droplet ensembles, thus completely arresting droplet motion and effectively isolating each droplet from others in the ensemble. To do this, we employ molecular inclusion complexes of β-cyclodextrin, which spontaneously form and assemble into colloidal solids at the droplet interface and beyond, via the outward diffusion of a guest molecule (dichloromethane) from the droplets. We illustrate the advantage of such transient network-based droplet stabilization in the area of pharmaceutical crystallization, where we are able to fabricate monodisperse spherical crystalline microgranules of 5-methyl-2-[(2-nitrophenyl)amino]-3-thiophenecarbonitrile (ROY), a model hydrophobic drug, with a dramatic enhancement of particle properties compared to conventional methods.

  17. Sampling the isothermal-isobaric ensemble by Langevin dynamics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gao, Xingyu; Institute of Applied Physics and Computational Mathematics, Fenghao East Road 2, Beijing 100094; CAEP Software Center for High Performance Numerical Simulation, Huayuan Road 6, Beijing 100088

    2016-03-28

    We present a new method of conducting fully flexible-cell molecular dynamics simulation in isothermal-isobaric ensemble based on Langevin equations of motion. The stochastic coupling to all particle and cell degrees of freedoms is introduced in a correct way, in the sense that the stationary configurational distribution is proved to be consistent with that of the isothermal-isobaric ensemble. In order to apply the proposed method in computer simulations, a second order symmetric numerical integration scheme is developed by Trotter’s splitting of the single-step propagator. Moreover, a practical guide of choosing working parameters is suggested for user specified thermo- and baro-coupling timemore » scales. The method and software implementation are carefully validated by a numerical example.« less

  18. Combined Monte Carlo/torsion-angle molecular dynamics for ensemble modeling of proteins, nucleic acids and carbohydrates.

    PubMed

    Zhang, Weihong; Howell, Steven C; Wright, David W; Heindel, Andrew; Qiu, Xiangyun; Chen, Jianhan; Curtis, Joseph E

    2017-05-01

    We describe a general method to use Monte Carlo simulation followed by torsion-angle molecular dynamics simulations to create ensembles of structures to model a wide variety of soft-matter biological systems. Our particular emphasis is focused on modeling low-resolution small-angle scattering and reflectivity structural data. We provide examples of this method applied to HIV-1 Gag protein and derived fragment proteins, TraI protein, linear B-DNA, a nucleosome core particle, and a glycosylated monoclonal antibody. This procedure will enable a large community of researchers to model low-resolution experimental data with greater accuracy by using robust physics based simulation and sampling methods which are a significant improvement over traditional methods used to interpret such data. Published by Elsevier Inc.

  19. Protein-protein structure prediction by scoring molecular dynamics trajectories of putative poses.

    PubMed

    Sarti, Edoardo; Gladich, Ivan; Zamuner, Stefano; Correia, Bruno E; Laio, Alessandro

    2016-09-01

    The prediction of protein-protein interactions and their structural configuration remains a largely unsolved problem. Most of the algorithms aimed at finding the native conformation of a protein complex starting from the structure of its monomers are based on searching the structure corresponding to the global minimum of a suitable scoring function. However, protein complexes are often highly flexible, with mobile side chains and transient contacts due to thermal fluctuations. Flexibility can be neglected if one aims at finding quickly the approximate structure of the native complex, but may play a role in structure refinement, and in discriminating solutions characterized by similar scores. We here benchmark the capability of some state-of-the-art scoring functions (BACH-SixthSense, PIE/PISA and Rosetta) in discriminating finite-temperature ensembles of structures corresponding to the native state and to non-native configurations. We produce the ensembles by running thousands of molecular dynamics simulations in explicit solvent starting from poses generated by rigid docking and optimized in vacuum. We find that while Rosetta outperformed the other two scoring functions in scoring the structures in vacuum, BACH-SixthSense and PIE/PISA perform better in distinguishing near-native ensembles of structures generated by molecular dynamics in explicit solvent. Proteins 2016; 84:1312-1320. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  20. Crystal molecular dynamics simulations to speed up MM/PB(GB)SA evaluation of binding free energies of di-mannose deoxy analogs with P51G-m4-Cyanovirin-N.

    PubMed

    Vorontsov, Ivan I; Miyashita, Osamu

    2011-04-30

    Complexes of two Cyanovirin-N (CVN) mutants, m4-CVN and P51G-m4-CVN, with deoxy di-mannose analogs were employed as models to generate conformational ensembles using explicit water Molecular Dynamics (MD) simulations in solution and in crystal environment. The results were utilized for evaluation of binding free energies with the molecular mechanics Poisson-Boltzmann (or Generalized Born) surface area, MM/PB(GB)SA, methods. The calculations provided the ranking of deoxy di-mannose ligands affinity in agreement with available qualitative experimental evidences. This confirms the importance of the hydrogen-bond network between di-mannose 3'- and 4'-hydroxyl groups and the protein binding site B(M) as a basis of the CVN activity as an effective HIV fusion inhibitor. Comparison of binding free energies averaged over snapshots from the solution and crystal simulations showed high promises in the use of the crystal matrix for acceleration of the conformational ensemble generation, the most time consuming step in MM/PB(GB)SA approach. Correlation between energy values based on solution versus crystal ensembles is 0.95 for both MM/PBSA and MM/GBSA methods. Copyright © 2010 Wiley Periodicals, Inc.

  1. SQUEEZE-E: The Optimal Solution for Molecular Simulations with Periodic Boundary Conditions.

    PubMed

    Wassenaar, Tsjerk A; de Vries, Sjoerd; Bonvin, Alexandre M J J; Bekker, Henk

    2012-10-09

    In molecular simulations of macromolecules, it is desirable to limit the amount of solvent in the system to avoid spending computational resources on uninteresting solvent-solvent interactions. As a consequence, periodic boundary conditions are commonly used, with a simulation box chosen as small as possible, for a given minimal distance between images. Here, we describe how such a simulation cell can be set up for ensembles, taking into account a priori available or estimable information regarding conformational flexibility. Doing so ensures that any conformation present in the input ensemble will satisfy the distance criterion during the simulation. This helps avoid periodicity artifacts due to conformational changes. The method introduces three new approaches in computational geometry: (1) The first is the derivation of an optimal packing of ensembles, for which the mathematical framework is described. (2) A new method for approximating the α-hull and the contact body for single bodies and ensembles is presented, which is orders of magnitude faster than existing routines, allowing the calculation of packings of large ensembles and/or large bodies. 3. A routine is described for searching a combination of three vectors on a discretized contact body forming a reduced base for a lattice with minimal cell volume. The new algorithms reduce the time required to calculate packings of single bodies from minutes or hours to seconds. The use and efficacy of the method is demonstrated for ensembles obtained from NMR, MD simulations, and elastic network modeling. An implementation of the method has been made available online at http://haddock.chem.uu.nl/services/SQUEEZE/ and has been made available as an option for running simulations through the weNMR GRID MD server at http://haddock.science.uu.nl/enmr/services/GROMACS/main.php .

  2. Multiensemble Markov models of molecular thermodynamics and kinetics.

    PubMed

    Wu, Hao; Paul, Fabian; Wehmeyer, Christoph; Noé, Frank

    2016-06-07

    We introduce the general transition-based reweighting analysis method (TRAM), a statistically optimal approach to integrate both unbiased and biased molecular dynamics simulations, such as umbrella sampling or replica exchange. TRAM estimates a multiensemble Markov model (MEMM) with full thermodynamic and kinetic information at all ensembles. The approach combines the benefits of Markov state models-clustering of high-dimensional spaces and modeling of complex many-state systems-with those of the multistate Bennett acceptance ratio of exploiting biased or high-temperature ensembles to accelerate rare-event sampling. TRAM does not depend on any rate model in addition to the widely used Markov state model approximation, but uses only fundamental relations such as detailed balance and binless reweighting of configurations between ensembles. Previous methods, including the multistate Bennett acceptance ratio, discrete TRAM, and Markov state models are special cases and can be derived from the TRAM equations. TRAM is demonstrated by efficiently computing MEMMs in cases where other estimators break down, including the full thermodynamics and rare-event kinetics from high-dimensional simulation data of an all-atom protein-ligand binding model.

  3. Multiensemble Markov models of molecular thermodynamics and kinetics

    PubMed Central

    Wu, Hao; Paul, Fabian; Noé, Frank

    2016-01-01

    We introduce the general transition-based reweighting analysis method (TRAM), a statistically optimal approach to integrate both unbiased and biased molecular dynamics simulations, such as umbrella sampling or replica exchange. TRAM estimates a multiensemble Markov model (MEMM) with full thermodynamic and kinetic information at all ensembles. The approach combines the benefits of Markov state models—clustering of high-dimensional spaces and modeling of complex many-state systems—with those of the multistate Bennett acceptance ratio of exploiting biased or high-temperature ensembles to accelerate rare-event sampling. TRAM does not depend on any rate model in addition to the widely used Markov state model approximation, but uses only fundamental relations such as detailed balance and binless reweighting of configurations between ensembles. Previous methods, including the multistate Bennett acceptance ratio, discrete TRAM, and Markov state models are special cases and can be derived from the TRAM equations. TRAM is demonstrated by efficiently computing MEMMs in cases where other estimators break down, including the full thermodynamics and rare-event kinetics from high-dimensional simulation data of an all-atom protein–ligand binding model. PMID:27226302

  4. The Flexible C-terminal Arm of the Lassa Arenavirus Z-Protein Mediates Interactions with Multiple Binding Partners

    PubMed Central

    May, Eric R.; Armen, Roger S.; Mannan, Aristotle M.; Brooks, Charles L.

    2010-01-01

    The arenavirus genome encodes for a Z-protein, which contains a RING domain that coordinates two zinc ions, and has been identified as having several functional roles at various stages of the virus life cycle. Z-protein binds to multiple host proteins and has been directly implicated in the promotion of viral budding, repression of mRNA translation and apoptosis of infected cells. Using homology models of the Z-protein from Lassa strain arenavirus, replica exchange molecular dynamics were employed to refine the structures, which were then subsequently clustered. Population weighted ensembles of low energy cluster representatives were predicted based upon optimal agreement of the chemical shifts computed with the SPARTA program with the experimental NMR chemical shifts. A member of the refined ensemble was indentified to be a potential binder of budding factor Tsg101 based on its correspondence to the structure of the HIV-1 Gag late domain when bound to Tsg101. Members of these ensembles were docked against the crystal structure of human eIF4E translation initiation factor. Two plausible binding modes emerged based upon their agreement with experimental observation, favorable interaction energies and stability during molecular dynamics trajectories. Mutations to Z are proposed that would either inhibit both binding mechanisms or selectively inhibit only one mode. The C-terminal domain conformation of the most populated member of the representative ensemble shielded protein binding recognition motifs for Tsg101 and eIF4E, and represents the most populated state free in solution. We propose that C-terminal flexibility is key for mediating the different functional states of the Z-protein. PMID:20544962

  5. RNA unrestrained molecular dynamics ensemble improves agreement with experimental NMR data compared to single static structure: a test case

    NASA Astrophysics Data System (ADS)

    Beckman, Robert A.; Moreland, David; Louise-May, Shirley; Humblet, Christine

    2006-05-01

    Nuclear magnetic resonance (NMR) provides structural and dynamic information reflecting an average, often non-linear, of multiple solution-state conformations. Therefore, a single optimized structure derived from NMR refinement may be misleading if the NMR data actually result from averaging of distinct conformers. It is hypothesized that a conformational ensemble generated by a valid molecular dynamics (MD) simulation should be able to improve agreement with the NMR data set compared with the single optimized starting structure. Using a model system consisting of two sequence-related self-complementary ribonucleotide octamers for which NMR data was available, 0.3 ns particle mesh Ewald MD simulations were performed in the AMBER force field in the presence of explicit water and counterions. Agreement of the averaged properties of the molecular dynamics ensembles with NMR data such as homonuclear proton nuclear Overhauser effect (NOE)-based distance constraints, homonuclear proton and heteronuclear 1H-31P coupling constant ( J) data, and qualitative NMR information on hydrogen bond occupancy, was systematically assessed. Despite the short length of the simulation, the ensemble generated from it agreed with the NMR experimental constraints more completely than the single optimized NMR structure. This suggests that short unrestrained MD simulations may be of utility in interpreting NMR results. As expected, a 0.5 ns simulation utilizing a distance dependent dielectric did not improve agreement with the NMR data, consistent with its inferior exploration of conformational space as assessed by 2-D RMSD plots. Thus, ability to rapidly improve agreement with NMR constraints may be a sensitive diagnostic of the MD methods themselves.

  6. Density functional theory calculation of refractive indices of liquid-forming silicon oil compounds

    NASA Astrophysics Data System (ADS)

    Lee, Sanghun; Park, Sung Soo; Hagelberg, Frank

    2012-02-01

    A combination of quantum chemical calculation and molecular dynamics simulation is applied to compute refractive indices of liquid-forming silicon oils. The densities of these species are obtained from molecular dynamics simulations based on the NPT ensemble while the molecular polarizabilities are evaluated by density functional theory. This procedure is shown to yield results well compatible with available experimental data, suggesting that it represents a robust and economic route for determining the refractive indices of liquid-forming organic complexes containing silicon.

  7. Nanosecond to submillisecond dynamics in dye-labeled single-stranded DNA, as revealed by ensemble measurements and photon statistics at single-molecule level.

    PubMed

    Kaji, Takahiro; Ito, Syoji; Iwai, Shigenori; Miyasaka, Hiroshi

    2009-10-22

    Single-molecule and ensemble time-resolved fluorescence measurements were applied for the investigation of the conformational dynamics of single-stranded DNA, ssDNA, connected with a fluorescein dye by a C6 linker, where the motions both of DNA and the C6 linker affect the geometry of the system. From the ensemble measurement of the fluorescence quenching via photoinduced electron transfer with a guanine base in the DNA sequence, three main conformations were found in aqueous solution: a conformation unaffected by the guanine base in the excited state lifetime of fluorescein, a conformation in which the fluorescence is dynamically quenched in the excited-state lifetime, and a conformation leading to rapid quenching via nonfluorescent complex. The analysis by using the parameters acquired from the ensemble measurements for interphoton time distribution histograms and FCS autocorrelations by the single-molecule measurement revealed that interconversion in these three conformations took place with two characteristic time constants of several hundreds of nanoseconds and tens of microseconds. The advantage of the combination use of the ensemble measurements with the single-molecule detections for rather complex dynamic motions is discussed by integrating the experimental results with those obtained by molecular dynamics simulation.

  8. A benchmark for reaction coordinates in the transition path ensemble

    PubMed Central

    2016-01-01

    The molecular mechanism of a reaction is embedded in its transition path ensemble, the complete collection of reactive trajectories. Utilizing the information in the transition path ensemble alone, we developed a novel metric, which we termed the emergent potential energy, for distinguishing reaction coordinates from the bath modes. The emergent potential energy can be understood as the average energy cost for making a displacement of a coordinate in the transition path ensemble. Where displacing a bath mode invokes essentially no cost, it costs significantly to move the reaction coordinate. Based on some general assumptions of the behaviors of reaction and bath coordinates in the transition path ensemble, we proved theoretically with statistical mechanics that the emergent potential energy could serve as a benchmark of reaction coordinates and demonstrated its effectiveness by applying it to a prototypical system of biomolecular dynamics. Using the emergent potential energy as guidance, we developed a committor-free and intuition-independent method for identifying reaction coordinates in complex systems. We expect this method to be applicable to a wide range of reaction processes in complex biomolecular systems. PMID:27059559

  9. Characterizing rare-event property distributions via replicate molecular dynamics simulations of proteins.

    PubMed

    Krishnan, Ranjani; Walton, Emily B; Van Vliet, Krystyn J

    2009-11-01

    As computational resources increase, molecular dynamics simulations of biomolecules are becoming an increasingly informative complement to experimental studies. In particular, it has now become feasible to use multiple initial molecular configurations to generate an ensemble of replicate production-run simulations that allows for more complete characterization of rare events such as ligand-receptor unbinding. However, there are currently no explicit guidelines for selecting an ensemble of initial configurations for replicate simulations. Here, we use clustering analysis and steered molecular dynamics simulations to demonstrate that the configurational changes accessible in molecular dynamics simulations of biomolecules do not necessarily correlate with observed rare-event properties. This informs selection of a representative set of initial configurations. We also employ statistical analysis to identify the minimum number of replicate simulations required to sufficiently sample a given biomolecular property distribution. Together, these results suggest a general procedure for generating an ensemble of replicate simulations that will maximize accurate characterization of rare-event property distributions in biomolecules.

  10. Muscle activation described with a differential equation model for large ensembles of locally coupled molecular motors.

    PubMed

    Walcott, Sam

    2014-10-01

    Molecular motors, by turning chemical energy into mechanical work, are responsible for active cellular processes. Often groups of these motors work together to perform their biological role. Motors in an ensemble are coupled and exhibit complex emergent behavior. Although large motor ensembles can be modeled with partial differential equations (PDEs) by assuming that molecules function independently of their neighbors, this assumption is violated when motors are coupled locally. It is therefore unclear how to describe the ensemble behavior of the locally coupled motors responsible for biological processes such as calcium-dependent skeletal muscle activation. Here we develop a theory to describe locally coupled motor ensembles and apply the theory to skeletal muscle activation. The central idea is that a muscle filament can be divided into two phases: an active and an inactive phase. Dynamic changes in the relative size of these phases are described by a set of linear ordinary differential equations (ODEs). As the dynamics of the active phase are described by PDEs, muscle activation is governed by a set of coupled ODEs and PDEs, building on previous PDE models. With comparison to Monte Carlo simulations, we demonstrate that the theory captures the behavior of locally coupled ensembles. The theory also plausibly describes and predicts muscle experiments from molecular to whole muscle scales, suggesting that a micro- to macroscale muscle model is within reach.

  11. Gibbs Ensemble Simulations of the Solvent Swelling of Polymer Films

    NASA Astrophysics Data System (ADS)

    Gartner, Thomas; Epps, Thomas, III; Jayaraman, Arthi

    Solvent vapor annealing (SVA) is a useful technique to tune the morphology of block polymer, polymer blend, and polymer nanocomposite films. Despite SVA's utility, standardized SVA protocols have not been established, partly due to a lack of fundamental knowledge regarding the interplay between the polymer(s), solvent, substrate, and free-surface during solvent annealing and evaporation. An understanding of how to tune polymer film properties in a controllable manner through SVA processes is needed. Herein, the thermodynamic implications of the presence of solvent in the swollen polymer film is explored through two alternative Gibbs ensemble simulation methods that we have developed and extended: Gibbs ensemble molecular dynamics (GEMD) and hybrid Monte Carlo (MC)/molecular dynamics (MD). In this poster, we will describe these simulation methods and demonstrate their application to polystyrene films swollen by toluene and n-hexane. Polymer film swelling experiments, Gibbs ensemble molecular simulations, and polymer reference interaction site model (PRISM) theory are combined to calculate an effective Flory-Huggins χ (χeff) for polymer-solvent mixtures. The effects of solvent chemistry, solvent content, polymer molecular weight, and polymer architecture on χeff are examined, providing a platform to control and understand the thermodynamics of polymer film swelling.

  12. Monte Carlo and Molecular Dynamics in the Multicanonical Ensemble: Connections between Wang-Landau Sampling and Metadynamics

    NASA Astrophysics Data System (ADS)

    Vogel, Thomas; Perez, Danny; Junghans, Christoph

    2014-03-01

    We show direct formal relationships between the Wang-Landau iteration [PRL 86, 2050 (2001)], metadynamics [PNAS 99, 12562 (2002)] and statistical temperature molecular dynamics [PRL 97, 050601 (2006)], the major Monte Carlo and molecular dynamics work horses for sampling from a generalized, multicanonical ensemble. We aim at helping to consolidate the developments in the different areas by indicating how methodological advancements can be transferred in a straightforward way, avoiding the parallel, largely independent, developments tracks observed in the past.

  13. Peptidic Macrocycles - Conformational Sampling and Thermodynamic Characterization

    PubMed Central

    2018-01-01

    Macrocycles are of considerable interest as highly specific drug candidates, yet they challenge standard conformer generators with their large number of rotatable bonds and conformational restrictions. Here, we present a molecular dynamics-based routine that bypasses current limitations in conformational sampling and extensively profiles the free energy landscape of peptidic macrocycles in solution. We perform accelerated molecular dynamics simulations to capture a diverse conformational ensemble. By applying an energetic cutoff, followed by geometric clustering, we demonstrate the striking robustness and efficiency of the approach in identifying highly populated conformational states of cyclic peptides. The resulting structural and thermodynamic information is benchmarked against interproton distances from NMR experiments and conformational states identified by X-ray crystallography. Using three different model systems of varying size and flexibility, we show that the method reliably reproduces experimentally determined structural ensembles and is capable of identifying key conformational states that include the bioactive conformation. Thus, the described approach is a robust method to generate conformations of peptidic macrocycles and holds promise for structure-based drug design. PMID:29652495

  14. Peptidic Macrocycles - Conformational Sampling and Thermodynamic Characterization.

    PubMed

    Kamenik, Anna S; Lessel, Uta; Fuchs, Julian E; Fox, Thomas; Liedl, Klaus R

    2018-05-29

    Macrocycles are of considerable interest as highly specific drug candidates, yet they challenge standard conformer generators with their large number of rotatable bonds and conformational restrictions. Here, we present a molecular dynamics-based routine that bypasses current limitations in conformational sampling and extensively profiles the free energy landscape of peptidic macrocycles in solution. We perform accelerated molecular dynamics simulations to capture a diverse conformational ensemble. By applying an energetic cutoff, followed by geometric clustering, we demonstrate the striking robustness and efficiency of the approach in identifying highly populated conformational states of cyclic peptides. The resulting structural and thermodynamic information is benchmarked against interproton distances from NMR experiments and conformational states identified by X-ray crystallography. Using three different model systems of varying size and flexibility, we show that the method reliably reproduces experimentally determined structural ensembles and is capable of identifying key conformational states that include the bioactive conformation. Thus, the described approach is a robust method to generate conformations of peptidic macrocycles and holds promise for structure-based drug design.

  15. Liquid Water from First Principles: Validation of Different Sampling Approaches

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mundy, C J; Kuo, W; Siepmann, J

    2004-05-20

    A series of first principles molecular dynamics and Monte Carlo simulations were carried out for liquid water to assess the validity and reproducibility of different sampling approaches. These simulations include Car-Parrinello molecular dynamics simulations using the program CPMD with different values of the fictitious electron mass in the microcanonical and canonical ensembles, Born-Oppenheimer molecular dynamics using the programs CPMD and CP2K in the microcanonical ensemble, and Metropolis Monte Carlo using CP2K in the canonical ensemble. With the exception of one simulation for 128 water molecules, all other simulations were carried out for systems consisting of 64 molecules. It is foundmore » that the structural and thermodynamic properties of these simulations are in excellent agreement with each other as long as adiabatic sampling is maintained in the Car-Parrinello molecular dynamics simulations either by choosing a sufficiently small fictitious mass in the microcanonical ensemble or by Nos{acute e}-Hoover thermostats in the canonical ensemble. Using the Becke-Lee-Yang-Parr exchange and correlation energy functionals and norm-conserving Troullier-Martins or Goedecker-Teter-Hutter pseudopotentials, simulations at a fixed density of 1.0 g/cm{sup 3} and a temperature close to 315 K yield a height of the first peak in the oxygen-oxygen radial distribution function of about 3.0, a classical constant-volume heat capacity of about 70 J K{sup -1} mol{sup -1}, and a self-diffusion constant of about 0.1 Angstroms{sup 2}/ps.« less

  16. Effect of the Crystal Environment on Side-Chain Conformational Dynamics in Cyanovirin-N Investigated through Crystal and Solution Molecular Dynamics Simulations

    PubMed Central

    Ahlstrom, Logan S.; Vorontsov, Ivan I.; Shi, Jun; Miyashita, Osamu

    2017-01-01

    Side chains in protein crystal structures are essential for understanding biochemical processes such as catalysis and molecular recognition. However, crystal packing could influence side-chain conformation and dynamics, thus complicating functional interpretations of available experimental structures. Here we investigate the effect of crystal packing on side-chain conformational dynamics with crystal and solution molecular dynamics simulations using Cyanovirin-N as a model system. Side-chain ensembles for solvent-exposed residues obtained from simulation largely reflect the conformations observed in the X-ray structure. This agreement is most striking for crystal-contacting residues during crystal simulation. Given the high level of correspondence between our simulations and the X-ray data, we compare side-chain ensembles in solution and crystal simulations. We observe large decreases in conformational entropy in the crystal for several long, polar and contacting residues on the protein surface. Such cases agree well with the average loss in conformational entropy per residue upon protein folding and are accompanied by a change in side-chain conformation. This finding supports the application of surface engineering to facilitate crystallization. Our simulation-based approach demonstrated here with Cyanovirin-N establishes a framework for quantitatively comparing side-chain ensembles in solution and in the crystal across a larger set of proteins to elucidate the effect of the crystal environment on protein conformations. PMID:28107510

  17. Effect of the Crystal Environment on Side-Chain Conformational Dynamics in Cyanovirin-N Investigated through Crystal and Solution Molecular Dynamics Simulations.

    PubMed

    Ahlstrom, Logan S; Vorontsov, Ivan I; Shi, Jun; Miyashita, Osamu

    2017-01-01

    Side chains in protein crystal structures are essential for understanding biochemical processes such as catalysis and molecular recognition. However, crystal packing could influence side-chain conformation and dynamics, thus complicating functional interpretations of available experimental structures. Here we investigate the effect of crystal packing on side-chain conformational dynamics with crystal and solution molecular dynamics simulations using Cyanovirin-N as a model system. Side-chain ensembles for solvent-exposed residues obtained from simulation largely reflect the conformations observed in the X-ray structure. This agreement is most striking for crystal-contacting residues during crystal simulation. Given the high level of correspondence between our simulations and the X-ray data, we compare side-chain ensembles in solution and crystal simulations. We observe large decreases in conformational entropy in the crystal for several long, polar and contacting residues on the protein surface. Such cases agree well with the average loss in conformational entropy per residue upon protein folding and are accompanied by a change in side-chain conformation. This finding supports the application of surface engineering to facilitate crystallization. Our simulation-based approach demonstrated here with Cyanovirin-N establishes a framework for quantitatively comparing side-chain ensembles in solution and in the crystal across a larger set of proteins to elucidate the effect of the crystal environment on protein conformations.

  18. Using 1H and 13C NMR chemical shifts to determine cyclic peptide conformations: a combined molecular dynamics and quantum mechanics approach.

    PubMed

    Nguyen, Q Nhu N; Schwochert, Joshua; Tantillo, Dean J; Lokey, R Scott

    2018-05-10

    Solving conformations of cyclic peptides can provide insight into structure-activity and structure-property relationships, which can help in the design of compounds with improved bioactivity and/or ADME characteristics. The most common approaches for determining the structures of cyclic peptides are based on NMR-derived distance restraints obtained from NOESY or ROESY cross-peak intensities, and 3J-based dihedral restraints using the Karplus relationship. Unfortunately, these observables are often too weak, sparse, or degenerate to provide unequivocal, high-confidence solution structures, prompting us to investigate an alternative approach that relies only on 1H and 13C chemical shifts as experimental observables. This method, which we call conformational analysis from NMR and density-functional prediction of low-energy ensembles (CANDLE), uses molecular dynamics (MD) simulations to generate conformer families and density functional theory (DFT) calculations to predict their 1H and 13C chemical shifts. Iterative conformer searches and DFT energy calculations on a cyclic peptide-peptoid hybrid yielded Boltzmann ensembles whose predicted chemical shifts matched the experimental values better than any single conformer. For these compounds, CANDLE outperformed the classic NOE- and 3J-coupling-based approach by disambiguating similar β-turn types and also enabled the structural elucidation of the minor conformer. Through the use of chemical shifts, in conjunction with DFT and MD calculations, CANDLE can help illuminate conformational ensembles of cyclic peptides in solution.

  19. Advanced ensemble modelling of flexible macromolecules using X-ray solution scattering.

    PubMed

    Tria, Giancarlo; Mertens, Haydyn D T; Kachala, Michael; Svergun, Dmitri I

    2015-03-01

    Dynamic ensembles of macromolecules mediate essential processes in biology. Understanding the mechanisms driving the function and molecular interactions of 'unstructured' and flexible molecules requires alternative approaches to those traditionally employed in structural biology. Small-angle X-ray scattering (SAXS) is an established method for structural characterization of biological macromolecules in solution, and is directly applicable to the study of flexible systems such as intrinsically disordered proteins and multi-domain proteins with unstructured regions. The Ensemble Optimization Method (EOM) [Bernadó et al. (2007 ▶). J. Am. Chem. Soc. 129, 5656-5664] was the first approach introducing the concept of ensemble fitting of the SAXS data from flexible systems. In this approach, a large pool of macromolecules covering the available conformational space is generated and a sub-ensemble of conformers coexisting in solution is selected guided by the fit to the experimental SAXS data. This paper presents a series of new developments and advancements to the method, including significantly enhanced functionality and also quantitative metrics for the characterization of the results. Building on the original concept of ensemble optimization, the algorithms for pool generation have been redesigned to allow for the construction of partially or completely symmetric oligomeric models, and the selection procedure was improved to refine the size of the ensemble. Quantitative measures of the flexibility of the system studied, based on the characteristic integral parameters of the selected ensemble, are introduced. These improvements are implemented in the new EOM version 2.0, and the capabilities as well as inherent limitations of the ensemble approach in SAXS, and of EOM 2.0 in particular, are discussed.

  20. Development and Validation of a Computational Model Ensemble for the Early Detection of BCRP/ABCG2 Substrates during the Drug Design Stage.

    PubMed

    Gantner, Melisa E; Peroni, Roxana N; Morales, Juan F; Villalba, María L; Ruiz, María E; Talevi, Alan

    2017-08-28

    Breast Cancer Resistance Protein (BCRP) is an ATP-dependent efflux transporter linked to the multidrug resistance phenomenon in many diseases such as epilepsy and cancer and a potential source of drug interactions. For these reasons, the early identification of substrates and nonsubstrates of this transporter during the drug discovery stage is of great interest. We have developed a computational nonlinear model ensemble based on conformational independent molecular descriptors using a combined strategy of genetic algorithms, J48 decision tree classifiers, and data fusion. The best model ensemble consists in averaging the ranking of the 12 decision trees that showed the best performance on the training set, which also demonstrated a good performance for the test set. It was experimentally validated using the ex vivo everted rat intestinal sac model. Five anticonvulsant drugs classified as nonsubstrates for BRCP by the model ensemble were experimentally evaluated, and none of them proved to be a BCRP substrate under the experimental conditions used, thus confirming the predictive ability of the model ensemble. The model ensemble reported here is a potentially valuable tool to be used as an in silico ADME filter in computer-aided drug discovery campaigns intended to overcome BCRP-mediated multidrug resistance issues and to prevent drug-drug interactions.

  1. Experimentally assessing molecular dynamics sampling of the protein native state conformational distribution

    PubMed Central

    Hernández, Griselda; Anderson, Janet S.; LeMaster, David M.

    2012-01-01

    The acute sensitivity to conformation exhibited by amide hydrogen exchange reactivity provides a valuable test for the physical accuracy of model ensembles developed to represent the Boltzmann distribution of the protein native state. A number of molecular dynamics studies of ubiquitin have predicted a well-populated transition in the tight turn immediately preceding the primary site of proteasome-directed polyubiquitylation Lys 48. Amide exchange reactivity analysis demonstrates that this transition is 103-fold rarer than these predictions. More strikingly, for the most populated novel conformational basin predicted from a recent 1 ms MD simulation of bovine pancreatic trypsin inhibitor (at 13% of total), experimental hydrogen exchange data indicates a population below 10−6. The most sophisticated efforts to directly incorporate experimental constraints into the derivation of model protein ensembles have been applied to ubiquitin, as illustrated by three recently deposited studies (PDB codes 2NR2, 2K39 and 2KOX). Utilizing the extensive set of experimental NOE constraints, each of these three ensembles yields a modestly more accurate prediction of the exchange rates for the highly exposed amides than does a standard unconstrained molecular simulation. However, for the less frequently exposed amide hydrogens, the 2NR2 ensemble offers no improvement in rate predictions as compared to the unconstrained MD ensemble. The other two NMR-constrained ensembles performed markedly worse, either underestimating (2KOX) or overestimating (2K39) the extent of conformational diversity. PMID:22425325

  2. Bayesian refinement of protein structures and ensembles against SAXS data using molecular dynamics

    PubMed Central

    Shevchuk, Roman; Hub, Jochen S.

    2017-01-01

    Small-angle X-ray scattering is an increasingly popular technique used to detect protein structures and ensembles in solution. However, the refinement of structures and ensembles against SAXS data is often ambiguous due to the low information content of SAXS data, unknown systematic errors, and unknown scattering contributions from the solvent. We offer a solution to such problems by combining Bayesian inference with all-atom molecular dynamics simulations and explicit-solvent SAXS calculations. The Bayesian formulation correctly weights the SAXS data versus prior physical knowledge, it quantifies the precision or ambiguity of fitted structures and ensembles, and it accounts for unknown systematic errors due to poor buffer matching. The method further provides a probabilistic criterion for identifying the number of states required to explain the SAXS data. The method is validated by refining ensembles of a periplasmic binding protein against calculated SAXS curves. Subsequently, we derive the solution ensembles of the eukaryotic chaperone heat shock protein 90 (Hsp90) against experimental SAXS data. We find that the SAXS data of the apo state of Hsp90 is compatible with a single wide-open conformation, whereas the SAXS data of Hsp90 bound to ATP or to an ATP-analogue strongly suggest heterogenous ensembles of a closed and a wide-open state. PMID:29045407

  3. Generalized Ensemble Sampling of Enzyme Reaction Free Energy Pathways

    PubMed Central

    Wu, Dongsheng; Fajer, Mikolai I.; Cao, Liaoran; Cheng, Xiaolin; Yang, Wei

    2016-01-01

    Free energy path sampling plays an essential role in computational understanding of chemical reactions, particularly those occurring in enzymatic environments. Among a variety of molecular dynamics simulation approaches, the generalized ensemble sampling strategy is uniquely attractive for the fact that it not only can enhance the sampling of rare chemical events but also can naturally ensure consistent exploration of environmental degrees of freedom. In this review, we plan to provide a tutorial-like tour on an emerging topic: generalized ensemble sampling of enzyme reaction free energy path. The discussion is largely focused on our own studies, particularly ones based on the metadynamics free energy sampling method and the on-the-path random walk path sampling method. We hope that this mini presentation will provide interested practitioners some meaningful guidance for future algorithm formulation and application study. PMID:27498634

  4. Molecular dynamics simulations: advances and applications

    PubMed Central

    Hospital, Adam; Goñi, Josep Ramon; Orozco, Modesto; Gelpí, Josep L

    2015-01-01

    Molecular dynamics simulations have evolved into a mature technique that can be used effectively to understand macromolecular structure-to-function relationships. Present simulation times are close to biologically relevant ones. Information gathered about the dynamic properties of macromolecules is rich enough to shift the usual paradigm of structural bioinformatics from studying single structures to analyze conformational ensembles. Here, we describe the foundations of molecular dynamics and the improvements made in the direction of getting such ensemble. Specific application of the technique to three main issues (allosteric regulation, docking, and structure refinement) is discussed. PMID:26604800

  5. Conformational ensembles of RNA oligonucleotides from integrating NMR and molecular simulations.

    PubMed

    Bottaro, Sandro; Bussi, Giovanni; Kennedy, Scott D; Turner, Douglas H; Lindorff-Larsen, Kresten

    2018-05-01

    RNA molecules are key players in numerous cellular processes and are characterized by a complex relationship between structure, dynamics, and function. Despite their apparent simplicity, RNA oligonucleotides are very flexible molecules, and understanding their internal dynamics is particularly challenging using experimental data alone. We show how to reconstruct the conformational ensemble of four RNA tetranucleotides by combining atomistic molecular dynamics simulations with nuclear magnetic resonance spectroscopy data. The goal is achieved by reweighting simulations using a maximum entropy/Bayesian approach. In this way, we overcome problems of current simulation methods, as well as in interpreting ensemble- and time-averaged experimental data. We determine the populations of different conformational states by considering several nuclear magnetic resonance parameters and point toward properties that are not captured by state-of-the-art molecular force fields. Although our approach is applied on a set of model systems, it is fully general and may be used to study the conformational dynamics of flexible biomolecules and to detect inaccuracies in molecular dynamics force fields.

  6. Loss of intramolecular electrostatic interactions and limited conformational ensemble may promote self-association of cis-tau peptide.

    PubMed

    Barman, Arghya; Hamelberg, Donald

    2015-03-01

    Self-association of proteins can be triggered by a change in the distribution of the conformational ensemble. Posttranslational modification, such as phosphorylation, can induce a shift in the ensemble of conformations. In the brain of Alzheimer's disease patients, the formation of intra-cellular neurofibrillary tangles deposition is a result of self-aggregation of hyper-phosphorylated tau protein. Biochemical and NMR studies suggest that the cis peptidyl prolyl conformation of a phosphorylated threonine-proline motif in the tau protein renders tau more prone to aggregation than the trans isomer. However, little is known about the role of peptidyl prolyl cis/trans isomerization in tau aggregation. Here, we show that intra-molecular electrostatic interactions are better formed in the trans isomer. We explore the conformational landscape of the tau segment containing the phosphorylated-Thr(231)-Pro(232) motif using accelerated molecular dynamics and show that intra-molecular electrostatic interactions are coupled to the isomeric state of the peptidyl prolyl bond. Our results suggest that the loss of intra-molecular interactions and the more restricted conformational ensemble of the cis isomer could favor self-aggregation. The results are consistent with experiments, providing valuable complementary atomistic insights and a hypothetical model for isomer specific aggregation of the tau protein. © 2014 Wiley Periodicals, Inc.

  7. High temperature phonon dispersion in graphene using classical molecular dynamics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Anees, P., E-mail: anees@igcar.gov.in; Panigrahi, B. K.; Valsakumar, M. C., E-mail: anees@igcar.gov.in

    2014-04-24

    Phonon dispersion and phonon density of states of graphene are calculated using classical molecular dynamics simulations. In this method, the dynamical matrix is constructed based on linear response theory by computing the displacement of atoms during the simulations. The computed phonon dispersions show excellent agreement with experiments. The simulations are done in both NVT and NPT ensembles at 300 K and found that the LO/TO modes are getting hardened at the Γ point. The NPT ensemble simulations capture the anharmonicity of the crystal accurately and the hardening of LO/TO modes is more pronounced. We also found that at 300 Kmore » the C-C bond length reduces below the equilibrium value and the ZA bending mode frequency becomes imaginary close to Γ along K-Γ direction, which indicates instability of the flat 2D graphene sheets.« less

  8. Molecular docking and molecular dynamics simulation analyses of urea with ammoniated and ammoxidized lignin.

    PubMed

    Li, Wenzhuo; Zhang, Song; Zhao, Yingying; Huang, Shuaiyu; Zhao, Jiangshan

    2017-01-01

    Ammoniated lignin, prepared through the Mannich reaction of lignin, has more advantages as a slow-release carrier of urea molecules than ammoxidized lignin and lignin. The advantages of the ammoniated lignin include its amine groups added and its high molecular mass kept as similar as that of lignin. Three organic molecules including guaiacyl, 2-hydroxybenzylamine and 5-carbamoylpentanoic acid are monomers respectively in lignin, ammoniated lignin and ammoxidized lignin. We studied the difference between the interactions of lignin, ammoniated lignin and ammoxidized lignin with respect to urea, based on radial distribution functions (RDFs) results from molecular dynamics (MD) simulations. Glass transition temperature (T g ) and solubility parameter (δ) of ammoniated and ammoxidized lignin have been calculated by MD simulations in the constant-temperature and constant-pressure ensemble (NPT). Molecular docking results showed the interaction sites of the urea onto the ammoniated and ammoxidized lignin and three different interaction modes were identified. Root mean square deviation (RMSD) values could indicate the mobilities of the urea molecule affected by the three different interaction modes. A series of MD simulations in the constant-temperature and constant-volume ensemble (NVT) helped us to calculate the diffusivity of urea which was affected by the content of urea in ammoniated and ammoxidized lignin. Copyright © 2016 Elsevier Inc. All rights reserved.

  9. Convergence and reproducibility in molecular dynamics simulations of the DNA duplex d(GCACGAACGAACGAACGC)

    PubMed Central

    Galindo-Murillo, Rodrigo; Roe, Daniel R.; Cheatham, Thomas E.

    2014-01-01

    Background The structure and dynamics of DNA are critically related to its function. Molecular dynamics (MD) simulations augment experiment by providing detailed information about the atomic motions. However, to date the simulations have not been long enough for convergence of the dynamics and structural properties of DNA. Methods MD simulations performed with AMBER using the ff99SB force field with the parmbsc0 modifications, including ensembles of independent simulations, were compared to long timescale MD performed with the specialized Anton MD engine on the B-DNA structure d(GCACGAACGAACGAACGC). To assess convergence, the decay of the average RMSD values over longer and longer time intervals was evaluated in addition to assessing convergence of the dynamics via the Kullback-Leibler divergence of principal component projection histograms. Results These MD simulations —including one of the longest simulations of DNA published to date at ~44 μs—surprisingly suggest that the structure and dynamics of the DNA helix, neglecting the terminal base pairs, are essentially fully converged on the ~1–5 μs timescale. Conclusions We can now reproducibly converge the structure and dynamics of B-DNA helices, omitting the terminal base pairs, on the μs time scale with both the AMBER and CHARMM C36 nucleic acid force fields. Results from independent ensembles of simulations starting from different initial conditions, when aggregated, match the results from long timescale simulations on the specialized Anton MD engine. General Significance With access to large-scale GPU resources or the specialized MD engine “Anton” it is possibly for a variety of molecular systems to reproducibly and reliably converge the conformational ensemble of sampled structures. PMID:25219455

  10. Modulating RNA Alignment Using Directional Dynamic Kinks: Application in Determining an Atomic-Resolution Ensemble for a Hairpin using NMR Residual Dipolar Couplings.

    PubMed

    Salmon, Loïc; Giambaşu, George M; Nikolova, Evgenia N; Petzold, Katja; Bhattacharya, Akash; Case, David A; Al-Hashimi, Hashim M

    2015-10-14

    Approaches that combine experimental data and computational molecular dynamics (MD) to determine atomic resolution ensembles of biomolecules require the measurement of abundant experimental data. NMR residual dipolar couplings (RDCs) carry rich dynamics information, however, difficulties in modulating overall alignment of nucleic acids have limited the ability to fully extract this information. We present a strategy for modulating RNA alignment that is based on introducing variable dynamic kinks in terminal helices. With this strategy, we measured seven sets of RDCs in a cUUCGg apical loop and used this rich data set to test the accuracy of an 0.8 μs MD simulation computed using the Amber ff10 force field as well as to determine an atomic resolution ensemble. The MD-generated ensemble quantitatively reproduces the measured RDCs, but selection of a sub-ensemble was required to satisfy the RDCs within error. The largest discrepancies between the RDC-selected and MD-generated ensembles are observed for the most flexible loop residues and backbone angles connecting the loop to the helix, with the RDC-selected ensemble resulting in more uniform dynamics. Comparison of the RDC-selected ensemble with NMR spin relaxation data suggests that the dynamics occurs on the ps-ns time scales as verified by measurements of R(1ρ) relaxation-dispersion data. The RDC-satisfying ensemble samples many conformations adopted by the hairpin in crystal structures indicating that intrinsic plasticity may play important roles in conformational adaptation. The approach presented here can be applied to test nucleic acid force fields and to characterize dynamics in diverse RNA motifs at atomic resolution.

  11. Determination of ensemble-average pairwise root mean-square deviation from experimental B-factors.

    PubMed

    Kuzmanic, Antonija; Zagrovic, Bojan

    2010-03-03

    Root mean-square deviation (RMSD) after roto-translational least-squares fitting is a measure of global structural similarity of macromolecules used commonly. On the other hand, experimental x-ray B-factors are used frequently to study local structural heterogeneity and dynamics in macromolecules by providing direct information about root mean-square fluctuations (RMSF) that can also be calculated from molecular dynamics simulations. We provide a mathematical derivation showing that, given a set of conservative assumptions, a root mean-square ensemble-average of an all-against-all distribution of pairwise RMSD for a single molecular species, (1/2), is directly related to average B-factors () and (1/2). We show this relationship and explore its limits of validity on a heterogeneous ensemble of structures taken from molecular dynamics simulations of villin headpiece generated using distributed-computing techniques and the Folding@Home cluster. Our results provide a basis for quantifying global structural diversity of macromolecules in crystals directly from x-ray experiments, and we show this on a large set of structures taken from the Protein Data Bank. In particular, we show that the ensemble-average pairwise backbone RMSD for a microscopic ensemble underlying a typical protein x-ray structure is approximately 1.1 A, under the assumption that the principal contribution to experimental B-factors is conformational variability. 2010 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  12. Determination of Ensemble-Average Pairwise Root Mean-Square Deviation from Experimental B-Factors

    PubMed Central

    Kuzmanic, Antonija; Zagrovic, Bojan

    2010-01-01

    Abstract Root mean-square deviation (RMSD) after roto-translational least-squares fitting is a measure of global structural similarity of macromolecules used commonly. On the other hand, experimental x-ray B-factors are used frequently to study local structural heterogeneity and dynamics in macromolecules by providing direct information about root mean-square fluctuations (RMSF) that can also be calculated from molecular dynamics simulations. We provide a mathematical derivation showing that, given a set of conservative assumptions, a root mean-square ensemble-average of an all-against-all distribution of pairwise RMSD for a single molecular species, 1/2, is directly related to average B-factors () and 1/2. We show this relationship and explore its limits of validity on a heterogeneous ensemble of structures taken from molecular dynamics simulations of villin headpiece generated using distributed-computing techniques and the Folding@Home cluster. Our results provide a basis for quantifying global structural diversity of macromolecules in crystals directly from x-ray experiments, and we show this on a large set of structures taken from the Protein Data Bank. In particular, we show that the ensemble-average pairwise backbone RMSD for a microscopic ensemble underlying a typical protein x-ray structure is ∼1.1 Å, under the assumption that the principal contribution to experimental B-factors is conformational variability. PMID:20197040

  13. Modelling dynamics in protein crystal structures by ensemble refinement

    PubMed Central

    Burnley, B Tom; Afonine, Pavel V; Adams, Paul D; Gros, Piet

    2012-01-01

    Single-structure models derived from X-ray data do not adequately account for the inherent, functionally important dynamics of protein molecules. We generated ensembles of structures by time-averaged refinement, where local molecular vibrations were sampled by molecular-dynamics (MD) simulation whilst global disorder was partitioned into an underlying overall translation–libration–screw (TLS) model. Modeling of 20 protein datasets at 1.1–3.1 Å resolution reduced cross-validated Rfree values by 0.3–4.9%, indicating that ensemble models fit the X-ray data better than single structures. The ensembles revealed that, while most proteins display a well-ordered core, some proteins exhibit a ‘molten core’ likely supporting functionally important dynamics in ligand binding, enzyme activity and protomer assembly. Order–disorder changes in HIV protease indicate a mechanism of entropy compensation for ordering the catalytic residues upon ligand binding by disordering specific core residues. Thus, ensemble refinement extracts dynamical details from the X-ray data that allow a more comprehensive understanding of structure–dynamics–function relationships. DOI: http://dx.doi.org/10.7554/eLife.00311.001 PMID:23251785

  14. Ensemble and Single-Molecule Studies on Fluorescence Quenching in Transition Metal Bipyridine-Complexes

    PubMed Central

    Brox, Dominik; Kiel, Alexander; Wörner, Svenja Johanna; Pernpointner, Markus; Comba, Peter; Martin, Bodo; Herten, Dirk-Peter

    2013-01-01

    Beyond their use in analytical chemistry fluorescent probes continuously gain importance because of recent applications of single-molecule fluorescence spectroscopy to monitor elementary reaction steps. In this context, we characterized quenching of a fluorescent probe by different metal ions with fluorescence spectroscopy in the bulk and at the single-molecule level. We apply a quantitative model to explain deviations from existing standard models for fluorescence quenching. The model is based on a reversible transition from a bright to a dim state upon binding of the metal ion. We use the model to estimate the stability constants of complexes with different metal ions and the change of the relative quantum yield of different reporter dye labels. We found ensemble data to agree widely with results from single-molecule experiments. Our data indicates a mechanism involving close molecular contact of dye and quenching moiety which we also found in molecular dynamics simulations. We close the manuscript with a discussion of possible mechanisms based on Förster distances and electrochemical potentials which renders photo-induced electron transfer to be more likely than Förster resonance energy transfer. PMID:23483966

  15. Self-consistent implementation of ensemble density functional theory method for multiple strongly correlated electron pairs

    DOE PAGES

    Filatov, Michael; Liu, Fang; Kim, Kwang S.; ...

    2016-12-22

    Here, the spin-restricted ensemble-referenced Kohn-Sham (REKS) method is based on an ensemble representation of the density and is capable of correctly describing the non-dynamic electron correlation stemming from (near-)degeneracy of several electronic configurations. The existing REKS methodology describes systems with two electrons in two fractionally occupied orbitals. In this work, the REKS methodology is extended to treat systems with four fractionally occupied orbitals accommodating four electrons and self-consistent implementation of the REKS(4,4) method with simultaneous optimization of the orbitals and their fractional occupation numbers is reported. The new method is applied to a number of molecular systems where simultaneous dissociationmore » of several chemical bonds takes place, as well as to the singlet ground states of organic tetraradicals 2,4-didehydrometaxylylene and 1,4,6,9-spiro[4.4]nonatetrayl.« less

  16. Molecular Dynamics Simulations and XAFS (MD-XAFS)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schenter, Gregory K.; Fulton, John L.

    2017-01-20

    MD-XAFS (Molecular Dynamics X-ray Adsorption Fine Structure) makes the connection between simulation techniques that generate an ensemble of molecular configurations and the direct signal observed from X-ray measurement.

  17. CABS-flex predictions of protein flexibility compared with NMR ensembles

    PubMed Central

    Jamroz, Michal; Kolinski, Andrzej; Kmiecik, Sebastian

    2014-01-01

    Motivation: Identification of flexible regions of protein structures is important for understanding of their biological functions. Recently, we have developed a fast approach for predicting protein structure fluctuations from a single protein model: the CABS-flex. CABS-flex was shown to be an efficient alternative to conventional all-atom molecular dynamics (MD). In this work, we evaluate CABS-flex and MD predictions by comparison with protein structural variations within NMR ensembles. Results: Based on a benchmark set of 140 proteins, we show that the relative fluctuations of protein residues obtained from CABS-flex are well correlated to those of NMR ensembles. On average, this correlation is stronger than that between MD and NMR ensembles. In conclusion, CABS-flex is useful and complementary to MD in predicting protein regions that undergo conformational changes as well as the extent of such changes. Availability and implementation: The CABS-flex is freely available to all users at http://biocomp.chem.uw.edu.pl/CABSflex. Contact: sekmi@chem.uw.edu.pl Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24735558

  18. CABS-flex predictions of protein flexibility compared with NMR ensembles.

    PubMed

    Jamroz, Michal; Kolinski, Andrzej; Kmiecik, Sebastian

    2014-08-01

    Identification of flexible regions of protein structures is important for understanding of their biological functions. Recently, we have developed a fast approach for predicting protein structure fluctuations from a single protein model: the CABS-flex. CABS-flex was shown to be an efficient alternative to conventional all-atom molecular dynamics (MD). In this work, we evaluate CABS-flex and MD predictions by comparison with protein structural variations within NMR ensembles. Based on a benchmark set of 140 proteins, we show that the relative fluctuations of protein residues obtained from CABS-flex are well correlated to those of NMR ensembles. On average, this correlation is stronger than that between MD and NMR ensembles. In conclusion, CABS-flex is useful and complementary to MD in predicting protein regions that undergo conformational changes as well as the extent of such changes. The CABS-flex is freely available to all users at http://biocomp.chem.uw.edu.pl/CABSflex. sekmi@chem.uw.edu.pl Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

  19. From a structural average to the conformational ensemble of a DNA bulge

    PubMed Central

    Shi, Xuesong; Beauchamp, Kyle A.; Harbury, Pehr B.; Herschlag, Daniel

    2014-01-01

    Direct experimental measurements of conformational ensembles are critical for understanding macromolecular function, but traditional biophysical methods do not directly report the solution ensemble of a macromolecule. Small-angle X-ray scattering interferometry has the potential to overcome this limitation by providing the instantaneous distance distribution between pairs of gold-nanocrystal probes conjugated to a macromolecule in solution. Our X-ray interferometry experiments reveal an increasing bend angle of DNA duplexes with bulges of one, three, and five adenosine residues, consistent with previous FRET measurements, and further reveal an increasingly broad conformational ensemble with increasing bulge length. The distance distributions for the AAA bulge duplex (3A-DNA) with six different Au-Au pairs provide strong evidence against a simple elastic model in which fluctuations occur about a single conformational state. Instead, the measured distance distributions suggest a 3A-DNA ensemble with multiple conformational states predominantly across a region of conformational space with bend angles between 24 and 85 degrees and characteristic bend directions and helical twists and displacements. Additional X-ray interferometry experiments revealed perturbations to the ensemble from changes in ionic conditions and the bulge sequence, effects that can be understood in terms of electrostatic and stacking contributions to the ensemble and that demonstrate the sensitivity of X-ray interferometry. Combining X-ray interferometry ensemble data with molecular dynamics simulations gave atomic-level models of representative conformational states and of the molecular interactions that may shape the ensemble, and fluorescence measurements with 2-aminopurine-substituted 3A-DNA provided initial tests of these atomistic models. More generally, X-ray interferometry will provide powerful benchmarks for testing and developing computational methods. PMID:24706812

  20. Coupling molecular spin centers to microwave resonators: steps towards the implementation of molecular qubits for hybrid quantum circuits

    NASA Astrophysics Data System (ADS)

    Bonizzoni, Claudio; Ghirri, Alberto; Affronte, Marco

    Hybrid spin-photons quantum bits can be obtained under strong coupling regime between microwave photons and a spin ensemble, where coherent exchange of photons is realized. Molecular spins systems, thanks to their tailorable magnetic properties, are retained promising candidates for hybrid qubits. We present an experimental study of the coupling regimes between a high critical temperature YBCO superconducting resonator and different molecular spin ensembles. Three mononuclear compounds, (PPh4)2[Cu(mnt)2], [ErPc2]-TBA+ , Dy(trensal) and two organic radicals, DPPH and PyBTM, are studied. Strong coupling is found in radicals thanks to exchange narrowing. Possible strategies to achieve strong coupling with mononuclear compounds are discussed, and several hints in the design of molecular spins are given.

  1. Control Mechanisms of Photoisomerization in Protonated Schiff Bases.

    PubMed

    Vuković, Lela; Burmeister, Carl F; Král, Petr; Groenhof, Gerrit

    2013-03-21

    We performed ab initio excited-state molecular dynamics simulations of a gas-phase photoexcited protonated Schiff base (C1-N2═C3-C4═C5-C6) to search for control mechanisms of its photoisomerization. The excited molecule twists by ∼90° around either the N2C3 bond or the C4C5 bond and relaxes to the ground electronic state through a conical intersection with either a trans or cis outcome. We show that a large initial distortion of several dihedral angles and a specific normal vibrational mode combining pyramidalization and double-bond twisting can lead to a preferential rotation of atoms around the C4C5 bond. We also show that selective pretwisting of several dihedral angles in the initial ground state thermal ensemble (by analogy to a protein pocket) can significantly increase the fraction of photoreactive (cis → trans) trajectories. We demonstrate that new ensembles with higher degrees of control over the photoisomerization reaction can be obtained by a computational directed evolution approach on the ensembles of molecules with the pretwisted geometries.

  2. Simulating Energy Relaxation in Pump-Probe Vibrational Spectroscopy of Hydrogen-Bonded Liquids.

    PubMed

    Dettori, Riccardo; Ceriotti, Michele; Hunger, Johannes; Melis, Claudio; Colombo, Luciano; Donadio, Davide

    2017-03-14

    We introduce a nonequilibrium molecular dynamics simulation approach, based on the generalized Langevin equation, to study vibrational energy relaxation in pump-probe spectroscopy. A colored noise thermostat is used to selectively excite a set of vibrational modes, leaving the other modes nearly unperturbed, to mimic the effect of a monochromatic laser pump. Energy relaxation is probed by analyzing the evolution of the system after excitation in the microcanonical ensemble, thus providing direct information about the energy redistribution paths at the molecular level and their time scale. The method is applied to hydrogen-bonded molecular liquids, specifically deuterated methanol and water, providing a robust picture of energy relaxation at the molecular scale.

  3. Coherent coupling between Vanadyl Phthalocyanine spin ensemble and microwave photons: towards integration of molecular spin qubits into quantum circuits.

    PubMed

    Bonizzoni, C; Ghirri, A; Atzori, M; Sorace, L; Sessoli, R; Affronte, M

    2017-10-12

    Electron spins are ideal two-level systems that may couple with microwave photons so that, under specific conditions, coherent spin-photon states can be realized. This represents a fundamental step for the transfer and the manipulation of quantum information. Along with spin impurities in solids, molecular spins in concentrated phases have recently shown coherent dynamics under microwave stimuli. Here we show that it is possible to obtain high cooperativity regime between a molecular Vanadyl Phthalocyanine (VOPc) spin ensemble and a high quality factor superconducting YBa 2 Cu 3 O 7 (YBCO) coplanar resonator at 0.5 K. This demonstrates that molecular spin centers can be successfully integrated in hybrid quantum devices.

  4. An Evaluation of Explicit Receptor Flexibility in Molecular Docking Using Molecular Dynamics and Torsion Angle Molecular Dynamics.

    PubMed

    Armen, Roger S; Chen, Jianhan; Brooks, Charles L

    2009-10-13

    Incorporating receptor flexibility into molecular docking should improve results for flexible proteins. However, the incorporation of explicit all-atom flexibility with molecular dynamics for the entire protein chain may also introduce significant error and "noise" that could decrease docking accuracy and deteriorate the ability of a scoring function to rank native-like poses. We address this apparent paradox by comparing the success of several flexible receptor models in cross-docking and multiple receptor ensemble docking for p38α mitogen-activated protein (MAP) kinase. Explicit all-atom receptor flexibility has been incorporated into a CHARMM-based molecular docking method (CDOCKER) using both molecular dynamics (MD) and torsion angle molecular dynamics (TAMD) for the refinement of predicted protein-ligand binding geometries. These flexible receptor models have been evaluated, and the accuracy and efficiency of TAMD sampling is directly compared to MD sampling. Several flexible receptor models are compared, encompassing flexible side chains, flexible loops, multiple flexible backbone segments, and treatment of the entire chain as flexible. We find that although including side chain and some backbone flexibility is required for improved docking accuracy as expected, docking accuracy also diminishes as additional and unnecessary receptor flexibility is included into the conformational search space. Ensemble docking results demonstrate that including protein flexibility leads to to improved agreement with binding data for 227 active compounds. This comparison also demonstrates that a flexible receptor model enriches high affinity compound identification without significantly increasing the number of false positives from low affinity compounds.

  5. An Evaluation of Explicit Receptor Flexibility in Molecular Docking Using Molecular Dynamics and Torsion Angle Molecular Dynamics

    PubMed Central

    Armen, Roger S.; Chen, Jianhan; Brooks, Charles L.

    2009-01-01

    Incorporating receptor flexibility into molecular docking should improve results for flexible proteins. However, the incorporation of explicit all-atom flexibility with molecular dynamics for the entire protein chain may also introduce significant error and “noise” that could decrease docking accuracy and deteriorate the ability of a scoring function to rank native-like poses. We address this apparent paradox by comparing the success of several flexible receptor models in cross-docking and multiple receptor ensemble docking for p38α mitogen-activated protein (MAP) kinase. Explicit all-atom receptor flexibility has been incorporated into a CHARMM-based molecular docking method (CDOCKER) using both molecular dynamics (MD) and torsion angle molecular dynamics (TAMD) for the refinement of predicted protein-ligand binding geometries. These flexible receptor models have been evaluated, and the accuracy and efficiency of TAMD sampling is directly compared to MD sampling. Several flexible receptor models are compared, encompassing flexible side chains, flexible loops, multiple flexible backbone segments, and treatment of the entire chain as flexible. We find that although including side chain and some backbone flexibility is required for improved docking accuracy as expected, docking accuracy also diminishes as additional and unnecessary receptor flexibility is included into the conformational search space. Ensemble docking results demonstrate that including protein flexibility leads to to improved agreement with binding data for 227 active compounds. This comparison also demonstrates that a flexible receptor model enriches high affinity compound identification without significantly increasing the number of false positives from low affinity compounds. PMID:20160879

  6. Asynchronous Replica Exchange Software for Grid and Heterogeneous Computing.

    PubMed

    Gallicchio, Emilio; Xia, Junchao; Flynn, William F; Zhang, Baofeng; Samlalsingh, Sade; Mentes, Ahmet; Levy, Ronald M

    2015-11-01

    Parallel replica exchange sampling is an extended ensemble technique often used to accelerate the exploration of the conformational ensemble of atomistic molecular simulations of chemical systems. Inter-process communication and coordination requirements have historically discouraged the deployment of replica exchange on distributed and heterogeneous resources. Here we describe the architecture of a software (named ASyncRE) for performing asynchronous replica exchange molecular simulations on volunteered computing grids and heterogeneous high performance clusters. The asynchronous replica exchange algorithm on which the software is based avoids centralized synchronization steps and the need for direct communication between remote processes. It allows molecular dynamics threads to progress at different rates and enables parameter exchanges among arbitrary sets of replicas independently from other replicas. ASyncRE is written in Python following a modular design conducive to extensions to various replica exchange schemes and molecular dynamics engines. Applications of the software for the modeling of association equilibria of supramolecular and macromolecular complexes on BOINC campus computational grids and on the CPU/MIC heterogeneous hardware of the XSEDE Stampede supercomputer are illustrated. They show the ability of ASyncRE to utilize large grids of desktop computers running the Windows, MacOS, and/or Linux operating systems as well as collections of high performance heterogeneous hardware devices.

  7. A molecular ensemble in the rER for procollagen maturation.

    PubMed

    Ishikawa, Yoshihiro; Bächinger, Hans Peter

    2013-11-01

    Extracellular matrix (ECM) proteins create structural frameworks in tissues such as bone, skin, tendon and cartilage etc. These connective tissues play important roles in the development and homeostasis of organs. Collagen is the most abundant ECM protein and represents one third of all proteins in humans. The biosynthesis of ECM proteins occurs in the rough endoplasmic reticulum (rER). This review describes the current understanding of the biosynthesis and folding of procollagens, which are the precursor molecules of collagens, in the rER. Multiple folding enzymes and molecular chaperones are required for procollagen to establish specific posttranslational modifications, and facilitate folding and transport to the cell surface. Thus, this molecular ensemble in the rER contributes to ECM maturation and to the development and homeostasis of tissues. Mutations in this ensemble are likely candidates for connective tissue disorders. This article is part of a Special Issue entitled: Functional and structural diversity of endoplasmic reticulum. Copyright © 2013 Elsevier B.V. All rights reserved.

  8. Molecular dynamics simulations using temperature-enhanced essential dynamics replica exchange.

    PubMed

    Kubitzki, Marcus B; de Groot, Bert L

    2007-06-15

    Today's standard molecular dynamics simulations of moderately sized biomolecular systems at full atomic resolution are typically limited to the nanosecond timescale and therefore suffer from limited conformational sampling. Efficient ensemble-preserving algorithms like replica exchange (REX) may alleviate this problem somewhat but are still computationally prohibitive due to the large number of degrees of freedom involved. Aiming at increased sampling efficiency, we present a novel simulation method combining the ideas of essential dynamics and REX. Unlike standard REX, in each replica only a selection of essential collective modes of a subsystem of interest (essential subspace) is coupled to a higher temperature, with the remainder of the system staying at a reference temperature, T(0). This selective excitation along with the replica framework permits efficient approximate ensemble-preserving conformational sampling and allows much larger temperature differences between replicas, thereby considerably enhancing sampling efficiency. Ensemble properties and sampling performance of the method are discussed using dialanine and guanylin test systems, with multi-microsecond molecular dynamics simulations of these test systems serving as references.

  9. Rapid sampling of local minima in protein energy surface and effective reduction through a multi-objective filter

    PubMed Central

    2013-01-01

    Background Many problems in protein modeling require obtaining a discrete representation of the protein conformational space as an ensemble of conformations. In ab-initio structure prediction, in particular, where the goal is to predict the native structure of a protein chain given its amino-acid sequence, the ensemble needs to satisfy energetic constraints. Given the thermodynamic hypothesis, an effective ensemble contains low-energy conformations which are similar to the native structure. The high-dimensionality of the conformational space and the ruggedness of the underlying energy surface currently make it very difficult to obtain such an ensemble. Recent studies have proposed that Basin Hopping is a promising probabilistic search framework to obtain a discrete representation of the protein energy surface in terms of local minima. Basin Hopping performs a series of structural perturbations followed by energy minimizations with the goal of hopping between nearby energy minima. This approach has been shown to be effective in obtaining conformations near the native structure for small systems. Recent work by us has extended this framework to larger systems through employment of the molecular fragment replacement technique, resulting in rapid sampling of large ensembles. Methods This paper investigates the algorithmic components in Basin Hopping to both understand and control their effect on the sampling of near-native minima. Realizing that such an ensemble is reduced before further refinement in full ab-initio protocols, we take an additional step and analyze the quality of the ensemble retained by ensemble reduction techniques. We propose a novel multi-objective technique based on the Pareto front to filter the ensemble of sampled local minima. Results and conclusions We show that controlling the magnitude of the perturbation allows directly controlling the distance between consecutively-sampled local minima and, in turn, steering the exploration towards conformations near the native structure. For the minimization step, we show that the addition of Metropolis Monte Carlo-based minimization is no more effective than a simple greedy search. Finally, we show that the size of the ensemble of sampled local minima can be effectively and efficiently reduced by a multi-objective filter to obtain a simpler representation of the probed energy surface. PMID:24564970

  10. Car-Parrinello molecular dynamics study of the intramolecular vibrational mode-sensitive double proton-transfer mechanisms in porphycene.

    PubMed

    Walewski, Łukasz; Waluk, Jacek; Lesyng, Bogdan

    2010-02-18

    Car-Parrinello molecular dynamics simulations were carried out to help interpret proton-transfer processes observed experimentally in porphycene under thermodynamic equilibrium conditions (NVT ensemble) as well as during selective, nonequilibrium vibrational excitations of the molecular scaffold (NVE ensemble). In the NVT ensemble, the population of the trans form in the gas phase at 300 K is 96.5%, and of the cis-1 form is 3.5%, in agreement with experimental data. Approximately 70% of the proton-transfer events are asynchronous double proton transfers. According to the high resolution simulation data they consist of two single transfer events that rapidly take place one after the other. The average time-period between the two consecutive jumps is 220 fs. The gas phase reaction rate estimate at 300 K is 3.6 ps, which is comparable to experimentally determined rates. The NVE ensemble nonequilibrium ab initio MD simulations, which correspond to selective vibrational excitations of the molecular scaffold generated with high resolution laser spectroscopy techniques, exhibit an enhancing property of the 182 cm(-1) vibrational mode and an inhibiting property of the 114 cm(-1) one. Both of them influence the proton-transfer rate, in qualitative agreement with experimental findings. Our ab initio simulations provide new predictions regarding the influence of double-mode vibrational excitations on proton-transfer processes. They can help in setting up future programmable spectroscopic experiments for the proton-transfer translocations.

  11. Generalized Green's function molecular dynamics for canonical ensemble simulations

    NASA Astrophysics Data System (ADS)

    Coluci, V. R.; Dantas, S. O.; Tewary, V. K.

    2018-05-01

    The need of small integration time steps (˜1 fs) in conventional molecular dynamics simulations is an important issue that inhibits the study of physical, chemical, and biological systems in real timescales. Additionally, to simulate those systems in contact with a thermal bath, thermostating techniques are usually applied. In this work, we generalize the Green's function molecular dynamics technique to allow simulations within the canonical ensemble. By applying this technique to one-dimensional systems, we were able to correctly describe important thermodynamic properties such as the temperature fluctuations, the temperature distribution, and the velocity autocorrelation function. We show that the proposed technique also allows the use of time steps one order of magnitude larger than those typically used in conventional molecular dynamics simulations. We expect that this technique can be used in long-timescale molecular dynamics simulations.

  12. An Integrated In Silico Method to Discover Novel Rock1 Inhibitors: Multi- Complex-Based Pharmacophore, Molecular Dynamics Simulation and Hybrid Protocol Virtual Screening.

    PubMed

    Chen, Haining; Li, Sijia; Hu, Yajiao; Chen, Guo; Jiang, Qinglin; Tong, Rongsheng; Zang, Zhihe; Cai, Lulu

    2016-01-01

    Rho-associated, coiled-coil containing protein kinase 1 (ROCK1) is an important regulator of focal adhesion, actomyosin contraction and cell motility. In this manuscript, a combination of the multi-complex-based pharmacophore (MCBP), molecular dynamics simulation and a hybrid protocol of a virtual screening method, comprised of multipharmacophore- based virtual screening (PBVS) and ensemble docking-based virtual screening (DBVS) methods were used for retrieving novel ROCK1 inhibitors from the natural products database embedded in the ZINC database. Ten hit compounds were selected from the hit compounds, and five compounds were tested experimentally. Thus, these results may provide valuable information for further discovery of more novel ROCK1 inhibitors.

  13. Stochastic dynamics of small ensembles of non-processive molecular motors: The parallel cluster model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Erdmann, Thorsten; Albert, Philipp J.; Schwarz, Ulrich S.

    2013-11-07

    Non-processive molecular motors have to work together in ensembles in order to generate appreciable levels of force or movement. In skeletal muscle, for example, hundreds of myosin II molecules cooperate in thick filaments. In non-muscle cells, by contrast, small groups with few tens of non-muscle myosin II motors contribute to essential cellular processes such as transport, shape changes, or mechanosensing. Here we introduce a detailed and analytically tractable model for this important situation. Using a three-state crossbridge model for the myosin II motor cycle and exploiting the assumptions of fast power stroke kinetics and equal load sharing between motors inmore » equivalent states, we reduce the stochastic reaction network to a one-step master equation for the binding and unbinding dynamics (parallel cluster model) and derive the rules for ensemble movement. We find that for constant external load, ensemble dynamics is strongly shaped by the catch bond character of myosin II, which leads to an increase of the fraction of bound motors under load and thus to firm attachment even for small ensembles. This adaptation to load results in a concave force-velocity relation described by a Hill relation. For external load provided by a linear spring, myosin II ensembles dynamically adjust themselves towards an isometric state with constant average position and load. The dynamics of the ensembles is now determined mainly by the distribution of motors over the different kinds of bound states. For increasing stiffness of the external spring, there is a sharp transition beyond which myosin II can no longer perform the power stroke. Slow unbinding from the pre-power-stroke state protects the ensembles against detachment.« less

  14. From deep TLS validation to ensembles of atomic models built from elemental motions

    DOE PAGES

    Urzhumtsev, Alexandre; Afonine, Pavel V.; Van Benschoten, Andrew H.; ...

    2015-07-28

    The translation–libration–screw model first introduced by Cruickshank, Schomaker and Trueblood describes the concerted motions of atomic groups. Using TLS models can improve the agreement between calculated and experimental diffraction data. Because the T, L and S matrices describe a combination of atomic vibrations and librations, TLS models can also potentially shed light on molecular mechanisms involving correlated motions. However, this use of TLS models in mechanistic studies is hampered by the difficulties in translating the results of refinement into molecular movement or a structural ensemble. To convert the matrices into a constituent molecular movement, the matrix elements must satisfy severalmore » conditions. Refining the T, L and S matrix elements as independent parameters without taking these conditions into account may result in matrices that do not represent concerted molecular movements. Here, a mathematical framework and the computational tools to analyze TLS matrices, resulting in either explicit decomposition into descriptions of the underlying motions or a report of broken conditions, are described. The description of valid underlying motions can then be output as a structural ensemble. All methods are implemented as part of the PHENIX project.« less

  15. Leveraging Gibbs Ensemble Molecular Dynamics and Hybrid Monte Carlo/Molecular Dynamics for Efficient Study of Phase Equilibria.

    PubMed

    Gartner, Thomas E; Epps, Thomas H; Jayaraman, Arthi

    2016-11-08

    We describe an extension of the Gibbs ensemble molecular dynamics (GEMD) method for studying phase equilibria. Our modifications to GEMD allow for direct control over particle transfer between phases and improve the method's numerical stability. Additionally, we found that the modified GEMD approach had advantages in computational efficiency in comparison to a hybrid Monte Carlo (MC)/MD Gibbs ensemble scheme in the context of the single component Lennard-Jones fluid. We note that this increase in computational efficiency does not compromise the close agreement of phase equilibrium results between the two methods. However, numerical instabilities in the GEMD scheme hamper GEMD's use near the critical point. We propose that the computationally efficient GEMD simulations can be used to map out the majority of the phase window, with hybrid MC/MD used as a follow up for conditions under which GEMD may be unstable (e.g., near-critical behavior). In this manner, we can capitalize on the contrasting strengths of these two methods to enable the efficient study of phase equilibria for systems that present challenges for a purely stochastic GEMC method, such as dense or low temperature systems, and/or those with complex molecular topologies.

  16. A virtual pebble game to ensemble average graph rigidity.

    PubMed

    González, Luis C; Wang, Hui; Livesay, Dennis R; Jacobs, Donald J

    2015-01-01

    The body-bar Pebble Game (PG) algorithm is commonly used to calculate network rigidity properties in proteins and polymeric materials. To account for fluctuating interactions such as hydrogen bonds, an ensemble of constraint topologies are sampled, and average network properties are obtained by averaging PG characterizations. At a simpler level of sophistication, Maxwell constraint counting (MCC) provides a rigorous lower bound for the number of internal degrees of freedom (DOF) within a body-bar network, and it is commonly employed to test if a molecular structure is globally under-constrained or over-constrained. MCC is a mean field approximation (MFA) that ignores spatial fluctuations of distance constraints by replacing the actual molecular structure by an effective medium that has distance constraints globally distributed with perfect uniform density. The Virtual Pebble Game (VPG) algorithm is a MFA that retains spatial inhomogeneity in the density of constraints on all length scales. Network fluctuations due to distance constraints that may be present or absent based on binary random dynamic variables are suppressed by replacing all possible constraint topology realizations with the probabilities that distance constraints are present. The VPG algorithm is isomorphic to the PG algorithm, where integers for counting "pebbles" placed on vertices or edges in the PG map to real numbers representing the probability to find a pebble. In the VPG, edges are assigned pebble capacities, and pebble movements become a continuous flow of probability within the network. Comparisons between the VPG and average PG results over a test set of proteins and disordered lattices demonstrate the VPG quantitatively estimates the ensemble average PG results well. The VPG performs about 20% faster than one PG, and it provides a pragmatic alternative to averaging PG rigidity characteristics over an ensemble of constraint topologies. The utility of the VPG falls in between the most accurate but slowest method of ensemble averaging over hundreds to thousands of independent PG runs, and the fastest but least accurate MCC.

  17. Machine learning approaches to evaluate correlation patterns in allosteric signaling: A case study of the PDZ2 domain

    NASA Astrophysics Data System (ADS)

    Botlani, Mohsen; Siddiqui, Ahnaf; Varma, Sameer

    2018-06-01

    Many proteins are regulated by dynamic allostery wherein regulator-induced changes in structure are comparable with thermal fluctuations. Consequently, understanding their mechanisms requires assessment of relationships between and within conformational ensembles of different states. Here we show how machine learning based approaches can be used to simplify this high-dimensional data mining task and also obtain mechanistic insight. In particular, we use these approaches to investigate two fundamental questions in dynamic allostery. First, how do regulators modify inter-site correlations in conformational fluctuations (Cij)? Second, how are regulator-induced shifts in conformational ensembles at two different sites in a protein related to each other? We address these questions in the context of the human protein tyrosine phosphatase 1E's PDZ2 domain, which is a model protein for studying dynamic allostery. We use molecular dynamics to generate conformational ensembles of the PDZ2 domain in both the regulator-bound and regulator-free states. The employed protocol reproduces methyl deuterium order parameters from NMR. Results from unsupervised clustering of Cij combined with flow analyses of weighted graphs of Cij show that regulator binding significantly alters the global signaling network in the protein; however, not by altering the spatial arrangement of strongly interacting amino acid clusters but by modifying the connectivity between clusters. Additionally, we find that regulator-induced shifts in conformational ensembles, which we evaluate by repartitioning ensembles using supervised learning, are, in fact, correlated. This correlation Δij is less extensive compared to Cij, but in contrast to Cij, Δij depends inversely on the distance from the regulator binding site. Assuming that Δij is an indicator of the transduction of the regulatory signal leads to the conclusion that the regulatory signal weakens with distance from the regulatory site. Overall, this work provides new approaches to analyze high-dimensional molecular simulation data and also presents applications that yield new insight into dynamic allostery.

  18. An Effective Antifreeze Protein Predictor with Ensemble Classifiers and Comprehensive Sequence Descriptors.

    PubMed

    Yang, Runtao; Zhang, Chengjin; Gao, Rui; Zhang, Lina

    2015-09-07

    Antifreeze proteins (AFPs) play a pivotal role in the antifreeze effect of overwintering organisms. They have a wide range of applications in numerous fields, such as improving the production of crops and the quality of frozen foods. Accurate identification of AFPs may provide important clues to decipher the underlying mechanisms of AFPs in ice-binding and to facilitate the selection of the most appropriate AFPs for several applications. Based on an ensemble learning technique, this study proposes an AFP identification system called AFP-Ensemble. In this system, random forest classifiers are trained by different training subsets and then aggregated into a consensus classifier by majority voting. The resulting predictor yields a sensitivity of 0.892, a specificity of 0.940, an accuracy of 0.938 and a balanced accuracy of 0.916 on an independent dataset, which are far better than the results obtained by previous methods. These results reveal that AFP-Ensemble is an effective and promising predictor for large-scale determination of AFPs. The detailed feature analysis in this study may give useful insights into the molecular mechanisms of AFP-ice interactions and provide guidance for the related experimental validation. A web server has been designed to implement the proposed method.

  19. Equilibrating metal-oxide cluster ensembles for oxidation reactions using oxygen in water

    Treesearch

    Ira A. Weinstock; Elena M. G. Barbuzzi; Michael W. Wemple; Jennifer J. Cowan; Richard S. Reiner; Dan M. Sonnen; Robert A. Heintz; James S. Bond; Craig L. Hill

    2001-01-01

    Although many enzymes can readily and selectively use oxygen in water--the most familiar and attractive of all oxidants and solvents, respectively–-the design of synthetic catalysts for selective water-based oxidation processes utilizing molecular oxygen remains a daunting task. Particularly problematic is the fact that oxidation of substrates by O2 involves radical...

  20. Isobaric molecular dynamics version of the generalized replica exchange method (gREM): Liquid–vapor equilibrium

    DOE PAGES

    Malolepsza, Edyta; Secor, Maxim; Keyes, Tom

    2015-09-23

    A prescription for sampling isobaric generalized ensembles with molecular dynamics is presented and applied to the generalized replica exchange method (gREM), which was designed for simulating first-order phase transitions. The properties of the isobaric gREM ensemble are discussed and a study is presented of the liquid-vapor equilibrium of the guest molecules given for gas hydrate formation with the mW water model. As a result, phase diagrams, critical parameters, and a law of corresponding states are obtained.

  1. ms 2: A molecular simulation tool for thermodynamic properties, release 3.0

    NASA Astrophysics Data System (ADS)

    Rutkai, Gábor; Köster, Andreas; Guevara-Carrion, Gabriela; Janzen, Tatjana; Schappals, Michael; Glass, Colin W.; Bernreuther, Martin; Wafai, Amer; Stephan, Simon; Kohns, Maximilian; Reiser, Steffen; Deublein, Stephan; Horsch, Martin; Hasse, Hans; Vrabec, Jadran

    2017-12-01

    A new version release (3.0) of the molecular simulation tool ms 2 (Deublein et al., 2011; Glass et al. 2014) is presented. Version 3.0 of ms 2 features two additional ensembles, i.e. microcanonical (NVE) and isobaric-isoenthalpic (NpH), various Helmholtz energy derivatives in the NVE ensemble, thermodynamic integration as a method for calculating the chemical potential, the osmotic pressure for calculating the activity of solvents, the six Maxwell-Stefan diffusion coefficients of quaternary mixtures, statistics for sampling hydrogen bonds, smooth-particle mesh Ewald summation as well as the ability to carry out molecular dynamics runs for an arbitrary number of state points in a single program execution.

  2. Communication and the emergence of collective behavior in living organisms: a quantum approach.

    PubMed

    Bischof, Marco; Del Giudice, Emilio

    2013-01-01

    Intermolecular interactions within living organisms have been found to occur not as individual independent events but as a part of a collective array of interconnected events. The problem of the emergence of this collective dynamics and of the correlated biocommunication therefore arises. In the present paper we review the proposals given within the paradigm of modern molecular biology and those given by some holistic approaches to biology. In recent times, the collective behavior of ensembles of microscopic units (atoms/molecules) has been addressed in the conceptual framework of Quantum Field Theory. The possibility of producing physical states where all the components of the ensemble move in unison has been recognized. In such cases, electromagnetic fields trapped within the ensemble appear. In the present paper we present a scheme based on Quantum Field Theory where molecules are able to move in phase-correlated unison among them and with a self-produced electromagnetic field. Experimental corroboration of this scheme is presented. Some consequences for future biological developments are discussed.

  3. Communication and the Emergence of Collective Behavior in Living Organisms: A Quantum Approach

    PubMed Central

    Bischof, Marco; Del Giudice, Emilio

    2013-01-01

    Intermolecular interactions within living organisms have been found to occur not as individual independent events but as a part of a collective array of interconnected events. The problem of the emergence of this collective dynamics and of the correlated biocommunication therefore arises. In the present paper we review the proposals given within the paradigm of modern molecular biology and those given by some holistic approaches to biology. In recent times, the collective behavior of ensembles of microscopic units (atoms/molecules) has been addressed in the conceptual framework of Quantum Field Theory. The possibility of producing physical states where all the components of the ensemble move in unison has been recognized. In such cases, electromagnetic fields trapped within the ensemble appear. In the present paper we present a scheme based on Quantum Field Theory where molecules are able to move in phase-correlated unison among them and with a self-produced electromagnetic field. Experimental corroboration of this scheme is presented. Some consequences for future biological developments are discussed. PMID:24288611

  4. Harnessing Reversible Electronic Energy Transfer: From Molecular Dyads to Molecular Machines.

    PubMed

    Denisov, Sergey A; Yu, Shinlin; Pozzo, Jean-Luc; Jonusauskas, Gediminas; McClenaghan, Nathan D

    2016-06-17

    Reversible electronic energy transfer (REET) may be instilled in bi-/multichromophoric molecule-based systems, following photoexcitation, upon judicious structural integration of matched chromophores. This leads to a new set of photophysical properties for the ensemble, which can be fully characterized by steady-state and time-resolved spectroscopic methods. Herein, we take a comprehensive look at progress in the development of this type of supermolecule in the last five years, which has seen systems evolve from covalently tethered dyads to synthetic molecular machines, exemplified by two different pseudorotaxanes. Indeed, REET holds promise in the control of movement in molecular machines, their assembly/disassembly, as well as in charge separation. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. Equipartition terms in transition path ensemble: Insights from molecular dynamics simulations of alanine dipeptide.

    PubMed

    Li, Wenjin

    2018-02-28

    Transition path ensemble consists of reactive trajectories and possesses all the information necessary for the understanding of the mechanism and dynamics of important condensed phase processes. However, quantitative description of the properties of the transition path ensemble is far from being established. Here, with numerical calculations on a model system, the equipartition terms defined in thermal equilibrium were for the first time estimated in the transition path ensemble. It was not surprising to observe that the energy was not equally distributed among all the coordinates. However, the energies distributed on a pair of conjugated coordinates remained equal. Higher energies were observed to be distributed on several coordinates, which are highly coupled to the reaction coordinate, while the rest were almost equally distributed. In addition, the ensemble-averaged energy on each coordinate as a function of time was also quantified. These quantitative analyses on energy distributions provided new insights into the transition path ensemble.

  6. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Calabrese, Gabriele, E-mail: calabrese@pdi-berlin.de; Corfdir, Pierre; Gao, Guanhui

    We demonstrate the self-assembled growth of vertically aligned GaN nanowire ensembles on a flexible Ti foil by plasma-assisted molecular beam epitaxy. The analysis of single nanowires by transmission electron microscopy reveals that they are single crystalline. Low-temperature photoluminescence spectroscopy demonstrates that in comparison to standard GaN nanowires grown on Si, the nanowires prepared on the Ti foil exhibit an equivalent crystalline perfection, a higher density of basal-plane stacking faults, but a reduced density of inversion domain boundaries. The room-temperature photoluminescence spectrum of the nanowire ensemble is not influenced or degraded by the bending of the substrate. The present results pavemore » the way for the fabrication of flexible optoelectronic devices based on GaN nanowires on metal foils.« less

  7. Generalized ensemble method applied to study systems with strong first order transitions

    DOE PAGES

    Malolepsza, E.; Kim, J.; Keyes, T.

    2015-09-28

    At strong first-order phase transitions, the entropy versus energy or, at constant pressure, enthalpy, exhibits convex behavior, and the statistical temperature curve correspondingly exhibits an S-loop or back-bending. In the canonical and isothermal-isobaric ensembles, with temperature as the control variable, the probability density functions become bimodal with peaks localized outside of the S-loop region. Inside, states are unstable, and as a result simulation of equilibrium phase coexistence becomes impossible. To overcome this problem, a method was proposed by Kim, Keyes and Straub, where optimally designed generalized ensemble sampling was combined with replica exchange, and denoted generalized replica exchange method (gREM).more » This new technique uses parametrized effective sampling weights that lead to a unimodal energy distribution, transforming unstable states into stable ones. In the present study, the gREM, originally developed as a Monte Carlo algorithm, was implemented to work with molecular dynamics in an isobaric ensemble and coded into LAMMPS, a highly optimized open source molecular simulation package. Lastly, the method is illustrated in a study of the very strong solid/liquid transition in water.« less

  8. Generalized ensemble method applied to study systems with strong first order transitions

    NASA Astrophysics Data System (ADS)

    Małolepsza, E.; Kim, J.; Keyes, T.

    2015-09-01

    At strong first-order phase transitions, the entropy versus energy or, at constant pressure, enthalpy, exhibits convex behavior, and the statistical temperature curve correspondingly exhibits an S-loop or back-bending. In the canonical and isothermal-isobaric ensembles, with temperature as the control variable, the probability density functions become bimodal with peaks localized outside of the S-loop region. Inside, states are unstable, and as a result simulation of equilibrium phase coexistence becomes impossible. To overcome this problem, a method was proposed by Kim, Keyes and Straub [1], where optimally designed generalized ensemble sampling was combined with replica exchange, and denoted generalized replica exchange method (gREM). This new technique uses parametrized effective sampling weights that lead to a unimodal energy distribution, transforming unstable states into stable ones. In the present study, the gREM, originally developed as a Monte Carlo algorithm, was implemented to work with molecular dynamics in an isobaric ensemble and coded into LAMMPS, a highly optimized open source molecular simulation package. The method is illustrated in a study of the very strong solid/liquid transition in water.

  9. Coherent coupling of a superconducting flux qubit to an electron spin ensemble in diamond.

    PubMed

    Zhu, Xiaobo; Saito, Shiro; Kemp, Alexander; Kakuyanagi, Kosuke; Karimoto, Shin-ichi; Nakano, Hayato; Munro, William J; Tokura, Yasuhiro; Everitt, Mark S; Nemoto, Kae; Kasu, Makoto; Mizuochi, Norikazu; Semba, Kouichi

    2011-10-12

    During the past decade, research into superconducting quantum bits (qubits) based on Josephson junctions has made rapid progress. Many foundational experiments have been performed, and superconducting qubits are now considered one of the most promising systems for quantum information processing. However, the experimentally reported coherence times are likely to be insufficient for future large-scale quantum computation. A natural solution to this problem is a dedicated engineered quantum memory based on atomic and molecular systems. The question of whether coherent quantum coupling is possible between such natural systems and a single macroscopic artificial atom has attracted considerable attention since the first demonstration of macroscopic quantum coherence in Josephson junction circuits. Here we report evidence of coherent strong coupling between a single macroscopic superconducting artificial atom (a flux qubit) and an ensemble of electron spins in the form of nitrogen-vacancy colour centres in diamond. Furthermore, we have observed coherent exchange of a single quantum of energy between a flux qubit and a macroscopic ensemble consisting of about 3 × 10(7) such colour centres. This provides a foundation for future quantum memories and hybrid devices coupling microwave and optical systems.

  10. Conformational and functional analysis of molecular dynamics trajectories by Self-Organising Maps

    PubMed Central

    2011-01-01

    Background Molecular dynamics (MD) simulations are powerful tools to investigate the conformational dynamics of proteins that is often a critical element of their function. Identification of functionally relevant conformations is generally done clustering the large ensemble of structures that are generated. Recently, Self-Organising Maps (SOMs) were reported performing more accurately and providing more consistent results than traditional clustering algorithms in various data mining problems. We present a novel strategy to analyse and compare conformational ensembles of protein domains using a two-level approach that combines SOMs and hierarchical clustering. Results The conformational dynamics of the α-spectrin SH3 protein domain and six single mutants were analysed by MD simulations. The Cα's Cartesian coordinates of conformations sampled in the essential space were used as input data vectors for SOM training, then complete linkage clustering was performed on the SOM prototype vectors. A specific protocol to optimize a SOM for structural ensembles was proposed: the optimal SOM was selected by means of a Taguchi experimental design plan applied to different data sets, and the optimal sampling rate of the MD trajectory was selected. The proposed two-level approach was applied to single trajectories of the SH3 domain independently as well as to groups of them at the same time. The results demonstrated the potential of this approach in the analysis of large ensembles of molecular structures: the possibility of producing a topological mapping of the conformational space in a simple 2D visualisation, as well as of effectively highlighting differences in the conformational dynamics directly related to biological functions. Conclusions The use of a two-level approach combining SOMs and hierarchical clustering for conformational analysis of structural ensembles of proteins was proposed. It can easily be extended to other study cases and to conformational ensembles from other sources. PMID:21569575

  11. Intracellular applications of fluorescence correlation spectroscopy: prospects for neuroscience.

    PubMed

    Kim, Sally A; Schwille, Petra

    2003-10-01

    Based on time-averaging fluctuation analysis of small fluorescent molecular ensembles in equilibrium, fluorescence correlation spectroscopy has recently been applied to investigate processes in the intracellular milieu. The exquisite sensitivity of fluorescence correlation spectroscopy provides access to a multitude of measurement parameters (rates of diffusion, local concentration, states of aggregation and molecular interactions) in real time with fast temporal and high spatial resolution. The introduction of dual-color cross-correlation, imaging, two-photon excitation, and coincidence analysis coupled with fluorescence correlation spectroscopy has expanded the utility of the technique to encompass a wide range of promising applications in living cells that may provide unprecedented insight into understanding the molecular mechanisms of intracellular neurobiological processes.

  12. Ensemble MD simulations restrained via crystallographic data: Accurate structure leads to accurate dynamics

    PubMed Central

    Xue, Yi; Skrynnikov, Nikolai R

    2014-01-01

    Currently, the best existing molecular dynamics (MD) force fields cannot accurately reproduce the global free-energy minimum which realizes the experimental protein structure. As a result, long MD trajectories tend to drift away from the starting coordinates (e.g., crystallographic structures). To address this problem, we have devised a new simulation strategy aimed at protein crystals. An MD simulation of protein crystal is essentially an ensemble simulation involving multiple protein molecules in a crystal unit cell (or a block of unit cells). To ensure that average protein coordinates remain correct during the simulation, we introduced crystallography-based restraints into the MD protocol. Because these restraints are aimed at the ensemble-average structure, they have only minimal impact on conformational dynamics of the individual protein molecules. So long as the average structure remains reasonable, the proteins move in a native-like fashion as dictated by the original force field. To validate this approach, we have used the data from solid-state NMR spectroscopy, which is the orthogonal experimental technique uniquely sensitive to protein local dynamics. The new method has been tested on the well-established model protein, ubiquitin. The ensemble-restrained MD simulations produced lower crystallographic R factors than conventional simulations; they also led to more accurate predictions for crystallographic temperature factors, solid-state chemical shifts, and backbone order parameters. The predictions for 15N R1 relaxation rates are at least as accurate as those obtained from conventional simulations. Taken together, these results suggest that the presented trajectories may be among the most realistic protein MD simulations ever reported. In this context, the ensemble restraints based on high-resolution crystallographic data can be viewed as protein-specific empirical corrections to the standard force fields. PMID:24452989

  13. Charge transfer excitations from exact and approximate ensemble Kohn-Sham theory

    NASA Astrophysics Data System (ADS)

    Gould, Tim; Kronik, Leeor; Pittalis, Stefano

    2018-05-01

    By studying the lowest excitations of an exactly solvable one-dimensional soft-Coulomb molecular model, we show that components of Kohn-Sham ensembles can be used to describe charge transfer processes. Furthermore, we compute the approximate excitation energies obtained by using the exact ensemble densities in the recently formulated ensemble Hartree-exchange theory [T. Gould and S. Pittalis, Phys. Rev. Lett. 119, 243001 (2017)]. Remarkably, our results show that triplet excitations are accurately reproduced across a dissociation curve in all cases tested, even in systems where ground state energies are poor due to strong static correlations. Singlet excitations exhibit larger deviations from exact results but are still reproduced semi-quantitatively.

  14. New technologies for examining the role of neuronal ensembles in drug addiction and fear.

    PubMed

    Cruz, Fabio C; Koya, Eisuke; Guez-Barber, Danielle H; Bossert, Jennifer M; Lupica, Carl R; Shaham, Yavin; Hope, Bruce T

    2013-11-01

    Correlational data suggest that learned associations are encoded within neuronal ensembles. However, it has been difficult to prove that neuronal ensembles mediate learned behaviours because traditional pharmacological and lesion methods, and even newer cell type-specific methods, affect both activated and non-activated neurons. In addition, previous studies on synaptic and molecular alterations induced by learning did not distinguish between behaviourally activated and non-activated neurons. Here, we describe three new approaches--Daun02 inactivation, FACS sorting of activated neurons and Fos-GFP transgenic rats--that have been used to selectively target and study activated neuronal ensembles in models of conditioned drug effects and relapse. We also describe two new tools--Fos-tTA transgenic mice and inactivation of CREB-overexpressing neurons--that have been used to study the role of neuronal ensembles in conditioned fear.

  15. Superconducting molybdenum-rhenium electrodes for single-molecule transport studies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gaudenzi, R.; Island, J. O.; Bruijckere, J. de

    2015-06-01

    We demonstrate that electronic transport through single molecules or molecular ensembles, commonly based on gold (Au) electrodes, can be extended to superconducting electrodes by combining gold with molybdenum-rhenium (MoRe). This combination induces proximity-effect superconductivity in the gold to temperatures of at least 4.6 K and magnetic fields of 6 T, improving on previously reported aluminum based superconducting nanojunctions. As a proof of concept, we show three-terminal superconductive transport measurements through an individual Fe{sub 4} single-molecule magnet.

  16. Steric interactions lead to collective tilting motion in the ribosome during mRNA-tRNA translocation

    NASA Astrophysics Data System (ADS)

    Nguyen, Kien; Whitford, Paul C.

    2016-02-01

    Translocation of mRNA and tRNA through the ribosome is associated with large-scale rearrangements of the head domain in the 30S ribosomal subunit. To elucidate the relationship between 30S head dynamics and mRNA-tRNA displacement, we apply molecular dynamics simulations using an all-atom structure-based model. Here we provide a statistical analysis of 250 spontaneous transitions between the A/P-P/E and P/P-E/E ensembles. Consistent with structural studies, the ribosome samples a chimeric ap/P-pe/E intermediate, where the 30S head is rotated ~18°. It then transiently populates a previously unreported intermediate ensemble, which is characterized by a ~10° tilt of the head. To identify the origins of head tilting, we analyse 781 additional simulations in which specific steric features are perturbed. These calculations show that head tilting may be attributed to specific steric interactions between tRNA and the 30S subunit (PE loop and protein S13). Taken together, this study demonstrates how molecular structure can give rise to large-scale collective rearrangements.

  17. Ensemble docking to difficult targets in early-stage drug discovery: Methodology and application to fibroblast growth factor 23.

    PubMed

    Velazquez, Hector A; Riccardi, Demian; Xiao, Zhousheng; Quarles, Leigh Darryl; Yates, Charless Ryan; Baudry, Jerome; Smith, Jeremy C

    2018-02-01

    Ensemble docking is now commonly used in early-stage in silico drug discovery and can be used to attack difficult problems such as finding lead compounds which can disrupt protein-protein interactions. We give an example of this methodology here, as applied to fibroblast growth factor 23 (FGF23), a protein hormone that is responsible for regulating phosphate homeostasis. The first small-molecule antagonists of FGF23 were recently discovered by combining ensemble docking with extensive experimental target validation data (Science Signaling, 9, 2016, ra113). Here, we provide a detailed account of how ensemble-based high-throughput virtual screening was used to identify the antagonist compounds discovered in reference (Science Signaling, 9, 2016, ra113). Moreover, we perform further calculations, redocking those antagonist compounds identified in reference (Science Signaling, 9, 2016, ra113) that performed well on drug-likeness filters, to predict possible binding regions. These predicted binding modes are rescored with the molecular mechanics Poisson-Boltzmann surface area (MM/PBSA) approach to calculate the most likely binding site. Our findings suggest that the antagonist compounds antagonize FGF23 through the disruption of protein-protein interactions between FGF23 and fibroblast growth factor receptor (FGFR). © 2017 John Wiley & Sons A/S.

  18. DNA origami as biocompatible surface to match single-molecule and ensemble experiments

    PubMed Central

    Gietl, Andreas; Holzmeister, Phil; Grohmann, Dina; Tinnefeld, Philip

    2012-01-01

    Single-molecule experiments on immobilized molecules allow unique insights into the dynamics of molecular machines and enzymes as well as their interactions. The immobilization, however, can invoke perturbation to the activity of biomolecules causing incongruities between single molecule and ensemble measurements. Here we introduce the recently developed DNA origami as a platform to transfer ensemble assays to the immobilized single molecule level without changing the nano-environment of the biomolecules. The idea is a stepwise transfer of common functional assays first to the surface of a DNA origami, which can be checked at the ensemble level, and then to the microscope glass slide for single-molecule inquiry using the DNA origami as a transfer platform. We studied the structural flexibility of a DNA Holliday junction and the TATA-binding protein (TBP)-induced bending of DNA both on freely diffusing molecules and attached to the origami structure by fluorescence resonance energy transfer. This resulted in highly congruent data sets demonstrating that the DNA origami does not influence the functionality of the biomolecule. Single-molecule data collected from surface-immobilized biomolecule-loaded DNA origami are in very good agreement with data from solution measurements supporting the fact that the DNA origami can be used as biocompatible surface in many fluorescence-based measurements. PMID:22523083

  19. Fast Computation of Solvation Free Energies with Molecular Density Functional Theory: Thermodynamic-Ensemble Partial Molar Volume Corrections.

    PubMed

    Sergiievskyi, Volodymyr P; Jeanmairet, Guillaume; Levesque, Maximilien; Borgis, Daniel

    2014-06-05

    Molecular density functional theory (MDFT) offers an efficient implicit-solvent method to estimate molecule solvation free-energies, whereas conserving a fully molecular representation of the solvent. Even within a second-order approximation for the free-energy functional, the so-called homogeneous reference fluid approximation, we show that the hydration free-energies computed for a data set of 500 organic compounds are of similar quality as those obtained from molecular dynamics free-energy perturbation simulations, with a computer cost reduced by 2-3 orders of magnitude. This requires to introduce the proper partial volume correction to transform the results from the grand canonical to the isobaric-isotherm ensemble that is pertinent to experiments. We show that this correction can be extended to 3D-RISM calculations, giving a sound theoretical justification to empirical partial molar volume corrections that have been proposed recently.

  20. Adaptively restrained molecular dynamics in LAMMPS

    NASA Astrophysics Data System (ADS)

    Kant Singh, Krishna; Redon, Stephane

    2017-07-01

    Adaptively restrained molecular dynamics (ARMD) is a recently introduced particles simulation method that switches positional degrees of freedom on and off during simulation in order to speed up calculations. In the NVE ensemble, ARMD allows users to trade between precision and speed while, in the NVT ensemble, it makes it possible to compute statistical averages faster. Despite the conceptual simplicity of the approach, however, integrating it in existing molecular dynamics packages is non-trivial, in particular since implemented potentials should a priori be rewritten to take advantage of frozen particles and achieve a speed-up. In this paper, we present novel algorithms for integrating ARMD in LAMMPS, a popular multi-purpose molecular simulation package. In particular, we demonstrate how to enable ARMD in LAMMPS without having to re-implement all available force fields. The proposed algorithms are assessed on four different benchmarks, and show how they allow us to speed up simulations up to one order of magnitude.

  1. Molecular beacon anchored onto a graphene oxide substrate

    NASA Astrophysics Data System (ADS)

    Darbandi, Arash; Datta, Debopam; Patel, Krunal; Lin, Gary; Stroscio, Michael A.; Dutta, Mitra

    2017-09-01

    In this article, we report a graphene oxide-based nanosensor incorporating semiconductor quantum dots linked to DNA-aptamers that functions as a ‘turn-off’ fluorescent nanosensor for detection of low concentrations of analytes. A specific demonstration of this turn-off aptasensor is presented for the case of the detection of mercury (II) ions. In this system, ensembles of aptamer-based quantum-dot sensors are anchored onto graphene oxide (GO) flakes which provide a platform for analyte detection in the vicinity of GO. Herein, the operation of this ensemble-based nanosensor is demonstrated for mercury ions, which upon addition of mercury, quenching of the emission intensity from the quantum dots is observed due to resonance energy transfer between quantum dots and the gold nanoparticle connected via a mercury target aptamer. A key result is that the usually dominant effect of quenching of the quantum dot due to close proximity to the GO can be reduced to negligible levels by using a linker molecule in conjunctions with the aptamer-based nanosensor. The effect of ionic concentration of the background matrix on the emission intensity was also investigated. The sensor system is found to be highly selective towards mercury and exhibits a linear behavior (r 2 > 0.99) in the nanomolar concentration range. The detection limit of the sensor towards mercury with no GO present was found to be 16.5 nM. With GO attached to molecular beacon via 14 base, 35 base, and 51 base long linker DNA, the detection limit was found to be 38.4 nM, 9.45 nM, and 11.38 nM; respectively.

  2. Disentangling polydispersity in the PCNA−p15PAF complex, a disordered, transient and multivalent macromolecular assembly

    PubMed Central

    Cordeiro, Tiago N.; Chen, Po-chia; De Biasio, Alfredo; Sibille, Nathalie; Blanco, Francisco J.; Hub, Jochen S.; Crehuet, Ramon

    2017-01-01

    Abstract The intrinsically disordered p15PAF regulates DNA replication and repair when interacting with the Proliferating Cell Nuclear Antigen (PCNA) sliding clamp. As many interactions between disordered proteins and globular partners involved in signaling and regulation, the complex between p15PAF and trimeric PCNA is of low affinity, forming a transient complex that is difficult to characterize at a structural level due to its inherent polydispersity. We have determined the structure, conformational fluctuations, and relative population of the five species that coexist in solution by combining small-angle X-ray scattering (SAXS) with molecular modelling. By using explicit ensemble descriptions for the individual species, built using integrative approaches and molecular dynamics (MD) simulations, we collectively interpreted multiple SAXS profiles as population-weighted thermodynamic mixtures. The analysis demonstrates that the N-terminus of p15PAF penetrates the PCNA ring and emerges on the back face. This observation substantiates the role of p15PAF as a drag regulating PCNA processivity during DNA repair. Our study reveals the power of ensemble-based approaches to decode structural, dynamic, and thermodynamic information from SAXS data. This strategy paves the way for deciphering the structural bases of flexible, transient and multivalent macromolecular assemblies involved in pivotal biological processes. PMID:28180305

  3. Coherently coupling distinct spin ensembles through a high critical temperature superconducting resonator

    NASA Astrophysics Data System (ADS)

    Ghirri, Alberto; Bonizzoni, Claudio; Troiani, Filippo; Affronte, Marco

    The problem of coupling remote ensembles of two-level systems through cavity photons is revisited by using molecular spin centers and a high critical temperature superconducting coplanar resonator. By using PyBTM organic radicals, we achieved the strong coupling regime with values of the cooperativity reaching 4300 at 2 K. We show that up to three distinct spin ensembles are simultaneously coupled through the resonator mode. The ensembles are made physically distinguishable by chemically varying the g-factor and by exploiting the inhomogeneities of the applied magnetic field. The coherent mixing of the spin and field modes is demonstrated by the observed multiple anticrossing, along with the simulations performed within the input-output formalism, and quantified by suitable entropic measures.

  4. Recognition of adenosine monophosphate and H2PO4- using zinc ensemble of new hexaphenylbenzene derivative: potential bioprobe and multichannel keypad system.

    PubMed

    Bhalla, Vandana; Vij, Varun; Kumar, Manoj; Sharma, Parduman Raj; Kaur, Tandeep

    2012-02-17

    Zinc ensemble of hexaphenylbenzene derivative 3 exhibits sensitive response toward adenosine monophosphate (AMP) and H(2)PO(4)(-) ions. Further, the application of derivative 3 as a multichannel molecular keypad could be realized in the presence of inputs of Zn(2+) ions, H(2)PO(4)(-) ions, and AMP.

  5. Conformational Heterogeneity of Unbound Proteins Enhances Recognition in Protein-Protein Encounters.

    PubMed

    Pallara, Chiara; Rueda, Manuel; Abagyan, Ruben; Fernández-Recio, Juan

    2016-07-12

    To understand cellular processes at the molecular level we need to improve our knowledge of protein-protein interactions, from a structural, mechanistic, and energetic point of view. Current theoretical studies and computational docking simulations show that protein dynamics plays a key role in protein association and support the need for including protein flexibility in modeling protein interactions. Assuming the conformational selection binding mechanism, in which the unbound state can sample bound conformers, one possible strategy to include flexibility in docking predictions would be the use of conformational ensembles originated from unbound protein structures. Here we present an exhaustive computational study about the use of precomputed unbound ensembles in the context of protein docking, performed on a set of 124 cases of the Protein-Protein Docking Benchmark 3.0. Conformational ensembles were generated by conformational optimization and refinement with MODELLER and by short molecular dynamics trajectories with AMBER. We identified those conformers providing optimal binding and investigated the role of protein conformational heterogeneity in protein-protein recognition. Our results show that a restricted conformational refinement can generate conformers with better binding properties and improve docking encounters in medium-flexible cases. For more flexible cases, a more extended conformational sampling based on Normal Mode Analysis was proven helpful. We found that successful conformers provide better energetic complementarity to the docking partners, which is compatible with recent views of binding association. In addition to the mechanistic considerations, these findings could be exploited for practical docking predictions of improved efficiency.

  6. On the statistical equivalence of restrained-ensemble simulations with the maximum entropy method

    PubMed Central

    Roux, Benoît; Weare, Jonathan

    2013-01-01

    An issue of general interest in computer simulations is to incorporate information from experiments into a structural model. An important caveat in pursuing this goal is to avoid corrupting the resulting model with spurious and arbitrary biases. While the problem of biasing thermodynamic ensembles can be formulated rigorously using the maximum entropy method introduced by Jaynes, the approach can be cumbersome in practical applications with the need to determine multiple unknown coefficients iteratively. A popular alternative strategy to incorporate the information from experiments is to rely on restrained-ensemble molecular dynamics simulations. However, the fundamental validity of this computational strategy remains in question. Here, it is demonstrated that the statistical distribution produced by restrained-ensemble simulations is formally consistent with the maximum entropy method of Jaynes. This clarifies the underlying conditions under which restrained-ensemble simulations will yield results that are consistent with the maximum entropy method. PMID:23464140

  7. New technologies for examining neuronal ensembles in drug addiction and fear

    PubMed Central

    Cruz, Fabio C.; Koya, Eisuke; Guez-Barber, Danielle H.; Bossert, Jennifer M.; Lupica, Carl R.; Shaham, Yavin; Hope, Bruce T.

    2015-01-01

    Correlational data suggest that learned associations are encoded within neuronal ensembles. However, it has been difficult to prove that neuronal ensembles mediate learned behaviours because traditional pharmacological and lesion methods, and even newer cell type-specific methods, affect both activated and non-activated neurons. Additionally, previous studies on synaptic and molecular alterations induced by learning did not distinguish between behaviourally activated and non-activated neurons. Here, we describe three new approaches—Daun02 inactivation, FACS sorting of activated neurons and c-fos-GFP transgenic rats — that have been used to selectively target and study activated neuronal ensembles in models of conditioned drug effects and relapse. We also describe two new tools — c-fos-tTA mice and inactivation of CREB-overexpressing neurons — that have been used to study the role of neuronal ensembles in conditioned fear. PMID:24088811

  8. From Reactor to Rheology in LDPE Modeling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Read, Daniel J.; Das, Chinmay; Auhl, Dietmar

    2008-07-07

    In recent years the association between molecular structure and linear rheology has been established and well-understood through the tube concept and its extensions for well-characterized materials (e.g. McLeish, Adv. Phys. 2002). However, for industrial branched polymeric material at processing conditions this piece of information is missing. A large number of phenomenological models have been developed to describe the nonlinear response of polymers. But none of these models takes into account the underlying molecular structure, leading to a fitting procedure with arbitrary fitting parameters. The goal of applied molecular rheology is a predictive scheme that runs in its entirety from themore » molecular structure from the reactor to the non-linear rheology of the resin. In our approach, we use a model for the industrial reactor to explicitly generate the molecular structure ensemble of LDPE's, (Tobita, J. Polym. Sci. B 2001), which are consistent with the analytical information. We calculate the linear rheology of the LDPE ensemble with the use of a tube model for branched polymers (Das et al., J. Rheol. 2006). We then, separate the contribution of the stress decay to a large number of pompom modes (McLeish et al., J. Rheol. 1998 and Inkson et al., J. Rheol. 1999) with the stretch time and the priority variables corresponding to the actual ensemble of molecules involved. This multimode pompom model allows us to predict the nonlinear properties without any fitting parameter. We present and analyze our results in comparison with experimental data on industrial materials.« less

  9. Accessing protein conformational ensembles using room-temperature X-ray crystallography

    PubMed Central

    Fraser, James S.; van den Bedem, Henry; Samelson, Avi J.; Lang, P. Therese; Holton, James M.; Echols, Nathaniel; Alber, Tom

    2011-01-01

    Modern protein crystal structures are based nearly exclusively on X-ray data collected at cryogenic temperatures (generally 100 K). The cooling process is thought to introduce little bias in the functional interpretation of structural results, because cryogenic temperatures minimally perturb the overall protein backbone fold. In contrast, here we show that flash cooling biases previously hidden structural ensembles in protein crystals. By analyzing available data for 30 different proteins using new computational tools for electron-density sampling, model refinement, and molecular packing analysis, we found that crystal cryocooling remodels the conformational distributions of more than 35% of side chains and eliminates packing defects necessary for functional motions. In the signaling switch protein, H-Ras, an allosteric network consistent with fluctuations detected in solution by NMR was uncovered in the room-temperature, but not the cryogenic, electron-density maps. These results expose a bias in structural databases toward smaller, overpacked, and unrealistically unique models. Monitoring room-temperature conformational ensembles by X-ray crystallography can reveal motions crucial for catalysis, ligand binding, and allosteric regulation. PMID:21918110

  10. Refining Markov state models for conformational dynamics using ensemble-averaged data and time-series trajectories

    NASA Astrophysics Data System (ADS)

    Matsunaga, Y.; Sugita, Y.

    2018-06-01

    A data-driven modeling scheme is proposed for conformational dynamics of biomolecules based on molecular dynamics (MD) simulations and experimental measurements. In this scheme, an initial Markov State Model (MSM) is constructed from MD simulation trajectories, and then, the MSM parameters are refined using experimental measurements through machine learning techniques. The second step can reduce the bias of MD simulation results due to inaccurate force-field parameters. Either time-series trajectories or ensemble-averaged data are available as a training data set in the scheme. Using a coarse-grained model of a dye-labeled polyproline-20, we compare the performance of machine learning estimations from the two types of training data sets. Machine learning from time-series data could provide the equilibrium populations of conformational states as well as their transition probabilities. It estimates hidden conformational states in more robust ways compared to that from ensemble-averaged data although there are limitations in estimating the transition probabilities between minor states. We discuss how to use the machine learning scheme for various experimental measurements including single-molecule time-series trajectories.

  11. The flexible C-terminal arm of the Lassa arenavirus Z-protein mediates interactions with multiple binding partners.

    PubMed

    May, Eric R; Armen, Roger S; Mannan, Aristotle M; Brooks, Charles L

    2010-08-01

    The arenavirus genome encodes for a Z-protein, which contains a RING domain that coordinates two zinc ions, and has been identified as having several functional roles at various stages of the virus life cycle. Z-protein binds to multiple host proteins and has been directly implicated in the promotion of viral budding, repression of mRNA translation, and apoptosis of infected cells. Using homology models of the Z-protein from Lassa strain arenavirus, replica exchange molecular dynamics (MD) was used to refine the structures, which were then subsequently clustered. Population-weighted ensembles of low-energy cluster representatives were predicted based upon optimal agreement of the chemical shifts computed with the SPARTA program with the experimental NMR chemical shifts. A member of the refined ensemble was identified to be a potential binder of budding factor Tsg101 based on its correspondence to the structure of the HIV-1 Gag late domain when bound to Tsg101. Members of these ensembles were docked against the crystal structure of human eIF4E translation initiation factor. Two plausible binding modes emerged based upon their agreement with experimental observation, favorable interaction energies and stability during MD trajectories. Mutations to Z are proposed that would either inhibit both binding mechanisms or selectively inhibit only one mode. The C-terminal domain conformation of the most populated member of the representative ensemble shielded protein-binding recognition motifs for Tsg101 and eIF4E and represents the most populated state free in solution. We propose that C-terminal flexibility is key for mediating the different functional states of the Z-protein. (c) 2010 Wiley-Liss, Inc.

  12. A Factor Graph Approach to Automated GO Annotation

    PubMed Central

    Spetale, Flavio E.; Tapia, Elizabeth; Krsticevic, Flavia; Roda, Fernando; Bulacio, Pilar

    2016-01-01

    As volume of genomic data grows, computational methods become essential for providing a first glimpse onto gene annotations. Automated Gene Ontology (GO) annotation methods based on hierarchical ensemble classification techniques are particularly interesting when interpretability of annotation results is a main concern. In these methods, raw GO-term predictions computed by base binary classifiers are leveraged by checking the consistency of predefined GO relationships. Both formal leveraging strategies, with main focus on annotation precision, and heuristic alternatives, with main focus on scalability issues, have been described in literature. In this contribution, a factor graph approach to the hierarchical ensemble formulation of the automated GO annotation problem is presented. In this formal framework, a core factor graph is first built based on the GO structure and then enriched to take into account the noisy nature of GO-term predictions. Hence, starting from raw GO-term predictions, an iterative message passing algorithm between nodes of the factor graph is used to compute marginal probabilities of target GO-terms. Evaluations on Saccharomyces cerevisiae, Arabidopsis thaliana and Drosophila melanogaster protein sequences from the GO Molecular Function domain showed significant improvements over competing approaches, even when protein sequences were naively characterized by their physicochemical and secondary structure properties or when loose noisy annotation datasets were considered. Based on these promising results and using Arabidopsis thaliana annotation data, we extend our approach to the identification of most promising molecular function annotations for a set of proteins of unknown function in Solanum lycopersicum. PMID:26771463

  13. A Factor Graph Approach to Automated GO Annotation.

    PubMed

    Spetale, Flavio E; Tapia, Elizabeth; Krsticevic, Flavia; Roda, Fernando; Bulacio, Pilar

    2016-01-01

    As volume of genomic data grows, computational methods become essential for providing a first glimpse onto gene annotations. Automated Gene Ontology (GO) annotation methods based on hierarchical ensemble classification techniques are particularly interesting when interpretability of annotation results is a main concern. In these methods, raw GO-term predictions computed by base binary classifiers are leveraged by checking the consistency of predefined GO relationships. Both formal leveraging strategies, with main focus on annotation precision, and heuristic alternatives, with main focus on scalability issues, have been described in literature. In this contribution, a factor graph approach to the hierarchical ensemble formulation of the automated GO annotation problem is presented. In this formal framework, a core factor graph is first built based on the GO structure and then enriched to take into account the noisy nature of GO-term predictions. Hence, starting from raw GO-term predictions, an iterative message passing algorithm between nodes of the factor graph is used to compute marginal probabilities of target GO-terms. Evaluations on Saccharomyces cerevisiae, Arabidopsis thaliana and Drosophila melanogaster protein sequences from the GO Molecular Function domain showed significant improvements over competing approaches, even when protein sequences were naively characterized by their physicochemical and secondary structure properties or when loose noisy annotation datasets were considered. Based on these promising results and using Arabidopsis thaliana annotation data, we extend our approach to the identification of most promising molecular function annotations for a set of proteins of unknown function in Solanum lycopersicum.

  14. Ensembles generated from crystal structures of single distant homologues solve challenging molecular-replacement cases in AMPLE.

    PubMed

    Rigden, Daniel J; Thomas, Jens M H; Simkovic, Felix; Simpkin, Adam; Winn, Martyn D; Mayans, Olga; Keegan, Ronan M

    2018-03-01

    Molecular replacement (MR) is the predominant route to solution of the phase problem in macromolecular crystallography. Although routine in many cases, it becomes more effortful and often impossible when the available experimental structures typically used as search models are only distantly homologous to the target. Nevertheless, with current powerful MR software, relatively small core structures shared between the target and known structure, of 20-40% of the overall structure for example, can succeed as search models where they can be isolated. Manual sculpting of such small structural cores is rarely attempted and is dependent on the crystallographer's expertise and understanding of the protein family in question. Automated search-model editing has previously been performed on the basis of sequence alignment, in order to eliminate, for example, side chains or loops that are not present in the target, or on the basis of structural features (e.g. solvent accessibility) or crystallographic parameters (e.g. B factors). Here, based on recent work demonstrating a correlation between evolutionary conservation and protein rigidity/packing, novel automated ways to derive edited search models from a given distant homologue over a range of sizes are presented. A variety of structure-based metrics, many readily obtained from online webservers, can be fed to the MR pipeline AMPLE to produce search models that succeed with a set of test cases where expertly manually edited comparators, further processed in diverse ways with MrBUMP, fail. Further significant performance gains result when the structure-based distance geometry method CONCOORD is used to generate ensembles from the distant homologue. To our knowledge, this is the first such approach whereby a single structure is meaningfully transformed into an ensemble for the purposes of MR. Additional cases further demonstrate the advantages of the approach. CONCOORD is freely available and computationally inexpensive, so these novel methods offer readily available new routes to solve difficult MR cases.

  15. Ensembles generated from crystal structures of single distant homologues solve challenging molecular-replacement cases in AMPLE

    PubMed Central

    Simpkin, Adam; Mayans, Olga; Keegan, Ronan M.

    2018-01-01

    Molecular replacement (MR) is the predominant route to solution of the phase problem in macromolecular crystallography. Although routine in many cases, it becomes more effortful and often impossible when the available experimental structures typically used as search models are only distantly homologous to the target. Nevertheless, with current powerful MR software, relatively small core structures shared between the target and known structure, of 20–40% of the overall structure for example, can succeed as search models where they can be isolated. Manual sculpting of such small structural cores is rarely attempted and is dependent on the crystallographer’s expertise and understanding of the protein family in question. Automated search-model editing has previously been performed on the basis of sequence alignment, in order to eliminate, for example, side chains or loops that are not present in the target, or on the basis of structural features (e.g. solvent accessibility) or crystallographic parameters (e.g. B factors). Here, based on recent work demonstrating a correlation between evolutionary conservation and protein rigidity/packing, novel automated ways to derive edited search models from a given distant homologue over a range of sizes are presented. A variety of structure-based metrics, many readily obtained from online webservers, can be fed to the MR pipeline AMPLE to produce search models that succeed with a set of test cases where expertly manually edited comparators, further processed in diverse ways with MrBUMP, fail. Further significant performance gains result when the structure-based distance geometry method CONCOORD is used to generate ensembles from the distant homologue. To our knowledge, this is the first such approach whereby a single structure is meaningfully transformed into an ensemble for the purposes of MR. Additional cases further demonstrate the advantages of the approach. CONCOORD is freely available and computationally inexpensive, so these novel methods offer readily available new routes to solve difficult MR cases. PMID:29533226

  16. An Improved Ensemble of Random Vector Functional Link Networks Based on Particle Swarm Optimization with Double Optimization Strategy

    PubMed Central

    Ling, Qing-Hua; Song, Yu-Qing; Han, Fei; Yang, Dan; Huang, De-Shuang

    2016-01-01

    For ensemble learning, how to select and combine the candidate classifiers are two key issues which influence the performance of the ensemble system dramatically. Random vector functional link networks (RVFL) without direct input-to-output links is one of suitable base-classifiers for ensemble systems because of its fast learning speed, simple structure and good generalization performance. In this paper, to obtain a more compact ensemble system with improved convergence performance, an improved ensemble of RVFL based on attractive and repulsive particle swarm optimization (ARPSO) with double optimization strategy is proposed. In the proposed method, ARPSO is applied to select and combine the candidate RVFL. As for using ARPSO to select the optimal base RVFL, ARPSO considers both the convergence accuracy on the validation data and the diversity of the candidate ensemble system to build the RVFL ensembles. In the process of combining RVFL, the ensemble weights corresponding to the base RVFL are initialized by the minimum norm least-square method and then further optimized by ARPSO. Finally, a few redundant RVFL is pruned, and thus the more compact ensemble of RVFL is obtained. Moreover, in this paper, theoretical analysis and justification on how to prune the base classifiers on classification problem is presented, and a simple and practically feasible strategy for pruning redundant base classifiers on both classification and regression problems is proposed. Since the double optimization is performed on the basis of the single optimization, the ensemble of RVFL built by the proposed method outperforms that built by some single optimization methods. Experiment results on function approximation and classification problems verify that the proposed method could improve its convergence accuracy as well as reduce the complexity of the ensemble system. PMID:27835638

  17. An Improved Ensemble of Random Vector Functional Link Networks Based on Particle Swarm Optimization with Double Optimization Strategy.

    PubMed

    Ling, Qing-Hua; Song, Yu-Qing; Han, Fei; Yang, Dan; Huang, De-Shuang

    2016-01-01

    For ensemble learning, how to select and combine the candidate classifiers are two key issues which influence the performance of the ensemble system dramatically. Random vector functional link networks (RVFL) without direct input-to-output links is one of suitable base-classifiers for ensemble systems because of its fast learning speed, simple structure and good generalization performance. In this paper, to obtain a more compact ensemble system with improved convergence performance, an improved ensemble of RVFL based on attractive and repulsive particle swarm optimization (ARPSO) with double optimization strategy is proposed. In the proposed method, ARPSO is applied to select and combine the candidate RVFL. As for using ARPSO to select the optimal base RVFL, ARPSO considers both the convergence accuracy on the validation data and the diversity of the candidate ensemble system to build the RVFL ensembles. In the process of combining RVFL, the ensemble weights corresponding to the base RVFL are initialized by the minimum norm least-square method and then further optimized by ARPSO. Finally, a few redundant RVFL is pruned, and thus the more compact ensemble of RVFL is obtained. Moreover, in this paper, theoretical analysis and justification on how to prune the base classifiers on classification problem is presented, and a simple and practically feasible strategy for pruning redundant base classifiers on both classification and regression problems is proposed. Since the double optimization is performed on the basis of the single optimization, the ensemble of RVFL built by the proposed method outperforms that built by some single optimization methods. Experiment results on function approximation and classification problems verify that the proposed method could improve its convergence accuracy as well as reduce the complexity of the ensemble system.

  18. On the predictability of outliers in ensemble forecasts

    NASA Astrophysics Data System (ADS)

    Siegert, S.; Bröcker, J.; Kantz, H.

    2012-03-01

    In numerical weather prediction, ensembles are used to retrieve probabilistic forecasts of future weather conditions. We consider events where the verification is smaller than the smallest, or larger than the largest ensemble member of a scalar ensemble forecast. These events are called outliers. In a statistically consistent K-member ensemble, outliers should occur with a base rate of 2/(K+1). In operational ensembles this base rate tends to be higher. We study the predictability of outlier events in terms of the Brier Skill Score and find that forecast probabilities can be calculated which are more skillful than the unconditional base rate. This is shown analytically for statistically consistent ensembles. Using logistic regression, forecast probabilities for outlier events in an operational ensemble are calculated. These probabilities exhibit positive skill which is quantitatively similar to the analytical results. Possible causes of these results as well as their consequences for ensemble interpretation are discussed.

  19. Ensembler: Enabling High-Throughput Molecular Simulations at the Superfamily Scale.

    PubMed

    Parton, Daniel L; Grinaway, Patrick B; Hanson, Sonya M; Beauchamp, Kyle A; Chodera, John D

    2016-06-01

    The rapidly expanding body of available genomic and protein structural data provides a rich resource for understanding protein dynamics with biomolecular simulation. While computational infrastructure has grown rapidly, simulations on an omics scale are not yet widespread, primarily because software infrastructure to enable simulations at this scale has not kept pace. It should now be possible to study protein dynamics across entire (super)families, exploiting both available structural biology data and conformational similarities across homologous proteins. Here, we present a new tool for enabling high-throughput simulation in the genomics era. Ensembler takes any set of sequences-from a single sequence to an entire superfamily-and shepherds them through various stages of modeling and refinement to produce simulation-ready structures. This includes comparative modeling to all relevant PDB structures (which may span multiple conformational states of interest), reconstruction of missing loops, addition of missing atoms, culling of nearly identical structures, assignment of appropriate protonation states, solvation in explicit solvent, and refinement and filtering with molecular simulation to ensure stable simulation. The output of this pipeline is an ensemble of structures ready for subsequent molecular simulations using computer clusters, supercomputers, or distributed computing projects like Folding@home. Ensembler thus automates much of the time-consuming process of preparing protein models suitable for simulation, while allowing scalability up to entire superfamilies. A particular advantage of this approach can be found in the construction of kinetic models of conformational dynamics-such as Markov state models (MSMs)-which benefit from a diverse array of initial configurations that span the accessible conformational states to aid sampling. We demonstrate the power of this approach by constructing models for all catalytic domains in the human tyrosine kinase family, using all available kinase catalytic domain structures from any organism as structural templates. Ensembler is free and open source software licensed under the GNU General Public License (GPL) v2. It is compatible with Linux and OS X. The latest release can be installed via the conda package manager, and the latest source can be downloaded from https://github.com/choderalab/ensembler.

  20. Argumentation Based Joint Learning: A Novel Ensemble Learning Approach

    PubMed Central

    Xu, Junyi; Yao, Li; Li, Le

    2015-01-01

    Recently, ensemble learning methods have been widely used to improve classification performance in machine learning. In this paper, we present a novel ensemble learning method: argumentation based multi-agent joint learning (AMAJL), which integrates ideas from multi-agent argumentation, ensemble learning, and association rule mining. In AMAJL, argumentation technology is introduced as an ensemble strategy to integrate multiple base classifiers and generate a high performance ensemble classifier. We design an argumentation framework named Arena as a communication platform for knowledge integration. Through argumentation based joint learning, high quality individual knowledge can be extracted, and thus a refined global knowledge base can be generated and used independently for classification. We perform numerous experiments on multiple public datasets using AMAJL and other benchmark methods. The results demonstrate that our method can effectively extract high quality knowledge for ensemble classifier and improve the performance of classification. PMID:25966359

  1. Ensemble-based docking: From hit discovery to metabolism and toxicity predictions

    DOE PAGES

    Evangelista, Wilfredo; Weir, Rebecca; Ellingson, Sally; ...

    2016-07-29

    The use of ensemble-based docking for the exploration of biochemical pathways and toxicity prediction of drug candidates is described. We describe the computational engineering work necessary to enable large ensemble docking campaigns on supercomputers. We show examples where ensemble-based docking has significantly increased the number and the diversity of validated drug candidates. Finally, we illustrate how ensemble-based docking can be extended beyond hit discovery and toward providing a structural basis for the prediction of metabolism and off-target binding relevant to pre-clinical and clinical trials.

  2. High strength films from oriented, hydrogen-bonded "graphamid" 2D polymer molecular ensembles.

    PubMed

    Sandoz-Rosado, Emil; Beaudet, Todd D; Andzelm, Jan W; Wetzel, Eric D

    2018-02-27

    The linear polymer poly(p-phenylene terephthalamide), better known by its tradename Kevlar, is an icon of modern materials science due to its remarkable strength, stiffness, and environmental resistance. Here, we propose a new two-dimensional (2D) polymer, "graphamid", that closely resembles Kevlar in chemical structure, but is mechanically advantaged by virtue of its 2D structure. Using atomistic calculations, we show that graphamid comprises covalently-bonded sheets bridged by a high population of strong intermolecular hydrogen bonds. Molecular and micromechanical calculations predict that these strong intermolecular interactions allow stiff, high strength (6-8 GPa), and tough films from ensembles of finite graphamid molecules. In contrast, traditional 2D materials like graphene have weak intermolecular interactions, leading to ensembles of low strength (0.1-0.5 GPa) and brittle fracture behavior. These results suggest that hydrogen-bonded 2D polymers like graphamid would be transformative in enabling scalable, lightweight, high performance polymer films of unprecedented mechanical performance.

  3. Locally Weighted Ensemble Clustering.

    PubMed

    Huang, Dong; Wang, Chang-Dong; Lai, Jian-Huang

    2018-05-01

    Due to its ability to combine multiple base clusterings into a probably better and more robust clustering, the ensemble clustering technique has been attracting increasing attention in recent years. Despite the significant success, one limitation to most of the existing ensemble clustering methods is that they generally treat all base clusterings equally regardless of their reliability, which makes them vulnerable to low-quality base clusterings. Although some efforts have been made to (globally) evaluate and weight the base clusterings, yet these methods tend to view each base clustering as an individual and neglect the local diversity of clusters inside the same base clustering. It remains an open problem how to evaluate the reliability of clusters and exploit the local diversity in the ensemble to enhance the consensus performance, especially, in the case when there is no access to data features or specific assumptions on data distribution. To address this, in this paper, we propose a novel ensemble clustering approach based on ensemble-driven cluster uncertainty estimation and local weighting strategy. In particular, the uncertainty of each cluster is estimated by considering the cluster labels in the entire ensemble via an entropic criterion. A novel ensemble-driven cluster validity measure is introduced, and a locally weighted co-association matrix is presented to serve as a summary for the ensemble of diverse clusters. With the local diversity in ensembles exploited, two novel consensus functions are further proposed. Extensive experiments on a variety of real-world datasets demonstrate the superiority of the proposed approach over the state-of-the-art.

  4. Clustering molecular dynamics trajectories for optimizing docking experiments.

    PubMed

    De Paris, Renata; Quevedo, Christian V; Ruiz, Duncan D; Norberto de Souza, Osmar; Barros, Rodrigo C

    2015-01-01

    Molecular dynamics simulations of protein receptors have become an attractive tool for rational drug discovery. However, the high computational cost of employing molecular dynamics trajectories in virtual screening of large repositories threats the feasibility of this task. Computational intelligence techniques have been applied in this context, with the ultimate goal of reducing the overall computational cost so the task can become feasible. Particularly, clustering algorithms have been widely used as a means to reduce the dimensionality of molecular dynamics trajectories. In this paper, we develop a novel methodology for clustering entire trajectories using structural features from the substrate-binding cavity of the receptor in order to optimize docking experiments on a cloud-based environment. The resulting partition was selected based on three clustering validity criteria, and it was further validated by analyzing the interactions between 20 ligands and a fully flexible receptor (FFR) model containing a 20 ns molecular dynamics simulation trajectory. Our proposed methodology shows that taking into account features of the substrate-binding cavity as input for the k-means algorithm is a promising technique for accurately selecting ensembles of representative structures tailored to a specific ligand.

  5. simulation of the DNA force-extension curve

    NASA Astrophysics Data System (ADS)

    Shinaberry, Gregory; Mikhaylov, Ivan; Balaeff, Alexander

    A molecular dynamics simulation study of the force-extension curve of double-stranded DNA is presented. Extended simulations of the DNA at multiple points along the force-extension curve are conducted with DNA end-to-end length constrained at each point. The calculated force-extension curve qualitatively reproduces the experimental one. The DNA conformational ensemble at each extension shows that the famous plateau of the force-extension curve results from B-DNA melting, whereas the formation of the earlier-predicted novel DNA conformation called 'zip-DNA' takes place at extensions past the plateau. An extensive analysis of the DNA conformational ensemble in terms of base configuration, backbone configuration, solvent interaction energy, etc., is conducted in order to elucidate the physical origin of DNA elasticity and the main interactions responsible for the shape of the force-extension curve.

  6. Arc expression identifies the lateral amygdala fear memory trace

    PubMed Central

    Gouty-Colomer, L A; Hosseini, B; Marcelo, I M; Schreiber, J; Slump, D E; Yamaguchi, S; Houweling, A R; Jaarsma, D; Elgersma, Y; Kushner, S A

    2016-01-01

    Memories are encoded within sparsely distributed neuronal ensembles. However, the defining cellular properties of neurons within a memory trace remain incompletely understood. Using a fluorescence-based Arc reporter, we were able to visually identify the distinct subset of lateral amygdala (LA) neurons activated during auditory fear conditioning. We found that Arc-expressing neurons have enhanced intrinsic excitability and are preferentially recruited into newly encoded memory traces. Furthermore, synaptic potentiation of thalamic inputs to the LA during fear conditioning is learning-specific, postsynaptically mediated and highly localized to Arc-expressing neurons. Taken together, our findings validate the immediate-early gene Arc as a molecular marker for the LA neuronal ensemble recruited during fear learning. Moreover, these results establish a model of fear memory formation in which intrinsic excitability determines neuronal selection, whereas learning-related encoding is governed by synaptic plasticity. PMID:25802982

  7. Simulation's Ensemble is Better Than Ensemble Simulation

    NASA Astrophysics Data System (ADS)

    Yan, X.

    2017-12-01

    Simulation's ensemble is better than ensemble simulation Yan Xiaodong State Key Laboratory of Earth Surface Processes and Resource Ecology (ESPRE) Beijing Normal University,19 Xinjiekouwai Street, Haidian District, Beijing 100875, China Email: yxd@bnu.edu.cnDynamical system is simulated from initial state. However initial state data is of great uncertainty, which leads to uncertainty of simulation. Therefore, multiple possible initial states based simulation has been used widely in atmospheric science, which has indeed been proved to be able to lower the uncertainty, that was named simulation's ensemble because multiple simulation results would be fused . In ecological field, individual based model simulation (forest gap models for example) can be regarded as simulation's ensemble compared with community based simulation (most ecosystem models). In this talk, we will address the advantage of individual based simulation and even their ensembles.

  8. Automatized Assessment of Protective Group Reactivity: A Step Toward Big Reaction Data Analysis.

    PubMed

    Lin, Arkadii I; Madzhidov, Timur I; Klimchuk, Olga; Nugmanov, Ramil I; Antipin, Igor S; Varnek, Alexandre

    2016-11-28

    We report a new method to assess protective groups (PGs) reactivity as a function of reaction conditions (catalyst, solvent) using raw reaction data. It is based on an intuitive similarity principle for chemical reactions: similar reactions proceed under similar conditions. Technically, reaction similarity can be assessed using the Condensed Graph of Reaction (CGR) approach representing an ensemble of reactants and products as a single molecular graph, i.e., as a pseudomolecule for which molecular descriptors or fingerprints can be calculated. CGR-based in-house tools were used to process data for 142,111 catalytic hydrogenation reactions extracted from the Reaxys database. Our results reveal some contradictions with famous Greene's Reactivity Charts based on manual expert analysis. Models developed in this study show high accuracy (ca. 90%) for predicting optimal experimental conditions of protective group deprotection.

  9. GENESIS: a hybrid-parallel and multi-scale molecular dynamics simulator with enhanced sampling algorithms for biomolecular and cellular simulations.

    PubMed

    Jung, Jaewoon; Mori, Takaharu; Kobayashi, Chigusa; Matsunaga, Yasuhiro; Yoda, Takao; Feig, Michael; Sugita, Yuji

    2015-07-01

    GENESIS (Generalized-Ensemble Simulation System) is a new software package for molecular dynamics (MD) simulations of macromolecules. It has two MD simulators, called ATDYN and SPDYN. ATDYN is parallelized based on an atomic decomposition algorithm for the simulations of all-atom force-field models as well as coarse-grained Go-like models. SPDYN is highly parallelized based on a domain decomposition scheme, allowing large-scale MD simulations on supercomputers. Hybrid schemes combining OpenMP and MPI are used in both simulators to target modern multicore computer architectures. Key advantages of GENESIS are (1) the highly parallel performance of SPDYN for very large biological systems consisting of more than one million atoms and (2) the availability of various REMD algorithms (T-REMD, REUS, multi-dimensional REMD for both all-atom and Go-like models under the NVT, NPT, NPAT, and NPγT ensembles). The former is achieved by a combination of the midpoint cell method and the efficient three-dimensional Fast Fourier Transform algorithm, where the domain decomposition space is shared in real-space and reciprocal-space calculations. Other features in SPDYN, such as avoiding concurrent memory access, reducing communication times, and usage of parallel input/output files, also contribute to the performance. We show the REMD simulation results of a mixed (POPC/DMPC) lipid bilayer as a real application using GENESIS. GENESIS is released as free software under the GPLv2 licence and can be easily modified for the development of new algorithms and molecular models. WIREs Comput Mol Sci 2015, 5:310-323. doi: 10.1002/wcms.1220.

  10. Residue-level global and local ensemble-ensemble comparisons of protein domains.

    PubMed

    Clark, Sarah A; Tronrud, Dale E; Karplus, P Andrew

    2015-09-01

    Many methods of protein structure generation such as NMR-based solution structure determination and template-based modeling do not produce a single model, but an ensemble of models consistent with the available information. Current strategies for comparing ensembles lose information because they use only a single representative structure. Here, we describe the ENSEMBLATOR and its novel strategy to directly compare two ensembles containing the same atoms to identify significant global and local backbone differences between them on per-atom and per-residue levels, respectively. The ENSEMBLATOR has four components: eePREP (ee for ensemble-ensemble), which selects atoms common to all models; eeCORE, which identifies atoms belonging to a cutoff-distance dependent common core; eeGLOBAL, which globally superimposes all models using the defined core atoms and calculates for each atom the two intraensemble variations, the interensemble variation, and the closest approach of members of the two ensembles; and eeLOCAL, which performs a local overlay of each dipeptide and, using a novel measure of local backbone similarity, reports the same four variations as eeGLOBAL. The combination of eeGLOBAL and eeLOCAL analyses identifies the most significant differences between ensembles. We illustrate the ENSEMBLATOR's capabilities by showing how using it to analyze NMR ensembles and to compare NMR ensembles with crystal structures provides novel insights compared to published studies. One of these studies leads us to suggest that a "consistency check" of NMR-derived ensembles may be a useful analysis step for NMR-based structure determinations in general. The ENSEMBLATOR 1.0 is available as a first generation tool to carry out ensemble-ensemble comparisons. © 2015 The Protein Society.

  11. Residue-level global and local ensemble-ensemble comparisons of protein domains

    PubMed Central

    Clark, Sarah A; Tronrud, Dale E; Andrew Karplus, P

    2015-01-01

    Many methods of protein structure generation such as NMR-based solution structure determination and template-based modeling do not produce a single model, but an ensemble of models consistent with the available information. Current strategies for comparing ensembles lose information because they use only a single representative structure. Here, we describe the ENSEMBLATOR and its novel strategy to directly compare two ensembles containing the same atoms to identify significant global and local backbone differences between them on per-atom and per-residue levels, respectively. The ENSEMBLATOR has four components: eePREP (ee for ensemble-ensemble), which selects atoms common to all models; eeCORE, which identifies atoms belonging to a cutoff-distance dependent common core; eeGLOBAL, which globally superimposes all models using the defined core atoms and calculates for each atom the two intraensemble variations, the interensemble variation, and the closest approach of members of the two ensembles; and eeLOCAL, which performs a local overlay of each dipeptide and, using a novel measure of local backbone similarity, reports the same four variations as eeGLOBAL. The combination of eeGLOBAL and eeLOCAL analyses identifies the most significant differences between ensembles. We illustrate the ENSEMBLATOR's capabilities by showing how using it to analyze NMR ensembles and to compare NMR ensembles with crystal structures provides novel insights compared to published studies. One of these studies leads us to suggest that a “consistency check” of NMR-derived ensembles may be a useful analysis step for NMR-based structure determinations in general. The ENSEMBLATOR 1.0 is available as a first generation tool to carry out ensemble-ensemble comparisons. PMID:26032515

  12. Forces and stress in second order Møller-Plesset perturbation theory for condensed phase systems within the resolution-of-identity Gaussian and plane waves approach

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Del Ben, Mauro, E-mail: mauro.delben@chem.uzh.ch; Hutter, Jürg, E-mail: hutter@chem.uzh.ch; VandeVondele, Joost, E-mail: Joost.VandeVondele@mat.ethz.ch

    The forces acting on the atoms as well as the stress tensor are crucial ingredients for calculating the structural and dynamical properties of systems in the condensed phase. Here, these derivatives of the total energy are evaluated for the second-order Møller-Plesset perturbation energy (MP2) in the framework of the resolution of identity Gaussian and plane waves method, in a way that is fully consistent with how the total energy is computed. This consistency is non-trivial, given the different ways employed to compute Coulomb, exchange, and canonical four center integrals, and allows, for example, for energy conserving dynamics in various ensembles.more » Based on this formalism, a massively parallel algorithm has been developed for finite and extended system. The designed parallel algorithm displays, with respect to the system size, cubic, quartic, and quintic requirements, respectively, for the memory, communication, and computation. All these requirements are reduced with an increasing number of processes, and the measured performance shows excellent parallel scalability and efficiency up to thousands of nodes. Additionally, the computationally more demanding quintic scaling steps can be accelerated by employing graphics processing units (GPU’s) showing, for large systems, a gain of almost a factor two compared to the standard central processing unit-only case. In this way, the evaluation of the derivatives of the RI-MP2 energy can be performed within a few minutes for systems containing hundreds of atoms and thousands of basis functions. With good time to solution, the implementation thus opens the possibility to perform molecular dynamics (MD) simulations in various ensembles (microcanonical ensemble and isobaric-isothermal ensemble) at the MP2 level of theory. Geometry optimization, full cell relaxation, and energy conserving MD simulations have been performed for a variety of molecular crystals including NH{sub 3}, CO{sub 2}, formic acid, and benzene.« less

  13. The evolution of replicators.

    PubMed Central

    Szathmáry, E

    2000-01-01

    Replicators of interest in chemistry, biology and culture are briefly surveyed from a conceptual point of view. Systems with limited heredity have only a limited evolutionary potential because the number of available types is too low. Chemical cycles, such as the formose reaction, are holistic replicators since replication is not based on the successive addition of modules. Replicator networks consisting of catalytic molecules (such as reflexively autocatalytic sets of proteins, or reproducing lipid vesicles) are hypothetical ensemble replicators, and their functioning rests on attractors of their dynamics. Ensemble replicators suffer from the paradox of specificity: while their abstract feasibility seems to require a high number of molecular types, the harmful effect of side reactions calls for a small system size. No satisfactory solution to this problem is known. Phenotypic replicators do not pass on their genotypes, only some aspects of the phenotype are transmitted. Phenotypic replicators with limited heredity include genetic membranes, prions and simple memetic systems. Memes in human culture are unlimited hereditary, phenotypic replicators, based on language. The typical path of evolution goes from limited to unlimited heredity, and from attractor-based to modular (digital) replicators. PMID:11127914

  14. The evolution of replicators.

    PubMed

    Szathmáry, E

    2000-11-29

    Replicators of interest in chemistry, biology and culture are briefly surveyed from a conceptual point of view. Systems with limited heredity have only a limited evolutionary potential because the number of available types is too low. Chemical cycles, such as the formose reaction, are holistic replicators since replication is not based on the successive addition of modules. Replicator networks consisting of catalytic molecules (such as reflexively autocatalytic sets of proteins, or reproducing lipid vesicles) are hypothetical ensemble replicators, and their functioning rests on attractors of their dynamics. Ensemble replicators suffer from the paradox of specificity: while their abstract feasibility seems to require a high number of molecular types, the harmful effect of side reactions calls for a small system size. No satisfactory solution to this problem is known. Phenotypic replicators do not pass on their genotypes, only some aspects of the phenotype are transmitted. Phenotypic replicators with limited heredity include genetic membranes, prions and simple memetic systems. Memes in human culture are unlimited hereditary, phenotypic replicators, based on language. The typical path of evolution goes from limited to unlimited heredity, and from attractor-based to modular (digital) replicators.

  15. Stochastic dynamics and mechanosensitivity of myosin II minifilaments

    NASA Astrophysics Data System (ADS)

    Albert, Philipp J.; Erdmann, Thorsten; Schwarz, Ulrich S.

    2014-09-01

    Tissue cells are in a state of permanent mechanical tension that is maintained mainly by myosin II minifilaments, which are bipolar assemblies of tens of myosin II molecular motors contracting actin networks and bundles. Here we introduce a stochastic model for myosin II minifilaments as two small myosin II motor ensembles engaging in a stochastic tug-of-war. Each of the two ensembles is described by the parallel cluster model that allows us to use exact stochastic simulations and at the same time to keep important molecular details of the myosin II cross-bridge cycle. Our simulation and analytical results reveal a strong dependence of myosin II minifilament dynamics on environmental stiffness that is reminiscent of the cellular response to substrate stiffness. For small stiffness, minifilaments form transient crosslinks exerting short spikes of force with negligible mean. For large stiffness, minifilaments form near permanent crosslinks exerting a mean force which hardly depends on environmental elasticity. This functional switch arises because dissociation after the power stroke is suppressed by force (catch bonding) and because ensembles can no longer perform the power stroke at large forces. Symmetric myosin II minifilaments perform a random walk with an effective diffusion constant which decreases with increasing ensemble size, as demonstrated for rigid substrates with an analytical treatment.

  16. Ensemble control of Kondo screening in molecular adsorbates

    DOE PAGES

    Maughan, Bret; Zahl, Percy; Sutter, Peter; ...

    2017-04-06

    Switching the magnetic properties of organic semiconductors on a metal surface has thus far largely been limited to molecule-by-molecule tip-induced transformations in scanned probe experiments. Here we demonstrate with molecular resolution that collective control of activated Kondo screening can be achieved in thin-films of the organic semiconductor titanyl phthalocyanine on Cu(110) to obtain tunable concentrations of Kondo impurities. Using low-temperature scanning tunneling microscopy and spectroscopy, we show that a thermally activated molecular distortion dramatically shifts surface–molecule coupling and enables ensemble-level control of Kondo screening in the interfacial spin system. This is accompanied by the formation of a temperature-dependent Abrikosov–Suhl–Kondo resonancemore » in the local density of states of the activated molecules. This enables coverage-dependent control over activation to the Kondo screening state. Finally, our study thus advances the versatility of molecular switching for Kondo physics and opens new avenues for scalable bottom-up tailoring of the electronic structure and magnetic texture of organic semiconductor interfaces at the nanoscale.« less

  17. How Accurate Are Transition States from Simulations of Enzymatic Reactions?

    PubMed Central

    2015-01-01

    The rate expression of traditional transition state theory (TST) assumes no recrossing of the transition state (TS) and thermal quasi-equilibrium between the ground state and the TS. Currently, it is not well understood to what extent these assumptions influence the nature of the activated complex obtained in traditional TST-based simulations of processes in the condensed phase in general and in enzymes in particular. Here we scrutinize these assumptions by characterizing the TSs for hydride transfer catalyzed by the enzyme Escherichia coli dihydrofolate reductase obtained using various simulation approaches. Specifically, we compare the TSs obtained with common TST-based methods and a dynamics-based method. Using a recently developed accurate hybrid quantum mechanics/molecular mechanics potential, we find that the TST-based and dynamics-based methods give considerably different TS ensembles. This discrepancy, which could be due equilibrium solvation effects and the nature of the reaction coordinate employed and its motion, raises major questions about how to interpret the TSs determined by common simulation methods. We conclude that further investigation is needed to characterize the impact of various TST assumptions on the TS phase-space ensemble and on the reaction kinetics. PMID:24860275

  18. Time-course, negative-stain electron microscopy-based analysis for investigating protein-protein interactions at the single-molecule level.

    PubMed

    Nogal, Bartek; Bowman, Charles A; Ward, Andrew B

    2017-11-24

    Several biophysical approaches are available to study protein-protein interactions. Most approaches are conducted in bulk solution, and are therefore limited to an average measurement of the ensemble of molecular interactions. Here, we show how single-particle EM can enrich our understanding of protein-protein interactions at the single-molecule level and potentially capture states that are unobservable with ensemble methods because they are below the limit of detection or not conducted on an appropriate time scale. Using the HIV-1 envelope glycoprotein (Env) and its interaction with receptor CD4-binding site neutralizing antibodies as a model system, we both corroborate ensemble kinetics-derived parameters and demonstrate how time-course EM can further dissect stoichiometric states of complexes that are not readily observable with other methods. Visualization of the kinetics and stoichiometry of Env-antibody complexes demonstrated the applicability of our approach to qualitatively and semi-quantitatively differentiate two highly similar neutralizing antibodies. Furthermore, implementation of machine-learning techniques for sorting class averages of these complexes into discrete subclasses of particles helped reduce human bias. Our data provide proof of concept that single-particle EM can be used to generate a "visual" kinetic profile that should be amenable to studying many other protein-protein interactions, is relatively simple and complementary to well-established biophysical approaches. Moreover, our method provides critical insights into broadly neutralizing antibody recognition of Env, which may inform vaccine immunogen design and immunotherapeutic development. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.

  19. Ensemble-Biased Metadynamics: A Molecular Simulation Method to Sample Experimental Distributions

    PubMed Central

    Marinelli, Fabrizio; Faraldo-Gómez, José D.

    2015-01-01

    We introduce an enhanced-sampling method for molecular dynamics (MD) simulations referred to as ensemble-biased metadynamics (EBMetaD). The method biases a conventional MD simulation to sample a molecular ensemble that is consistent with one or more probability distributions known a priori, e.g., experimental intramolecular distance distributions obtained by double electron-electron resonance or other spectroscopic techniques. To this end, EBMetaD adds an adaptive biasing potential throughout the simulation that discourages sampling of configurations inconsistent with the target probability distributions. The bias introduced is the minimum necessary to fulfill the target distributions, i.e., EBMetaD satisfies the maximum-entropy principle. Unlike other methods, EBMetaD does not require multiple simulation replicas or the introduction of Lagrange multipliers, and is therefore computationally efficient and straightforward in practice. We demonstrate the performance and accuracy of the method for a model system as well as for spin-labeled T4 lysozyme in explicit water, and show how EBMetaD reproduces three double electron-electron resonance distance distributions concurrently within a few tens of nanoseconds of simulation time. EBMetaD is integrated in the open-source PLUMED plug-in (www.plumed-code.org), and can be therefore readily used with multiple MD engines. PMID:26083917

  20. Reliable oligonucleotide conformational ensemble generation in explicit solvent for force field assessment using reservoir replica exchange molecular dynamics simulations

    PubMed Central

    Henriksen, Niel M.; Roe, Daniel R.; Cheatham, Thomas E.

    2013-01-01

    Molecular dynamics force field development and assessment requires a reliable means for obtaining a well-converged conformational ensemble of a molecule in both a time-efficient and cost-effective manner. This remains a challenge for RNA because its rugged energy landscape results in slow conformational sampling and accurate results typically require explicit solvent which increases computational cost. To address this, we performed both traditional and modified replica exchange molecular dynamics simulations on a test system (alanine dipeptide) and an RNA tetramer known to populate A-form-like conformations in solution (single-stranded rGACC). A key focus is on providing the means to demonstrate that convergence is obtained, for example by investigating replica RMSD profiles and/or detailed ensemble analysis through clustering. We found that traditional replica exchange simulations still require prohibitive time and resource expenditures, even when using GPU accelerated hardware, and our results are not well converged even at 2 microseconds of simulation time per replica. In contrast, a modified version of replica exchange, reservoir replica exchange in explicit solvent, showed much better convergence and proved to be both a cost-effective and reliable alternative to the traditional approach. We expect this method will be attractive for future research that requires quantitative conformational analysis from explicitly solvated simulations. PMID:23477537

  1. Reliable oligonucleotide conformational ensemble generation in explicit solvent for force field assessment using reservoir replica exchange molecular dynamics simulations.

    PubMed

    Henriksen, Niel M; Roe, Daniel R; Cheatham, Thomas E

    2013-04-18

    Molecular dynamics force field development and assessment requires a reliable means for obtaining a well-converged conformational ensemble of a molecule in both a time-efficient and cost-effective manner. This remains a challenge for RNA because its rugged energy landscape results in slow conformational sampling and accurate results typically require explicit solvent which increases computational cost. To address this, we performed both traditional and modified replica exchange molecular dynamics simulations on a test system (alanine dipeptide) and an RNA tetramer known to populate A-form-like conformations in solution (single-stranded rGACC). A key focus is on providing the means to demonstrate that convergence is obtained, for example, by investigating replica RMSD profiles and/or detailed ensemble analysis through clustering. We found that traditional replica exchange simulations still require prohibitive time and resource expenditures, even when using GPU accelerated hardware, and our results are not well converged even at 2 μs of simulation time per replica. In contrast, a modified version of replica exchange, reservoir replica exchange in explicit solvent, showed much better convergence and proved to be both a cost-effective and reliable alternative to the traditional approach. We expect this method will be attractive for future research that requires quantitative conformational analysis from explicitly solvated simulations.

  2. Guidelines for the analysis of free energy calculations

    PubMed Central

    Klimovich, Pavel V.; Shirts, Michael R.; Mobley, David L.

    2015-01-01

    Free energy calculations based on molecular dynamics (MD) simulations show considerable promise for applications ranging from drug discovery to prediction of physical properties and structure-function studies. But these calculations are still difficult and tedious to analyze, and best practices for analysis are not well defined or propagated. Essentially, each group analyzing these calculations needs to decide how to conduct the analysis and, usually, develop its own analysis tools. Here, we review and recommend best practices for analysis yielding reliable free energies from molecular simulations. Additionally, we provide a Python tool, alchemical–analysis.py, freely available on GitHub at https://github.com/choderalab/pymbar–examples, that implements the analysis practices reviewed here for several reference simulation packages, which can be adapted to handle data from other packages. Both this review and the tool covers analysis of alchemical calculations generally, including free energy estimates via both thermodynamic integration and free energy perturbation-based estimators. Our Python tool also handles output from multiple types of free energy calculations, including expanded ensemble and Hamiltonian replica exchange, as well as standard fixed ensemble calculations. We also survey a range of statistical and graphical ways of assessing the quality of the data and free energy estimates, and provide prototypes of these in our tool. We hope these tools and discussion will serve as a foundation for more standardization of and agreement on best practices for analysis of free energy calculations. PMID:25808134

  3. Long-term ensemble forecast of snowmelt inflow into the Cheboksary Reservoir under two different weather scenarios

    NASA Astrophysics Data System (ADS)

    Gelfan, Alexander; Moreydo, Vsevolod; Motovilov, Yury; Solomatine, Dimitri P.

    2018-04-01

    A long-term forecasting ensemble methodology, applied to water inflows into the Cheboksary Reservoir (Russia), is presented. The methodology is based on a version of the semi-distributed hydrological model ECOMAG (ECOlogical Model for Applied Geophysics) that allows for the calculation of an ensemble of inflow hydrographs using two different sets of weather ensembles for the lead time period: observed weather data, constructed on the basis of the Ensemble Streamflow Prediction methodology (ESP-based forecast), and synthetic weather data, simulated by a multi-site weather generator (WG-based forecast). We have studied the following: (1) whether there is any advantage of the developed ensemble forecasts in comparison with the currently issued operational forecasts of water inflow into the Cheboksary Reservoir, and (2) whether there is any noticeable improvement in probabilistic forecasts when using the WG-simulated ensemble compared to the ESP-based ensemble. We have found that for a 35-year period beginning from the reservoir filling in 1982, both continuous and binary model-based ensemble forecasts (issued in the deterministic form) outperform the operational forecasts of the April-June inflow volume actually used and, additionally, provide acceptable forecasts of additional water regime characteristics besides the inflow volume. We have also demonstrated that the model performance measures (in the verification period) obtained from the WG-based probabilistic forecasts, which are based on a large number of possible weather scenarios, appeared to be more statistically reliable than the corresponding measures calculated from the ESP-based forecasts based on the observed weather scenarios.

  4. Chemomimesis and Molecular Darwinism in Action: From Abiotic Generation of Nucleobases to Nucleosides and RNA.

    PubMed

    Saladino, Raffaele; Šponer, Judit E; Šponer, Jiří; Costanzo, Giovanna; Pino, Samanta; Di Mauro, Ernesto

    2018-06-20

    Molecular Darwinian evolution is an intrinsic property of reacting pools of molecules resulting in the adaptation of the system to changing conditions. It has no a priori aim. From the point of view of the origin of life, Darwinian selection behavior, when spontaneously emerging in the ensembles of molecules composing prebiotic pools, initiates subsequent evolution of increasingly complex and innovative chemical information. On the conservation side, it is a posteriori observed that numerous biological processes are based on prebiotically promptly made compounds, as proposed by the concept of Chemomimesis. Molecular Darwinian evolution and Chemomimesis are principles acting in balanced cooperation in the frame of Systems Chemistry. The one-pot synthesis of nucleosides in radical chemistry conditions is possibly a telling example of the operation of these principles. Other indications of similar cases of molecular evolution can be found among biogenic processes.

  5. Mixture models for protein structure ensembles.

    PubMed

    Hirsch, Michael; Habeck, Michael

    2008-10-01

    Protein structure ensembles provide important insight into the dynamics and function of a protein and contain information that is not captured with a single static structure. However, it is not clear a priori to what extent the variability within an ensemble is caused by internal structural changes. Additional variability results from overall translations and rotations of the molecule. And most experimental data do not provide information to relate the structures to a common reference frame. To report meaningful values of intrinsic dynamics, structural precision, conformational entropy, etc., it is therefore important to disentangle local from global conformational heterogeneity. We consider the task of disentangling local from global heterogeneity as an inference problem. We use probabilistic methods to infer from the protein ensemble missing information on reference frames and stable conformational sub-states. To this end, we model a protein ensemble as a mixture of Gaussian probability distributions of either entire conformations or structural segments. We learn these models from a protein ensemble using the expectation-maximization algorithm. Our first model can be used to find multiple conformers in a structure ensemble. The second model partitions the protein chain into locally stable structural segments or core elements and less structured regions typically found in loops. Both models are simple to implement and contain only a single free parameter: the number of conformers or structural segments. Our models can be used to analyse experimental ensembles, molecular dynamics trajectories and conformational change in proteins. The Python source code for protein ensemble analysis is available from the authors upon request.

  6. Molecular Dynamics Simulations of a Cyclic DP-240 Amylose Fragment in a Periodic Cell: Glass Transition Temperature and Water Diffusion

    USDA-ARS?s Scientific Manuscript database

    Molecular dynamics simulations using AMB06C, an in-house carbohydrate force field, (NPT ensembles, 1atm) were carried out on a periodic cell that contained a cyclic-DP-240 amylose fragment and TIP3P water molecules. Molecular conformation and movement of the amylose fragment and water molecules at ...

  7. Modeling and enhanced sampling of molecular systems with smooth and nonlinear data-driven collective variables

    NASA Astrophysics Data System (ADS)

    Hashemian, Behrooz; Millán, Daniel; Arroyo, Marino

    2013-12-01

    Collective variables (CVs) are low-dimensional representations of the state of a complex system, which help us rationalize molecular conformations and sample free energy landscapes with molecular dynamics simulations. Given their importance, there is need for systematic methods that effectively identify CVs for complex systems. In recent years, nonlinear manifold learning has shown its ability to automatically characterize molecular collective behavior. Unfortunately, these methods fail to provide a differentiable function mapping high-dimensional configurations to their low-dimensional representation, as required in enhanced sampling methods. We introduce a methodology that, starting from an ensemble representative of molecular flexibility, builds smooth and nonlinear data-driven collective variables (SandCV) from the output of nonlinear manifold learning algorithms. We demonstrate the method with a standard benchmark molecule, alanine dipeptide, and show how it can be non-intrusively combined with off-the-shelf enhanced sampling methods, here the adaptive biasing force method. We illustrate how enhanced sampling simulations with SandCV can explore regions that were poorly sampled in the original molecular ensemble. We further explore the transferability of SandCV from a simpler system, alanine dipeptide in vacuum, to a more complex system, alanine dipeptide in explicit water.

  8. Modeling and enhanced sampling of molecular systems with smooth and nonlinear data-driven collective variables.

    PubMed

    Hashemian, Behrooz; Millán, Daniel; Arroyo, Marino

    2013-12-07

    Collective variables (CVs) are low-dimensional representations of the state of a complex system, which help us rationalize molecular conformations and sample free energy landscapes with molecular dynamics simulations. Given their importance, there is need for systematic methods that effectively identify CVs for complex systems. In recent years, nonlinear manifold learning has shown its ability to automatically characterize molecular collective behavior. Unfortunately, these methods fail to provide a differentiable function mapping high-dimensional configurations to their low-dimensional representation, as required in enhanced sampling methods. We introduce a methodology that, starting from an ensemble representative of molecular flexibility, builds smooth and nonlinear data-driven collective variables (SandCV) from the output of nonlinear manifold learning algorithms. We demonstrate the method with a standard benchmark molecule, alanine dipeptide, and show how it can be non-intrusively combined with off-the-shelf enhanced sampling methods, here the adaptive biasing force method. We illustrate how enhanced sampling simulations with SandCV can explore regions that were poorly sampled in the original molecular ensemble. We further explore the transferability of SandCV from a simpler system, alanine dipeptide in vacuum, to a more complex system, alanine dipeptide in explicit water.

  9. Improving the accuracy of protein stability predictions with multistate design using a variety of backbone ensembles.

    PubMed

    Davey, James A; Chica, Roberto A

    2014-05-01

    Multistate computational protein design (MSD) with backbone ensembles approximating conformational flexibility can predict higher quality sequences than single-state design with a single fixed backbone. However, it is currently unclear what characteristics of backbone ensembles are required for the accurate prediction of protein sequence stability. In this study, we aimed to improve the accuracy of protein stability predictions made with MSD by using a variety of backbone ensembles to recapitulate the experimentally measured stability of 85 Streptococcal protein G domain β1 sequences. Ensembles tested here include an NMR ensemble as well as those generated by molecular dynamics (MD) simulations, by Backrub motions, and by PertMin, a new method that we developed involving the perturbation of atomic coordinates followed by energy minimization. MSD with the PertMin ensembles resulted in the most accurate predictions by providing the highest number of stable sequences in the top 25, and by correctly binning sequences as stable or unstable with the highest success rate (≈90%) and the lowest number of false positives. The performance of PertMin ensembles is due to the fact that their members closely resemble the input crystal structure and have low potential energy. Conversely, the NMR ensemble as well as those generated by MD simulations at 500 or 1000 K reduced prediction accuracy due to their low structural similarity to the crystal structure. The ensembles tested herein thus represent on- or off-target models of the native protein fold and could be used in future studies to design for desired properties other than stability. Copyright © 2013 Wiley Periodicals, Inc.

  10. Predicting cancer-relevant proteins using an improved molecular similarity ensemble approach.

    PubMed

    Zhou, Bin; Sun, Qi; Kong, De-Xin

    2016-05-31

    In this study, we proposed an improved algorithm for identifying proteins relevant to cancer. The algorithm was named two-layer molecular similarity ensemble approach (TL-SEA). We applied TL-SEA to analyzing the correlation between anticancer compounds (against cell lines K562, MCF7 and A549) and active compounds against separate target proteins listed in BindingDB. Several associations between cancer types and related proteins were revealed using this chemoinformatics approach. An analysis of the literature showed that 26 of 35 predicted proteins were correlated with cancer cell proliferation, apoptosis or differentiation. Additionally, interactions between proteins in BindingDB and anticancer chemicals were also predicted. We discuss the roles of the most important predicted proteins in cancer biology and conclude that TL-SEA could be a useful tool for inferring novel proteins involved in cancer and revealing underlying molecular mechanisms.

  11. Loss of conformational entropy in protein folding calculated using realistic ensembles and its implications for NMR-based calculations

    PubMed Central

    Baxa, Michael C.; Haddadian, Esmael J.; Jumper, John M.; Freed, Karl F.; Sosnick, Tobin R.

    2014-01-01

    The loss of conformational entropy is a major contribution in the thermodynamics of protein folding. However, accurate determination of the quantity has proven challenging. We calculate this loss using molecular dynamic simulations of both the native protein and a realistic denatured state ensemble. For ubiquitin, the total change in entropy is TΔSTotal = 1.4 kcal⋅mol−1 per residue at 300 K with only 20% from the loss of side-chain entropy. Our analysis exhibits mixed agreement with prior studies because of the use of more accurate ensembles and contributions from correlated motions. Buried side chains lose only a factor of 1.4 in the number of conformations available per rotamer upon folding (ΩU/ΩN). The entropy loss for helical and sheet residues differs due to the smaller motions of helical residues (TΔShelix−sheet = 0.5 kcal⋅mol−1), a property not fully reflected in the amide N-H and carbonyl C=O bond NMR order parameters. The results have implications for the thermodynamics of folding and binding, including estimates of solvent ordering and microscopic entropies obtained from NMR. PMID:25313044

  12. A Sidekick for Membrane Simulations: Automated Ensemble Molecular Dynamics Simulations of Transmembrane Helices

    PubMed Central

    Hall, Benjamin A; Halim, Khairul Abd; Buyan, Amanda; Emmanouil, Beatrice; Sansom, Mark S P

    2016-01-01

    The interactions of transmembrane (TM) α-helices with the phospholipid membrane and with one another are central to understanding the structure and stability of integral membrane proteins. These interactions may be analysed via coarse-grained molecular dynamics (CGMD) simulations. To obtain statistically meaningful analysis of TM helix interactions, large (N ca. 100) ensembles of CGMD simulations are needed. To facilitate the running and analysis of such ensembles of simulations we have developed Sidekick, an automated pipeline software for performing high throughput CGMD simulations of α-helical peptides in lipid bilayer membranes. Through an end-to-end approach, which takes as input a helix sequence and outputs analytical metrics derived from CGMD simulations, we are able to predict the orientation and likelihood of insertion into a lipid bilayer of a given helix of family of helix sequences. We illustrate this software via analysis of insertion into a membrane of short hydrophobic TM helices containing a single cationic arginine residue positioned at different positions along the length of the helix. From analysis of these ensembles of simulations we estimate apparent energy barriers to insertion which are comparable to experimentally determined values. In a second application we use CGMD simulations to examine self-assembly of dimers of TM helices from the ErbB1 receptor tyrosine kinase, and analyse the numbers of simulation repeats necessary to obtain convergence of simple descriptors of the mode of packing of the two helices within a dimer. Our approach offers proof-of-principle platform for the further employment of automation in large ensemble CGMD simulations of membrane proteins. PMID:26580541

  13. Sidekick for Membrane Simulations: Automated Ensemble Molecular Dynamics Simulations of Transmembrane Helices.

    PubMed

    Hall, Benjamin A; Halim, Khairul Bariyyah Abd; Buyan, Amanda; Emmanouil, Beatrice; Sansom, Mark S P

    2014-05-13

    The interactions of transmembrane (TM) α-helices with the phospholipid membrane and with one another are central to understanding the structure and stability of integral membrane proteins. These interactions may be analyzed via coarse grained molecular dynamics (CGMD) simulations. To obtain statistically meaningful analysis of TM helix interactions, large (N ca. 100) ensembles of CGMD simulations are needed. To facilitate the running and analysis of such ensembles of simulations, we have developed Sidekick, an automated pipeline software for performing high throughput CGMD simulations of α-helical peptides in lipid bilayer membranes. Through an end-to-end approach, which takes as input a helix sequence and outputs analytical metrics derived from CGMD simulations, we are able to predict the orientation and likelihood of insertion into a lipid bilayer of a given helix of a family of helix sequences. We illustrate this software via analyses of insertion into a membrane of short hydrophobic TM helices containing a single cationic arginine residue positioned at different positions along the length of the helix. From analyses of these ensembles of simulations, we estimate apparent energy barriers to insertion which are comparable to experimentally determined values. In a second application, we use CGMD simulations to examine the self-assembly of dimers of TM helices from the ErbB1 receptor tyrosine kinase and analyze the numbers of simulation repeats necessary to obtain convergence of simple descriptors of the mode of packing of the two helices within a dimer. Our approach offers a proof-of-principle platform for the further employment of automation in large ensemble CGMD simulations of membrane proteins.

  14. Deciphering neuronal population codes for acute thermal pain

    NASA Astrophysics Data System (ADS)

    Chen, Zhe; Zhang, Qiaosheng; Phuong Sieu Tong, Ai; Manders, Toby R.; Wang, Jing

    2017-06-01

    Objective. Pain is defined as an unpleasant sensory and emotional experience associated with actual or potential tissue damage, or described in terms of such damage. Current pain research mostly focuses on molecular and synaptic changes at the spinal and peripheral levels. However, a complete understanding of pain mechanisms requires the physiological study of the neocortex. Our goal is to apply a neural decoding approach to read out the onset of acute thermal pain signals, which can be used for brain-machine interface. Approach. We used micro wire arrays to record ensemble neuronal activities from the primary somatosensory cortex (S1) and anterior cingulate cortex (ACC) in freely behaving rats. We further investigated neural codes for acute thermal pain at both single-cell and population levels. To detect the onset of acute thermal pain signals, we developed a novel latent state-space framework to decipher the sorted or unsorted S1 and ACC ensemble spike activities, which reveal information about the onset of pain signals. Main results. The state space analysis allows us to uncover a latent state process that drives the observed ensemble spike activity, and to further detect the ‘neuronal threshold’ for acute thermal pain on a single-trial basis. Our method achieved good detection performance in sensitivity and specificity. In addition, our results suggested that an optimal strategy for detecting the onset of acute thermal pain signals may be based on combined evidence from S1 and ACC population codes. Significance. Our study is the first to detect the onset of acute pain signals based on neuronal ensemble spike activity. It is important from a mechanistic viewpoint as it relates to the significance of S1 and ACC activities in the regulation of the acute pain onset.

  15. Triazole-based Zn²⁺-specific molecular marker for fluorescence bioimaging.

    PubMed

    Sinha, Sougata; Mukherjee, Trinetra; Mathew, Jomon; Mukhopadhyay, Subhra K; Ghosh, Subrata

    2014-04-25

    Fluorescence bioimaging potential, both in vitro and in vivo, of a yellow emissive triazole-based molecular marker has been investigated and demonstrated. Three different kinds of cells, viz Bacillus thuringiensis, Candida albicans, and Techoma stans pollen grains were used to investigate the intracellular zinc imaging potential of 1 (in vitro studies). Fluorescence imaging of translocation of zinc through the stem of small herb, Peperomia pellucida, having transparent stem proved in vivo bioimaging capability of 1. This approach will enable in screening cell permeability and biostability of a newly developed probe. Similarly, the current method for detection and localization of zinc in Gram seed sprouts could be an easy and potential alternative of the existing analytical methods to investigate the efficiency of various strategies applied for increasing zinc-content in cereal crops. The probe-zinc ensemble has efficiently been applied for detecting phosphate-based biomolecules. Copyright © 2014 Elsevier B.V. All rights reserved.

  16. Genetic programming based ensemble system for microarray data classification.

    PubMed

    Liu, Kun-Hong; Tong, Muchenxuan; Xie, Shu-Tong; Yee Ng, Vincent To

    2015-01-01

    Recently, more and more machine learning techniques have been applied to microarray data analysis. The aim of this study is to propose a genetic programming (GP) based new ensemble system (named GPES), which can be used to effectively classify different types of cancers. Decision trees are deployed as base classifiers in this ensemble framework with three operators: Min, Max, and Average. Each individual of the GP is an ensemble system, and they become more and more accurate in the evolutionary process. The feature selection technique and balanced subsampling technique are applied to increase the diversity in each ensemble system. The final ensemble committee is selected by a forward search algorithm, which is shown to be capable of fitting data automatically. The performance of GPES is evaluated using five binary class and six multiclass microarray datasets, and results show that the algorithm can achieve better results in most cases compared with some other ensemble systems. By using elaborate base classifiers or applying other sampling techniques, the performance of GPES may be further improved.

  17. Genetic Programming Based Ensemble System for Microarray Data Classification

    PubMed Central

    Liu, Kun-Hong; Tong, Muchenxuan; Xie, Shu-Tong; Yee Ng, Vincent To

    2015-01-01

    Recently, more and more machine learning techniques have been applied to microarray data analysis. The aim of this study is to propose a genetic programming (GP) based new ensemble system (named GPES), which can be used to effectively classify different types of cancers. Decision trees are deployed as base classifiers in this ensemble framework with three operators: Min, Max, and Average. Each individual of the GP is an ensemble system, and they become more and more accurate in the evolutionary process. The feature selection technique and balanced subsampling technique are applied to increase the diversity in each ensemble system. The final ensemble committee is selected by a forward search algorithm, which is shown to be capable of fitting data automatically. The performance of GPES is evaluated using five binary class and six multiclass microarray datasets, and results show that the algorithm can achieve better results in most cases compared with some other ensemble systems. By using elaborate base classifiers or applying other sampling techniques, the performance of GPES may be further improved. PMID:25810748

  18. Minimalist ensemble algorithms for genome-wide protein localization prediction.

    PubMed

    Lin, Jhih-Rong; Mondal, Ananda Mohan; Liu, Rong; Hu, Jianjun

    2012-07-03

    Computational prediction of protein subcellular localization can greatly help to elucidate its functions. Despite the existence of dozens of protein localization prediction algorithms, the prediction accuracy and coverage are still low. Several ensemble algorithms have been proposed to improve the prediction performance, which usually include as many as 10 or more individual localization algorithms. However, their performance is still limited by the running complexity and redundancy among individual prediction algorithms. This paper proposed a novel method for rational design of minimalist ensemble algorithms for practical genome-wide protein subcellular localization prediction. The algorithm is based on combining a feature selection based filter and a logistic regression classifier. Using a novel concept of contribution scores, we analyzed issues of algorithm redundancy, consensus mistakes, and algorithm complementarity in designing ensemble algorithms. We applied the proposed minimalist logistic regression (LR) ensemble algorithm to two genome-wide datasets of Yeast and Human and compared its performance with current ensemble algorithms. Experimental results showed that the minimalist ensemble algorithm can achieve high prediction accuracy with only 1/3 to 1/2 of individual predictors of current ensemble algorithms, which greatly reduces computational complexity and running time. It was found that the high performance ensemble algorithms are usually composed of the predictors that together cover most of available features. Compared to the best individual predictor, our ensemble algorithm improved the prediction accuracy from AUC score of 0.558 to 0.707 for the Yeast dataset and from 0.628 to 0.646 for the Human dataset. Compared with popular weighted voting based ensemble algorithms, our classifier-based ensemble algorithms achieved much better performance without suffering from inclusion of too many individual predictors. We proposed a method for rational design of minimalist ensemble algorithms using feature selection and classifiers. The proposed minimalist ensemble algorithm based on logistic regression can achieve equal or better prediction performance while using only half or one-third of individual predictors compared to other ensemble algorithms. The results also suggested that meta-predictors that take advantage of a variety of features by combining individual predictors tend to achieve the best performance. The LR ensemble server and related benchmark datasets are available at http://mleg.cse.sc.edu/LRensemble/cgi-bin/predict.cgi.

  19. Minimalist ensemble algorithms for genome-wide protein localization prediction

    PubMed Central

    2012-01-01

    Background Computational prediction of protein subcellular localization can greatly help to elucidate its functions. Despite the existence of dozens of protein localization prediction algorithms, the prediction accuracy and coverage are still low. Several ensemble algorithms have been proposed to improve the prediction performance, which usually include as many as 10 or more individual localization algorithms. However, their performance is still limited by the running complexity and redundancy among individual prediction algorithms. Results This paper proposed a novel method for rational design of minimalist ensemble algorithms for practical genome-wide protein subcellular localization prediction. The algorithm is based on combining a feature selection based filter and a logistic regression classifier. Using a novel concept of contribution scores, we analyzed issues of algorithm redundancy, consensus mistakes, and algorithm complementarity in designing ensemble algorithms. We applied the proposed minimalist logistic regression (LR) ensemble algorithm to two genome-wide datasets of Yeast and Human and compared its performance with current ensemble algorithms. Experimental results showed that the minimalist ensemble algorithm can achieve high prediction accuracy with only 1/3 to 1/2 of individual predictors of current ensemble algorithms, which greatly reduces computational complexity and running time. It was found that the high performance ensemble algorithms are usually composed of the predictors that together cover most of available features. Compared to the best individual predictor, our ensemble algorithm improved the prediction accuracy from AUC score of 0.558 to 0.707 for the Yeast dataset and from 0.628 to 0.646 for the Human dataset. Compared with popular weighted voting based ensemble algorithms, our classifier-based ensemble algorithms achieved much better performance without suffering from inclusion of too many individual predictors. Conclusions We proposed a method for rational design of minimalist ensemble algorithms using feature selection and classifiers. The proposed minimalist ensemble algorithm based on logistic regression can achieve equal or better prediction performance while using only half or one-third of individual predictors compared to other ensemble algorithms. The results also suggested that meta-predictors that take advantage of a variety of features by combining individual predictors tend to achieve the best performance. The LR ensemble server and related benchmark datasets are available at http://mleg.cse.sc.edu/LRensemble/cgi-bin/predict.cgi. PMID:22759391

  20. Transport Phenomena of Water in Molecular Fluidic Channels

    PubMed Central

    Vo, Truong Quoc; Kim, BoHung

    2016-01-01

    In molecular-level fluidic transport, where the discrete characteristics of a molecular system are not negligible (in contrast to a continuum description), the response of the molecular water system might still be similar to the continuum description if the time and ensemble averages satisfy the ergodic hypothesis and the scale of the average is enough to recover the classical thermodynamic properties. However, even in such cases, the continuum description breaks down on the material interfaces. In short, molecular-level liquid flows exhibit substantially different physics from classical fluid transport theories because of (i) the interface/surface force field, (ii) thermal/velocity slip, (iii) the discreteness of fluid molecules at the interface and (iv) local viscosity. Therefore, in this study, we present the result of our investigations using molecular dynamics (MD) simulations with continuum-based energy equations and check the validity and limitations of the continuum hypothesis. Our study shows that when the continuum description is subjected to the proper treatment of the interface effects via modified boundary conditions, the so-called continuum-based modified-analytical solutions, they can adequately predict nanoscale fluid transport phenomena. The findings in this work have broad effects in overcoming current limitations in modeling/predicting the fluid behaviors of molecular fluidic devices. PMID:27650138

  1. Ensemble-based docking: From hit discovery to metabolism and toxicity predictions.

    PubMed

    Evangelista, Wilfredo; Weir, Rebecca L; Ellingson, Sally R; Harris, Jason B; Kapoor, Karan; Smith, Jeremy C; Baudry, Jerome

    2016-10-15

    This paper describes and illustrates the use of ensemble-based docking, i.e., using a collection of protein structures in docking calculations for hit discovery, the exploration of biochemical pathways and toxicity prediction of drug candidates. We describe the computational engineering work necessary to enable large ensemble docking campaigns on supercomputers. We show examples where ensemble-based docking has significantly increased the number and the diversity of validated drug candidates. Finally, we illustrate how ensemble-based docking can be extended beyond hit discovery and toward providing a structural basis for the prediction of metabolism and off-target binding relevant to pre-clinical and clinical trials. Copyright © 2016 Elsevier Ltd. All rights reserved.

  2. Historeceptomic Fingerprints for Drug-Like Compounds.

    PubMed

    Shmelkov, Evgeny; Grigoryan, Arsen; Swetnam, James; Xin, Junyang; Tivon, Doreen; Shmelkov, Sergey V; Cardozo, Timothy

    2015-01-01

    Most drugs exert their beneficial and adverse effects through their combined action on several different molecular targets (polypharmacology). The true molecular fingerprint of the direct action of a drug has two components: the ensemble of all the receptors upon which a drug acts and their level of expression in organs/tissues. Conversely, the fingerprint of the adverse effects of a drug may derive from its action in bystander tissues. The ensemble of targets is almost always only partially known. Here we describe an approach improving upon and integrating both components: in silico identification of a more comprehensive ensemble of targets for any drug weighted by the expression of those receptors in relevant tissues. Our system combines more than 300,000 experimentally determined bioactivity values from the ChEMBL database and 4.2 billion molecular docking scores. We integrated these scores with gene expression data for human receptors across a panel of human tissues to produce drug-specific tissue-receptor (historeceptomics) scores. A statistical model was designed to identify significant scores, which define an improved fingerprint representing the unique activity of any drug. These multi-dimensional historeceptomic fingerprints describe, in a novel, intuitive, and easy to interpret style, the holistic, in vivo picture of the mechanism of any drug's action. Valuable applications in drug discovery and personalized medicine, including the identification of molecular signatures for drugs with polypharmacologic modes of action, detection of tissue-specific adverse effects of drugs, matching molecular signatures of a disease to drugs, target identification for bioactive compounds with unknown receptors, and hypothesis generation for drug/compound phenotypes may be enabled by this approach. The system has been deployed at drugable.org for access through a user-friendly web site.

  3. Toward canonical ensemble distribution from self-guided Langevin dynamics simulation

    NASA Astrophysics Data System (ADS)

    Wu, Xiongwu; Brooks, Bernard R.

    2011-04-01

    This work derives a quantitative description of the conformational distribution in self-guided Langevin dynamics (SGLD) simulations. SGLD simulations employ guiding forces calculated from local average momentums to enhance low-frequency motion. This enhancement in low-frequency motion dramatically accelerates conformational search efficiency, but also induces certain perturbations in conformational distribution. Through the local averaging, we separate properties of molecular systems into low-frequency and high-frequency portions. The guiding force effect on the conformational distribution is quantitatively described using these low-frequency and high-frequency properties. This quantitative relation provides a way to convert between a canonical ensemble and a self-guided ensemble. Using example systems, we demonstrated how to utilize the relation to obtain canonical ensemble properties and conformational distributions from SGLD simulations. This development makes SGLD not only an efficient approach for conformational searching, but also an accurate means for conformational sampling.

  4. Charge Transport and Phase Behavior of Imidazolium-Based Ionic Liquid Crystals from Fully Atomistic Simulations.

    PubMed

    Quevillon, Michael J; Whitmer, Jonathan K

    2018-01-02

    Ionic liquid crystals occupy an intriguing middle ground between room-temperature ionic liquids and mesostructured liquid crystals. Here, we examine a non-polarizable, fully atomistic model of the 1-alkyl-3-methylimidazolium nitrate family using molecular dynamics in the constant pressure-constant temperature ensemble. These materials exhibit a distinct "smectic" liquid phase, characterized by layers formed by the molecules, which separate the ionic and aliphatic moieties. In particular, we discuss the implications this layering may have for electrolyte applications.

  5. Molecular-Scale Electronics: From Concept to Function.

    PubMed

    Xiang, Dong; Wang, Xiaolong; Jia, Chuancheng; Lee, Takhee; Guo, Xuefeng

    2016-04-13

    Creating functional electrical circuits using individual or ensemble molecules, often termed as "molecular-scale electronics", not only meets the increasing technical demands of the miniaturization of traditional Si-based electronic devices, but also provides an ideal window of exploring the intrinsic properties of materials at the molecular level. This Review covers the major advances with the most general applicability and emphasizes new insights into the development of efficient platform methodologies for building reliable molecular electronic devices with desired functionalities through the combination of programmed bottom-up self-assembly and sophisticated top-down device fabrication. First, we summarize a number of different approaches of forming molecular-scale junctions and discuss various experimental techniques for examining these nanoscale circuits in details. We then give a full introduction of characterization techniques and theoretical simulations for molecular electronics. Third, we highlight the major contributions and new concepts of integrating molecular functionalities into electrical circuits. Finally, we provide a critical discussion of limitations and main challenges that still exist for the development of molecular electronics. These analyses should be valuable for deeply understanding charge transport through molecular junctions, the device fabrication process, and the roadmap for future practical molecular electronics.

  6. GENESIS: a hybrid-parallel and multi-scale molecular dynamics simulator with enhanced sampling algorithms for biomolecular and cellular simulations

    PubMed Central

    Jung, Jaewoon; Mori, Takaharu; Kobayashi, Chigusa; Matsunaga, Yasuhiro; Yoda, Takao; Feig, Michael; Sugita, Yuji

    2015-01-01

    GENESIS (Generalized-Ensemble Simulation System) is a new software package for molecular dynamics (MD) simulations of macromolecules. It has two MD simulators, called ATDYN and SPDYN. ATDYN is parallelized based on an atomic decomposition algorithm for the simulations of all-atom force-field models as well as coarse-grained Go-like models. SPDYN is highly parallelized based on a domain decomposition scheme, allowing large-scale MD simulations on supercomputers. Hybrid schemes combining OpenMP and MPI are used in both simulators to target modern multicore computer architectures. Key advantages of GENESIS are (1) the highly parallel performance of SPDYN for very large biological systems consisting of more than one million atoms and (2) the availability of various REMD algorithms (T-REMD, REUS, multi-dimensional REMD for both all-atom and Go-like models under the NVT, NPT, NPAT, and NPγT ensembles). The former is achieved by a combination of the midpoint cell method and the efficient three-dimensional Fast Fourier Transform algorithm, where the domain decomposition space is shared in real-space and reciprocal-space calculations. Other features in SPDYN, such as avoiding concurrent memory access, reducing communication times, and usage of parallel input/output files, also contribute to the performance. We show the REMD simulation results of a mixed (POPC/DMPC) lipid bilayer as a real application using GENESIS. GENESIS is released as free software under the GPLv2 licence and can be easily modified for the development of new algorithms and molecular models. WIREs Comput Mol Sci 2015, 5:310–323. doi: 10.1002/wcms.1220 PMID:26753008

  7. Plasticity of the Binding Site of Renin: Optimized Selection of Protein Structures for Ensemble Docking.

    PubMed

    Strecker, Claas; Meyer, Bernd

    2018-05-29

    Protein flexibility poses a major challenge to docking of potential ligands in that the binding site can adopt different shapes. Docking algorithms usually keep the protein rigid and only allow the ligand to be treated as flexible. However, a wrong assessment of the shape of the binding pocket can prevent a ligand from adapting a correct pose. Ensemble docking is a simple yet promising method to solve this problem: Ligands are docked into multiple structures, and the results are subsequently merged. Selection of protein structures is a significant factor for this approach. In this work we perform a comprehensive and comparative study evaluating the impact of structure selection on ensemble docking. We perform ensemble docking with several crystal structures and with structures derived from molecular dynamics simulations of renin, an attractive target for antihypertensive drugs. Here, 500 ns of MD simulations revealed binding site shapes not found in any available crystal structure. We evaluate the importance of structure selection for ensemble docking by comparing binding pose prediction, ability to rank actives above nonactives (screening utility), and scoring accuracy. As a result, for ensemble definition k-means clustering appears to be better suited than hierarchical clustering with average linkage. The best performing ensemble consists of four crystal structures and is able to reproduce the native ligand poses better than any individual crystal structure. Moreover this ensemble outperforms 88% of all individual crystal structures in terms of screening utility as well as scoring accuracy. Similarly, ensembles of MD-derived structures perform on average better than 75% of any individual crystal structure in terms of scoring accuracy at all inspected ensembles sizes.

  8. Thermostating extended Lagrangian Born-Oppenheimer molecular dynamics.

    PubMed

    Martínez, Enrique; Cawkwell, Marc J; Voter, Arthur F; Niklasson, Anders M N

    2015-04-21

    Extended Lagrangian Born-Oppenheimer molecular dynamics is developed and analyzed for applications in canonical (NVT) simulations. Three different approaches are considered: the Nosé and Andersen thermostats and Langevin dynamics. We have tested the temperature distribution under different conditions of self-consistent field (SCF) convergence and time step and compared the results to analytical predictions. We find that the simulations based on the extended Lagrangian Born-Oppenheimer framework provide accurate canonical distributions even under approximate SCF convergence, often requiring only a single diagonalization per time step, whereas regular Born-Oppenheimer formulations exhibit unphysical fluctuations unless a sufficiently high degree of convergence is reached at each time step. The thermostated extended Lagrangian framework thus offers an accurate approach to sample processes in the canonical ensemble at a fraction of the computational cost of regular Born-Oppenheimer molecular dynamics simulations.

  9. MSEBAG: a dynamic classifier ensemble generation based on `minimum-sufficient ensemble' and bagging

    NASA Astrophysics Data System (ADS)

    Chen, Lei; Kamel, Mohamed S.

    2016-01-01

    In this paper, we propose a dynamic classifier system, MSEBAG, which is characterised by searching for the 'minimum-sufficient ensemble' and bagging at the ensemble level. It adopts an 'over-generation and selection' strategy and aims to achieve a good bias-variance trade-off. In the training phase, MSEBAG first searches for the 'minimum-sufficient ensemble', which maximises the in-sample fitness with the minimal number of base classifiers. Then, starting from the 'minimum-sufficient ensemble', a backward stepwise algorithm is employed to generate a collection of ensembles. The objective is to create a collection of ensembles with a descending fitness on the data, as well as a descending complexity in the structure. MSEBAG dynamically selects the ensembles from the collection for the decision aggregation. The extended adaptive aggregation (EAA) approach, a bagging-style algorithm performed at the ensemble level, is employed for this task. EAA searches for the competent ensembles using a score function, which takes into consideration both the in-sample fitness and the confidence of the statistical inference, and averages the decisions of the selected ensembles to label the test pattern. The experimental results show that the proposed MSEBAG outperforms the benchmarks on average.

  10. Clustering Molecular Dynamics Trajectories for Optimizing Docking Experiments

    PubMed Central

    De Paris, Renata; Quevedo, Christian V.; Ruiz, Duncan D.; Norberto de Souza, Osmar; Barros, Rodrigo C.

    2015-01-01

    Molecular dynamics simulations of protein receptors have become an attractive tool for rational drug discovery. However, the high computational cost of employing molecular dynamics trajectories in virtual screening of large repositories threats the feasibility of this task. Computational intelligence techniques have been applied in this context, with the ultimate goal of reducing the overall computational cost so the task can become feasible. Particularly, clustering algorithms have been widely used as a means to reduce the dimensionality of molecular dynamics trajectories. In this paper, we develop a novel methodology for clustering entire trajectories using structural features from the substrate-binding cavity of the receptor in order to optimize docking experiments on a cloud-based environment. The resulting partition was selected based on three clustering validity criteria, and it was further validated by analyzing the interactions between 20 ligands and a fully flexible receptor (FFR) model containing a 20 ns molecular dynamics simulation trajectory. Our proposed methodology shows that taking into account features of the substrate-binding cavity as input for the k-means algorithm is a promising technique for accurately selecting ensembles of representative structures tailored to a specific ligand. PMID:25873944

  11. Project fires. Volume 2: Protective ensemble performance standards, phase 1B

    NASA Astrophysics Data System (ADS)

    Abeles, F. J.

    1980-05-01

    The design of the prototype protective ensemble was finalized. Prototype ensembles were fabricated and then subjected to a series of qualification tests which were based upon the protective ensemble performance standards PEPS requirements. Engineering drawings and purchase specifications were prepared for the new protective ensemble.

  12. HLPI-Ensemble: Prediction of human lncRNA-protein interactions based on ensemble strategy.

    PubMed

    Hu, Huan; Zhang, Li; Ai, Haixin; Zhang, Hui; Fan, Yetian; Zhao, Qi; Liu, Hongsheng

    2018-03-27

    LncRNA plays an important role in many biological and disease progression by binding to related proteins. However, the experimental methods for studying lncRNA-protein interactions are time-consuming and expensive. Although there are a few models designed to predict the interactions of ncRNA-protein, they all have some common drawbacks that limit their predictive performance. In this study, we present a model called HLPI-Ensemble designed specifically for human lncRNA-protein interactions. HLPI-Ensemble adopts the ensemble strategy based on three mainstream machine learning algorithms of Support Vector Machines (SVM), Random Forests (RF) and Extreme Gradient Boosting (XGB) to generate HLPI-SVM Ensemble, HLPI-RF Ensemble and HLPI-XGB Ensemble, respectively. The results of 10-fold cross-validation show that HLPI-SVM Ensemble, HLPI-RF Ensemble and HLPI-XGB Ensemble achieved AUCs of 0.95, 0.96 and 0.96, respectively, in the test dataset. Furthermore, we compared the performance of the HLPI-Ensemble models with the previous models through external validation dataset. The results show that the false positives (FPs) of HLPI-Ensemble models are much lower than that of the previous models, and other evaluation indicators of HLPI-Ensemble models are also higher than those of the previous models. It is further showed that HLPI-Ensemble models are superior in predicting human lncRNA-protein interaction compared with previous models. The HLPI-Ensemble is publicly available at: http://ccsipb.lnu.edu.cn/hlpiensemble/ .

  13. Structural Elements in the Gαs and Gαq C Termini That Mediate Selective G Protein-coupled Receptor (GPCR) Signaling.

    PubMed

    Semack, Ansley; Sandhu, Manbir; Malik, Rabia U; Vaidehi, Nagarajan; Sivaramakrishnan, Sivaraj

    2016-08-19

    Although the importance of the C terminus of the α subunit of the heterotrimeric G protein in G protein-coupled receptor (GPCR)-G protein pairing is well established, the structural basis of selective interactions remains unknown. Here, we combine live cell FRET-based measurements and molecular dynamics simulations of the interaction between the GPCR and a peptide derived from the C terminus of the Gα subunit (Gα peptide) to dissect the molecular mechanisms of G protein selectivity. We observe a direct link between Gα peptide binding and stabilization of the GPCR conformational ensemble. We find that cognate and non-cognate Gα peptides show deep and shallow binding, respectively, and in distinct orientations within the GPCR. Binding of the cognate Gα peptide stabilizes the agonist-bound GPCR conformational ensemble resulting in favorable binding energy and lower flexibility of the agonist-GPCR pair. We identify three hot spot residues (Gαs/Gαq-Gln-384/Leu-349, Gln-390/Glu-355, and Glu-392/Asn-357) that contribute to selective interactions between the β2-adrenergic receptor (β2-AR)-Gαs and V1A receptor (V1AR)-Gαq The Gαs and Gαq peptides adopt different orientations in β2-AR and V1AR, respectively. The β2-AR/Gαs peptide interface is dominated by electrostatic interactions, whereas the V1AR/Gαq peptide interactions are predominantly hydrophobic. Interestingly, our study reveals a role for both favorable and unfavorable interactions in G protein selection. Residue Glu-355 in Gαq prevents this peptide from interacting strongly with β2-AR. Mutagenesis to the Gαs counterpart (E355Q) imparts a cognate-like interaction. Overall, our study highlights the synergy in molecular dynamics and FRET-based approaches to dissect the structural basis of selective G protein interactions. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

  14. Efficient and Unbiased Sampling of Biomolecular Systems in the Canonical Ensemble: A Review of Self-Guided Langevin Dynamics

    PubMed Central

    Wu, Xiongwu; Damjanovic, Ana; Brooks, Bernard R.

    2013-01-01

    This review provides a comprehensive description of the self-guided Langevin dynamics (SGLD) and the self-guided molecular dynamics (SGMD) methods and their applications. Example systems are included to provide guidance on optimal application of these methods in simulation studies. SGMD/SGLD has enhanced ability to overcome energy barriers and accelerate rare events to affordable time scales. It has been demonstrated that with moderate parameters, SGLD can routinely cross energy barriers of 20 kT at a rate that molecular dynamics (MD) or Langevin dynamics (LD) crosses 10 kT barriers. The core of these methods is the use of local averages of forces and momenta in a direct manner that can preserve the canonical ensemble. The use of such local averages results in methods where low frequency motion “borrows” energy from high frequency degrees of freedom when a barrier is approached and then returns that excess energy after a barrier is crossed. This self-guiding effect also results in an accelerated diffusion to enhance conformational sampling efficiency. The resulting ensemble with SGLD deviates in a small way from the canonical ensemble, and that deviation can be corrected with either an on-the-fly or a post processing reweighting procedure that provides an excellent canonical ensemble for systems with a limited number of accelerated degrees of freedom. Since reweighting procedures are generally not size extensive, a newer method, SGLDfp, uses local averages of both momenta and forces to preserve the ensemble without reweighting. The SGLDfp approach is size extensive and can be used to accelerate low frequency motion in large systems, or in systems with explicit solvent where solvent diffusion is also to be enhanced. Since these methods are direct and straightforward, they can be used in conjunction with many other sampling methods or free energy methods by simply replacing the integration of degrees of freedom that are normally sampled by MD or LD. PMID:23913991

  15. Combining Structural Modeling with Ensemble Machine Learning to Accurately Predict Protein Fold Stability and Binding Affinity Effects upon Mutation

    PubMed Central

    Garcia Lopez, Sebastian; Kim, Philip M.

    2014-01-01

    Advances in sequencing have led to a rapid accumulation of mutations, some of which are associated with diseases. However, to draw mechanistic conclusions, a biochemical understanding of these mutations is necessary. For coding mutations, accurate prediction of significant changes in either the stability of proteins or their affinity to their binding partners is required. Traditional methods have used semi-empirical force fields, while newer methods employ machine learning of sequence and structural features. Here, we show how combining both of these approaches leads to a marked boost in accuracy. We introduce ELASPIC, a novel ensemble machine learning approach that is able to predict stability effects upon mutation in both, domain cores and domain-domain interfaces. We combine semi-empirical energy terms, sequence conservation, and a wide variety of molecular details with a Stochastic Gradient Boosting of Decision Trees (SGB-DT) algorithm. The accuracy of our predictions surpasses existing methods by a considerable margin, achieving correlation coefficients of 0.77 for stability, and 0.75 for affinity predictions. Notably, we integrated homology modeling to enable proteome-wide prediction and show that accurate prediction on modeled structures is possible. Lastly, ELASPIC showed significant differences between various types of disease-associated mutations, as well as between disease and common neutral mutations. Unlike pure sequence-based prediction methods that try to predict phenotypic effects of mutations, our predictions unravel the molecular details governing the protein instability, and help us better understand the molecular causes of diseases. PMID:25243403

  16. Accelerating Monte Carlo molecular simulations by reweighting and reconstructing Markov chains: Extrapolation of canonical ensemble averages and second derivatives to different temperature and density conditions

    NASA Astrophysics Data System (ADS)

    Kadoura, Ahmad; Sun, Shuyu; Salama, Amgad

    2014-08-01

    Accurate determination of thermodynamic properties of petroleum reservoir fluids is of great interest to many applications, especially in petroleum engineering and chemical engineering. Molecular simulation has many appealing features, especially its requirement of fewer tuned parameters but yet better predicting capability; however it is well known that molecular simulation is very CPU expensive, as compared to equation of state approaches. We have recently introduced an efficient thermodynamically consistent technique to regenerate rapidly Monte Carlo Markov Chains (MCMCs) at different thermodynamic conditions from the existing data points that have been pre-computed with expensive classical simulation. This technique can speed up the simulation more than a million times, making the regenerated molecular simulation almost as fast as equation of state approaches. In this paper, this technique is first briefly reviewed and then numerically investigated in its capability of predicting ensemble averages of primary quantities at different neighboring thermodynamic conditions to the original simulated MCMCs. Moreover, this extrapolation technique is extended to predict second derivative properties (e.g. heat capacity and fluid compressibility). The method works by reweighting and reconstructing generated MCMCs in canonical ensemble for Lennard-Jones particles. In this paper, system's potential energy, pressure, isochoric heat capacity and isothermal compressibility along isochors, isotherms and paths of changing temperature and density from the original simulated points were extrapolated. Finally, an optimized set of Lennard-Jones parameters (ε, σ) for single site models were proposed for methane, nitrogen and carbon monoxide.

  17. A scalable and accurate method for classifying protein-ligand binding geometries using a MapReduce approach.

    PubMed

    Estrada, T; Zhang, B; Cicotti, P; Armen, R S; Taufer, M

    2012-07-01

    We present a scalable and accurate method for classifying protein-ligand binding geometries in molecular docking. Our method is a three-step process: the first step encodes the geometry of a three-dimensional (3D) ligand conformation into a single 3D point in the space; the second step builds an octree by assigning an octant identifier to every single point in the space under consideration; and the third step performs an octree-based clustering on the reduced conformation space and identifies the most dense octant. We adapt our method for MapReduce and implement it in Hadoop. The load-balancing, fault-tolerance, and scalability in MapReduce allow screening of very large conformation spaces not approachable with traditional clustering methods. We analyze results for docking trials for 23 protein-ligand complexes for HIV protease, 21 protein-ligand complexes for Trypsin, and 12 protein-ligand complexes for P38alpha kinase. We also analyze cross docking trials for 24 ligands, each docking into 24 protein conformations of the HIV protease, and receptor ensemble docking trials for 24 ligands, each docking in a pool of HIV protease receptors. Our method demonstrates significant improvement over energy-only scoring for the accurate identification of native ligand geometries in all these docking assessments. The advantages of our clustering approach make it attractive for complex applications in real-world drug design efforts. We demonstrate that our method is particularly useful for clustering docking results using a minimal ensemble of representative protein conformational states (receptor ensemble docking), which is now a common strategy to address protein flexibility in molecular docking. Copyright © 2012 Elsevier Ltd. All rights reserved.

  18. Binding Modes of Teixobactin to Lipid II: Molecular Dynamics Study.

    PubMed

    Liu, Yang; Liu, Yaxin; Chan-Park, Mary B; Mu, Yuguang

    2017-12-08

    Teixobactin (TXB) is a newly discovered antibiotic targeting the bacterial cell wall precursor Lipid II (L II ). In the present work, four binding modes of TXB on L II were identified by a contact-map based clustering method. The highly flexible binary complex ensemble was generated by parallel tempering metadynamics simulation in a well-tempered ensemble (PTMetaD-WTE). In agreement with experimental findings, the pyrophosphate group and the attached first sugar subunit of L II are found to be the minimal motif for stable TXB binding. Three of the four binding modes involve the ring structure of TXB and have relatively higher binding affinities, indicating the importance of the ring motif of TXB in L II recognition. TXB-L II complexes with a ratio of 2:1 are also predicted with configurations such that the ring motif of two TXB molecules bound to the pyrophosphate-MurNAc moiety and the glutamic acid residue of one L II , respectively. Our findings disclose that the ring motif of TXB is critical to L II binding and novel antibiotics can be designed based on its mimetics.

  19. Diffraction-Based Density Restraints for Membrane and Membrane-Peptide Molecular Dynamics Simulations

    PubMed Central

    Benz, Ryan W.; Nanda, Hirsh; Castro-Román, Francisco; White, Stephen H.; Tobias, Douglas J.

    2006-01-01

    We have recently shown that current molecular dynamics (MD) atomic force fields are not yet able to produce lipid bilayer structures that agree with experimentally-determined structures within experimental errors. Because of the many advantages offered by experimentally validated simulations, we have developed a novel restraint method for membrane MD simulations that uses experimental diffraction data. The restraints, introduced into the MD force field, act upon specified groups of atoms to restrain their mean positions and widths to values determined experimentally. The method was first tested using a simple liquid argon system, and then applied to a neat dioleoylphosphatidylcholine (DOPC) bilayer at 66% relative humidity and to the same bilayer containing the peptide melittin. Application of experiment-based restraints to the transbilayer double-bond and water distributions of neat DOPC bilayers led to distributions that agreed with the experimental values. Based upon the experimental structure, the restraints improved the simulated structure in some regions while introducing larger differences in others, as might be expected from imperfect force fields. For the DOPC-melittin system, the experimental transbilayer distribution of melittin was used as a restraint. The addition of the peptide caused perturbations of the simulated bilayer structure, but which were larger than observed experimentally. The melittin distribution of the simulation could be fit accurately to a Gaussian with parameters close to the observed ones, indicating that the restraints can be used to produce an ensemble of membrane-bound peptide conformations that are consistent with experiments. Such ensembles pave the way for understanding peptide-bilayer interactions at the atomic level. PMID:16950837

  20. Modified Nose-Hoover thermostat for solid state for constant temperature molecular dynamics simulation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Wen-Hwa, E-mail: whchen@pme.nthu.edu.tw; National Applied Research Laboratories, Taipei 10622, Taiwan, ROC; Wu, Chun-Hung

    2011-07-10

    Nose-Hoover (NH) thermostat methods incorporated with molecular dynamics (MD) simulation have been widely used to simulate the instantaneous system temperature and feedback energy in a canonical ensemble. The method simply relates the kinetic energy to the system temperature via the particles' momenta based on the ideal gas law. However, when used in a tightly bound system such as solids, the method may suffer from deriving a lower system temperature and potentially inducing early breaking of atomic bonds at relatively high temperature due to the neglect of the effect of the potential energy of atoms based on solid state physics. Inmore » this paper, a modified NH thermostat method is proposed for solid system. The method takes into account the contribution of phonons by virtue of the vibrational energy of lattice and the zero-point energy, derived based on the Debye theory. Proof of the equivalence of the method and the canonical ensemble is first made. The modified NH thermostat is tested on different gold nanocrystals to characterize their melting point and constant volume specific heat, and also their size and temperature dependence. Results show that the modified NH method can give much more comparable results to both the literature experimental and theoretical data than the standard NH. Most importantly, the present model is the only one, among the six thermostat algorithms under comparison, that can accurately reproduce the experimental data and also the T{sup 3}-law at temperature below the Debye temperature, where the specific heat of a solid at constant volume is proportional to the cube of temperature.« less

  1. Project FIRES [Firefighters' Integrated Response Equipment System]. Volume 2: Protective Ensemble Performance Standards, Phase 1B

    NASA Technical Reports Server (NTRS)

    Abeles, F. J.

    1980-01-01

    The design of the prototype protective ensemble was finalized. Prototype ensembles were fabricated and then subjected to a series of qualification tests which were based upon the protective ensemble performance standards PEPS requirements. Engineering drawings and purchase specifications were prepared for the new protective ensemble.

  2. Fireball as the result of self-organization of an ensemble of diamagnetic electron-ion nanoparticles in molecular gas

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lopasov, V. P., E-mail: lopas@iao.ru

    The conditions for dissipative self-organization of a fireball (FB) is a molecular gas by means of a regular correction of an elastic collision of water and nitrogen molecules by the field of a coherent bi-harmonic light wave (BLW) are presented. The BWL field is generated due to conversion of energy of a linear lightning discharge into light energy. A FB consists of two components: an ensemble of optically active diamagnetic electron-ion nanoparticles and a standing wave of elliptical polarization (SWEP). It is shown that the FB lifetime depends on the energies accumulated by nanoparticles and the SWEP field and onmore » the stability of self-oscillations of the energy between nanoparticles and SWEP.« less

  3. Preparation of a pure molecular quantum gas.

    PubMed

    Herbig, Jens; Kraemer, Tobias; Mark, Michael; Weber, Tino; Chin, Cheng; Nägerl, Hanns-Christoph; Grimm, Rudolf

    2003-09-12

    An ultracold molecular quantum gas is created by application of a magnetic field sweep across a Feshbach resonance to a Bose-Einstein condensate of cesium atoms. The ability to separate the molecules from the atoms permits direct imaging of the pure molecular sample. Magnetic levitation enables study of the dynamics of the ensemble on extended time scales. We measured ultralow expansion energies in the range of a few nanokelvin for a sample of 3000 molecules. Our observations are consistent with the presence of a macroscopic molecular matter wave.

  4. Guidelines for the analysis of free energy calculations.

    PubMed

    Klimovich, Pavel V; Shirts, Michael R; Mobley, David L

    2015-05-01

    Free energy calculations based on molecular dynamics simulations show considerable promise for applications ranging from drug discovery to prediction of physical properties and structure-function studies. But these calculations are still difficult and tedious to analyze, and best practices for analysis are not well defined or propagated. Essentially, each group analyzing these calculations needs to decide how to conduct the analysis and, usually, develop its own analysis tools. Here, we review and recommend best practices for analysis yielding reliable free energies from molecular simulations. Additionally, we provide a Python tool, alchemical-analysis.py, freely available on GitHub as part of the pymbar package (located at http://github.com/choderalab/pymbar), that implements the analysis practices reviewed here for several reference simulation packages, which can be adapted to handle data from other packages. Both this review and the tool covers analysis of alchemical calculations generally, including free energy estimates via both thermodynamic integration and free energy perturbation-based estimators. Our Python tool also handles output from multiple types of free energy calculations, including expanded ensemble and Hamiltonian replica exchange, as well as standard fixed ensemble calculations. We also survey a range of statistical and graphical ways of assessing the quality of the data and free energy estimates, and provide prototypes of these in our tool. We hope this tool and discussion will serve as a foundation for more standardization of and agreement on best practices for analysis of free energy calculations.

  5. Enantioselectivity in Candida antarctica lipase B: A molecular dynamics study

    PubMed Central

    Raza, Sami; Fransson, Linda; Hult, Karl

    2001-01-01

    A major problem in predicting the enantioselectivity of an enzyme toward substrate molecules is that even high selectivity toward one substrate enantiomer over the other corresponds to a very small difference in free energy. However, total free energies in enzyme-substrate systems are very large and fluctuate significantly because of general protein motion. Candida antarctica lipase B (CALB), a serine hydrolase, displays enantioselectivity toward secondary alcohols. Here, we present a modeling study where the aim has been to develop a molecular dynamics-based methodology for the prediction of enantioselectivity in CALB. The substrates modeled (seven in total) were 3-methyl-2-butanol with various aliphatic carboxylic acids and also 2-butanol, as well as 3,3-dimethyl-2-butanol with octanoic acid. The tetrahedral reaction intermediate was used as a model of the transition state. Investigative analyses were performed on ensembles of nonminimized structures and focused on the potential energies of a number of subsets within the modeled systems to determine which specific regions are important for the prediction of enantioselectivity. One category of subset was based on atoms that make up the core structural elements of the transition state. We considered that a more favorable energetic conformation of such a subset should relate to a greater likelihood for catalysis to occur, thus reflecting higher selectivity. The results of this study conveyed that the use of this type of subset was viable for the analysis of structural ensembles and yielded good predictions of enantioselectivity. PMID:11266619

  6. Towards an improved ensemble precipitation forecast: A probabilistic post-processing approach

    NASA Astrophysics Data System (ADS)

    Khajehei, Sepideh; Moradkhani, Hamid

    2017-03-01

    Recently, ensemble post-processing (EPP) has become a commonly used approach for reducing the uncertainty in forcing data and hence hydrologic simulation. The procedure was introduced to build ensemble precipitation forecasts based on the statistical relationship between observations and forecasts. More specifically, the approach relies on a transfer function that is developed based on a bivariate joint distribution between the observations and the simulations in the historical period. The transfer function is used to post-process the forecast. In this study, we propose a Bayesian EPP approach based on copula functions (COP-EPP) to improve the reliability of the precipitation ensemble forecast. Evaluation of the copula-based method is carried out by comparing the performance of the generated ensemble precipitation with the outputs from an existing procedure, i.e. mixed type meta-Gaussian distribution. Monthly precipitation from Climate Forecast System Reanalysis (CFS) and gridded observation from Parameter-Elevation Relationships on Independent Slopes Model (PRISM) have been employed to generate the post-processed ensemble precipitation. Deterministic and probabilistic verification frameworks are utilized in order to evaluate the outputs from the proposed technique. Distribution of seasonal precipitation for the generated ensemble from the copula-based technique is compared to the observation and raw forecasts for three sub-basins located in the Western United States. Results show that both techniques are successful in producing reliable and unbiased ensemble forecast, however, the COP-EPP demonstrates considerable improvement in the ensemble forecast in both deterministic and probabilistic verification, in particular in characterizing the extreme events in wet seasons.

  7. Correlation of chemical shifts predicted by molecular dynamics simulations for partially disordered proteins.

    PubMed

    Karp, Jerome M; Eryilmaz, Ertan; Erylimaz, Ertan; Cowburn, David

    2015-01-01

    There has been a longstanding interest in being able to accurately predict NMR chemical shifts from structural data. Recent studies have focused on using molecular dynamics (MD) simulation data as input for improved prediction. Here we examine the accuracy of chemical shift prediction for intein systems, which have regions of intrinsic disorder. We find that using MD simulation data as input for chemical shift prediction does not consistently improve prediction accuracy over use of a static X-ray crystal structure. This appears to result from the complex conformational ensemble of the disordered protein segments. We show that using accelerated molecular dynamics (aMD) simulations improves chemical shift prediction, suggesting that methods which better sample the conformational ensemble like aMD are more appropriate tools for use in chemical shift prediction for proteins with disordered regions. Moreover, our study suggests that data accurately reflecting protein dynamics must be used as input for chemical shift prediction in order to correctly predict chemical shifts in systems with disorder.

  8. Skill of Global Raw and Postprocessed Ensemble Predictions of Rainfall over Northern Tropical Africa

    NASA Astrophysics Data System (ADS)

    Vogel, Peter; Knippertz, Peter; Fink, Andreas H.; Schlueter, Andreas; Gneiting, Tilmann

    2018-04-01

    Accumulated precipitation forecasts are of high socioeconomic importance for agriculturally dominated societies in northern tropical Africa. In this study, we analyze the performance of nine operational global ensemble prediction systems (EPSs) relative to climatology-based forecasts for 1 to 5-day accumulated precipitation based on the monsoon seasons 2007-2014 for three regions within northern tropical Africa. To assess the full potential of raw ensemble forecasts across spatial scales, we apply state-of-the-art statistical postprocessing methods in form of Bayesian Model Averaging (BMA) and Ensemble Model Output Statistics (EMOS), and verify against station and spatially aggregated, satellite-based gridded observations. Raw ensemble forecasts are uncalibrated, unreliable, and underperform relative to climatology, independently of region, accumulation time, monsoon season, and ensemble. Differences between raw ensemble and climatological forecasts are large, and partly stem from poor prediction for low precipitation amounts. BMA and EMOS postprocessed forecasts are calibrated, reliable, and strongly improve on the raw ensembles, but - somewhat disappointingly - typically do not outperform climatology. Most EPSs exhibit slight improvements over the period 2007-2014, but overall have little added value compared to climatology. We suspect that the parametrization of convection is a potential cause for the sobering lack of ensemble forecast skill in a region dominated by mesoscale convective systems.

  9. A further step toward an optimal ensemble of classifiers for peptide classification, a case study: HIV protease.

    PubMed

    Nanni, Loris; Lumini, Alessandra

    2009-01-01

    The focuses of this work are: to propose a novel method for building an ensemble of classifiers for peptide classification based on substitution matrices; to show the importance to select a proper set of the parameters of the classifiers that build the ensemble of learning systems. The HIV-1 protease cleavage site prediction problem is here studied. The results obtained by a blind testing protocol are reported, the comparison with other state-of-the-art approaches, based on ensemble of classifiers, allows to quantify the performance improvement obtained by the systems proposed in this paper. The simulation based on experimentally determined protease cleavage data has demonstrated the success of these new ensemble algorithms. Particularly interesting it is to note that also if the HIV-1 protease cleavage site prediction problem is considered linearly separable we obtain the best performance using an ensemble of non-linear classifiers.

  10. Application of the maximum entropy principle to determine ensembles of intrinsically disordered proteins from residual dipolar couplings.

    PubMed

    Sanchez-Martinez, M; Crehuet, R

    2014-12-21

    We present a method based on the maximum entropy principle that can re-weight an ensemble of protein structures based on data from residual dipolar couplings (RDCs). The RDCs of intrinsically disordered proteins (IDPs) provide information on the secondary structure elements present in an ensemble; however even two sets of RDCs are not enough to fully determine the distribution of conformations, and the force field used to generate the structures has a pervasive influence on the refined ensemble. Two physics-based coarse-grained force fields, Profasi and Campari, are able to predict the secondary structure elements present in an IDP, but even after including the RDC data, the re-weighted ensembles differ between both force fields. Thus the spread of IDP ensembles highlights the need for better force fields. We distribute our algorithm in an open-source Python code.

  11. Simulating adsorptive expansion of zeolites: application to biomass-derived solutions in contact with silicalite.

    PubMed

    Santander, Julian E; Tsapatsis, Michael; Auerbach, Scott M

    2013-04-16

    We have constructed and applied an algorithm to simulate the behavior of zeolite frameworks during liquid adsorption. We applied this approach to compute the adsorption isotherms of furfural-water and hydroxymethyl furfural (HMF)-water mixtures adsorbing in silicalite zeolite at 300 K for comparison with experimental data. We modeled these adsorption processes under two different statistical mechanical ensembles: the grand canonical (V-Nz-μg-T or GC) ensemble keeping volume fixed, and the P-Nz-μg-T (osmotic) ensemble allowing volume to fluctuate. To optimize accuracy and efficiency, we compared pure Monte Carlo (MC) sampling to hybrid MC-molecular dynamics (MD) simulations. For the external furfural-water and HMF-water phases, we assumed the ideal solution approximation and employed a combination of tabulated data and extended ensemble simulations for computing solvation free energies. We found that MC sampling in the V-Nz-μg-T ensemble (i.e., standard GCMC) does a poor job of reproducing both the Henry's law regime and the saturation loadings of these systems. Hybrid MC-MD sampling of the V-Nz-μg-T ensemble, which includes framework vibrations at fixed total volume, provides better results in the Henry's law region, but this approach still does not reproduce experimental saturation loadings. Pure MC sampling of the osmotic ensemble was found to approach experimental saturation loadings more closely, whereas hybrid MC-MD sampling of the osmotic ensemble quantitatively reproduces such loadings because the MC-MD approach naturally allows for locally anisotropic volume changes wherein some pores expand whereas others contract.

  12. Predicting conformational ensembles and genome-wide transcription factor binding sites from DNA sequences.

    PubMed

    Andrabi, Munazah; Hutchins, Andrew Paul; Miranda-Saavedra, Diego; Kono, Hidetoshi; Nussinov, Ruth; Mizuguchi, Kenji; Ahmad, Shandar

    2017-06-22

    DNA shape is emerging as an important determinant of transcription factor binding beyond just the DNA sequence. The only tool for large scale DNA shape estimates, DNAshape was derived from Monte-Carlo simulations and predicts four broad and static DNA shape features, Propeller twist, Helical twist, Minor groove width and Roll. The contributions of other shape features e.g. Shift, Slide and Opening cannot be evaluated using DNAshape. Here, we report a novel method DynaSeq, which predicts molecular dynamics-derived ensembles of a more exhaustive set of DNA shape features. We compared the DNAshape and DynaSeq predictions for the common features and applied both to predict the genome-wide binding sites of 1312 TFs available from protein interaction quantification (PIQ) data. The results indicate a good agreement between the two methods for the common shape features and point to advantages in using DynaSeq. Predictive models employing ensembles from individual conformational parameters revealed that base-pair opening - known to be important in strand separation - was the best predictor of transcription factor-binding sites (TFBS) followed by features employed by DNAshape. Of note, TFBS could be predicted not only from the features at the target motif sites, but also from those as far as 200 nucleotides away from the motif.

  13. Thermostating extended Lagrangian Born-Oppenheimer molecular dynamics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Martínez, Enrique; Cawkwell, Marc J.; Voter, Arthur F.

    Here, Extended Lagrangian Born-Oppenheimer molecular dynamics is developed and analyzed for applications in canonical (NVT) simulations. Three different approaches are considered: the Nosé and Andersen thermostats and Langevin dynamics. We have tested the temperature distribution under different conditions of self-consistent field (SCF) convergence and time step and compared the results to analytical predictions. We find that the simulations based on the extended Lagrangian Born-Oppenheimer framework provide accurate canonical distributions even under approximate SCF convergence, often requiring only a single diagonalization per time step, whereas regular Born-Oppenheimer formulations exhibit unphysical fluctuations unless a sufficiently high degree of convergence is reached atmore » each time step. Lastly, the thermostated extended Lagrangian framework thus offers an accurate approach to sample processes in the canonical ensemble at a fraction of the computational cost of regular Born-Oppenheimer molecular dynamics simulations.« less

  14. Thermostating extended Lagrangian Born-Oppenheimer molecular dynamics

    DOE PAGES

    Martínez, Enrique; Cawkwell, Marc J.; Voter, Arthur F.; ...

    2015-04-21

    Here, Extended Lagrangian Born-Oppenheimer molecular dynamics is developed and analyzed for applications in canonical (NVT) simulations. Three different approaches are considered: the Nosé and Andersen thermostats and Langevin dynamics. We have tested the temperature distribution under different conditions of self-consistent field (SCF) convergence and time step and compared the results to analytical predictions. We find that the simulations based on the extended Lagrangian Born-Oppenheimer framework provide accurate canonical distributions even under approximate SCF convergence, often requiring only a single diagonalization per time step, whereas regular Born-Oppenheimer formulations exhibit unphysical fluctuations unless a sufficiently high degree of convergence is reached atmore » each time step. Lastly, the thermostated extended Lagrangian framework thus offers an accurate approach to sample processes in the canonical ensemble at a fraction of the computational cost of regular Born-Oppenheimer molecular dynamics simulations.« less

  15. RNA sequencing from neural ensembles activated during fear conditioning in the mouse temporal association cortex

    PubMed Central

    Cho, Jin-Hyung; Huang, Ben S.; Gray, Jesse M.

    2016-01-01

    The stable formation of remote fear memories is thought to require neuronal gene induction in cortical ensembles that are activated during learning. However, the set of genes expressed specifically in these activated ensembles is not known; knowledge of such transcriptional profiles may offer insights into the molecular program underlying stable memory formation. Here we use RNA-Seq to identify genes whose expression is enriched in activated cortical ensembles labeled during associative fear learning. We first establish that mouse temporal association cortex (TeA) is required for remote recall of auditory fear memories. We then perform RNA-Seq in TeA neurons that are labeled by the activity reporter Arc-dVenus during learning. We identify 944 genes with enriched expression in Arc-dVenus+ neurons. These genes include markers of L2/3, L5b, and L6 excitatory neurons but not glial or inhibitory markers, confirming Arc-dVenus to be an excitatory neuron-specific but non-layer-specific activity reporter. Cross comparisons to other transcriptional profiles show that 125 of the enriched genes are also activity-regulated in vitro or induced by visual stimulus in the visual cortex, suggesting that they may be induced generally in the cortex in an experience-dependent fashion. Prominent among the enriched genes are those encoding potassium channels that down-regulate neuronal activity, suggesting the possibility that part of the molecular program induced by fear conditioning may initiate homeostatic plasticity. PMID:27557751

  16. Differences in β-strand Populations of Monomeric Aβ40 and Aβ42

    PubMed Central

    Ball, K. Aurelia; Phillips, Aaron H.; Wemmer, David E.; Head-Gordon, Teresa

    2013-01-01

    Using homonuclear 1H NOESY spectra, with chemical shifts, 3JHNHα scalar couplings, residual dipolar couplings, and 1H-15N NOEs, we have optimized and validated the conformational ensembles of the amyloid-β 1–40 (Aβ40) and amyloid-β 1–42 (Aβ42) peptides generated by molecular dynamics simulations. We find that both peptides have a diverse set of secondary structure elements including turns, helices, and antiparallel and parallel β-strands. The most significant difference in the structural ensembles of the two peptides is the type of β-hairpins and β-strands they populate. We find that Aβ42 forms a major antiparallel β-hairpin involving the central hydrophobic cluster residues (16–21) with residues 29–36, compatible with known amyloid fibril forming regions, whereas Aβ40 forms an alternative but less populated antiparallel β-hairpin between the central hydrophobic cluster and residues 9–13, that sometimes forms a β-sheet by association with residues 35–37. Furthermore, we show that the two additional C-terminal residues of Aβ42, in particular Ile-41, directly control the differences in the β-strand content found between the Aβ40 and Aβ42 structural ensembles. Integrating the experimental and theoretical evidence accumulated over the last decade, it is now possible to present monomeric structural ensembles of Aβ40 and Aβ42 consistent with available information that produce a plausible molecular basis for why Aβ42 exhibits greater fibrillization rates than Aβ40. PMID:23790380

  17. WESTPA: An interoperable, highly scalable software package for weighted ensemble simulation and analysis

    PubMed Central

    Zwier, Matthew C.; Adelman, Joshua L.; Kaus, Joseph W.; Pratt, Adam J.; Wong, Kim F.; Rego, Nicholas B.; Suárez, Ernesto; Lettieri, Steven; Wang, David W.; Grabe, Michael; Zuckerman, Daniel M.; Chong, Lillian T.

    2015-01-01

    The weighted ensemble (WE) path sampling approach orchestrates an ensemble of parallel calculations with intermittent communication to enhance the sampling of rare events, such as molecular associations or conformational changes in proteins or peptides. Trajectories are replicated and pruned in a way that focuses computational effort on under-explored regions of configuration space while maintaining rigorous kinetics. To enable the simulation of rare events at any scale (e.g. atomistic, cellular), we have developed an open-source, interoperable, and highly scalable software package for the execution and analysis of WE simulations: WESTPA (The Weighted Ensemble Simulation Toolkit with Parallelization and Analysis). WESTPA scales to thousands of CPU cores and includes a suite of analysis tools that have been implemented in a massively parallel fashion. The software has been designed to interface conveniently with any dynamics engine and has already been used with a variety of molecular dynamics (e.g. GROMACS, NAMD, OpenMM, AMBER) and cell-modeling packages (e.g. BioNetGen, MCell). WESTPA has been in production use for over a year, and its utility has been demonstrated for a broad set of problems, ranging from atomically detailed host-guest associations to non-spatial chemical kinetics of cellular signaling networks. The following describes the design and features of WESTPA, including the facilities it provides for running WE simulations, storing and analyzing WE simulation data, as well as examples of input and output. PMID:26392815

  18. Nonequilibrium electromagnetics: Local and macroscopic fields and constitutive relationships

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Baker-Jarvis, James; Kabos, Pavel; Holloway, Christopher L.

    We study the electrodynamics of materials using a Liouville-Hamiltonian-based statistical-mechanical theory. Our goal is to develop electrodynamics from an ensemble-average viewpoint that is valid for microscopic and nonequilibrium systems at molecular to submolecular scales. This approach is not based on a Taylor series expansion of the charge density to obtain the multipoles. Instead, expressions of the molecular multipoles are used in an inverse problem to obtain the averaging statistical-density function that is used to obtain the macroscopic fields. The advantages of this method are that the averaging function is constructed in a self-consistent manner and the molecules can either bemore » treated as point multipoles or contain more microstructure. Expressions for the local and macroscopic fields are obtained, and evolution equations for the constitutive parameters are developed. We derive equations for the local field as functions of the applied, polarization, magnetization, strain density, and macroscopic fields.« less

  19. Structural insights into human microsomal epoxide hydrolase by combined homology modeling, molecular dynamics simulations, and molecular docking calculations.

    PubMed

    Saenz-Méndez, Patricia; Katz, Aline; Pérez-Kempner, María Lucía; Ventura, Oscar N; Vázquez, Marta

    2017-04-01

    A new homology model of human microsomal epoxide hydrolase was derived based on multiple templates. The model obtained was fully evaluated, including MD simulations and ensemble-based docking, showing that the quality of the structure is better than that of only previously known model. Particularly, a catalytic triad was clearly identified, in agreement with the experimental information available. Analysis of intermediates in the enzymatic mechanism led to the identification of key residues for substrate binding, stereoselectivity, and intermediate stabilization during the reaction. In particular, we have confirmed the role of the oxyanion hole and the conserved motif (HGXP) in epoxide hydrolases, in excellent agreement with known experimental and computational data on similar systems. The model obtained is the first one that fully agrees with all the experimental observations on the system. Proteins 2017; 85:720-730. © 2016 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  20. SVM and SVM Ensembles in Breast Cancer Prediction.

    PubMed

    Huang, Min-Wei; Chen, Chih-Wen; Lin, Wei-Chao; Ke, Shih-Wen; Tsai, Chih-Fong

    2017-01-01

    Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers.

  1. SVM and SVM Ensembles in Breast Cancer Prediction

    PubMed Central

    Huang, Min-Wei; Chen, Chih-Wen; Lin, Wei-Chao; Ke, Shih-Wen; Tsai, Chih-Fong

    2017-01-01

    Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers. PMID:28060807

  2. Spatial and Spin Symmetry Breaking in Semidefinite-Programming-Based Hartree-Fock Theory.

    PubMed

    Nascimento, Daniel R; DePrince, A Eugene

    2018-05-08

    The Hartree-Fock problem was recently recast as a semidefinite optimization over the space of rank-constrained two-body reduced-density matrices (RDMs) [ Phys. Rev. A 2014 , 89 , 010502(R) ]. This formulation of the problem transfers the nonconvexity of the Hartree-Fock energy functional to the rank constraint on the two-body RDM. We consider an equivalent optimization over the space of positive semidefinite one-electron RDMs (1-RDMs) that retains the nonconvexity of the Hartree-Fock energy expression. The optimized 1-RDM satisfies ensemble N-representability conditions, and ensemble spin-state conditions may be imposed as well. The spin-state conditions place additional linear and nonlinear constraints on the 1-RDM. We apply this RDM-based approach to several molecular systems and explore its spatial (point group) and spin ( Ŝ 2 and Ŝ 3 ) symmetry breaking properties. When imposing Ŝ 2 and Ŝ 3 symmetry but relaxing point group symmetry, the procedure often locates spatial-symmetry-broken solutions that are difficult to identify using standard algorithms. For example, the RDM-based approach yields a smooth, spatial-symmetry-broken potential energy curve for the well-known Be-H 2 insertion pathway. We also demonstrate numerically that, upon relaxation of Ŝ 2 and Ŝ 3 symmetry constraints, the RDM-based approach is equivalent to real-valued generalized Hartree-Fock theory.

  3. Heterogeneous Ensemble Combination Search Using Genetic Algorithm for Class Imbalanced Data Classification.

    PubMed

    Haque, Mohammad Nazmul; Noman, Nasimul; Berretta, Regina; Moscato, Pablo

    2016-01-01

    Classification of datasets with imbalanced sample distributions has always been a challenge. In general, a popular approach for enhancing classification performance is the construction of an ensemble of classifiers. However, the performance of an ensemble is dependent on the choice of constituent base classifiers. Therefore, we propose a genetic algorithm-based search method for finding the optimum combination from a pool of base classifiers to form a heterogeneous ensemble. The algorithm, called GA-EoC, utilises 10 fold-cross validation on training data for evaluating the quality of each candidate ensembles. In order to combine the base classifiers decision into ensemble's output, we used the simple and widely used majority voting approach. The proposed algorithm, along with the random sub-sampling approach to balance the class distribution, has been used for classifying class-imbalanced datasets. Additionally, if a feature set was not available, we used the (α, β) - k Feature Set method to select a better subset of features for classification. We have tested GA-EoC with three benchmarking datasets from the UCI-Machine Learning repository, one Alzheimer's disease dataset and a subset of the PubFig database of Columbia University. In general, the performance of the proposed method on the chosen datasets is robust and better than that of the constituent base classifiers and many other well-known ensembles. Based on our empirical study we claim that a genetic algorithm is a superior and reliable approach to heterogeneous ensemble construction and we expect that the proposed GA-EoC would perform consistently in other cases.

  4. Computational Amide I Spectroscopy for Refinement of Disordered Peptide Ensembles: Maximum Entropy and Related Approaches

    NASA Astrophysics Data System (ADS)

    Reppert, Michael; Tokmakoff, Andrei

    The structural characterization of intrinsically disordered peptides (IDPs) presents a challenging biophysical problem. Extreme heterogeneity and rapid conformational interconversion make traditional methods difficult to interpret. Due to its ultrafast (ps) shutter speed, Amide I vibrational spectroscopy has received considerable interest as a novel technique to probe IDP structure and dynamics. Historically, Amide I spectroscopy has been limited to delivering global secondary structural information. More recently, however, the method has been adapted to study structure at the local level through incorporation of isotope labels into the protein backbone at specific amide bonds. Thanks to the acute sensitivity of Amide I frequencies to local electrostatic interactions-particularly hydrogen bonds-spectroscopic data on isotope labeled residues directly reports on local peptide conformation. Quantitative information can be extracted using electrostatic frequency maps which translate molecular dynamics trajectories into Amide I spectra for comparison with experiment. Here we present our recent efforts in the development of a rigorous approach to incorporating Amide I spectroscopic restraints into refined molecular dynamics structural ensembles using maximum entropy and related approaches. By combining force field predictions with experimental spectroscopic data, we construct refined structural ensembles for a family of short, strongly disordered, elastin-like peptides in aqueous solution.

  5. Enabling grand-canonical Monte Carlo: extending the flexibility of GROMACS through the GromPy python interface module.

    PubMed

    Pool, René; Heringa, Jaap; Hoefling, Martin; Schulz, Roland; Smith, Jeremy C; Feenstra, K Anton

    2012-05-05

    We report on a python interface to the GROMACS molecular simulation package, GromPy (available at https://github.com/GromPy). This application programming interface (API) uses the ctypes python module that allows function calls to shared libraries, for example, written in C. To the best of our knowledge, this is the first reported interface to the GROMACS library that uses direct library calls. GromPy can be used for extending the current GROMACS simulation and analysis modes. In this work, we demonstrate that the interface enables hybrid Monte-Carlo/molecular dynamics (MD) simulations in the grand-canonical ensemble, a simulation mode that is currently not implemented in GROMACS. For this application, the interplay between GromPy and GROMACS requires only minor modifications of the GROMACS source code, not affecting the operation, efficiency, and performance of the GROMACS applications. We validate the grand-canonical application against MD in the canonical ensemble by comparison of equations of state. The results of the grand-canonical simulations are in complete agreement with MD in the canonical ensemble. The python overhead of the grand-canonical scheme is only minimal. Copyright © 2012 Wiley Periodicals, Inc.

  6. An Ensemble-Based Smoother with Retrospectively Updated Weights for Highly Nonlinear Systems

    NASA Technical Reports Server (NTRS)

    Chin, T. M.; Turmon, M. J.; Jewell, J. B.; Ghil, M.

    2006-01-01

    Monte Carlo computational methods have been introduced into data assimilation for nonlinear systems in order to alleviate the computational burden of updating and propagating the full probability distribution. By propagating an ensemble of representative states, algorithms like the ensemble Kalman filter (EnKF) and the resampled particle filter (RPF) rely on the existing modeling infrastructure to approximate the distribution based on the evolution of this ensemble. This work presents an ensemble-based smoother that is applicable to the Monte Carlo filtering schemes like EnKF and RPF. At the minor cost of retrospectively updating a set of weights for ensemble members, this smoother has demonstrated superior capabilities in state tracking for two highly nonlinear problems: the double-well potential and trivariate Lorenz systems. The algorithm does not require retrospective adaptation of the ensemble members themselves, and it is thus suited to a streaming operational mode. The accuracy of the proposed backward-update scheme in estimating non-Gaussian distributions is evaluated by comparison to the more accurate estimates provided by a Markov chain Monte Carlo algorithm.

  7. Charge Transport and Phase Behavior of Imidazolium-Based Ionic Liquid Crystals from Fully Atomistic Simulations

    PubMed Central

    2018-01-01

    Ionic liquid crystals occupy an intriguing middle ground between room-temperature ionic liquids and mesostructured liquid crystals. Here, we examine a non-polarizable, fully atomistic model of the 1-alkyl-3-methylimidazolium nitrate family using molecular dynamics in the constant pressure–constant temperature ensemble. These materials exhibit a distinct “smectic” liquid phase, characterized by layers formed by the molecules, which separate the ionic and aliphatic moieties. In particular, we discuss the implications this layering may have for electrolyte applications. PMID:29301305

  8. g_contacts: Fast contact search in bio-molecular ensemble data

    NASA Astrophysics Data System (ADS)

    Blau, Christian; Grubmuller, Helmut

    2013-12-01

    Short-range interatomic interactions govern many bio-molecular processes. Therefore, identifying close interaction partners in ensemble data is an essential task in structural biology and computational biophysics. A contact search can be cast as a typical range search problem for which efficient algorithms have been developed. However, none of those has yet been adapted to the context of macromolecular ensembles, particularly in a molecular dynamics (MD) framework. Here a set-decomposition algorithm is implemented which detects all contacting atoms or residues in maximum O(Nlog(N)) run-time, in contrast to the O(N2) complexity of a brute-force approach. Catalogue identifier: AEQA_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEQA_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 8945 No. of bytes in distributed program, including test data, etc.: 981604 Distribution format: tar.gz Programming language: C99. Computer: PC. Operating system: Linux. RAM: ≈Size of input frame Classification: 3, 4.14. External routines: Gromacs 4.6[1] Nature of problem: Finding atoms or residues that are closer to one another than a given cut-off. Solution method: Excluding distant atoms from distance calculations by decomposing the given set of atoms into disjoint subsets. Running time:≤O(Nlog(N)) References: [1] S. Pronk, S. Pall, R. Schulz, P. Larsson, P. Bjelkmar, R. Apostolov, M. R. Shirts, J.C. Smith, P. M. Kasson, D. van der Spoel, B. Hess and Erik Lindahl, Gromacs 4.5: a high-throughput and highly parallel open source molecular simulation toolkit, Bioinformatics 29 (7) (2013).

  9. Quantifying polypeptide conformational space: sensitivity to conformation and ensemble definition.

    PubMed

    Sullivan, David C; Lim, Carmay

    2006-08-24

    Quantifying the density of conformations over phase space (the conformational distribution) is needed to model important macromolecular processes such as protein folding. In this work, we quantify the conformational distribution for a simple polypeptide (N-mer polyalanine) using the cumulative distribution function (CDF), which gives the probability that two randomly selected conformations are separated by less than a "conformational" distance and whose inverse gives conformation counts as a function of conformational radius. An important finding is that the conformation counts obtained by the CDF inverse depend critically on the assignment of a conformation's distance span and the ensemble (e.g., unfolded state model): varying ensemble and conformation definition (1 --> 2 A) varies the CDF-based conformation counts for Ala(50) from 10(11) to 10(69). In particular, relatively short molecular dynamics (MD) relaxation of Ala(50)'s random-walk ensemble reduces the number of conformers from 10(55) to 10(14) (using a 1 A root-mean-square-deviation radius conformation definition) pointing to potential disconnections in comparing the results from simplified models of unfolded proteins with those from all-atom MD simulations. Explicit waters are found to roughen the landscape considerably. Under some common conformation definitions, the results herein provide (i) an upper limit to the number of accessible conformations that compose unfolded states of proteins, (ii) the optimal clustering radius/conformation radius for counting conformations for a given energy and solvent model, (iii) a means of comparing various studies, and (iv) an assessment of the applicability of random search in protein folding.

  10. Independent Metrics for Protein Backbone and Side-Chain Flexibility: Time Scales and Effects of Ligand Binding.

    PubMed

    Fuchs, Julian E; Waldner, Birgit J; Huber, Roland G; von Grafenstein, Susanne; Kramer, Christian; Liedl, Klaus R

    2015-03-10

    Conformational dynamics are central for understanding biomolecular structure and function, since biological macromolecules are inherently flexible at room temperature and in solution. Computational methods are nowadays capable of providing valuable information on the conformational ensembles of biomolecules. However, analysis tools and intuitive metrics that capture dynamic information from in silico generated structural ensembles are limited. In standard work-flows, flexibility in a conformational ensemble is represented through residue-wise root-mean-square fluctuations or B-factors following a global alignment. Consequently, these approaches relying on global alignments discard valuable information on local dynamics. Results inherently depend on global flexibility, residue size, and connectivity. In this study we present a novel approach for capturing positional fluctuations based on multiple local alignments instead of one single global alignment. The method captures local dynamics within a structural ensemble independent of residue type by splitting individual local and global degrees of freedom of protein backbone and side-chains. Dependence on residue type and size in the side-chains is removed via normalization with the B-factors of the isolated residue. As a test case, we demonstrate its application to a molecular dynamics simulation of bovine pancreatic trypsin inhibitor (BPTI) on the millisecond time scale. This allows for illustrating different time scales of backbone and side-chain flexibility. Additionally, we demonstrate the effects of ligand binding on side-chain flexibility of three serine proteases. We expect our new methodology for quantifying local flexibility to be helpful in unraveling local changes in biomolecular dynamics.

  11. A comparison of breeding and ensemble transform vectors for global ensemble generation

    NASA Astrophysics Data System (ADS)

    Deng, Guo; Tian, Hua; Li, Xiaoli; Chen, Jing; Gong, Jiandong; Jiao, Meiyan

    2012-02-01

    To compare the initial perturbation techniques using breeding vectors and ensemble transform vectors, three ensemble prediction systems using both initial perturbation methods but with different ensemble member sizes based on the spectral model T213/L31 are constructed at the National Meteorological Center, China Meteorological Administration (NMC/CMA). A series of ensemble verification scores such as forecast skill of the ensemble mean, ensemble resolution, and ensemble reliability are introduced to identify the most important attributes of ensemble forecast systems. The results indicate that the ensemble transform technique is superior to the breeding vector method in light of the evaluation of anomaly correlation coefficient (ACC), which is a deterministic character of the ensemble mean, the root-mean-square error (RMSE) and spread, which are of probabilistic attributes, and the continuous ranked probability score (CRPS) and its decomposition. The advantage of the ensemble transform approach is attributed to its orthogonality among ensemble perturbations as well as its consistence with the data assimilation system. Therefore, this study may serve as a reference for configuration of the best ensemble prediction system to be used in operation.

  12. Quantifying Nucleic Acid Ensembles with X-ray Scattering Interferometry.

    PubMed

    Shi, Xuesong; Bonilla, Steve; Herschlag, Daniel; Harbury, Pehr

    2015-01-01

    The conformational ensemble of a macromolecule is the complete description of the macromolecule's solution structures and can reveal important aspects of macromolecular folding, recognition, and function. However, most experimental approaches determine an average or predominant structure, or follow transitions between states that each can only be described by an average structure. Ensembles have been extremely difficult to experimentally characterize. We present the unique advantages and capabilities of a new biophysical technique, X-ray scattering interferometry (XSI), for probing and quantifying structural ensembles. XSI measures the interference of scattered waves from two heavy metal probes attached site specifically to a macromolecule. A Fourier transform of the interference pattern gives the fractional abundance of different probe separations directly representing the multiple conformation states populated by the macromolecule. These probe-probe distance distributions can then be used to define the structural ensemble of the macromolecule. XSI provides accurate, calibrated distance in a model-independent fashion with angstrom scale sensitivity in distances. XSI data can be compared in a straightforward manner to atomic coordinates determined experimentally or predicted by molecular dynamics simulations. We describe the conceptual framework for XSI and provide a detailed protocol for carrying out an XSI experiment. © 2015 Elsevier Inc. All rights reserved.

  13. Highly Disordered Amyloid-β Monomer Probed by Single-Molecule FRET and MD Simulation.

    PubMed

    Meng, Fanjie; Bellaiche, Mathias M J; Kim, Jae-Yeol; Zerze, Gül H; Best, Robert B; Chung, Hoi Sung

    2018-02-27

    Monomers of amyloid-β (Aβ) protein are known to be disordered, but there is considerable controversy over the existence of residual or transient conformations that can potentially promote oligomerization and fibril formation. We employed single-molecule Förster resonance energy transfer (FRET) spectroscopy with site-specific dye labeling using an unnatural amino acid and molecular dynamics simulations to investigate conformations and dynamics of Aβ isoforms with 40 (Aβ40) and 42 residues (Aβ42). The FRET efficiency distributions of both proteins measured in phosphate-buffered saline at room temperature show a single peak with very similar FRET efficiencies, indicating there is apparently only one state. 2D FRET efficiency-donor lifetime analysis reveals, however, that there is a broad distribution of rapidly interconverting conformations. Using nanosecond fluorescence correlation spectroscopy, we measured the timescale of the fluctuations between these conformations to be ∼35 ns, similar to that of disordered proteins. These results suggest that both Aβ40 and Aβ42 populate an ensemble of rapidly reconfiguring unfolded states, with no long-lived conformational state distinguishable from that of the disordered ensemble. To gain molecular-level insights into these observations, we performed molecular dynamics simulations with a force field optimized to describe disordered proteins. We find, as in experiments, that both peptides populate configurations consistent with random polymer chains, with the vast majority of conformations lacking significant secondary structure, giving rise to very similar ensemble-averaged FRET efficiencies. Published by Elsevier Inc.

  14. Practical implementation of a particle filter data assimilation approach to estimate initial hydrologic conditions and initialize medium-range streamflow forecasts

    NASA Astrophysics Data System (ADS)

    Clark, E.; Wood, A.; Nijssen, B.; Newman, A. J.; Mendoza, P. A.

    2016-12-01

    The System for Hydrometeorological Applications, Research and Prediction (SHARP), developed at the National Center for Atmospheric Research (NCAR), University of Washington, U.S. Army Corps of Engineers, and U.S. Bureau of Reclamation, is a fully automated ensemble prediction system for short-term to seasonal applications. It incorporates uncertainty in initial hydrologic conditions (IHCs) and in hydrometeorological predictions. In this implementation, IHC uncertainty is estimated by propagating an ensemble of 100 plausible temperature and precipitation time series through the Sacramento/Snow-17 model. The forcing ensemble explicitly accounts for measurement and interpolation uncertainties in the development of gridded meteorological forcing time series. The resulting ensemble of derived IHCs exhibits a broad range of possible soil moisture and snow water equivalent (SWE) states. To select the IHCs that are most consistent with the observations, we employ a particle filter (PF) that weights IHC ensemble members based on observations of streamflow and SWE. These particles are then used to initialize ensemble precipitation and temperature forecasts downscaled from the Global Ensemble Forecast System (GEFS), generating a streamflow forecast ensemble. We test this method in two basins in the Pacific Northwest that are important for water resources management: 1) the Green River upstream of Howard Hanson Dam, and 2) the South Fork Flathead River upstream of Hungry Horse Dam. The first of these is characterized by mixed snow and rain, while the second is snow-dominated. The PF-based forecasts are compared to forecasts based on a single IHC (corresponding to median streamflow) paired with the full GEFS ensemble, and 2) the full IHC ensemble, without filtering, paired with the full GEFS ensemble. In addition to assessing improvements in the spread of IHCs, we perform a hindcast experiment to evaluate the utility of PF-based data assimilation on streamflow forecasts at 1- to 7-day lead times.

  15. A New Outer Galaxy Molecular Cloud Catalog: Applications to Galactic Structure

    NASA Astrophysics Data System (ADS)

    Kerton, C. R.; Brunt, C. M.; Pomerleau, C.

    2001-12-01

    We have generated a new molecular cloud catalog from a reprocessed version of the Five College Radio Astronomy (FCRAO) Observatory Outer Galaxy Survey (OGS) of 12CO (J=1--0) emission. The catalog has been used to develop a technique that uses the observed angular size-linewidth relation (ASLWR) as a distance indicator to molecular cloud ensembles. The new technique is a promising means to map out the large-scale structure of our Galaxy using the new high spatial dynamic range CO surveys currently available. The catalog was created using a two-stage object-identification algorithm. We first identified contiguous emission structures of a specified minimum number of pixels above a specified temperature threshold. Each structure so defined was then examined and localized emission enhancements within each structure were identified as separate objects. The resulting cloud catalog, contains basic data on 14595 objects. From the OGS we identified twenty-three cloud ensembles. For each, bisector fits to angular size vs. linewidth plots were made. The fits vary in a systematic way that allows a calibration of the fit parameters with distance to be made. Our derived distances to the ensembles are consistent with the distance to the Perseus Arm, and the accurate radial velocity measurements available from the same data are in accord with the known non-circular motions at the location of the Perseus Arm. The ASLWR method was also successfully applied to data from the Boston University/FCRAO Galactic Ring Survey (GRS) of 13CO(J=1--0) emission. Based upon our experience with the GRS and OGS, the ASLWR technique should be usable in any data set with sufficient spatial dynamic range to allow it to be properly calibrated. C.P. participated in this study through the Women in Engineering and Science (WES) program of NRC Canada. The Dominion Radio Astrophysical Observatory is a National Facility operated by the National Research Council. The Canadian Galactic Plane Survey is a Canadian project with international partners, and is supported by the Natural Sciences and Engineering Research Council (NSERC).

  16. Using the fast fourier transform in binding free energy calculations.

    PubMed

    Nguyen, Trung Hai; Zhou, Huan-Xiang; Minh, David D L

    2018-04-30

    According to implicit ligand theory, the standard binding free energy is an exponential average of the binding potential of mean force (BPMF), an exponential average of the interaction energy between the unbound ligand ensemble and a rigid receptor. Here, we use the fast Fourier transform (FFT) to efficiently evaluate BPMFs by calculating interaction energies when rigid ligand configurations from the unbound ensemble are discretely translated across rigid receptor conformations. Results for standard binding free energies between T4 lysozyme and 141 small organic molecules are in good agreement with previous alchemical calculations based on (1) a flexible complex ( R≈0.9 for 24 systems) and (2) flexible ligand with multiple rigid receptor configurations ( R≈0.8 for 141 systems). While the FFT is routinely used for molecular docking, to our knowledge this is the first time that the algorithm has been used for rigorous binding free energy calculations. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  17. Direct observation of narrow mid-infrared plasmon linewidths of single metal oxide nanocrystals

    DOE PAGES

    Johns, Robert W.; Bechtel, Hans A.; Runnerstrom, Evan L.; ...

    2016-05-13

    Infrared-responsive doped metal oxide nanocrystals are an emerging class of plasmonic materials whose localized surface plasmon resonances (LSPR) can be resonant with molecular vibrations. This presents a distinctive opportunity to manipulate light-matter interactions to redirect chemical or spectroscopic outcomes through the strong local electric fields they generate. Here we report a technique for measuring single nanocrystal absorption spectra of doped metal oxide nanocrystals, revealing significant spectral inhomogeneity in their mid-infrared LSPRs. Our analysis suggests dopant incorporation is heterogeneous beyond expectation based on a statistical distribution of dopants. The broad ensemble linewidths typically observed in these materials result primarily from sammore » ple heterogeneity and not from strong electronic damping associated with lossy plasmonic materials. In fact, single nanocrystal spectra reveal linewidths as narrow as 600 cm -1 in aluminium-doped zinc oxide, a value less than half the ensemble linewidth and markedly less than homogeneous linewidths of gold nanospheres.« less

  18. The search for a hippocampal engram.

    PubMed

    Mayford, Mark

    2014-01-05

    Understanding the molecular and cellular changes that underlie memory, the engram, requires the identification, isolation and manipulation of the neurons involved. This presents a major difficulty for complex forms of memory, for example hippocampus-dependent declarative memory, where the participating neurons are likely to be sparse, anatomically distributed and unique to each individual brain and learning event. In this paper, I discuss several new approaches to this problem. In vivo calcium imaging techniques provide a means of assessing the activity patterns of large numbers of neurons over long periods of time with precise anatomical identification. This provides important insight into how the brain represents complex information and how this is altered with learning. The development of techniques for the genetic modification of neural ensembles based on their natural, sensory-evoked, activity along with optogenetics allows direct tests of the coding function of these ensembles. These approaches provide a new methodological framework in which to examine the mechanisms of complex forms of learning at the level of the neurons involved in a specific memory.

  19. The search for a hippocampal engram

    PubMed Central

    Mayford, Mark

    2014-01-01

    Understanding the molecular and cellular changes that underlie memory, the engram, requires the identification, isolation and manipulation of the neurons involved. This presents a major difficulty for complex forms of memory, for example hippocampus-dependent declarative memory, where the participating neurons are likely to be sparse, anatomically distributed and unique to each individual brain and learning event. In this paper, I discuss several new approaches to this problem. In vivo calcium imaging techniques provide a means of assessing the activity patterns of large numbers of neurons over long periods of time with precise anatomical identification. This provides important insight into how the brain represents complex information and how this is altered with learning. The development of techniques for the genetic modification of neural ensembles based on their natural, sensory-evoked, activity along with optogenetics allows direct tests of the coding function of these ensembles. These approaches provide a new methodological framework in which to examine the mechanisms of complex forms of learning at the level of the neurons involved in a specific memory. PMID:24298162

  20. Improving ensemble decision tree performance using Adaboost and Bagging

    NASA Astrophysics Data System (ADS)

    Hasan, Md. Rajib; Siraj, Fadzilah; Sainin, Mohd Shamrie

    2015-12-01

    Ensemble classifier systems are considered as one of the most promising in medical data classification and the performance of deceision tree classifier can be increased by the ensemble method as it is proven to be better than single classifiers. However, in a ensemble settings the performance depends on the selection of suitable base classifier. This research employed two prominent esemble s namely Adaboost and Bagging with base classifiers such as Random Forest, Random Tree, j48, j48grafts and Logistic Model Regression (LMT) that have been selected independently. The empirical study shows that the performance varries when different base classifiers are selected and even some places overfitting issue also been noted. The evidence shows that ensemble decision tree classfiers using Adaboost and Bagging improves the performance of selected medical data sets.

  1. Molecular dynamics in principal component space.

    PubMed

    Michielssens, Servaas; van Erp, Titus S; Kutzner, Carsten; Ceulemans, Arnout; de Groot, Bert L

    2012-07-26

    A molecular dynamics algorithm in principal component space is presented. It is demonstrated that sampling can be improved without changing the ensemble by assigning masses to the principal components proportional to the inverse square root of the eigenvalues. The setup of the simulation requires no prior knowledge of the system; a short initial MD simulation to extract the eigenvectors and eigenvalues suffices. Independent measures indicated a 6-7 times faster sampling compared to a regular molecular dynamics simulation.

  2. Ensemble-based modeling and rigidity decomposition of allosteric interaction networks and communication pathways in cyclin-dependent kinases: Differentiating kinase clients of the Hsp90-Cdc37 chaperone

    PubMed Central

    Stetz, Gabrielle; Tse, Amanda

    2017-01-01

    The overarching goal of delineating molecular principles underlying differentiation of protein kinase clients and chaperone-based modulation of kinase activity is fundamental to understanding activity of many oncogenic kinases that require chaperoning of Hsp70 and Hsp90 systems to attain a functionally competent active form. Despite structural similarities and common activation mechanisms shared by cyclin-dependent kinase (CDK) proteins, members of this family can exhibit vastly different chaperone preferences. The molecular determinants underlying chaperone dependencies of protein kinases are not fully understood as structurally similar kinases may often elicit distinct regulatory responses to the chaperone. The regulatory divergences observed for members of CDK family are of particular interest as functional diversification among these kinases may be related to variations in chaperone dependencies and can be exploited in drug discovery of personalized therapeutic agents. In this work, we report the results of a computational investigation of several members of CDK family (CDK5, CDK6, CDK9) that represented a broad repertoire of chaperone dependencies—from nonclient CDK5, to weak client CDK6, and strong client CDK9. By using molecular simulations of multiple crystal structures we characterized conformational ensembles and collective dynamics of CDK proteins. We found that the elevated dynamics of CDK9 can trigger imbalances in cooperative collective motions and reduce stability of the active fold, thus creating a cascade of favorable conditions for chaperone intervention. The ensemble-based modeling of residue interaction networks and community analysis determined how differences in modularity of allosteric networks and topography of communication pathways can be linked with the client status of CDK proteins. This analysis unveiled depleted modularity of the allosteric network in CDK9 that alters distribution of communication pathways and leads to impaired signaling in the client kinase. According to our results, these network features may uniquely define chaperone dependencies of CDK clients. The perturbation response scanning and rigidity decomposition approaches identified regulatory hotspots that mediate differences in stability and cooperativity of allosteric interaction networks in the CDK structures. By combining these synergistic approaches, our study revealed dynamic and network signatures that can differentiate kinase clients and rationalize subtle divergences in the activation mechanisms of CDK family members. The therapeutic implications of these results are illustrated by identifying structural hotspots of pathogenic mutations that preferentially target regions of the increased flexibility to enable modulation of activation changes. Our study offers a network-based perspective on dynamic kinase mechanisms and drug design by unravelling relationships between protein kinase dynamics, allosteric communications and chaperone dependencies. PMID:29095844

  3. Molecular dynamics-based refinement and validation for sub-5 Å cryo-electron microscopy maps.

    PubMed

    Singharoy, Abhishek; Teo, Ivan; McGreevy, Ryan; Stone, John E; Zhao, Jianhua; Schulten, Klaus

    2016-07-07

    Two structure determination methods, based on the molecular dynamics flexible fitting (MDFF) paradigm, are presented that resolve sub-5 Å cryo-electron microscopy (EM) maps with either single structures or ensembles of such structures. The methods, denoted cascade MDFF and resolution exchange MDFF, sequentially re-refine a search model against a series of maps of progressively higher resolutions, which ends with the original experimental resolution. Application of sequential re-refinement enables MDFF to achieve a radius of convergence of ~25 Å demonstrated with the accurate modeling of β-galactosidase and TRPV1 proteins at 3.2 Å and 3.4 Å resolution, respectively. The MDFF refinements uniquely offer map-model validation and B-factor determination criteria based on the inherent dynamics of the macromolecules studied, captured by means of local root mean square fluctuations. The MDFF tools described are available to researchers through an easy-to-use and cost-effective cloud computing resource on Amazon Web Services.

  4. Skin lesion computational diagnosis of dermoscopic images: Ensemble models based on input feature manipulation.

    PubMed

    Oliveira, Roberta B; Pereira, Aledir S; Tavares, João Manuel R S

    2017-10-01

    The number of deaths worldwide due to melanoma has risen in recent times, in part because melanoma is the most aggressive type of skin cancer. Computational systems have been developed to assist dermatologists in early diagnosis of skin cancer, or even to monitor skin lesions. However, there still remains a challenge to improve classifiers for the diagnosis of such skin lesions. The main objective of this article is to evaluate different ensemble classification models based on input feature manipulation to diagnose skin lesions. Input feature manipulation processes are based on feature subset selections from shape properties, colour variation and texture analysis to generate diversity for the ensemble models. Three subset selection models are presented here: (1) a subset selection model based on specific feature groups, (2) a correlation-based subset selection model, and (3) a subset selection model based on feature selection algorithms. Each ensemble classification model is generated using an optimum-path forest classifier and integrated with a majority voting strategy. The proposed models were applied on a set of 1104 dermoscopic images using a cross-validation procedure. The best results were obtained by the first ensemble classification model that generates a feature subset ensemble based on specific feature groups. The skin lesion diagnosis computational system achieved 94.3% accuracy, 91.8% sensitivity and 96.7% specificity. The input feature manipulation process based on specific feature subsets generated the greatest diversity for the ensemble classification model with very promising results. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. Ensemble Classifier Strategy Based on Transient Feature Fusion in Electronic Nose

    NASA Astrophysics Data System (ADS)

    Bagheri, Mohammad Ali; Montazer, Gholam Ali

    2011-09-01

    In this paper, we test the performance of several ensembles of classifiers and each base learner has been trained on different types of extracted features. Experimental results show the potential benefits introduced by the usage of simple ensemble classification systems for the integration of different types of transient features.

  6. An Effective Approach for Clustering InhA Molecular Dynamics Trajectory Using Substrate-Binding Cavity Features

    PubMed Central

    Ruiz, Duncan D. A.; Norberto de Souza, Osmar

    2015-01-01

    Protein receptor conformations, obtained from molecular dynamics (MD) simulations, have become a promising treatment of its explicit flexibility in molecular docking experiments applied to drug discovery and development. However, incorporating the entire ensemble of MD conformations in docking experiments to screen large candidate compound libraries is currently an unfeasible task. Clustering algorithms have been widely used as a means to reduce such ensembles to a manageable size. Most studies investigate different algorithms using pairwise Root-Mean Square Deviation (RMSD) values for all, or part of the MD conformations. Nevertheless, the RMSD only may not be the most appropriate gauge to cluster conformations when the target receptor has a plastic active site, since they are influenced by changes that occur on other parts of the structure. Hence, we have applied two partitioning methods (k-means and k-medoids) and four agglomerative hierarchical methods (Complete linkage, Ward’s, Unweighted Pair Group Method and Weighted Pair Group Method) to analyze and compare the quality of partitions between a data set composed of properties from an enzyme receptor substrate-binding cavity and two data sets created using different RMSD approaches. Ensembles of representative MD conformations were generated by selecting a medoid of each group from all partitions analyzed. We investigated the performance of our new method for evaluating binding conformation of drug candidates to the InhA enzyme, which were performed by cross-docking experiments between a 20 ns MD trajectory and 20 different ligands. Statistical analyses showed that the novel ensemble, which is represented by only 0.48% of the MD conformations, was able to reproduce 75% of all dynamic behaviors within the binding cavity for the docking experiments performed. Moreover, this new approach not only outperforms the other two RMSD-clustering solutions, but it also shows to be a promising strategy to distill biologically relevant information from MD trajectories, especially for docking purposes. PMID:26218832

  7. An Effective Approach for Clustering InhA Molecular Dynamics Trajectory Using Substrate-Binding Cavity Features.

    PubMed

    De Paris, Renata; Quevedo, Christian V; Ruiz, Duncan D A; Norberto de Souza, Osmar

    2015-01-01

    Protein receptor conformations, obtained from molecular dynamics (MD) simulations, have become a promising treatment of its explicit flexibility in molecular docking experiments applied to drug discovery and development. However, incorporating the entire ensemble of MD conformations in docking experiments to screen large candidate compound libraries is currently an unfeasible task. Clustering algorithms have been widely used as a means to reduce such ensembles to a manageable size. Most studies investigate different algorithms using pairwise Root-Mean Square Deviation (RMSD) values for all, or part of the MD conformations. Nevertheless, the RMSD only may not be the most appropriate gauge to cluster conformations when the target receptor has a plastic active site, since they are influenced by changes that occur on other parts of the structure. Hence, we have applied two partitioning methods (k-means and k-medoids) and four agglomerative hierarchical methods (Complete linkage, Ward's, Unweighted Pair Group Method and Weighted Pair Group Method) to analyze and compare the quality of partitions between a data set composed of properties from an enzyme receptor substrate-binding cavity and two data sets created using different RMSD approaches. Ensembles of representative MD conformations were generated by selecting a medoid of each group from all partitions analyzed. We investigated the performance of our new method for evaluating binding conformation of drug candidates to the InhA enzyme, which were performed by cross-docking experiments between a 20 ns MD trajectory and 20 different ligands. Statistical analyses showed that the novel ensemble, which is represented by only 0.48% of the MD conformations, was able to reproduce 75% of all dynamic behaviors within the binding cavity for the docking experiments performed. Moreover, this new approach not only outperforms the other two RMSD-clustering solutions, but it also shows to be a promising strategy to distill biologically relevant information from MD trajectories, especially for docking purposes.

  8. Temperature for a dynamic spin ensemble

    NASA Astrophysics Data System (ADS)

    Ma, Pui-Wai; Dudarev, S. L.; Semenov, A. A.; Woo, C. H.

    2010-09-01

    In molecular dynamics simulations, temperature is evaluated, via the equipartition principle, by computing the mean kinetic energy of atoms. There is no similar recipe yet for evaluating temperature of a dynamic system of interacting spins. By solving semiclassical Langevin spin-dynamics equations, and applying the fluctuation-dissipation theorem, we derive an equation for the temperature of a spin ensemble, expressed in terms of dynamic spin variables. The fact that definitions for the kinetic and spin temperatures are fully consistent is illustrated using large-scale spin dynamics and spin-lattice dynamics simulations.

  9. Shallow cumuli ensemble statistics for development of a stochastic parameterization

    NASA Astrophysics Data System (ADS)

    Sakradzija, Mirjana; Seifert, Axel; Heus, Thijs

    2014-05-01

    According to a conventional deterministic approach to the parameterization of moist convection in numerical atmospheric models, a given large scale forcing produces an unique response from the unresolved convective processes. This representation leaves out the small-scale variability of convection, as it is known from the empirical studies of deep and shallow convective cloud ensembles, there is a whole distribution of sub-grid states corresponding to the given large scale forcing. Moreover, this distribution gets broader with the increasing model resolution. This behavior is also consistent with our theoretical understanding of a coarse-grained nonlinear system. We propose an approach to represent the variability of the unresolved shallow-convective states, including the dependence of the sub-grid states distribution spread and shape on the model horizontal resolution. Starting from the Gibbs canonical ensemble theory, Craig and Cohen (2006) developed a theory for the fluctuations in a deep convective ensemble. The micro-states of a deep convective cloud ensemble are characterized by the cloud-base mass flux, which, according to the theory, is exponentially distributed (Boltzmann distribution). Following their work, we study the shallow cumulus ensemble statistics and the distribution of the cloud-base mass flux. We employ a Large-Eddy Simulation model (LES) and a cloud tracking algorithm, followed by a conditional sampling of clouds at the cloud base level, to retrieve the information about the individual cloud life cycles and the cloud ensemble as a whole. In the case of shallow cumulus cloud ensemble, the distribution of micro-states is a generalized exponential distribution. Based on the empirical and theoretical findings, a stochastic model has been developed to simulate the shallow convective cloud ensemble and to test the convective ensemble theory. Stochastic model simulates a compound random process, with the number of convective elements drawn from a Poisson distribution, and cloud properties sub-sampled from a generalized ensemble distribution. We study the role of the different cloud subtypes in a shallow convective ensemble and how the diverse cloud properties and cloud lifetimes affect the system macro-state. To what extent does the cloud-base mass flux distribution deviate from the simple Boltzmann distribution and how does it affect the results from the stochastic model? Is the memory, provided by the finite lifetime of individual clouds, of importance for the ensemble statistics? We also test for the minimal information given as an input to the stochastic model, able to reproduce the ensemble mean statistics and the variability in a convective ensemble. An important property of the resulting distribution of the sub-grid convective states is its scale-adaptivity - the smaller the grid-size, the broader the compound distribution of the sub-grid states.

  10. PomBase: a comprehensive online resource for fission yeast

    PubMed Central

    Wood, Valerie; Harris, Midori A.; McDowall, Mark D.; Rutherford, Kim; Vaughan, Brendan W.; Staines, Daniel M.; Aslett, Martin; Lock, Antonia; Bähler, Jürg; Kersey, Paul J.; Oliver, Stephen G.

    2012-01-01

    PomBase (www.pombase.org) is a new model organism database established to provide access to comprehensive, accurate, and up-to-date molecular data and biological information for the fission yeast Schizosaccharomyces pombe to effectively support both exploratory and hypothesis-driven research. PomBase encompasses annotation of genomic sequence and features, comprehensive manual literature curation and genome-wide data sets, and supports sophisticated user-defined queries. The implementation of PomBase integrates a Chado relational database that houses manually curated data with Ensembl software that supports sequence-based annotation and web access. PomBase will provide user-friendly tools to promote curation by experts within the fission yeast community. This will make a key contribution to shaping its content and ensuring its comprehensiveness and long-term relevance. PMID:22039153

  11. Functional Nanopores: A Solid-state Concept for Artificial Reaction Compartments and Molecular Factories.

    PubMed

    Puebla-Hellmann, Gabriel; Mayor, Marcel; Lörtscher, Emanuel

    2016-01-01

    On the road towards the long-term goal of the NCCR Molecular Systems Engineering to create artificial molecular factories, we aim at introducing a compartmentalization strategy based on solid-state silicon technology targeting zeptoliter reaction volumes and simultaneous electrical contact to ensembles of well-oriented molecules. This approach allows the probing of molecular building blocks under a controlled environment prior to their use in a complex molecular factory. Furthermore, these ultra-sensitive electrical conductance measurements allow molecular responses to a variety of external triggers to be used as sensing and feedback mechanisms. So far, we demonstrate the proof-of-concept by electrically contacting self-assembled mono-layers of alkane-dithiols as an established test system. Here, the molecular films are laterally constrained by a circular dielectric confinement, forming a so-called 'nanopore'. Device yields above 85% are consistently achieved down to sub-50 nm nanopore diameters. This generic platform will be extended to create distributed, cascaded reactors with individually addressable reaction sites, including interconnecting micro-fluidic channels for electrochemical communication among nanopores and sensing sites for reaction control and feedback. In this scientific outlook, we will sketch how such a solid-state nanopore concept can be used to study various aspects of molecular compounds tailored for operation in a molecular factory.

  12. Improving Classification Performance through an Advanced Ensemble Based Heterogeneous Extreme Learning Machines.

    PubMed

    Abuassba, Adnan O M; Zhang, Dezheng; Luo, Xiong; Shaheryar, Ahmad; Ali, Hazrat

    2017-01-01

    Extreme Learning Machine (ELM) is a fast-learning algorithm for a single-hidden layer feedforward neural network (SLFN). It often has good generalization performance. However, there are chances that it might overfit the training data due to having more hidden nodes than needed. To address the generalization performance, we use a heterogeneous ensemble approach. We propose an Advanced ELM Ensemble (AELME) for classification, which includes Regularized-ELM, L 2 -norm-optimized ELM (ELML2), and Kernel-ELM. The ensemble is constructed by training a randomly chosen ELM classifier on a subset of training data selected through random resampling. The proposed AELM-Ensemble is evolved by employing an objective function of increasing diversity and accuracy among the final ensemble. Finally, the class label of unseen data is predicted using majority vote approach. Splitting the training data into subsets and incorporation of heterogeneous ELM classifiers result in higher prediction accuracy, better generalization, and a lower number of base classifiers, as compared to other models (Adaboost, Bagging, Dynamic ELM ensemble, data splitting ELM ensemble, and ELM ensemble). The validity of AELME is confirmed through classification on several real-world benchmark datasets.

  13. Improving Classification Performance through an Advanced Ensemble Based Heterogeneous Extreme Learning Machines

    PubMed Central

    Abuassba, Adnan O. M.; Ali, Hazrat

    2017-01-01

    Extreme Learning Machine (ELM) is a fast-learning algorithm for a single-hidden layer feedforward neural network (SLFN). It often has good generalization performance. However, there are chances that it might overfit the training data due to having more hidden nodes than needed. To address the generalization performance, we use a heterogeneous ensemble approach. We propose an Advanced ELM Ensemble (AELME) for classification, which includes Regularized-ELM, L2-norm-optimized ELM (ELML2), and Kernel-ELM. The ensemble is constructed by training a randomly chosen ELM classifier on a subset of training data selected through random resampling. The proposed AELM-Ensemble is evolved by employing an objective function of increasing diversity and accuracy among the final ensemble. Finally, the class label of unseen data is predicted using majority vote approach. Splitting the training data into subsets and incorporation of heterogeneous ELM classifiers result in higher prediction accuracy, better generalization, and a lower number of base classifiers, as compared to other models (Adaboost, Bagging, Dynamic ELM ensemble, data splitting ELM ensemble, and ELM ensemble). The validity of AELME is confirmed through classification on several real-world benchmark datasets. PMID:28546808

  14. Multimillion atom simulations of dynamics of oxidation of an aluminum nanoparticle and nanoindentation on ceramics.

    PubMed

    Vashishta, Priya; Kalia, Rajiv K; Nakano, Aiichiro

    2006-03-02

    We have developed a first-principles-based hierarchical simulation framework, which seamlessly integrates (1) a quantum mechanical description based on the density functional theory (DFT), (2) multilevel molecular dynamics (MD) simulations based on a reactive force field (ReaxFF) that describes chemical reactions and polarization, a nonreactive force field that employs dynamic atomic charges, and an effective force field (EFF), and (3) an atomistically informed continuum model to reach macroscopic length scales. For scalable hierarchical simulations, we have developed parallel linear-scaling algorithms for (1) DFT calculation based on a divide-and-conquer algorithm on adaptive multigrids, (2) chemically reactive MD based on a fast ReaxFF (F-ReaxFF) algorithm, and (3) EFF-MD based on a space-time multiresolution MD (MRMD) algorithm. On 1920 Intel Itanium2 processors, we have demonstrated 1.4 million atom (0.12 trillion grid points) DFT, 0.56 billion atom F-ReaxFF, and 18.9 billion atom MRMD calculations, with parallel efficiency as high as 0.953. Through the use of these algorithms, multimillion atom MD simulations have been performed to study the oxidation of an aluminum nanoparticle. Structural and dynamic correlations in the oxide region are calculated as well as the evolution of charges, surface oxide thickness, diffusivities of atoms, and local stresses. In the microcanonical ensemble, the oxidizing reaction becomes explosive in both molecular and atomic oxygen environments, due to the enormous energy release associated with Al-O bonding. In the canonical ensemble, an amorphous oxide layer of a thickness of approximately 40 angstroms is formed after 466 ps, in good agreement with experiments. Simulations have been performed to study nanoindentation on crystalline, amorphous, and nanocrystalline silicon nitride and silicon carbide. Simulation on nanocrystalline silicon carbide reveals unusual deformation mechanisms in brittle nanophase materials, due to coexistence of brittle grains and soft amorphous-like grain boundary phases. Simulations predict a crossover from intergranular continuous deformation to intragrain discrete deformation at a critical indentation depth.

  15. An ensemble of dynamic neural network identifiers for fault detection and isolation of gas turbine engines.

    PubMed

    Amozegar, M; Khorasani, K

    2016-04-01

    In this paper, a new approach for Fault Detection and Isolation (FDI) of gas turbine engines is proposed by developing an ensemble of dynamic neural network identifiers. For health monitoring of the gas turbine engine, its dynamics is first identified by constructing three separate or individual dynamic neural network architectures. Specifically, a dynamic multi-layer perceptron (MLP), a dynamic radial-basis function (RBF) neural network, and a dynamic support vector machine (SVM) are trained to individually identify and represent the gas turbine engine dynamics. Next, three ensemble-based techniques are developed to represent the gas turbine engine dynamics, namely, two heterogeneous ensemble models and one homogeneous ensemble model. It is first shown that all ensemble approaches do significantly improve the overall performance and accuracy of the developed system identification scheme when compared to each of the stand-alone solutions. The best selected stand-alone model (i.e., the dynamic RBF network) and the best selected ensemble architecture (i.e., the heterogeneous ensemble) in terms of their performances in achieving an accurate system identification are then selected for solving the FDI task. The required residual signals are generated by using both a single model-based solution and an ensemble-based solution under various gas turbine engine health conditions. Our extensive simulation studies demonstrate that the fault detection and isolation task achieved by using the residuals that are obtained from the dynamic ensemble scheme results in a significantly more accurate and reliable performance as illustrated through detailed quantitative confusion matrix analysis and comparative studies. Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. A J-modulated protonless NMR experiment characterizes the conformational ensemble of the intrinsically disordered protein WIP.

    PubMed

    Rozentur-Shkop, Eva; Goobes, Gil; Chill, Jordan H

    2016-12-01

    Intrinsically disordered proteins (IDPs) are multi-conformational polypeptides that lack a single stable three-dimensional structure. It has become increasingly clear that the versatile IDPs play key roles in a multitude of biological processes, and, given their flexible nature, NMR is a leading method to investigate IDP behavior on the molecular level. Here we present an IDP-tailored J-modulated experiment designed to monitor changes in the conformational ensemble characteristic of IDPs by accurately measuring backbone one- and two-bond J( 15 N, 13 Cα) couplings. This concept was realized using a unidirectional (H)NCO 13 C-detected experiment suitable for poor spectral dispersion and optimized for maximum coverage of amino acid types. To demonstrate the utility of this approach we applied it to the disordered actin-binding N-terminal domain of WASp interacting protein (WIP), a ubiquitous key modulator of cytoskeletal changes in a range of biological systems. One- and two-bond J( 15 N, 13 Cα) couplings were acquired for WIP residues 2-65 at various temperatures, and in denaturing and crowding environments. Under native conditions fitted J-couplings identified in the WIP conformational ensemble a propensity for extended conformation at residues 16-23 and 45-60, and a helical tendency at residues 28-42. These findings are consistent with a previous study of the based upon chemical shift and RDC data and confirm that the WIP 2-65 conformational ensemble is biased towards the structure assumed by this fragment in its actin-bound form. The effects of environmental changes upon this ensemble were readily apparent in the J-coupling data, which reflected a significant decrease in structural propensity at higher temperatures, in the presence of 8 M urea, and under the influence of a bacterial cell lysate. The latter suggests that crowding can cause protein unfolding through protein-protein interactions that stabilize the unfolded state. We conclude that J-couplings are a useful measureable in characterizing structural ensembles in IDPs, and that the proposed experiment provides a practical method for accurately performing such measurements, once again emphasizing the power of NMR in studying IDP behavior.

  17. EMPIRE and pyenda: Two ensemble-based data assimilation systems written in Fortran and Python

    NASA Astrophysics Data System (ADS)

    Geppert, Gernot; Browne, Phil; van Leeuwen, Peter Jan; Merker, Claire

    2017-04-01

    We present and compare the features of two ensemble-based data assimilation frameworks, EMPIRE and pyenda. Both frameworks allow to couple models to the assimilation codes using the Message Passing Interface (MPI), leading to extremely efficient and fast coupling between models and the data-assimilation codes. The Fortran-based system EMPIRE (Employing Message Passing Interface for Researching Ensembles) is optimized for parallel, high-performance computing. It currently includes a suite of data assimilation algorithms including variants of the ensemble Kalman and several the particle filters. EMPIRE is targeted at models of all kinds of complexity and has been coupled to several geoscience models, eg. the Lorenz-63 model, a barotropic vorticity model, the general circulation model HadCM3, the ocean model NEMO, and the land-surface model JULES. The Python-based system pyenda (Python Ensemble Data Assimilation) allows Fortran- and Python-based models to be used for data assimilation. Models can be coupled either using MPI or by using a Python interface. Using Python allows quick prototyping and pyenda is aimed at small to medium scale models. pyenda currently includes variants of the ensemble Kalman filter and has been coupled to the Lorenz-63 model, an advection-based precipitation nowcasting scheme, and the dynamic global vegetation model JSBACH.

  18. Determination of the conformational ensemble of the TAR RNA by X-ray scattering interferometry

    PubMed Central

    Walker, Peter

    2017-01-01

    Abstract The conformational ensembles of structured RNA's are crucial for biological function, but they remain difficult to elucidate experimentally. We demonstrate with HIV-1 TAR RNA that X-ray scattering interferometry (XSI) can be used to determine RNA conformational ensembles. X-ray scattering interferometry (XSI) is based on site-specifically labeling RNA with pairs of heavy atom probes, and precisely measuring the distribution of inter-probe distances that arise from a heterogeneous mixture of RNA solution structures. We show that the XSI-based model of the TAR RNA ensemble closely resembles an independent model derived from NMR-RDC data. Further, we show how the TAR RNA ensemble changes shape at different salt concentrations. Finally, we demonstrate that a single hybrid model of the TAR RNA ensemble simultaneously fits both the XSI and NMR-RDC data set and show that XSI can be combined with NMR-RDC to further improve the quality of the determined ensemble. The results suggest that XSI-RNA will be a powerful approach for characterizing the solution conformational ensembles of RNAs and RNA-protein complexes under diverse solution conditions. PMID:28108663

  19. Discrete post-processing of total cloud cover ensemble forecasts

    NASA Astrophysics Data System (ADS)

    Hemri, Stephan; Haiden, Thomas; Pappenberger, Florian

    2017-04-01

    This contribution presents an approach to post-process ensemble forecasts for the discrete and bounded weather variable of total cloud cover. Two methods for discrete statistical post-processing of ensemble predictions are tested. The first approach is based on multinomial logistic regression, the second involves a proportional odds logistic regression model. Applying them to total cloud cover raw ensemble forecasts from the European Centre for Medium-Range Weather Forecasts improves forecast skill significantly. Based on station-wise post-processing of raw ensemble total cloud cover forecasts for a global set of 3330 stations over the period from 2007 to early 2014, the more parsimonious proportional odds logistic regression model proved to slightly outperform the multinomial logistic regression model. Reference Hemri, S., Haiden, T., & Pappenberger, F. (2016). Discrete post-processing of total cloud cover ensemble forecasts. Monthly Weather Review 144, 2565-2577.

  20. Overlapped Partitioning for Ensemble Classifiers of P300-Based Brain-Computer Interfaces

    PubMed Central

    Onishi, Akinari; Natsume, Kiyohisa

    2014-01-01

    A P300-based brain-computer interface (BCI) enables a wide range of people to control devices that improve their quality of life. Ensemble classifiers with naive partitioning were recently applied to the P300-based BCI and these classification performances were assessed. However, they were usually trained on a large amount of training data (e.g., 15300). In this study, we evaluated ensemble linear discriminant analysis (LDA) classifiers with a newly proposed overlapped partitioning method using 900 training data. In addition, the classification performances of the ensemble classifier with naive partitioning and a single LDA classifier were compared. One of three conditions for dimension reduction was applied: the stepwise method, principal component analysis (PCA), or none. The results show that an ensemble stepwise LDA (SWLDA) classifier with overlapped partitioning achieved a better performance than the commonly used single SWLDA classifier and an ensemble SWLDA classifier with naive partitioning. This result implies that the performance of the SWLDA is improved by overlapped partitioning and the ensemble classifier with overlapped partitioning requires less training data than that with naive partitioning. This study contributes towards reducing the required amount of training data and achieving better classification performance. PMID:24695550

  1. Overlapped partitioning for ensemble classifiers of P300-based brain-computer interfaces.

    PubMed

    Onishi, Akinari; Natsume, Kiyohisa

    2014-01-01

    A P300-based brain-computer interface (BCI) enables a wide range of people to control devices that improve their quality of life. Ensemble classifiers with naive partitioning were recently applied to the P300-based BCI and these classification performances were assessed. However, they were usually trained on a large amount of training data (e.g., 15300). In this study, we evaluated ensemble linear discriminant analysis (LDA) classifiers with a newly proposed overlapped partitioning method using 900 training data. In addition, the classification performances of the ensemble classifier with naive partitioning and a single LDA classifier were compared. One of three conditions for dimension reduction was applied: the stepwise method, principal component analysis (PCA), or none. The results show that an ensemble stepwise LDA (SWLDA) classifier with overlapped partitioning achieved a better performance than the commonly used single SWLDA classifier and an ensemble SWLDA classifier with naive partitioning. This result implies that the performance of the SWLDA is improved by overlapped partitioning and the ensemble classifier with overlapped partitioning requires less training data than that with naive partitioning. This study contributes towards reducing the required amount of training data and achieving better classification performance.

  2. EnsembleGraph: Interactive Visual Analysis of Spatial-Temporal Behavior for Ensemble Simulation Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shu, Qingya; Guo, Hanqi; Che, Limei

    We present a novel visualization framework—EnsembleGraph— for analyzing ensemble simulation data, in order to help scientists understand behavior similarities between ensemble members over space and time. A graph-based representation is used to visualize individual spatiotemporal regions with similar behaviors, which are extracted by hierarchical clustering algorithms. A user interface with multiple-linked views is provided, which enables users to explore, locate, and compare regions that have similar behaviors between and then users can investigate and analyze the selected regions in detail. The driving application of this paper is the studies on regional emission influences over tropospheric ozone, which is based onmore » ensemble simulations conducted with different anthropogenic emission absences using the MOZART-4 (model of ozone and related tracers, version 4) model. We demonstrate the effectiveness of our method by visualizing the MOZART-4 ensemble simulation data and evaluating the relative regional emission influences on tropospheric ozone concentrations. Positive feedbacks from domain experts and two case studies prove efficiency of our method.« less

  3. Thermostatted molecular dynamics: How to avoid the Toda demon hidden in Nose-Hoover dynamics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Holian, B.L.; Voter, A.F.; Ravelo, R.

    The Nose-Hoover thermostat, which is often used in the hope of modifying molecular dynamics trajectories in order to achieve canonical-ensemble averages, has hidden in it a Toda ``demon,`` which can give rise to unwanted, noncanonical undulations in the instantaneous kinetic temperature. We show how these long-lived oscillations arise from insufficient coupling of the thermostat to the atoms, and give straightforward, practical procedures for avoiding this weak-coupling pathology in isothermal molecular dynamics simulations.

  4. Recognition of multiple imbalanced cancer types based on DNA microarray data using ensemble classifiers.

    PubMed

    Yu, Hualong; Hong, Shufang; Yang, Xibei; Ni, Jun; Dan, Yuanyuan; Qin, Bin

    2013-01-01

    DNA microarray technology can measure the activities of tens of thousands of genes simultaneously, which provides an efficient way to diagnose cancer at the molecular level. Although this strategy has attracted significant research attention, most studies neglect an important problem, namely, that most DNA microarray datasets are skewed, which causes traditional learning algorithms to produce inaccurate results. Some studies have considered this problem, yet they merely focus on binary-class problem. In this paper, we dealt with multiclass imbalanced classification problem, as encountered in cancer DNA microarray, by using ensemble learning. We utilized one-against-all coding strategy to transform multiclass to multiple binary classes, each of them carrying out feature subspace, which is an evolving version of random subspace that generates multiple diverse training subsets. Next, we introduced one of two different correction technologies, namely, decision threshold adjustment or random undersampling, into each training subset to alleviate the damage of class imbalance. Specifically, support vector machine was used as base classifier, and a novel voting rule called counter voting was presented for making a final decision. Experimental results on eight skewed multiclass cancer microarray datasets indicate that unlike many traditional classification approaches, our methods are insensitive to class imbalance.

  5. ClustENM: ENM-Based Sampling of Essential Conformational Space at Full Atomic Resolution

    PubMed Central

    Kurkcuoglu, Zeynep; Bahar, Ivet; Doruker, Pemra

    2016-01-01

    Accurate sampling of conformational space and, in particular, the transitions between functional substates has been a challenge in molecular dynamic (MD) simulations of large biomolecular systems. We developed an Elastic Network Model (ENM)-based computational method, ClustENM, for sampling large conformational changes of biomolecules with various sizes and oligomerization states. ClustENM is an iterative method that combines ENM with energy minimization and clustering steps. It is an unbiased technique, which requires only an initial structure as input, and no information about the target conformation. To test the performance of ClustENM, we applied it to six biomolecular systems: adenylate kinase (AK), calmodulin, p38 MAP kinase, HIV-1 reverse transcriptase (RT), triosephosphate isomerase (TIM), and the 70S ribosomal complex. The generated ensembles of conformers determined at atomic resolution show good agreement with experimental data (979 structures resolved by X-ray and/or NMR) and encompass the subspaces covered in independent MD simulations for TIM, p38, and RT. ClustENM emerges as a computationally efficient tool for characterizing the conformational space of large systems at atomic detail, in addition to generating a representative ensemble of conformers that can be advantageously used in simulating substrate/ligand-binding events. PMID:27494296

  6. Ensemble density variational methods with self- and ghost-interaction-corrected functionals

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pastorczak, Ewa; Pernal, Katarzyna, E-mail: pernalk@gmail.com

    2014-05-14

    Ensemble density functional theory (DFT) offers a way of predicting excited-states energies of atomic and molecular systems without referring to a density response function. Despite a significant theoretical work, practical applications of the proposed approximations have been scarce and they do not allow for a fair judgement of the potential usefulness of ensemble DFT with available functionals. In the paper, we investigate two forms of ensemble density functionals formulated within ensemble DFT framework: the Gross, Oliveira, and Kohn (GOK) functional proposed by Gross et al. [Phys. Rev. A 37, 2809 (1988)] alongside the orbital-dependent eDFT form of the functional introducedmore » by Nagy [J. Phys. B 34, 2363 (2001)] (the acronym eDFT proposed in analogy to eHF – ensemble Hartree-Fock method). Local and semi-local ground-state density functionals are employed in both approaches. Approximate ensemble density functionals contain not only spurious self-interaction but also the so-called ghost-interaction which has no counterpart in the ground-state DFT. We propose how to correct the GOK functional for both kinds of interactions in approximations that go beyond the exact-exchange functional. Numerical applications lead to a conclusion that functionals free of the ghost-interaction by construction, i.e., eDFT, yield much more reliable results than approximate self- and ghost-interaction-corrected GOK functional. Additionally, local density functional corrected for self-interaction employed in the eDFT framework yields excitations energies of the accuracy comparable to that of the uncorrected semi-local eDFT functional.« less

  7. Understanding the Structural Ensembles of a Highly Extended Disordered Protein†

    PubMed Central

    Daughdrill, Gary W.; Kashtanov, Stepan; Stancik, Amber; Hill, Shannon E.; Helms, Gregory; Muschol, Martin

    2013-01-01

    Developing a comprehensive description of the equilibrium structural ensembles for intrinsically disordered proteins (IDPs) is essential to understanding their function. The p53 transactivation domain (p53TAD) is an IDP that interacts with multiple protein partners and contains numerous phosphorylation sites. Multiple techniques were used to investigate the equilibrium structural ensemble of p53TAD in its native and chemically unfolded states. The results from these experiments show that the native state of p53TAD has dimensions similar to a classical random coil while the chemically unfolded state is more extended. To investigate the molecular properties responsible for this behavior, a novel algorithm that generates diverse and unbiased structural ensembles of IDPs was developed. This algorithm was used to generate a large pool of plausible p53TAD structures that were reweighted to identify a subset of structures with the best fit to small angle X-ray scattering data. High weight structures in the native state ensemble show features that are localized to protein binding sites and regions with high proline content. The features localized to the protein binding sites are mostly eliminated in the chemically unfolded ensemble; while, the regions with high proline content remain relatively unaffected. Data from NMR experiments support these results, showing that residues from the protein binding sites experience larger environmental changes upon unfolding by urea than regions with high proline content. This behavior is consistent with the urea-induced exposure of nonpolar and aromatic side-chains in the protein binding sites that are partially excluded from solvent in the native state ensemble. PMID:21979461

  8. Performance of multiple docking and refinement methods in the pose prediction D3R prospective Grand Challenge 2016

    NASA Astrophysics Data System (ADS)

    Fradera, Xavier; Verras, Andreas; Hu, Yuan; Wang, Deping; Wang, Hongwu; Fells, James I.; Armacost, Kira A.; Crespo, Alejandro; Sherborne, Brad; Wang, Huijun; Peng, Zhengwei; Gao, Ying-Duo

    2018-01-01

    We describe the performance of multiple pose prediction methods for the D3R 2016 Grand Challenge. The pose prediction challenge includes 36 ligands, which represent 4 chemotypes and some miscellaneous structures against the FXR ligand binding domain. In this study we use a mix of fully automated methods as well as human-guided methods with considerations of both the challenge data and publicly available data. The methods include ensemble docking, colony entropy pose prediction, target selection by molecular similarity, molecular dynamics guided pose refinement, and pose selection by visual inspection. We evaluated the success of our predictions by method, chemotype, and relevance of publicly available data. For the overall data set, ensemble docking, visual inspection, and molecular dynamics guided pose prediction performed the best with overall mean RMSDs of 2.4, 2.2, and 2.2 Å respectively. For several individual challenge molecules, the best performing method is evaluated in light of that particular ligand. We also describe the protein, ligand, and public information data preparations that are typical of our binding mode prediction workflow.

  9. Variation is function: Are single cell differences functionally important?: Testing the hypothesis that single cell variation is required for aggregate function.

    PubMed

    Dueck, Hannah; Eberwine, James; Kim, Junhyong

    2016-02-01

    There is a growing appreciation of the extent of transcriptome variation across individual cells of the same cell type. While expression variation may be a byproduct of, for example, dynamic or homeostatic processes, here we consider whether single-cell molecular variation per se might be crucial for population-level function. Under this hypothesis, molecular variation indicates a diversity of hidden functional capacities within an ensemble of identical cells, and this functional diversity facilitates collective behavior that would be inaccessible to a homogenous population. In reviewing this topic, we explore possible functions that might be carried by a heterogeneous ensemble of cells; however, this question has proven difficult to test, both because methods to manipulate molecular variation are limited and because it is complicated to define, and measure, population-level function. We consider several possible methods to further pursue the hypothesis that variation is function through the use of comparative analysis and novel experimental techniques. © 2015 The Authors. BioEssays published by WILEY Periodicals, Inc.

  10. An ensemble of SVM classifiers based on gene pairs.

    PubMed

    Tong, Muchenxuan; Liu, Kun-Hong; Xu, Chungui; Ju, Wenbin

    2013-07-01

    In this paper, a genetic algorithm (GA) based ensemble support vector machine (SVM) classifier built on gene pairs (GA-ESP) is proposed. The SVMs (base classifiers of the ensemble system) are trained on different informative gene pairs. These gene pairs are selected by the top scoring pair (TSP) criterion. Each of these pairs projects the original microarray expression onto a 2-D space. Extensive permutation of gene pairs may reveal more useful information and potentially lead to an ensemble classifier with satisfactory accuracy and interpretability. GA is further applied to select an optimized combination of base classifiers. The effectiveness of the GA-ESP classifier is evaluated on both binary-class and multi-class datasets. Copyright © 2013 Elsevier Ltd. All rights reserved.

  11. Probing molecular choreography through single-molecule biochemistry.

    PubMed

    van Oijen, Antoine M; Dixon, Nicholas E

    2015-12-01

    Single-molecule approaches are having a dramatic impact on views of how proteins work. The ability to observe molecular properties at the single-molecule level allows characterization of subpopulations and acquisition of detailed kinetic information that would otherwise be hidden in the averaging over an ensemble of molecules. In this Perspective, we discuss how such approaches have successfully been applied to in vitro-reconstituted systems of increasing complexity.

  12. The construction of general basis functions in reweighting ensemble dynamics simulations: Reproduce equilibrium distribution in complex systems from multiple short simulation trajectories

    NASA Astrophysics Data System (ADS)

    Zhang, Chuan-Biao; Ming, Li; Xin, Zhou

    2015-12-01

    Ensemble simulations, which use multiple short independent trajectories from dispersive initial conformations, rather than a single long trajectory as used in traditional simulations, are expected to sample complex systems such as biomolecules much more efficiently. The re-weighted ensemble dynamics (RED) is designed to combine these short trajectories to reconstruct the global equilibrium distribution. In the RED, a number of conformational functions, named as basis functions, are applied to relate these trajectories to each other, then a detailed-balance-based linear equation is built, whose solution provides the weights of these trajectories in equilibrium distribution. Thus, the sufficient and efficient selection of basis functions is critical to the practical application of RED. Here, we review and present a few possible ways to generally construct basis functions for applying the RED in complex molecular systems. Especially, for systems with less priori knowledge, we could generally use the root mean squared deviation (RMSD) among conformations to split the whole conformational space into a set of cells, then use the RMSD-based-cell functions as basis functions. We demonstrate the application of the RED in typical systems, including a two-dimensional toy model, the lattice Potts model, and a short peptide system. The results indicate that the RED with the constructions of basis functions not only more efficiently sample the complex systems, but also provide a general way to understand the metastable structure of conformational space. Project supported by the National Natural Science Foundation of China (Grant No. 11175250).

  13. The potential of radar-based ensemble forecasts for flash-flood early warning in the southern Swiss Alps

    NASA Astrophysics Data System (ADS)

    Liechti, K.; Panziera, L.; Germann, U.; Zappa, M.

    2013-10-01

    This study explores the limits of radar-based forecasting for hydrological runoff prediction. Two novel radar-based ensemble forecasting chains for flash-flood early warning are investigated in three catchments in the southern Swiss Alps and set in relation to deterministic discharge forecasts for the same catchments. The first radar-based ensemble forecasting chain is driven by NORA (Nowcasting of Orographic Rainfall by means of Analogues), an analogue-based heuristic nowcasting system to predict orographic rainfall for the following eight hours. The second ensemble forecasting system evaluated is REAL-C2, where the numerical weather prediction COSMO-2 is initialised with 25 different initial conditions derived from a four-day nowcast with the radar ensemble REAL. Additionally, three deterministic forecasting chains were analysed. The performance of these five flash-flood forecasting systems was analysed for 1389 h between June 2007 and December 2010 for which NORA forecasts were issued, due to the presence of orographic forcing. A clear preference was found for the ensemble approach. Discharge forecasts perform better when forced by NORA and REAL-C2 rather then by deterministic weather radar data. Moreover, it was observed that using an ensemble of initial conditions at the forecast initialisation, as in REAL-C2, significantly improved the forecast skill. These forecasts also perform better then forecasts forced by ensemble rainfall forecasts (NORA) initialised form a single initial condition of the hydrological model. Thus the best results were obtained with the REAL-C2 forecasting chain. However, for regions where REAL cannot be produced, NORA might be an option for forecasting events triggered by orographic precipitation.

  14. Supramolecularly Engineered Circular Bivalent Aptamer for Enhanced Functional Protein Delivery.

    PubMed

    Jiang, Ying; Pan, Xiaoshu; Chang, Jin; Niu, Weijia; Hou, Weijia; Kuai, Hailan; Zhao, Zilong; Liu, Ji; Wang, Ming; Tan, Weihong

    2018-06-06

    Circular bivalent aptamers (cb-apt) comprise an emerging class of chemically engineered aptamers with substantially improved stability and molecular recognition ability. Its therapeutic application, however, is challenged by the lack of functional modules to control the interactions of cb-apt with therapeutics. We present the design of a β-cyclodextrin-modified cb-apt (cb-apt-βCD) and its supramolecular interaction with molecular therapeutics via host-guest chemistry for targeted intracellular delivery. The supramolecular ensemble exhibits high serum stability and enhanced intracellular delivery efficiency compared to a monomeric aptamer. The cb-apt-βCD ensemble delivers green fluorescent protein into targeted cells with efficiency as high as 80%, or cytotoxic saporin to efficiently inhibit tumor cell growth. The strategy of conjugating βCD to cb-apt, and subsequently modulating the supramolecular chemistry of cb-apt-βCD, provides a general platform to expand and diversify the function of aptamers, enabling new biological and therapeutic applications.

  15. The role of ensemble-based statistics in variational assimilation of cloud-affected observations from infrared imagers

    NASA Astrophysics Data System (ADS)

    Hacker, Joshua; Vandenberghe, Francois; Jung, Byoung-Jo; Snyder, Chris

    2017-04-01

    Effective assimilation of cloud-affected radiance observations from space-borne imagers, with the aim of improving cloud analysis and forecasting, has proven to be difficult. Large observation biases, nonlinear observation operators, and non-Gaussian innovation statistics present many challenges. Ensemble-variational data assimilation (EnVar) systems offer the benefits of flow-dependent background error statistics from an ensemble, and the ability of variational minimization to handle nonlinearity. The specific benefits of ensemble statistics, relative to static background errors more commonly used in variational systems, have not been quantified for the problem of assimilating cloudy radiances. A simple experiment framework is constructed with a regional NWP model and operational variational data assimilation system, to provide the basis understanding the importance of ensemble statistics in cloudy radiance assimilation. Restricting the observations to those corresponding to clouds in the background forecast leads to innovations that are more Gaussian. The number of large innovations is reduced compared to the more general case of all observations, but not eliminated. The Huber norm is investigated to handle the fat tails of the distributions, and allow more observations to be assimilated without the need for strict background checks that eliminate them. Comparing assimilation using only ensemble background error statistics with assimilation using only static background error statistics elucidates the importance of the ensemble statistics. Although the cost functions in both experiments converge to similar values after sufficient outer-loop iterations, the resulting cloud water, ice, and snow content are greater in the ensemble-based analysis. The subsequent forecasts from the ensemble-based analysis also retain more condensed water species, indicating that the local environment is more supportive of clouds. In this presentation we provide details that explain the apparent benefit from using ensembles for cloudy radiance assimilation in an EnVar context.

  16. Creating "Intelligent" Ensemble Averages Using a Process-Based Framework

    NASA Astrophysics Data System (ADS)

    Baker, Noel; Taylor, Patrick

    2014-05-01

    The CMIP5 archive contains future climate projections from over 50 models provided by dozens of modeling centers from around the world. Individual model projections, however, are subject to biases created by structural model uncertainties. As a result, ensemble averaging of multiple models is used to add value to individual model projections and construct a consensus projection. Previous reports for the IPCC establish climate change projections based on an equal-weighted average of all model projections. However, individual models reproduce certain climate processes better than other models. Should models be weighted based on performance? Unequal ensemble averages have previously been constructed using a variety of mean state metrics. What metrics are most relevant for constraining future climate projections? This project develops a framework for systematically testing metrics in models to identify optimal metrics for unequal weighting multi-model ensembles. The intention is to produce improved ("intelligent") unequal-weight ensemble averages. A unique aspect of this project is the construction and testing of climate process-based model evaluation metrics. A climate process-based metric is defined as a metric based on the relationship between two physically related climate variables—e.g., outgoing longwave radiation and surface temperature. Several climate process metrics are constructed using high-quality Earth radiation budget data from NASA's Clouds and Earth's Radiant Energy System (CERES) instrument in combination with surface temperature data sets. It is found that regional values of tested quantities can vary significantly when comparing the equal-weighted ensemble average and an ensemble weighted using the process-based metric. Additionally, this study investigates the dependence of the metric weighting scheme on the climate state using a combination of model simulations including a non-forced preindustrial control experiment, historical simulations, and several radiative forcing Representative Concentration Pathway (RCP) scenarios. Ultimately, the goal of the framework is to advise better methods for ensemble averaging models and create better climate predictions.

  17. Heterogeneous Ensemble Combination Search Using Genetic Algorithm for Class Imbalanced Data Classification

    PubMed Central

    Haque, Mohammad Nazmul; Noman, Nasimul; Berretta, Regina; Moscato, Pablo

    2016-01-01

    Classification of datasets with imbalanced sample distributions has always been a challenge. In general, a popular approach for enhancing classification performance is the construction of an ensemble of classifiers. However, the performance of an ensemble is dependent on the choice of constituent base classifiers. Therefore, we propose a genetic algorithm-based search method for finding the optimum combination from a pool of base classifiers to form a heterogeneous ensemble. The algorithm, called GA-EoC, utilises 10 fold-cross validation on training data for evaluating the quality of each candidate ensembles. In order to combine the base classifiers decision into ensemble’s output, we used the simple and widely used majority voting approach. The proposed algorithm, along with the random sub-sampling approach to balance the class distribution, has been used for classifying class-imbalanced datasets. Additionally, if a feature set was not available, we used the (α, β) − k Feature Set method to select a better subset of features for classification. We have tested GA-EoC with three benchmarking datasets from the UCI-Machine Learning repository, one Alzheimer’s disease dataset and a subset of the PubFig database of Columbia University. In general, the performance of the proposed method on the chosen datasets is robust and better than that of the constituent base classifiers and many other well-known ensembles. Based on our empirical study we claim that a genetic algorithm is a superior and reliable approach to heterogeneous ensemble construction and we expect that the proposed GA-EoC would perform consistently in other cases. PMID:26764911

  18. Ensemble-based flash-flood modelling: Taking into account hydrodynamic parameters and initial soil moisture uncertainties

    NASA Astrophysics Data System (ADS)

    Edouard, Simon; Vincendon, Béatrice; Ducrocq, Véronique

    2018-05-01

    Intense precipitation events in the Mediterranean often lead to devastating flash floods (FF). FF modelling is affected by several kinds of uncertainties and Hydrological Ensemble Prediction Systems (HEPS) are designed to take those uncertainties into account. The major source of uncertainty comes from rainfall forcing and convective-scale meteorological ensemble prediction systems can manage it for forecasting purpose. But other sources are related to the hydrological modelling part of the HEPS. This study focuses on the uncertainties arising from the hydrological model parameters and initial soil moisture with aim to design an ensemble-based version of an hydrological model dedicated to Mediterranean fast responding rivers simulations, the ISBA-TOP coupled system. The first step consists in identifying the parameters that have the strongest influence on FF simulations by assuming perfect precipitation. A sensitivity study is carried out first using a synthetic framework and then for several real events and several catchments. Perturbation methods varying the most sensitive parameters as well as initial soil moisture allow designing an ensemble-based version of ISBA-TOP. The first results of this system on some real events are presented. The direct perspective of this work will be to drive this ensemble-based version with the members of a convective-scale meteorological ensemble prediction system to design a complete HEPS for FF forecasting.

  19. Skill of Ensemble Seasonal Probability Forecasts

    NASA Astrophysics Data System (ADS)

    Smith, Leonard A.; Binter, Roman; Du, Hailiang; Niehoerster, Falk

    2010-05-01

    In operational forecasting, the computational complexity of large simulation models is, ideally, justified by enhanced performance over simpler models. We will consider probability forecasts and contrast the skill of ENSEMBLES-based seasonal probability forecasts of interest to the finance sector (specifically temperature forecasts for Nino 3.4 and the Atlantic Main Development Region (MDR)). The ENSEMBLES model simulations will be contrasted against forecasts from statistical models based on the observations (climatological distributions) and empirical dynamics based on the observations but conditioned on the current state (dynamical climatology). For some start dates, individual ENSEMBLES models yield significant skill even at a lead-time of 14 months. The nature of this skill is discussed, and chances of application are noted. Questions surrounding the interpretation of probability forecasts based on these multi-model ensemble simulations are then considered; the distributions considered are formed by kernel dressing the ensemble and blending with the climatology. The sources of apparent (RMS) skill in distributions based on multi-model simulations is discussed, and it is demonstrated that the inclusion of "zero-skill" models in the long range can improve Root-Mean-Square-Error scores, casting some doubt on the common justification for the claim that all models should be included in forming an operational probability forecast. It is argued that the rational response varies with lead time.

  20. Monte Carlo replica-exchange based ensemble docking of protein conformations.

    PubMed

    Zhang, Zhe; Ehmann, Uwe; Zacharias, Martin

    2017-05-01

    A replica-exchange Monte Carlo (REMC) ensemble docking approach has been developed that allows efficient exploration of protein-protein docking geometries. In addition to Monte Carlo steps in translation and orientation of binding partners, possible conformational changes upon binding are included based on Monte Carlo selection of protein conformations stored as ordered pregenerated conformational ensembles. The conformational ensembles of each binding partner protein were generated by three different approaches starting from the unbound partner protein structure with a range spanning a root mean square deviation of 1-2.5 Å with respect to the unbound structure. Because MC sampling is performed to select appropriate partner conformations on the fly the approach is not limited by the number of conformations in the ensemble compared to ensemble docking of each conformer pair in ensemble cross docking. Although only a fraction of generated conformers was in closer agreement with the bound structure the REMC ensemble docking approach achieved improved docking results compared to REMC docking with only the unbound partner structures or using docking energy minimization methods. The approach has significant potential for further improvement in combination with more realistic structural ensembles and better docking scoring functions. Proteins 2017; 85:924-937. © 2016 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  1. Active classifier selection for RGB-D object categorization using a Markov random field ensemble method

    NASA Astrophysics Data System (ADS)

    Durner, Maximilian; Márton, Zoltán.; Hillenbrand, Ulrich; Ali, Haider; Kleinsteuber, Martin

    2017-03-01

    In this work, a new ensemble method for the task of category recognition in different environments is presented. The focus is on service robotic perception in an open environment, where the robot's task is to recognize previously unseen objects of predefined categories, based on training on a public dataset. We propose an ensemble learning approach to be able to flexibly combine complementary sources of information (different state-of-the-art descriptors computed on color and depth images), based on a Markov Random Field (MRF). By exploiting its specific characteristics, the MRF ensemble method can also be executed as a Dynamic Classifier Selection (DCS) system. In the experiments, the committee- and topology-dependent performance boost of our ensemble is shown. Despite reduced computational costs and using less information, our strategy performs on the same level as common ensemble approaches. Finally, the impact of large differences between datasets is analyzed.

  2. Replica exchange and expanded ensemble simulations as Gibbs sampling: simple improvements for enhanced mixing.

    PubMed

    Chodera, John D; Shirts, Michael R

    2011-11-21

    The widespread popularity of replica exchange and expanded ensemble algorithms for simulating complex molecular systems in chemistry and biophysics has generated much interest in discovering new ways to enhance the phase space mixing of these protocols in order to improve sampling of uncorrelated configurations. Here, we demonstrate how both of these classes of algorithms can be considered as special cases of Gibbs sampling within a Markov chain Monte Carlo framework. Gibbs sampling is a well-studied scheme in the field of statistical inference in which different random variables are alternately updated from conditional distributions. While the update of the conformational degrees of freedom by Metropolis Monte Carlo or molecular dynamics unavoidably generates correlated samples, we show how judicious updating of the thermodynamic state indices--corresponding to thermodynamic parameters such as temperature or alchemical coupling variables--can substantially increase mixing while still sampling from the desired distributions. We show how state update methods in common use can lead to suboptimal mixing, and present some simple, inexpensive alternatives that can increase mixing of the overall Markov chain, reducing simulation times necessary to obtain estimates of the desired precision. These improved schemes are demonstrated for several common applications, including an alchemical expanded ensemble simulation, parallel tempering, and multidimensional replica exchange umbrella sampling.

  3. Coherent Spin Control at the Quantum Level in an Ensemble-Based Optical Memory.

    PubMed

    Jobez, Pierre; Laplane, Cyril; Timoney, Nuala; Gisin, Nicolas; Ferrier, Alban; Goldner, Philippe; Afzelius, Mikael

    2015-06-12

    Long-lived quantum memories are essential components of a long-standing goal of remote distribution of entanglement in quantum networks. These can be realized by storing the quantum states of light as single-spin excitations in atomic ensembles. However, spin states are often subjected to different dephasing processes that limit the storage time, which in principle could be overcome using spin-echo techniques. Theoretical studies suggest this to be challenging due to unavoidable spontaneous emission noise in ensemble-based quantum memories. Here, we demonstrate spin-echo manipulation of a mean spin excitation of 1 in a large solid-state ensemble, generated through storage of a weak optical pulse. After a storage time of about 1 ms we optically read-out the spin excitation with a high signal-to-noise ratio. Our results pave the way for long-duration optical quantum storage using spin-echo techniques for any ensemble-based memory.

  4. Action of Molecular Switches in GPCRs - Theoretical and Experimental Studies

    PubMed Central

    Trzaskowski, B; Latek, D; Yuan, S; Ghoshdastider, U; Debinski, A; Filipek, S

    2012-01-01

    G protein coupled receptors (GPCRs), also called 7TM receptors, form a huge superfamily of membrane proteins that, upon activation by extracellular agonists, pass the signal to the cell interior. Ligands can bind either to extracellular N-terminus and loops (e.g. glutamate receptors) or to the binding site within transmembrane helices (Rhodopsin-like family). They are all activated by agonists although a spontaneous auto-activation of an empty receptor can also be observed. Biochemical and crystallographic methods together with molecular dynamics simulations and other theoretical techniques provided models of the receptor activation based on the action of so-called “molecular switches” buried in the receptor structure. They are changed by agonists but also by inverse agonists evoking an ensemble of activation states leading toward different activation pathways. Switches discovered so far include the ionic lock switch, the 3-7 lock switch, the tyrosine toggle switch linked with the nPxxy motif in TM7, and the transmission switch. The latter one was proposed instead of the tryptophan rotamer toggle switch because no change of the rotamer was observed in structures of activated receptors. The global toggle switch suggested earlier consisting of a vertical rigid motion of TM6, seems also to be implausible based on the recent crystal structures of GPCRs with agonists. Theoretical and experimental methods (crystallography, NMR, specific spectroscopic methods like FRET/BRET but also single-molecule-force-spectroscopy) are currently used to study the effect of ligands on the receptor structure, location of stable structural segments/domains of GPCRs, and to answer the still open question on how ligands are binding: either via ensemble of conformational receptor states or rather via induced fit mechanisms. On the other hand the structural investigations of homo- and heterodimers and higher oligomers revealed the mechanism of allosteric signal transmission and receptor activation that could lead to design highly effective and selective allosteric or ago-allosteric drugs. PMID:22300046

  5. Action of molecular switches in GPCRs--theoretical and experimental studies.

    PubMed

    Trzaskowski, B; Latek, D; Yuan, S; Ghoshdastider, U; Debinski, A; Filipek, S

    2012-01-01

    G protein coupled receptors (GPCRs), also called 7TM receptors, form a huge superfamily of membrane proteins that, upon activation by extracellular agonists, pass the signal to the cell interior. Ligands can bind either to extracellular N-terminus and loops (e.g. glutamate receptors) or to the binding site within transmembrane helices (Rhodopsin-like family). They are all activated by agonists although a spontaneous auto-activation of an empty receptor can also be observed. Biochemical and crystallographic methods together with molecular dynamics simulations and other theoretical techniques provided models of the receptor activation based on the action of so-called "molecular switches" buried in the receptor structure. They are changed by agonists but also by inverse agonists evoking an ensemble of activation states leading toward different activation pathways. Switches discovered so far include the ionic lock switch, the 3-7 lock switch, the tyrosine toggle switch linked with the nPxxy motif in TM7, and the transmission switch. The latter one was proposed instead of the tryptophan rotamer toggle switch because no change of the rotamer was observed in structures of activated receptors. The global toggle switch suggested earlier consisting of a vertical rigid motion of TM6, seems also to be implausible based on the recent crystal structures of GPCRs with agonists. Theoretical and experimental methods (crystallography, NMR, specific spectroscopic methods like FRET/BRET but also single-molecule-force-spectroscopy) are currently used to study the effect of ligands on the receptor structure, location of stable structural segments/domains of GPCRs, and to answer the still open question on how ligands are binding: either via ensemble of conformational receptor states or rather via induced fit mechanisms. On the other hand the structural investigations of homoand heterodimers and higher oligomers revealed the mechanism of allosteric signal transmission and receptor activation that could lead to design highly effective and selective allosteric or ago-allosteric drugs.

  6. Geometric integrator for simulations in the canonical ensemble

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tapias, Diego, E-mail: diego.tapias@nucleares.unam.mx; Sanders, David P., E-mail: dpsanders@ciencias.unam.mx; Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139

    2016-08-28

    We introduce a geometric integrator for molecular dynamics simulations of physical systems in the canonical ensemble that preserves the invariant distribution in equations arising from the density dynamics algorithm, with any possible type of thermostat. Our integrator thus constitutes a unified framework that allows the study and comparison of different thermostats and of their influence on the equilibrium and non-equilibrium (thermo-)dynamic properties of a system. To show the validity and the generality of the integrator, we implement it with a second-order, time-reversible method and apply it to the simulation of a Lennard-Jones system with three different thermostats, obtaining good conservationmore » of the geometrical properties and recovering the expected thermodynamic results. Moreover, to show the advantage of our geometric integrator over a non-geometric one, we compare the results with those obtained by using the non-geometric Gear integrator, which is frequently used to perform simulations in the canonical ensemble. The non-geometric integrator induces a drift in the invariant quantity, while our integrator has no such drift, thus ensuring that the system is effectively sampling the correct ensemble.« less

  7. An efficient ensemble learning method for gene microarray classification.

    PubMed

    Osareh, Alireza; Shadgar, Bita

    2013-01-01

    The gene microarray analysis and classification have demonstrated an effective way for the effective diagnosis of diseases and cancers. However, it has been also revealed that the basic classification techniques have intrinsic drawbacks in achieving accurate gene classification and cancer diagnosis. On the other hand, classifier ensembles have received increasing attention in various applications. Here, we address the gene classification issue using RotBoost ensemble methodology. This method is a combination of Rotation Forest and AdaBoost techniques which in turn preserve both desirable features of an ensemble architecture, that is, accuracy and diversity. To select a concise subset of informative genes, 5 different feature selection algorithms are considered. To assess the efficiency of the RotBoost, other nonensemble/ensemble techniques including Decision Trees, Support Vector Machines, Rotation Forest, AdaBoost, and Bagging are also deployed. Experimental results have revealed that the combination of the fast correlation-based feature selection method with ICA-based RotBoost ensemble is highly effective for gene classification. In fact, the proposed method can create ensemble classifiers which outperform not only the classifiers produced by the conventional machine learning but also the classifiers generated by two widely used conventional ensemble learning methods, that is, Bagging and AdaBoost.

  8. Determination of the conformational ensemble of the TAR RNA by X-ray scattering interferometry.

    PubMed

    Shi, Xuesong; Walker, Peter; Harbury, Pehr B; Herschlag, Daniel

    2017-05-05

    The conformational ensembles of structured RNA's are crucial for biological function, but they remain difficult to elucidate experimentally. We demonstrate with HIV-1 TAR RNA that X-ray scattering interferometry (XSI) can be used to determine RNA conformational ensembles. X-ray scattering interferometry (XSI) is based on site-specifically labeling RNA with pairs of heavy atom probes, and precisely measuring the distribution of inter-probe distances that arise from a heterogeneous mixture of RNA solution structures. We show that the XSI-based model of the TAR RNA ensemble closely resembles an independent model derived from NMR-RDC data. Further, we show how the TAR RNA ensemble changes shape at different salt concentrations. Finally, we demonstrate that a single hybrid model of the TAR RNA ensemble simultaneously fits both the XSI and NMR-RDC data set and show that XSI can be combined with NMR-RDC to further improve the quality of the determined ensemble. The results suggest that XSI-RNA will be a powerful approach for characterizing the solution conformational ensembles of RNAs and RNA-protein complexes under diverse solution conditions. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. Specialized Dynamical Properties of Promiscuous Residues Revealed by Simulated Conformational Ensembles

    PubMed Central

    2013-01-01

    The ability to interact with different partners is one of the most important features in proteins. Proteins that bind a large number of partners (hubs) have been often associated with intrinsic disorder. However, many examples exist of hubs with an ordered structure, and evidence of a general mechanism promoting promiscuity in ordered proteins is still elusive. An intriguing hypothesis is that promiscuous binding sites have specific dynamical properties, distinct from the rest of the interface and pre-existing in the protein isolated state. Here, we present the first comprehensive study of the intrinsic dynamics of promiscuous residues in a large protein data set. Different computational methods, from coarse-grained elastic models to geometry-based sampling methods and to full-atom Molecular Dynamics simulations, were used to generate conformational ensembles for the isolated proteins. The flexibility and dynamic correlations of interface residues with a different degree of binding promiscuity were calculated and compared considering side chain and backbone motions, the latter both on a local and on a global scale. The study revealed that (a) promiscuous residues tend to be more flexible than nonpromiscuous ones, (b) this additional flexibility has a higher degree of organization, and (c) evolutionary conservation and binding promiscuity have opposite effects on intrinsic dynamics. Findings on simulated ensembles were also validated on ensembles of experimental structures extracted from the Protein Data Bank (PDB). Additionally, the low occurrence of single nucleotide polymorphisms observed for promiscuous residues indicated a tendency to preserve binding diversity at these positions. A case study on two ubiquitin-like proteins exemplifies how binding promiscuity in evolutionary related proteins can be modulated by the fine-tuning of the interface dynamics. The interplay between promiscuity and flexibility highlighted here can inspire new directions in protein–protein interaction prediction and design methods. PMID:24250278

  10. Conformational Fluctuations in G-Protein-Coupled Receptors

    NASA Astrophysics Data System (ADS)

    Brown, Michael F.

    2014-03-01

    G-protein-coupled receptors (GPCRs) comprise almost 50% of pharmaceutical drug targets, where rhodopsin is an important prototype and occurs naturally in a lipid membrane. Rhodopsin photoactivation entails 11-cis to all-trans isomerization of the retinal cofactor, yielding an equilibrium between inactive Meta-I and active Meta-II states. Two important questions are: (1) Is rhodopsin is a simple two-state switch? Or (2) does isomerization of retinal unlock an activated conformational ensemble? For an ensemble-based activation mechanism (EAM) a role for conformational fluctuations is clearly indicated. Solid-state NMR data together with theoretical molecular dynamics (MD) simulations detect increased local mobility of retinal after light activation. Resultant changes in local dynamics of the cofactor initiate large-scale fluctuations of transmembrane helices that expose recognition sites for the signal-transducing G-protein. Time-resolved FTIR studies and electronic spectroscopy further show the conformational ensemble is strongly biased by the membrane lipid composition, as well as pH and osmotic pressure. A new flexible surface model (FSM) describes how the curvature stress field of the membrane governs the energetics of active rhodopsin, due to the spontaneous monolayer curvature of the lipids. Furthermore, influences of osmotic pressure dictate that a large number of bulk water molecules are implicated in rhodopsin activation. Around 60 bulk water molecules activate rhodopsin, which is much larger than the number of structural waters seen in X-ray crystallography, or inferred from studies of bulk hydrostatic pressure. Conformational selection and promoting vibrational motions of rhodopsin lead to activation of the G-protein (transducin). Our biophysical data give a paradigm shift in understanding GPCR activation. The new view is: dynamics and conformational fluctuations involve an ensemble of substates that activate the cognate G-protein in the amplified visual response.

  11. A Metascalable Computing Framework for Large Spatiotemporal-Scale Atomistic Simulations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nomura, K; Seymour, R; Wang, W

    2009-02-17

    A metascalable (or 'design once, scale on new architectures') parallel computing framework has been developed for large spatiotemporal-scale atomistic simulations of materials based on spatiotemporal data locality principles, which is expected to scale on emerging multipetaflops architectures. The framework consists of: (1) an embedded divide-and-conquer (EDC) algorithmic framework based on spatial locality to design linear-scaling algorithms for high complexity problems; (2) a space-time-ensemble parallel (STEP) approach based on temporal locality to predict long-time dynamics, while introducing multiple parallelization axes; and (3) a tunable hierarchical cellular decomposition (HCD) parallelization framework to map these O(N) algorithms onto a multicore cluster based onmore » hybrid implementation combining message passing and critical section-free multithreading. The EDC-STEP-HCD framework exposes maximal concurrency and data locality, thereby achieving: (1) inter-node parallel efficiency well over 0.95 for 218 billion-atom molecular-dynamics and 1.68 trillion electronic-degrees-of-freedom quantum-mechanical simulations on 212,992 IBM BlueGene/L processors (superscalability); (2) high intra-node, multithreading parallel efficiency (nanoscalability); and (3) nearly perfect time/ensemble parallel efficiency (eon-scalability). The spatiotemporal scale covered by MD simulation on a sustained petaflops computer per day (i.e. petaflops {center_dot} day of computing) is estimated as NT = 2.14 (e.g. N = 2.14 million atoms for T = 1 microseconds).« less

  12. Creating "Intelligent" Climate Model Ensemble Averages Using a Process-Based Framework

    NASA Astrophysics Data System (ADS)

    Baker, N. C.; Taylor, P. C.

    2014-12-01

    The CMIP5 archive contains future climate projections from over 50 models provided by dozens of modeling centers from around the world. Individual model projections, however, are subject to biases created by structural model uncertainties. As a result, ensemble averaging of multiple models is often used to add value to model projections: consensus projections have been shown to consistently outperform individual models. Previous reports for the IPCC establish climate change projections based on an equal-weighted average of all model projections. However, certain models reproduce climate processes better than other models. Should models be weighted based on performance? Unequal ensemble averages have previously been constructed using a variety of mean state metrics. What metrics are most relevant for constraining future climate projections? This project develops a framework for systematically testing metrics in models to identify optimal metrics for unequal weighting multi-model ensembles. A unique aspect of this project is the construction and testing of climate process-based model evaluation metrics. A climate process-based metric is defined as a metric based on the relationship between two physically related climate variables—e.g., outgoing longwave radiation and surface temperature. Metrics are constructed using high-quality Earth radiation budget data from NASA's Clouds and Earth's Radiant Energy System (CERES) instrument and surface temperature data sets. It is found that regional values of tested quantities can vary significantly when comparing weighted and unweighted model ensembles. For example, one tested metric weights the ensemble by how well models reproduce the time-series probability distribution of the cloud forcing component of reflected shortwave radiation. The weighted ensemble for this metric indicates lower simulated precipitation (up to .7 mm/day) in tropical regions than the unweighted ensemble: since CMIP5 models have been shown to overproduce precipitation, this result could indicate that the metric is effective in identifying models which simulate more realistic precipitation. Ultimately, the goal of the framework is to identify performance metrics for advising better methods for ensemble averaging models and create better climate predictions.

  13. DNA Walkers as Transport Vehicles of Nanoparticles Along a Carbon Nanotube Track.

    PubMed

    Pan, Jing; Cha, Tae-Gon; Chen, Haorong; Li, Feiran; Choi, Jong Hyun

    2017-01-01

    DNA-based molecular motors are synthetic analogs of naturally occurring protein motors. Typical DNA walkers are constructed from synthetic short DNA strands and are powered by various free energy changes during hybridization reactions. Due to the constraints set by their small physical dimension and slow kinetics, most DNA walkers are characterized by ensemble measurements that result in averaged kinetics data. Here we present a synthetic DNA walker system that exploits the extraordinary physicochemical properties of nanomaterials and the functionalities of DNA molecules, which enables real-time control and monitoring of single-DNA walkers over an extended period.

  14. "Chemical transformers" from nanoparticle ensembles operated with logic.

    PubMed

    Motornov, Mikhail; Zhou, Jian; Pita, Marcos; Gopishetty, Venkateshwarlu; Tokarev, Ihor; Katz, Evgeny; Minko, Sergiy

    2008-09-01

    The pH-responsive nanoparticles were coupled with information-processing enzyme-based systems to yield "smart" signal-responsive hybrid systems with built-in Boolean logic. The enzyme systems performed AND/OR logic operations, transducing biochemical input signals into reversible structural changes (signal-directed self-assembly) of the nanoparticle assemblies, thus resulting in the processing and amplification of the biochemical signals. The hybrid system mimics biological systems in effective processing of complex biochemical information, resulting in reversible changes of the self-assembled structures of the nanoparticles. The bioinspired approach to the nanostructured morphing materials could be used in future self-assembled molecular robotic systems.

  15. Intelligent Ensemble Forecasting System of Stock Market Fluctuations Based on Symetric and Asymetric Wavelet Functions

    NASA Astrophysics Data System (ADS)

    Lahmiri, Salim; Boukadoum, Mounir

    2015-08-01

    We present a new ensemble system for stock market returns prediction where continuous wavelet transform (CWT) is used to analyze return series and backpropagation neural networks (BPNNs) for processing CWT-based coefficients, determining the optimal ensemble weights, and providing final forecasts. Particle swarm optimization (PSO) is used for finding optimal weights and biases for each BPNN. To capture symmetry/asymmetry in the underlying data, three wavelet functions with different shapes are adopted. The proposed ensemble system was tested on three Asian stock markets: The Hang Seng, KOSPI, and Taiwan stock market data. Three statistical metrics were used to evaluate the forecasting accuracy; including, mean of absolute errors (MAE), root mean of squared errors (RMSE), and mean of absolute deviations (MADs). Experimental results showed that our proposed ensemble system outperformed the individual CWT-ANN models each with different wavelet function. In addition, the proposed ensemble system outperformed the conventional autoregressive moving average process. As a result, the proposed ensemble system is suitable to capture symmetry/asymmetry in financial data fluctuations for better prediction accuracy.

  16. Novel forecasting approaches using combination of machine learning and statistical models for flood susceptibility mapping.

    PubMed

    Shafizadeh-Moghadam, Hossein; Valavi, Roozbeh; Shahabi, Himan; Chapi, Kamran; Shirzadi, Ataollah

    2018-07-01

    In this research, eight individual machine learning and statistical models are implemented and compared, and based on their results, seven ensemble models for flood susceptibility assessment are introduced. The individual models included artificial neural networks, classification and regression trees, flexible discriminant analysis, generalized linear model, generalized additive model, boosted regression trees, multivariate adaptive regression splines, and maximum entropy, and the ensemble models were Ensemble Model committee averaging (EMca), Ensemble Model confidence interval Inferior (EMciInf), Ensemble Model confidence interval Superior (EMciSup), Ensemble Model to estimate the coefficient of variation (EMcv), Ensemble Model to estimate the mean (EMmean), Ensemble Model to estimate the median (EMmedian), and Ensemble Model based on weighted mean (EMwmean). The data set covered 201 flood events in the Haraz watershed (Mazandaran province in Iran) and 10,000 randomly selected non-occurrence points. Among the individual models, the Area Under the Receiver Operating Characteristic (AUROC), which showed the highest value, belonged to boosted regression trees (0.975) and the lowest value was recorded for generalized linear model (0.642). On the other hand, the proposed EMmedian resulted in the highest accuracy (0.976) among all models. In spite of the outstanding performance of some models, nevertheless, variability among the prediction of individual models was considerable. Therefore, to reduce uncertainty, creating more generalizable, more stable, and less sensitive models, ensemble forecasting approaches and in particular the EMmedian is recommended for flood susceptibility assessment. Copyright © 2018 Elsevier Ltd. All rights reserved.

  17. The Oral Tradition in the Sankofa Drum and Dance Ensemble: Student Perceptions

    ERIC Educational Resources Information Center

    Hess, Juliet

    2009-01-01

    The Sankofa Drum and Dance Ensemble is a Ghanaian drum and dance ensemble that focusses on music in the Ewe tradition. It is based in an elementary school in the Greater Toronto Area and consists of students in Grade 4 through Grade 8. Students in the ensemble study Ghanaian traditional Ewe drumming and dancing in the oral tradition. Nine students…

  18. Computer simulation of surface and film processes

    NASA Technical Reports Server (NTRS)

    Tiller, W. A.; Halicioglu, M. T.

    1984-01-01

    All the investigations which were performed employed in one way or another a computer simulation technique based on atomistic level considerations. In general, three types of simulation methods were used for modeling systems with discrete particles that interact via well defined potential functions: molecular dynamics (a general method for solving the classical equations of motion of a model system); Monte Carlo (the use of Markov chain ensemble averaging technique to model equilibrium properties of a system); and molecular statics (provides properties of a system at T = 0 K). The effects of three-body forces on the vibrational frequencies of triatomic cluster were investigated. The multilayer relaxation phenomena for low index planes of an fcc crystal was analyzed also as a function of the three-body interactions. Various surface properties for Si and SiC system were calculated. Results obtained from static simulation calculations for slip formation were presented. The more elaborate molecular dynamics calculations on the propagation of cracks in two-dimensional systems were outlined.

  19. Solution NMR Refinement of a Metal Ion Bound Protein Using Metal Ion Inclusive Restrained Molecular Dynamics Methods

    PubMed Central

    Chakravorty, Dhruva K.; Wang, Bing; Lee, Chul Won; Guerra, Alfredo J.; Giedroc, David P.; Merz, Kenneth M.

    2013-01-01

    Correctly calculating the structure of metal coordination sites in a protein during the process of nuclear magnetic resonance (NMR) structure determination and refinement continues to be a challenging task. In this study, we present an accurate and convenient means by which to include metal ions in the NMR structure determination process using molecular dynamics (MD) constrained by NMR-derived data to obtain a realistic and physically viable description of the metal binding site(s). This method provides the framework to accurately portray the metal ions and its binding residues in a pseudo-bond or dummy-cation like approach, and is validated by quantum mechanical/molecular mechanical (QM/MM) MD calculations constrained by NMR-derived data. To illustrate this approach, we refine the zinc coordination complex structure of the zinc sensing transcriptional repressor protein Staphylococcus aureus CzrA, generating over 130 ns of MD and QM/MM MD NMR-data compliant sampling. In addition to refining the first coordination shell structure of the Zn(II) ion, this protocol benefits from being performed in a periodically replicated solvation environment including long-range electrostatics. We determine that unrestrained (not based on NMR data) MD simulations correlated to the NMR data in a time-averaged ensemble. The accurate solution structure ensemble of the metal-bound protein accurately describes the role of conformational dynamics in allosteric regulation of DNA binding by zinc and serves to validate our previous unrestrained MD simulations of CzrA. This methodology has potentially broad applicability in the structure determination of metal ion bound proteins, protein folding and metal template protein-design studies. PMID:23609042

  20. Linking Well-Tempered Metadynamics Simulations with Experiments

    PubMed Central

    Barducci, Alessandro; Bonomi, Massimiliano; Parrinello, Michele

    2010-01-01

    Abstract Linking experiments with the atomistic resolution provided by molecular dynamics simulations can shed light on the structure and dynamics of protein-disordered states. The sampling limitations of classical molecular dynamics can be overcome using metadynamics, which is based on the introduction of a history-dependent bias on a small number of suitably chosen collective variables. Even if such bias distorts the probability distribution of the other degrees of freedom, the equilibrium Boltzmann distribution can be reconstructed using a recently developed reweighting algorithm. Quantitative comparison with experimental data is thus possible. Here we show the potential of this combined approach by characterizing the conformational ensemble explored by a 13-residue helix-forming peptide by means of a well-tempered metadynamics/parallel tempering approach and comparing the reconstructed nuclear magnetic resonance scalar couplings with experimental data. PMID:20441734

  1. Multimodel hydrological ensemble forecasts for the Baskatong catchment in Canada using the TIGGE database.

    NASA Astrophysics Data System (ADS)

    Tito Arandia Martinez, Fabian

    2014-05-01

    Adequate uncertainty assessment is an important issue in hydrological modelling. An important issue for hydropower producers is to obtain ensemble forecasts which truly grasp the uncertainty linked to upcoming streamflows. If properly assessed, this uncertainty can lead to optimal reservoir management and energy production (ex. [1]). The meteorological inputs to the hydrological model accounts for an important part of the total uncertainty in streamflow forecasting. Since the creation of the THORPEX initiative and the TIGGE database, access to meteorological ensemble forecasts from nine agencies throughout the world have been made available. This allows for hydrological ensemble forecasts based on multiple meteorological ensemble forecasts. Consequently, both the uncertainty linked to the architecture of the meteorological model and the uncertainty linked to the initial condition of the atmosphere can be accounted for. The main objective of this work is to show that a weighted combination of meteorological ensemble forecasts based on different atmospheric models can lead to improved hydrological ensemble forecasts, for horizons from one to ten days. This experiment is performed for the Baskatong watershed, a head subcatchment of the Gatineau watershed in the province of Quebec, in Canada. Baskatong watershed is of great importance for hydro-power production, as it comprises the main reservoir for the Gatineau watershed, on which there are six hydropower plants managed by Hydro-Québec. Since the 70's, they have been using pseudo ensemble forecast based on deterministic meteorological forecasts to which variability derived from past forecasting errors is added. We use a combination of meteorological ensemble forecasts from different models (precipitation and temperature) as the main inputs for hydrological model HSAMI ([2]). The meteorological ensembles from eight of the nine agencies available through TIGGE are weighted according to their individual performance and combined to form a grand ensemble. Results show that the hydrological forecasts derived from the grand ensemble perform better than the pseudo ensemble forecasts actually used operationally at Hydro-Québec. References: [1] M. Verbunt, A. Walser, J. Gurtz et al., "Probabilistic flood forecasting with a limited-area ensemble prediction system: Selected case studies," Journal of Hydrometeorology, vol. 8, no. 4, pp. 897-909, Aug, 2007. [2] N. Evora, Valorisation des prévisions météorologiques d'ensemble, Institu de recherceh d'Hydro-Québec 2005. [3] V. Fortin, Le modèle météo-apport HSAMI: historique, théorie et application, Institut de recherche d'Hydro-Québec, 2000.

  2. Unsupervised Learning in an Ensemble of Spiking Neural Networks Mediated by ITDP.

    PubMed

    Shim, Yoonsik; Philippides, Andrew; Staras, Kevin; Husbands, Phil

    2016-10-01

    We propose a biologically plausible architecture for unsupervised ensemble learning in a population of spiking neural network classifiers. A mixture of experts type organisation is shown to be effective, with the individual classifier outputs combined via a gating network whose operation is driven by input timing dependent plasticity (ITDP). The ITDP gating mechanism is based on recent experimental findings. An abstract, analytically tractable model of the ITDP driven ensemble architecture is derived from a logical model based on the probabilities of neural firing events. A detailed analysis of this model provides insights that allow it to be extended into a full, biologically plausible, computational implementation of the architecture which is demonstrated on a visual classification task. The extended model makes use of a style of spiking network, first introduced as a model of cortical microcircuits, that is capable of Bayesian inference, effectively performing expectation maximization. The unsupervised ensemble learning mechanism, based around such spiking expectation maximization (SEM) networks whose combined outputs are mediated by ITDP, is shown to perform the visual classification task well and to generalize to unseen data. The combined ensemble performance is significantly better than that of the individual classifiers, validating the ensemble architecture and learning mechanisms. The properties of the full model are analysed in the light of extensive experiments with the classification task, including an investigation into the influence of different input feature selection schemes and a comparison with a hierarchical STDP based ensemble architecture.

  3. Force Sensor Based Tool Condition Monitoring Using a Heterogeneous Ensemble Learning Model

    PubMed Central

    Wang, Guofeng; Yang, Yinwei; Li, Zhimeng

    2014-01-01

    Tool condition monitoring (TCM) plays an important role in improving machining efficiency and guaranteeing workpiece quality. In order to realize reliable recognition of the tool condition, a robust classifier needs to be constructed to depict the relationship between tool wear states and sensory information. However, because of the complexity of the machining process and the uncertainty of the tool wear evolution, it is hard for a single classifier to fit all the collected samples without sacrificing generalization ability. In this paper, heterogeneous ensemble learning is proposed to realize tool condition monitoring in which the support vector machine (SVM), hidden Markov model (HMM) and radius basis function (RBF) are selected as base classifiers and a stacking ensemble strategy is further used to reflect the relationship between the outputs of these base classifiers and tool wear states. Based on the heterogeneous ensemble learning classifier, an online monitoring system is constructed in which the harmonic features are extracted from force signals and a minimal redundancy and maximal relevance (mRMR) algorithm is utilized to select the most prominent features. To verify the effectiveness of the proposed method, a titanium alloy milling experiment was carried out and samples with different tool wear states were collected to build the proposed heterogeneous ensemble learning classifier. Moreover, the homogeneous ensemble learning model and majority voting strategy are also adopted to make a comparison. The analysis and comparison results show that the proposed heterogeneous ensemble learning classifier performs better in both classification accuracy and stability. PMID:25405514

  4. Force sensor based tool condition monitoring using a heterogeneous ensemble learning model.

    PubMed

    Wang, Guofeng; Yang, Yinwei; Li, Zhimeng

    2014-11-14

    Tool condition monitoring (TCM) plays an important role in improving machining efficiency and guaranteeing workpiece quality. In order to realize reliable recognition of the tool condition, a robust classifier needs to be constructed to depict the relationship between tool wear states and sensory information. However, because of the complexity of the machining process and the uncertainty of the tool wear evolution, it is hard for a single classifier to fit all the collected samples without sacrificing generalization ability. In this paper, heterogeneous ensemble learning is proposed to realize tool condition monitoring in which the support vector machine (SVM), hidden Markov model (HMM) and radius basis function (RBF) are selected as base classifiers and a stacking ensemble strategy is further used to reflect the relationship between the outputs of these base classifiers and tool wear states. Based on the heterogeneous ensemble learning classifier, an online monitoring system is constructed in which the harmonic features are extracted from force signals and a minimal redundancy and maximal relevance (mRMR) algorithm is utilized to select the most prominent features. To verify the effectiveness of the proposed method, a titanium alloy milling experiment was carried out and samples with different tool wear states were collected to build the proposed heterogeneous ensemble learning classifier. Moreover, the homogeneous ensemble learning model and majority voting strategy are also adopted to make a comparison. The analysis and comparison results show that the proposed heterogeneous ensemble learning classifier performs better in both classification accuracy and stability.

  5. Unsupervised Learning in an Ensemble of Spiking Neural Networks Mediated by ITDP

    PubMed Central

    Staras, Kevin

    2016-01-01

    We propose a biologically plausible architecture for unsupervised ensemble learning in a population of spiking neural network classifiers. A mixture of experts type organisation is shown to be effective, with the individual classifier outputs combined via a gating network whose operation is driven by input timing dependent plasticity (ITDP). The ITDP gating mechanism is based on recent experimental findings. An abstract, analytically tractable model of the ITDP driven ensemble architecture is derived from a logical model based on the probabilities of neural firing events. A detailed analysis of this model provides insights that allow it to be extended into a full, biologically plausible, computational implementation of the architecture which is demonstrated on a visual classification task. The extended model makes use of a style of spiking network, first introduced as a model of cortical microcircuits, that is capable of Bayesian inference, effectively performing expectation maximization. The unsupervised ensemble learning mechanism, based around such spiking expectation maximization (SEM) networks whose combined outputs are mediated by ITDP, is shown to perform the visual classification task well and to generalize to unseen data. The combined ensemble performance is significantly better than that of the individual classifiers, validating the ensemble architecture and learning mechanisms. The properties of the full model are analysed in the light of extensive experiments with the classification task, including an investigation into the influence of different input feature selection schemes and a comparison with a hierarchical STDP based ensemble architecture. PMID:27760125

  6. An empirical study of ensemble-based semi-supervised learning approaches for imbalanced splice site datasets.

    PubMed

    Stanescu, Ana; Caragea, Doina

    2015-01-01

    Recent biochemical advances have led to inexpensive, time-efficient production of massive volumes of raw genomic data. Traditional machine learning approaches to genome annotation typically rely on large amounts of labeled data. The process of labeling data can be expensive, as it requires domain knowledge and expert involvement. Semi-supervised learning approaches that can make use of unlabeled data, in addition to small amounts of labeled data, can help reduce the costs associated with labeling. In this context, we focus on the problem of predicting splice sites in a genome using semi-supervised learning approaches. This is a challenging problem, due to the highly imbalanced distribution of the data, i.e., small number of splice sites as compared to the number of non-splice sites. To address this challenge, we propose to use ensembles of semi-supervised classifiers, specifically self-training and co-training classifiers. Our experiments on five highly imbalanced splice site datasets, with positive to negative ratios of 1-to-99, showed that the ensemble-based semi-supervised approaches represent a good choice, even when the amount of labeled data consists of less than 1% of all training data. In particular, we found that ensembles of co-training and self-training classifiers that dynamically balance the set of labeled instances during the semi-supervised iterations show improvements over the corresponding supervised ensemble baselines. In the presence of limited amounts of labeled data, ensemble-based semi-supervised approaches can successfully leverage the unlabeled data to enhance supervised ensembles learned from highly imbalanced data distributions. Given that such distributions are common for many biological sequence classification problems, our work can be seen as a stepping stone towards more sophisticated ensemble-based approaches to biological sequence annotation in a semi-supervised framework.

  7. An empirical study of ensemble-based semi-supervised learning approaches for imbalanced splice site datasets

    PubMed Central

    2015-01-01

    Background Recent biochemical advances have led to inexpensive, time-efficient production of massive volumes of raw genomic data. Traditional machine learning approaches to genome annotation typically rely on large amounts of labeled data. The process of labeling data can be expensive, as it requires domain knowledge and expert involvement. Semi-supervised learning approaches that can make use of unlabeled data, in addition to small amounts of labeled data, can help reduce the costs associated with labeling. In this context, we focus on the problem of predicting splice sites in a genome using semi-supervised learning approaches. This is a challenging problem, due to the highly imbalanced distribution of the data, i.e., small number of splice sites as compared to the number of non-splice sites. To address this challenge, we propose to use ensembles of semi-supervised classifiers, specifically self-training and co-training classifiers. Results Our experiments on five highly imbalanced splice site datasets, with positive to negative ratios of 1-to-99, showed that the ensemble-based semi-supervised approaches represent a good choice, even when the amount of labeled data consists of less than 1% of all training data. In particular, we found that ensembles of co-training and self-training classifiers that dynamically balance the set of labeled instances during the semi-supervised iterations show improvements over the corresponding supervised ensemble baselines. Conclusions In the presence of limited amounts of labeled data, ensemble-based semi-supervised approaches can successfully leverage the unlabeled data to enhance supervised ensembles learned from highly imbalanced data distributions. Given that such distributions are common for many biological sequence classification problems, our work can be seen as a stepping stone towards more sophisticated ensemble-based approaches to biological sequence annotation in a semi-supervised framework. PMID:26356316

  8. A Random Forest-based ensemble method for activity recognition.

    PubMed

    Feng, Zengtao; Mo, Lingfei; Li, Meng

    2015-01-01

    This paper presents a multi-sensor ensemble approach to human physical activity (PA) recognition, using random forest. We designed an ensemble learning algorithm, which integrates several independent Random Forest classifiers based on different sensor feature sets to build a more stable, more accurate and faster classifier for human activity recognition. To evaluate the algorithm, PA data collected from the PAMAP (Physical Activity Monitoring for Aging People), which is a standard, publicly available database, was utilized to train and test. The experimental results show that the algorithm is able to correctly recognize 19 PA types with an accuracy of 93.44%, while the training is faster than others. The ensemble classifier system based on the RF (Random Forest) algorithm can achieve high recognition accuracy and fast calculation.

  9. Application of an Ensemble Smoother to Precipitation Assimilation

    NASA Technical Reports Server (NTRS)

    Zhang, Sara; Zupanski, Dusanka; Hou, Arthur; Zupanski, Milija

    2008-01-01

    Assimilation of precipitation in a global modeling system poses a special challenge in that the observation operators for precipitation processes are highly nonlinear. In the variational approach, substantial development work and model simplifications are required to include precipitation-related physical processes in the tangent linear model and its adjoint. An ensemble based data assimilation algorithm "Maximum Likelihood Ensemble Smoother (MLES)" has been developed to explore the ensemble representation of the precipitation observation operator with nonlinear convection and large-scale moist physics. An ensemble assimilation system based on the NASA GEOS-5 GCM has been constructed to assimilate satellite precipitation data within the MLES framework. The configuration of the smoother takes the time dimension into account for the relationship between state variables and observable rainfall. The full nonlinear forward model ensembles are used to represent components involving the observation operator and its transpose. Several assimilation experiments using satellite precipitation observations have been carried out to investigate the effectiveness of the ensemble representation of the nonlinear observation operator and the data impact of assimilating rain retrievals from the TMI and SSM/I sensors. Preliminary results show that this ensemble assimilation approach is capable of extracting information from nonlinear observations to improve the analysis and forecast if ensemble size is adequate, and a suitable localization scheme is applied. In addition to a dynamically consistent precipitation analysis, the assimilation system produces a statistical estimate of the analysis uncertainty.

  10. Mixture EMOS model for calibrating ensemble forecasts of wind speed.

    PubMed

    Baran, S; Lerch, S

    2016-03-01

    Ensemble model output statistics (EMOS) is a statistical tool for post-processing forecast ensembles of weather variables obtained from multiple runs of numerical weather prediction models in order to produce calibrated predictive probability density functions. The EMOS predictive probability density function is given by a parametric distribution with parameters depending on the ensemble forecasts. We propose an EMOS model for calibrating wind speed forecasts based on weighted mixtures of truncated normal (TN) and log-normal (LN) distributions where model parameters and component weights are estimated by optimizing the values of proper scoring rules over a rolling training period. The new model is tested on wind speed forecasts of the 50 member European Centre for Medium-range Weather Forecasts ensemble, the 11 member Aire Limitée Adaptation dynamique Développement International-Hungary Ensemble Prediction System ensemble of the Hungarian Meteorological Service, and the eight-member University of Washington mesoscale ensemble, and its predictive performance is compared with that of various benchmark EMOS models based on single parametric families and combinations thereof. The results indicate improved calibration of probabilistic and accuracy of point forecasts in comparison with the raw ensemble and climatological forecasts. The mixture EMOS model significantly outperforms the TN and LN EMOS methods; moreover, it provides better calibrated forecasts than the TN-LN combination model and offers an increased flexibility while avoiding covariate selection problems. © 2016 The Authors Environmetrics Published by JohnWiley & Sons Ltd.

  11. Algorithms that Defy the Gravity of Learning Curve

    DTIC Science & Technology

    2017-04-28

    three nearest neighbour-based anomaly detectors, i.e., an ensemble of nearest neigh- bours, a recent nearest neighbour-based ensemble method called iNNE...streams. Note that the change in sample size does not alter the geometrical data characteristics discussed here. 3.1 Experimental Methodology ...need to be answered. 3.6 Comparison with conventional ensemble methods Given the theoretical results, the third aim of this project (i.e., identify the

  12. Sequential ensemble-based optimal design for parameter estimation: SEQUENTIAL ENSEMBLE-BASED OPTIMAL DESIGN

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Man, Jun; Zhang, Jiangjiang; Li, Weixuan

    2016-10-01

    The ensemble Kalman filter (EnKF) has been widely used in parameter estimation for hydrological models. The focus of most previous studies was to develop more efficient analysis (estimation) algorithms. On the other hand, it is intuitively understandable that a well-designed sampling (data-collection) strategy should provide more informative measurements and subsequently improve the parameter estimation. In this work, a Sequential Ensemble-based Optimal Design (SEOD) method, coupled with EnKF, information theory and sequential optimal design, is proposed to improve the performance of parameter estimation. Based on the first-order and second-order statistics, different information metrics including the Shannon entropy difference (SD), degrees ofmore » freedom for signal (DFS) and relative entropy (RE) are used to design the optimal sampling strategy, respectively. The effectiveness of the proposed method is illustrated by synthetic one-dimensional and two-dimensional unsaturated flow case studies. It is shown that the designed sampling strategies can provide more accurate parameter estimation and state prediction compared with conventional sampling strategies. Optimal sampling designs based on various information metrics perform similarly in our cases. The effect of ensemble size on the optimal design is also investigated. Overall, larger ensemble size improves the parameter estimation and convergence of optimal sampling strategy. Although the proposed method is applied to unsaturated flow problems in this study, it can be equally applied in any other hydrological problems.« less

  13. Effect of the explicit flexibility of the InhA enzyme from Mycobacterium tuberculosis in molecular docking simulations.

    PubMed

    Cohen, Elisangela M L; Machado, Karina S; Cohen, Marcelo; de Souza, Osmar Norberto

    2011-12-22

    Protein/receptor explicit flexibility has recently become an important feature of molecular docking simulations. Taking the flexibility into account brings the docking simulation closer to the receptors' real behaviour in its natural environment. Several approaches have been developed to address this problem. Among them, modelling the full flexibility as an ensemble of snapshots derived from a molecular dynamics simulation (MD) of the receptor has proved very promising. Despite its potential, however, only a few studies have employed this method to probe its effect in molecular docking simulations. We hereby use ensembles of snapshots obtained from three different MD simulations of the InhA enzyme from M. tuberculosis (Mtb), the wild-type (InhA_wt), InhA_I16T, and InhA_I21V mutants to model their explicit flexibility, and to systematically explore their effect in docking simulations with three different InhA inhibitors, namely, ethionamide (ETH), triclosan (TCL), and pentacyano(isoniazid)ferrate(II) (PIF). The use of fully-flexible receptor (FFR) models of InhA_wt, InhA_I16T, and InhA_I21V mutants in docking simulation with the inhibitors ETH, TCL, and PIF revealed significant differences in the way they interact as compared to the rigid, InhA crystal structure (PDB ID: 1ENY). In the latter, only up to five receptor residues interact with the three different ligands. Conversely, in the FFR models this number grows up to an astonishing 80 different residues. The comparison between the rigid crystal structure and the FFR models showed that the inclusion of explicit flexibility, despite the limitations of the FFR models employed in this study, accounts in a substantial manner to the induced fit expected when a protein/receptor and ligand approach each other to interact in the most favourable manner. Protein/receptor explicit flexibility, or FFR models, represented as an ensemble of MD simulation snapshots, can lead to a more realistic representation of the induced fit effect expected in the encounter and proper docking of receptors to ligands. The FFR models of InhA explicitly characterizes the overall movements of the amino acid residues in helices, strands, loops, and turns, allowing the ligand to properly accommodate itself in the receptor's binding site. Utilization of the intrinsic flexibility of Mtb's InhA enzyme and its mutants in virtual screening via molecular docking simulation may provide a novel platform to guide the rational or dynamical-structure-based drug design of novel inhibitors for Mtb's InhA. We have produced a short video sequence of each ligand (ETH, TCL and PIF) docked to the FFR models of InhA_wt. These videos are available at http://www.inf.pucrs.br/~osmarns/LABIO/Videos_Cohen_et_al_19_07_2011.htm.

  14. Novel layered clustering-based approach for generating ensemble of classifiers.

    PubMed

    Rahman, Ashfaqur; Verma, Brijesh

    2011-05-01

    This paper introduces a novel concept for creating an ensemble of classifiers. The concept is based on generating an ensemble of classifiers through clustering of data at multiple layers. The ensemble classifier model generates a set of alternative clustering of a dataset at different layers by randomly initializing the clustering parameters and trains a set of base classifiers on the patterns at different clusters in different layers. A test pattern is classified by first finding the appropriate cluster at each layer and then using the corresponding base classifier. The decisions obtained at different layers are fused into a final verdict using majority voting. As the base classifiers are trained on overlapping patterns at different layers, the proposed approach achieves diversity among the individual classifiers. Identification of difficult-to-classify patterns through clustering as well as achievement of diversity through layering leads to better classification results as evidenced from the experimental results.

  15. Generic Learning-Based Ensemble Framework for Small Sample Size Face Recognition in Multi-Camera Networks.

    PubMed

    Zhang, Cuicui; Liang, Xuefeng; Matsuyama, Takashi

    2014-12-08

    Multi-camera networks have gained great interest in video-based surveillance systems for security monitoring, access control, etc. Person re-identification is an essential and challenging task in multi-camera networks, which aims to determine if a given individual has already appeared over the camera network. Individual recognition often uses faces as a trial and requires a large number of samples during the training phrase. This is difficult to fulfill due to the limitation of the camera hardware system and the unconstrained image capturing conditions. Conventional face recognition algorithms often encounter the "small sample size" (SSS) problem arising from the small number of training samples compared to the high dimensionality of the sample space. To overcome this problem, interest in the combination of multiple base classifiers has sparked research efforts in ensemble methods. However, existing ensemble methods still open two questions: (1) how to define diverse base classifiers from the small data; (2) how to avoid the diversity/accuracy dilemma occurring during ensemble. To address these problems, this paper proposes a novel generic learning-based ensemble framework, which augments the small data by generating new samples based on a generic distribution and introduces a tailored 0-1 knapsack algorithm to alleviate the diversity/accuracy dilemma. More diverse base classifiers can be generated from the expanded face space, and more appropriate base classifiers are selected for ensemble. Extensive experimental results on four benchmarks demonstrate the higher ability of our system to cope with the SSS problem compared to the state-of-the-art system.

  16. Generic Learning-Based Ensemble Framework for Small Sample Size Face Recognition in Multi-Camera Networks

    PubMed Central

    Zhang, Cuicui; Liang, Xuefeng; Matsuyama, Takashi

    2014-01-01

    Multi-camera networks have gained great interest in video-based surveillance systems for security monitoring, access control, etc. Person re-identification is an essential and challenging task in multi-camera networks, which aims to determine if a given individual has already appeared over the camera network. Individual recognition often uses faces as a trial and requires a large number of samples during the training phrase. This is difficult to fulfill due to the limitation of the camera hardware system and the unconstrained image capturing conditions. Conventional face recognition algorithms often encounter the “small sample size” (SSS) problem arising from the small number of training samples compared to the high dimensionality of the sample space. To overcome this problem, interest in the combination of multiple base classifiers has sparked research efforts in ensemble methods. However, existing ensemble methods still open two questions: (1) how to define diverse base classifiers from the small data; (2) how to avoid the diversity/accuracy dilemma occurring during ensemble. To address these problems, this paper proposes a novel generic learning-based ensemble framework, which augments the small data by generating new samples based on a generic distribution and introduces a tailored 0–1 knapsack algorithm to alleviate the diversity/accuracy dilemma. More diverse base classifiers can be generated from the expanded face space, and more appropriate base classifiers are selected for ensemble. Extensive experimental results on four benchmarks demonstrate the higher ability of our system to cope with the SSS problem compared to the state-of-the-art system. PMID:25494350

  17. A new strategy for snow-cover mapping using remote sensing data and ensemble based systems techniques

    NASA Astrophysics Data System (ADS)

    Roberge, S.; Chokmani, K.; De Sève, D.

    2012-04-01

    The snow cover plays an important role in the hydrological cycle of Quebec (Eastern Canada). Consequently, evaluating its spatial extent interests the authorities responsible for the management of water resources, especially hydropower companies. The main objective of this study is the development of a snow-cover mapping strategy using remote sensing data and ensemble based systems techniques. Planned to be tested in a near real-time operational mode, this snow-cover mapping strategy has the advantage to provide the probability of a pixel to be snow covered and its uncertainty. Ensemble systems are made of two key components. First, a method is needed to build an ensemble of classifiers that is diverse as much as possible. Second, an approach is required to combine the outputs of individual classifiers that make up the ensemble in such a way that correct decisions are amplified, and incorrect ones are cancelled out. In this study, we demonstrate the potential of ensemble systems to snow-cover mapping using remote sensing data. The chosen classifier is a sequential thresholds algorithm using NOAA-AVHRR data adapted to conditions over Eastern Canada. Its special feature is the use of a combination of six sequential thresholds varying according to the day in the winter season. Two versions of the snow-cover mapping algorithm have been developed: one is specific for autumn (from October 1st to December 31st) and the other for spring (from March 16th to May 31st). In order to build the ensemble based system, different versions of the algorithm are created by varying randomly its parameters. One hundred of the versions are included in the ensemble. The probability of a pixel to be snow, no-snow or cloud covered corresponds to the amount of votes the pixel has been classified as such by all classifiers. The overall performance of ensemble based mapping is compared to the overall performance of the chosen classifier, and also with ground observations at meteorological stations.

  18. Use of ultraviolet-fluorescence-based simulation in evaluation of personal protective equipment worn for first assessment and care of a patient with suspected high-consequence infectious disease.

    PubMed

    Hall, S; Poller, B; Bailey, C; Gregory, S; Clark, R; Roberts, P; Tunbridge, A; Poran, V; Evans, C; Crook, B

    2018-06-01

    Variations currently exist across the UK in the choice of personal protective equipment (PPE) used by healthcare workers when caring for patients with suspected high-consequence infectious diseases (HCIDs). To test the protection afforded to healthcare workers by current PPE ensembles during assessment of a suspected HCID case, and to provide an evidence base to justify proposal of a unified PPE ensemble for healthcare workers across the UK. One 'basic level' (enhanced precautions) PPE ensemble and five 'suspected case' PPE ensembles were evaluated in volunteer trials using 'Violet'; an ultraviolet-fluorescence-based simulation exercise to visualize exposure/contamination events. Contamination was photographed and mapped. There were 147 post-simulation and 31 post-doffing contamination events, from a maximum of 980, when evaluating the basic level of PPE. Therefore, this PPE ensemble did not afford adequate protection, primarily due to direct contamination of exposed areas of the skin. For the five suspected case ensembles, 1584 post-simulation contamination events were recorded, from a maximum of 5110. Twelve post-doffing contamination events were also observed (face, two events; neck, one event; forearm, one event; lower legs, eight events). All suspected case PPE ensembles either had post-doffing contamination events or other significant disadvantages to their use. This identified the need to design a unified PPE ensemble and doffing procedure, incorporating the most protective PPE considered for each body area. This work has been presented to, and reviewed by, key stakeholders to decide on a proposed unified ensemble, subject to further evaluation. Crown Copyright © 2018. Published by Elsevier Ltd. All rights reserved.

  19. The Ensemble Canon

    NASA Technical Reports Server (NTRS)

    MIittman, David S

    2011-01-01

    Ensemble is an open architecture for the development, integration, and deployment of mission operations software. Fundamentally, it is an adaptation of the Eclipse Rich Client Platform (RCP), a widespread, stable, and supported framework for component-based application development. By capitalizing on the maturity and availability of the Eclipse RCP, Ensemble offers a low-risk, politically neutral path towards a tighter integration of operations tools. The Ensemble project is a highly successful, ongoing collaboration among NASA Centers. Since 2004, the Ensemble project has supported the development of mission operations software for NASA's Exploration Systems, Science, and Space Operations Directorates.

  20. Project FIRES. Volume 1: Program Overview and Summary, Phase 1B

    NASA Technical Reports Server (NTRS)

    Abeles, F. J.

    1980-01-01

    Overall performance requirements and evaluation methods for firefighters protective equipment were established and published as the Protective Ensemble Performance Standards (PEPS). Current firefighters protective equipment was tested and evaluated against the PEPS requirements, and the preliminary design of a prototype protective ensemble was performed. In phase 1B, the design of the prototype ensemble was finalized. Prototype ensembles were fabricated and then subjected to a series of qualification tests which were based upon the PEPS requirements. Engineering drawings and purchase specifications were prepared for the new protective ensemble.

  1. AceCloud: Molecular Dynamics Simulations in the Cloud.

    PubMed

    Harvey, M J; De Fabritiis, G

    2015-05-26

    We present AceCloud, an on-demand service for molecular dynamics simulations. AceCloud is designed to facilitate the secure execution of large ensembles of simulations on an external cloud computing service (currently Amazon Web Services). The AceCloud client, integrated into the ACEMD molecular dynamics package, provides an easy-to-use interface that abstracts all aspects of interaction with the cloud services. This gives the user the experience that all simulations are running on their local machine, minimizing the learning curve typically associated with the transition to using high performance computing services.

  2. Nationwide validation of ensemble streamflow forecasts from the Hydrologic Ensemble Forecast Service (HEFS) of the U.S. National Weather Service

    NASA Astrophysics Data System (ADS)

    Lee, H. S.; Liu, Y.; Ward, J.; Brown, J.; Maestre, A.; Herr, H.; Fresch, M. A.; Wells, E.; Reed, S. M.; Jones, E.

    2017-12-01

    The National Weather Service's (NWS) Office of Water Prediction (OWP) recently launched a nationwide effort to verify streamflow forecasts from the Hydrologic Ensemble Forecast Service (HEFS) for a majority of forecast locations across the 13 River Forecast Centers (RFCs). Known as the HEFS Baseline Validation (BV), the project involves a joint effort between the OWP and the RFCs. It aims to provide a geographically consistent, statistically robust validation, and a benchmark to guide the operational implementation of the HEFS, inform practical applications, such as impact-based decision support services, and to provide an objective framework for evaluating strategic investments in the HEFS. For the BV, HEFS hindcasts are issued once per day on a 12Z cycle for the period of 1985-2015 with a forecast horizon of 30 days. For the first two weeks, the hindcasts are forced with precipitation and temperature ensemble forecasts from the Global Ensemble Forecast System of the National Centers for Environmental Prediction, and by resampled climatology for the remaining period. The HEFS-generated ensemble streamflow hindcasts are verified using the Ensemble Verification System. Skill is assessed relative to streamflow hindcasts generated from NWS' current operational system, namely climatology-based Ensemble Streamflow Prediction. In this presentation, we summarize the results and findings to date.

  3. Efficient ensemble system based on the copper binding motif for highly sensitive and selective detection of cyanide ions in 100% aqueous solutions by fluorescent and colorimetric changes.

    PubMed

    Jung, Kwan Ho; Lee, Keun-Hyeung

    2015-09-15

    A peptide-based ensemble for the detection of cyanide ions in 100% aqueous solutions was designed on the basis of the copper binding motif. 7-Nitro-2,1,3-benzoxadiazole-labeled tripeptide (NBD-SSH, NBD-SerSerHis) formed the ensemble with Cu(2+), leading to a change in the color of the solution from yellow to orange and a complete decrease of fluorescence emission. The ensemble (NBD-SSH-Cu(2+)) sensitively and selectively detected a low concentration of cyanide ions in 100% aqueous solutions by a colorimetric change as well as a fluorescent change. The addition of cyanide ions instantly removed Cu(2+) from the ensemble (NBD-SSH-Cu(2+)) in 100% aqueous solutions, resulting in a color change of the solution from orange to yellow and a "turn-on" fluorescent response. The detection limits for cyanide ions were lower than the maximum allowable level of cyanide ions in drinking water set by the World Health Organization. The peptide-based ensemble system is expected to be a potential and practical way for the detection of submicromolar concentrations of cyanide ions in 100% aqueous solutions.

  4. JuPOETs: a constrained multiobjective optimization approach to estimate biochemical model ensembles in the Julia programming language.

    PubMed

    Bassen, David M; Vilkhovoy, Michael; Minot, Mason; Butcher, Jonathan T; Varner, Jeffrey D

    2017-01-25

    Ensemble modeling is a promising approach for obtaining robust predictions and coarse grained population behavior in deterministic mathematical models. Ensemble approaches address model uncertainty by using parameter or model families instead of single best-fit parameters or fixed model structures. Parameter ensembles can be selected based upon simulation error, along with other criteria such as diversity or steady-state performance. Simulations using parameter ensembles can estimate confidence intervals on model variables, and robustly constrain model predictions, despite having many poorly constrained parameters. In this software note, we present a multiobjective based technique to estimate parameter or models ensembles, the Pareto Optimal Ensemble Technique in the Julia programming language (JuPOETs). JuPOETs integrates simulated annealing with Pareto optimality to estimate ensembles on or near the optimal tradeoff surface between competing training objectives. We demonstrate JuPOETs on a suite of multiobjective problems, including test functions with parameter bounds and system constraints as well as for the identification of a proof-of-concept biochemical model with four conflicting training objectives. JuPOETs identified optimal or near optimal solutions approximately six-fold faster than a corresponding implementation in Octave for the suite of test functions. For the proof-of-concept biochemical model, JuPOETs produced an ensemble of parameters that gave both the mean of the training data for conflicting data sets, while simultaneously estimating parameter sets that performed well on each of the individual objective functions. JuPOETs is a promising approach for the estimation of parameter and model ensembles using multiobjective optimization. JuPOETs can be adapted to solve many problem types, including mixed binary and continuous variable types, bilevel optimization problems and constrained problems without altering the base algorithm. JuPOETs is open source, available under an MIT license, and can be installed using the Julia package manager from the JuPOETs GitHub repository.

  5. Potentiometric Titrations for Measuring the Capacitance of Colloidal Photodoped ZnO Nanocrystals.

    PubMed

    Brozek, Carl K; Hartstein, Kimberly H; Gamelin, Daniel R

    2016-08-24

    Colloidal semiconductor nanocrystals offer a unique opportunity to bridge molecular and bulk semiconductor redox phenomena. Here, potentiometric titration is demonstrated as a method for quantifying the Fermi levels and charging potentials of free-standing colloidal n-type ZnO nanocrystals possessing between 0 and 20 conduction-band electrons per nanocrystal, corresponding to carrier densities between 0 and 1.2 × 10(20) cm(-3). Potentiometric titration of colloidal semiconductor nanocrystals has not been described previously, and little precedent exists for analogous potentiometric titration of any soluble reductants involving so many electrons. Linear changes in Fermi level vs charge-carrier density are observed for each ensemble of nanocrystals, with slopes that depend on the nanocrystal size. Analysis indicates that the ensemble nanocrystal capacitance is governed by classical surface electrical double layers, showing no evidence of quantum contributions. Systematic shifts in the Fermi level are also observed with specific changes in the identity of the charge-compensating countercation. As a simple and contactless alternative to more common thin-film-based voltammetric techniques, potentiometric titration offers a powerful new approach for quantifying the redox properties of colloidal semiconductor nanocrystals.

  6. Ocean Predictability and Uncertainty Forecasts Using Local Ensemble Transfer Kalman Filter (LETKF)

    NASA Astrophysics Data System (ADS)

    Wei, M.; Hogan, P. J.; Rowley, C. D.; Smedstad, O. M.; Wallcraft, A. J.; Penny, S. G.

    2017-12-01

    Ocean predictability and uncertainty are studied with an ensemble system that has been developed based on the US Navy's operational HYCOM using the Local Ensemble Transfer Kalman Filter (LETKF) technology. One of the advantages of this method is that the best possible initial analysis states for the HYCOM forecasts are provided by the LETKF which assimilates operational observations using ensemble method. The background covariance during this assimilation process is implicitly supplied with the ensemble avoiding the difficult task of developing tangent linear and adjoint models out of HYCOM with the complicated hybrid isopycnal vertical coordinate for 4D-VAR. The flow-dependent background covariance from the ensemble will be an indispensable part in the next generation hybrid 4D-Var/ensemble data assimilation system. The predictability and uncertainty for the ocean forecasts are studied initially for the Gulf of Mexico. The results are compared with another ensemble system using Ensemble Transfer (ET) method which has been used in the Navy's operational center. The advantages and disadvantages are discussed.

  7. Discrete Fourier Transform-Based Multivariate Image Analysis: Application to Modeling of Aromatase Inhibitory Activity.

    PubMed

    Barigye, Stephen J; Freitas, Matheus P; Ausina, Priscila; Zancan, Patricia; Sola-Penna, Mauro; Castillo-Garit, Juan A

    2018-02-12

    We recently generalized the formerly alignment-dependent multivariate image analysis applied to quantitative structure-activity relationships (MIA-QSAR) method through the application of the discrete Fourier transform (DFT), allowing for its application to noncongruent and structurally diverse chemical compound data sets. Here we report the first practical application of this method in the screening of molecular entities of therapeutic interest, with human aromatase inhibitory activity as the case study. We developed an ensemble classification model based on the two-dimensional (2D) DFT MIA-QSAR descriptors, with which we screened the NCI Diversity Set V (1593 compounds) and obtained 34 chemical compounds with possible aromatase inhibitory activity. These compounds were docked into the aromatase active site, and the 10 most promising compounds were selected for in vitro experimental validation. Of these compounds, 7419 (nonsteroidal) and 89 201 (steroidal) demonstrated satisfactory antiproliferative and aromatase inhibitory activities. The obtained results suggest that the 2D-DFT MIA-QSAR method may be useful in ligand-based virtual screening of new molecular entities of therapeutic utility.

  8. Molecular dynamics-based refinement and validation for sub-5 Å cryo-electron microscopy maps

    PubMed Central

    Singharoy, Abhishek; Teo, Ivan; McGreevy, Ryan; Stone, John E; Zhao, Jianhua; Schulten, Klaus

    2016-01-01

    Two structure determination methods, based on the molecular dynamics flexible fitting (MDFF) paradigm, are presented that resolve sub-5 Å cryo-electron microscopy (EM) maps with either single structures or ensembles of such structures. The methods, denoted cascade MDFF and resolution exchange MDFF, sequentially re-refine a search model against a series of maps of progressively higher resolutions, which ends with the original experimental resolution. Application of sequential re-refinement enables MDFF to achieve a radius of convergence of ~25 Å demonstrated with the accurate modeling of β-galactosidase and TRPV1 proteins at 3.2 Å and 3.4 Å resolution, respectively. The MDFF refinements uniquely offer map-model validation and B-factor determination criteria based on the inherent dynamics of the macromolecules studied, captured by means of local root mean square fluctuations. The MDFF tools described are available to researchers through an easy-to-use and cost-effective cloud computing resource on Amazon Web Services. DOI: http://dx.doi.org/10.7554/eLife.16105.001 PMID:27383269

  9. An adaptive Gaussian process-based iterative ensemble smoother for data assimilation

    NASA Astrophysics Data System (ADS)

    Ju, Lei; Zhang, Jiangjiang; Meng, Long; Wu, Laosheng; Zeng, Lingzao

    2018-05-01

    Accurate characterization of subsurface hydraulic conductivity is vital for modeling of subsurface flow and transport. The iterative ensemble smoother (IES) has been proposed to estimate the heterogeneous parameter field. As a Monte Carlo-based method, IES requires a relatively large ensemble size to guarantee its performance. To improve the computational efficiency, we propose an adaptive Gaussian process (GP)-based iterative ensemble smoother (GPIES) in this study. At each iteration, the GP surrogate is adaptively refined by adding a few new base points chosen from the updated parameter realizations. Then the sensitivity information between model parameters and measurements is calculated from a large number of realizations generated by the GP surrogate with virtually no computational cost. Since the original model evaluations are only required for base points, whose number is much smaller than the ensemble size, the computational cost is significantly reduced. The applicability of GPIES in estimating heterogeneous conductivity is evaluated by the saturated and unsaturated flow problems, respectively. Without sacrificing estimation accuracy, GPIES achieves about an order of magnitude of speed-up compared with the standard IES. Although subsurface flow problems are considered in this study, the proposed method can be equally applied to other hydrological models.

  10. Multi-model ensembles for assessment of flood losses and associated uncertainty

    NASA Astrophysics Data System (ADS)

    Figueiredo, Rui; Schröter, Kai; Weiss-Motz, Alexander; Martina, Mario L. V.; Kreibich, Heidi

    2018-05-01

    Flood loss modelling is a crucial part of risk assessments. However, it is subject to large uncertainty that is often neglected. Most models available in the literature are deterministic, providing only single point estimates of flood loss, and large disparities tend to exist among them. Adopting any one such model in a risk assessment context is likely to lead to inaccurate loss estimates and sub-optimal decision-making. In this paper, we propose the use of multi-model ensembles to address these issues. This approach, which has been applied successfully in other scientific fields, is based on the combination of different model outputs with the aim of improving the skill and usefulness of predictions. We first propose a model rating framework to support ensemble construction, based on a probability tree of model properties, which establishes relative degrees of belief between candidate models. Using 20 flood loss models in two test cases, we then construct numerous multi-model ensembles, based both on the rating framework and on a stochastic method, differing in terms of participating members, ensemble size and model weights. We evaluate the performance of ensemble means, as well as their probabilistic skill and reliability. Our results demonstrate that well-designed multi-model ensembles represent a pragmatic approach to consistently obtain more accurate flood loss estimates and reliable probability distributions of model uncertainty.

  11. Structural Ensemble of CD4 Cytoplasmic Tail (402-419) Reveals a Nearly Flat Free-Energy Landscape with Local α-Helical Order in Aqueous Solution.

    PubMed

    Ahalawat, Navjeet; Arora, Simran; Murarka, Rajesh K

    2015-08-27

    The human cluster determinant 4 (CD4), expressed primarily on the surface of T helper cells, serves as a coreceptor in T-cell receptor recognition of MHC II antigen complexes. Besides its cellular functions, CD4 serves as a primary receptor of human immunodeficiency virus (HIV) type 1. The cytoplasmic tail of CD4 (residues 402-419) is known to be involved in direct interaction with the HIV-1 proteins Vpu and Nef. These two viral accessory proteins (Vpu and Nef) downregulate CD4 in HIV-1 infected cells by multiple strategies and make the body susceptible to all forms of infections. In this work, we carried out extensive replica exchange molecular dynamics simulations in explicit water with three popular protein force fields Amber ff99SB, Amber ff99SB*-ILDN, and CHARMM36 to characterize the equilibrium conformational ensemble of CD4-tail (402-419) and further validated the simulated ensembles with known NMR data. We found that ff99SB*-ILDN gives a better description of the structural ensemble of this peptide compared with ff99SB and CHARMM36. The peptide adopts multiple distinct conformations with varying degree of residual secondary structures. In particular, we observed 28, 7, and 5% average α-helical, β-strand, and 3(10)-helix content, respectively, for ff99SB*-ILDN. The peptide chain shows the tendency of helix formation in a cooperative manner, seeding at residues 407-410, and subsequently extending toward both ends of the chain. Furthermore, we constructed Markov state model (MSM) from large-scale molecular dynamics simulations to study the dynamics of transitions between different metastable states explored by this peptide. The mean first passage times computed from MSM indicate rapid interconversion of these states, and the time scales of transitions range from several nanoseconds to hundreds of microseconds. Our results show good agreement with experimental data and could help to understand the key molecular mechanisms of T-cell activation and HIV-mediated receptor interference.

  12. Glyph-based analysis of multimodal directional distributions in vector field ensembles

    NASA Astrophysics Data System (ADS)

    Jarema, Mihaela; Demir, Ismail; Kehrer, Johannes; Westermann, Rüdiger

    2015-04-01

    Ensemble simulations are increasingly often performed in the geosciences in order to study the uncertainty and variability of model predictions. Describing ensemble data by mean and standard deviation can be misleading in case of multimodal distributions. We present first results of a glyph-based visualization of multimodal directional distributions in 2D and 3D vector ensemble data. Directional information on the circle/sphere is modeled using mixtures of probability density functions (pdfs), which enables us to characterize the distributions with relatively few parameters. The resulting mixture models are represented by 2D and 3D lobular glyphs showing direction, spread and strength of each principal mode of the distributions. A 3D extension of our approach is realized by means of an efficient GPU rendering technique. We demonstrate our method in the context of ensemble weather simulations.

  13. Ensembl comparative genomics resources.

    PubMed

    Herrero, Javier; Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J; Searle, Stephen M J; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

    2016-01-01

    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org. © The Author(s) 2016. Published by Oxford University Press.

  14. Ensembl comparative genomics resources

    PubMed Central

    Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J.; Searle, Stephen M. J.; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

    2016-01-01

    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org. PMID:26896847

  15. Large unbalanced credit scoring using Lasso-logistic regression ensemble.

    PubMed

    Wang, Hong; Xu, Qingsong; Zhou, Lifeng

    2015-01-01

    Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data.

  16. Thermodynamic-ensemble independence of solvation free energy.

    PubMed

    Chong, Song-Ho; Ham, Sihyun

    2015-02-10

    Solvation free energy is the fundamental thermodynamic quantity in solution chemistry. Recently, it has been suggested that the partial molar volume correction is necessary to convert the solvation free energy determined in different thermodynamic ensembles. Here, we demonstrate ensemble-independence of the solvation free energy on general thermodynamic grounds. Theoretical estimates of the solvation free energy based on the canonical or grand-canonical ensemble are pertinent to experiments carried out under constant pressure without any conversion.

  17. A novel hybrid decomposition-and-ensemble model based on CEEMD and GWO for short-term PM2.5 concentration forecasting

    NASA Astrophysics Data System (ADS)

    Niu, Mingfei; Wang, Yufang; Sun, Shaolong; Li, Yongwu

    2016-06-01

    To enhance prediction reliability and accuracy, a hybrid model based on the promising principle of "decomposition and ensemble" and a recently proposed meta-heuristic called grey wolf optimizer (GWO) is introduced for daily PM2.5 concentration forecasting. Compared with existing PM2.5 forecasting methods, this proposed model has improved the prediction accuracy and hit rates of directional prediction. The proposed model involves three main steps, i.e., decomposing the original PM2.5 series into several intrinsic mode functions (IMFs) via complementary ensemble empirical mode decomposition (CEEMD) for simplifying the complex data; individually predicting each IMF with support vector regression (SVR) optimized by GWO; integrating all predicted IMFs for the ensemble result as the final prediction by another SVR optimized by GWO. Seven benchmark models, including single artificial intelligence (AI) models, other decomposition-ensemble models with different decomposition methods and models with the same decomposition-ensemble method but optimized by different algorithms, are considered to verify the superiority of the proposed hybrid model. The empirical study indicates that the proposed hybrid decomposition-ensemble model is remarkably superior to all considered benchmark models for its higher prediction accuracy and hit rates of directional prediction.

  18. JEnsembl: a version-aware Java API to Ensembl data systems.

    PubMed

    Paterson, Trevor; Law, Andy

    2012-11-01

    The Ensembl Project provides release-specific Perl APIs for efficient high-level programmatic access to data stored in various Ensembl database schema. Although Perl scripts are perfectly suited for processing large volumes of text-based data, Perl is not ideal for developing large-scale software applications nor embedding in graphical interfaces. The provision of a novel Java API would facilitate type-safe, modular, object-orientated development of new Bioinformatics tools with which to access, analyse and visualize Ensembl data. The JEnsembl API implementation provides basic data retrieval and manipulation functionality from the Core, Compara and Variation databases for all species in Ensembl and EnsemblGenomes and is a platform for the development of a richer API to Ensembl datasources. The JEnsembl architecture uses a text-based configuration module to provide evolving, versioned mappings from database schema to code objects. A single installation of the JEnsembl API can therefore simultaneously and transparently connect to current and previous database instances (such as those in the public archive) thus facilitating better analysis repeatability and allowing 'through time' comparative analyses to be performed. Project development, released code libraries, Maven repository and documentation are hosted at SourceForge (http://jensembl.sourceforge.net).

  19. Using THz Spectroscopy, Evolutionary Network Analysis Methods, and MD Simulation to Map the Evolution of Allosteric Communication Pathways in c-Type Lysozymes.

    PubMed

    Woods, Kristina N; Pfeffer, Juergen

    2016-01-01

    It is now widely accepted that protein function is intimately tied with the navigation of energy landscapes. In this framework, a protein sequence is not described by a distinct structure but rather by an ensemble of conformations. And it is through this ensemble that evolution is able to modify a protein's function by altering its landscape. Hence, the evolution of protein functions involves selective pressures that adjust the sampling of the conformational states. In this work, we focus on elucidating the evolutionary pathway that shaped the function of individual proteins that make-up the mammalian c-type lysozyme subfamily. Using both experimental and computational methods, we map out specific intermolecular interactions that direct the sampling of conformational states and accordingly, also underlie shifts in the landscape that are directly connected with the formation of novel protein functions. By contrasting three representative proteins in the family we identify molecular mechanisms that are associated with the selectivity of enhanced antimicrobial properties and consequently, divergent protein function. Namely, we link the extent of localized fluctuations involving the loop separating helices A and B with shifts in the equilibrium of the ensemble of conformational states that mediate interdomain coupling and concurrently moderate substrate binding affinity. This work reveals unique insights into the molecular level mechanisms that promote the progression of interactions that connect the immune response to infection with the nutritional properties of lactation, while also providing a deeper understanding about how evolving energy landscapes may define present-day protein function. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  20. Discrete Molecular Dynamics Can Predict Helical Prestructured Motifs in Disordered Proteins

    PubMed Central

    Han, Kyou-Hoon; Dokholyan, Nikolay V.; Tompa, Péter; Kalmár, Lajos; Hegedűs, Tamás

    2014-01-01

    Intrinsically disordered proteins (IDPs) lack a stable tertiary structure, but their short binding regions termed Pre-Structured Motifs (PreSMo) can form transient secondary structure elements in solution. Although disordered proteins are crucial in many biological processes and designing strategies to modulate their function is highly important, both experimental and computational tools to describe their conformational ensembles and the initial steps of folding are sparse. Here we report that discrete molecular dynamics (DMD) simulations combined with replica exchange (RX) method efficiently samples the conformational space and detects regions populating α-helical conformational states in disordered protein regions. While the available computational methods predict secondary structural propensities in IDPs based on the observation of protein-protein interactions, our ab initio method rests on physical principles of protein folding and dynamics. We show that RX-DMD predicts α-PreSMos with high confidence confirmed by comparison to experimental NMR data. Moreover, the method also can dissect α-PreSMos in close vicinity to each other and indicate helix stability. Importantly, simulations with disordered regions forming helices in X-ray structures of complexes indicate that a preformed helix is frequently the binding element itself, while in other cases it may have a role in initiating the binding process. Our results indicate that RX-DMD provides a breakthrough in the structural and dynamical characterization of disordered proteins by generating the structural ensembles of IDPs even when experimental data are not available. PMID:24763499

  1. Comparison of MM/GBSA calculations based on explicit and implicit solvent simulations.

    PubMed

    Godschalk, Frithjof; Genheden, Samuel; Söderhjelm, Pär; Ryde, Ulf

    2013-05-28

    Molecular mechanics with generalised Born and surface area solvation (MM/GBSA) is a popular method to calculate the free energy of the binding of ligands to proteins. It involves molecular dynamics (MD) simulations with an explicit solvent of the protein-ligand complex to give a set of snapshots for which energies are calculated with an implicit solvent. This change in the solvation method (explicit → implicit) would strictly require that the energies are reweighted with the implicit-solvent energies, which is normally not done. In this paper we calculate MM/GBSA energies with two generalised Born models for snapshots generated by the same methods or by explicit-solvent simulations for five synthetic N-acetyllactosamine derivatives binding to galectin-3. We show that the resulting energies are very different both in absolute and relative terms, showing that the change in the solvent model is far from innocent and that standard MM/GBSA is not a consistent method. The ensembles generated with the various solvent models are quite different with root-mean-square deviations of 1.2-1.4 Å. The ensembles can be converted to each other by performing short MD simulations with the new method, but the convergence is slow, showing mean absolute differences in the calculated energies of 6-7 kJ mol(-1) after 2 ps simulations. Minimisations show even slower convergence and there are strong indications that the energies obtained from minimised structures are different from those obtained by MD.

  2. Reparameterization of All-Atom Dipalmitoylphosphatidylcholine Lipid Parameters Enables Simulation of Fluid Bilayers at Zero Tension

    PubMed Central

    Sonne, Jacob; Jensen, Morten Ø.; Hansen, Flemming Y.; Hemmingsen, Lars; Peters, Günther H.

    2007-01-01

    Molecular dynamics simulations of dipalmitoylphosphatidylcholine (DPPC) lipid bilayers using the CHARMM27 force field in the tensionless isothermal-isobaric (NPT) ensemble give highly ordered, gel-like bilayers with an area per lipid of ∼48 Å2. To obtain fluid (Lα) phase properties of DPPC bilayers represented by the CHARMM energy function in this ensemble, we reparameterized the atomic partial charges in the lipid headgroup and upper parts of the acyl chains. The new charges were determined from the electron structure using both the Mulliken method and the restricted electrostatic potential fitting method. We tested the derived charges in molecular dynamics simulations of a fully hydrated DPPC bilayer. Only the simulation with the new restricted electrostatic potential charges shows significant improvements compared with simulations using the original CHARMM27 force field resulting in an area per lipid of 60.4 ± 0.1 Å2. Compared to the 48 Å2, the new value of 60.4 Å2 is in fair agreement with the experimental value of 64 Å2. In addition, the simulated order parameter profile and electron density profile are in satisfactory agreement with experimental data. Thus, the biologically more interesting fluid phase of DPPC bilayers can now be simulated in all-atom simulations in the NPT ensemble by employing our modified CHARMM27 force field. PMID:17400696

  3. Converging free energies of binding in cucurbit[7]uril and octa-acid host-guest systems from SAMPL4 using expanded ensemble simulations

    NASA Astrophysics Data System (ADS)

    Monroe, Jacob I.; Shirts, Michael R.

    2014-04-01

    Molecular containers such as cucurbit[7]uril (CB7) and the octa-acid (OA) host are ideal simplified model test systems for optimizing and analyzing methods for computing free energies of binding intended for use with biologically relevant protein-ligand complexes. To this end, we have performed initially blind free energy calculations to determine the free energies of binding for ligands of both the CB7 and OA hosts. A subset of the selected guest molecules were those included in the SAMPL4 prediction challenge. Using expanded ensemble simulations in the dimension of coupling host-guest intermolecular interactions, we are able to show that our estimates in most cases can be demonstrated to fully converge and that the errors in our estimates are due almost entirely to the assigned force field parameters and the choice of environmental conditions used to model experiment. We confirm the convergence through the use of alternative simulation methodologies and thermodynamic pathways, analyzing sampled conformations, and directly observing changes of the free energy with respect to simulation time. Our results demonstrate the benefits of enhanced sampling of multiple local free energy minima made possible by the use of expanded ensemble molecular dynamics and may indicate the presence of significant problems with current transferable force fields for organic molecules when used for calculating binding affinities, especially in non-protein chemistries.

  4. Converging free energies of binding in cucurbit[7]uril and octa-acid host-guest systems from SAMPL4 using expanded ensemble simulations.

    PubMed

    Monroe, Jacob I; Shirts, Michael R

    2014-04-01

    Molecular containers such as cucurbit[7]uril (CB7) and the octa-acid (OA) host are ideal simplified model test systems for optimizing and analyzing methods for computing free energies of binding intended for use with biologically relevant protein-ligand complexes. To this end, we have performed initially blind free energy calculations to determine the free energies of binding for ligands of both the CB7 and OA hosts. A subset of the selected guest molecules were those included in the SAMPL4 prediction challenge. Using expanded ensemble simulations in the dimension of coupling host-guest intermolecular interactions, we are able to show that our estimates in most cases can be demonstrated to fully converge and that the errors in our estimates are due almost entirely to the assigned force field parameters and the choice of environmental conditions used to model experiment. We confirm the convergence through the use of alternative simulation methodologies and thermodynamic pathways, analyzing sampled conformations, and directly observing changes of the free energy with respect to simulation time. Our results demonstrate the benefits of enhanced sampling of multiple local free energy minima made possible by the use of expanded ensemble molecular dynamics and may indicate the presence of significant problems with current transferable force fields for organic molecules when used for calculating binding affinities, especially in non-protein chemistries.

  5. Fluorescence quenching by TEMPO: a sub-30 A single-molecule ruler.

    PubMed

    Zhu, Peizhi; Clamme, Jean-Pierre; Deniz, Ashok A

    2005-11-01

    A series of DNA molecules labeled with 5-carboxytetramethylrhodamine (5-TAMRA) and the small nitroxide radical TEMPO were synthesized and tested to investigate whether the intramolecular quenching efficiency can be used to measure short intramolecular distances in small ensemble and single-molecule experiments. In combination with distance calculations using molecular mechanics modeling, the experimental results from steady-state ensemble fluorescence and fluorescence correlation spectroscopy measurements both show an exponential decrease in the quenching rate constant with the dye-quencher distance in the 10-30 A range. The results demonstrate that TEMPO-5-TAMRA fluorescence quenching is a promising method to measure short distance changes within single biomolecules.

  6. Navigating ligand protein binding free energy landscapes: universality and diversity of protein folding and molecular recognition mechanisms

    NASA Astrophysics Data System (ADS)

    Verkhivker, Gennady M.; Rejto, Paul A.; Bouzida, Djamal; Arthurs, Sandra; Colson, Anthony B.; Freer, Stephan T.; Gehlhaar, Daniel K.; Larson, Veda; Luty, Brock A.; Marrone, Tami; Rose, Peter W.

    2001-03-01

    Thermodynamic and kinetic aspects of ligand-protein binding are studied for the methotrexate-dihydrofolate reductase system from the binding free energy profile constructed as a function of the order parameter. Thermodynamic stability of the native complex and a cooperative transition to the unique native structure suggest the nucleation kinetic mechanism at the equilibrium transition temperature. Structural properties of the transition state ensemble and the ensemble of nucleation conformations are determined by kinetic simulations of the transmission coefficient and ligand-protein association pathways. Structural analysis of the transition states and the nucleation conformations reconciles different views on the nucleation mechanism in protein folding.

  7. Competitive Learning Neural Network Ensemble Weighted by Predicted Performance

    ERIC Educational Resources Information Center

    Ye, Qiang

    2010-01-01

    Ensemble approaches have been shown to enhance classification by combining the outputs from a set of voting classifiers. Diversity in error patterns among base classifiers promotes ensemble performance. Multi-task learning is an important characteristic for Neural Network classifiers. Introducing a secondary output unit that receives different…

  8. Encompassing receptor flexibility in virtual screening using ensemble docking-based hybrid QSAR: discovery of novel phytochemicals for BACE1 inhibition.

    PubMed

    Chakraborty, Sandipan; Ramachandran, Balaji; Basu, Soumalee

    2014-10-01

    Mimicking receptor flexibility during receptor-ligand binding is a challenging task in computational drug design since it is associated with a large increase in the conformational search space. In the present study, we have devised an in silico design strategy incorporating receptor flexibility in virtual screening to identify potential lead compounds as inhibitors for flexible proteins. We have considered BACE1 (β-secretase), a key target protease from a therapeutic perspective for Alzheimer's disease, as the highly flexible receptor. The protein undergoes significant conformational transitions from open to closed form upon ligand binding, which makes it a difficult target for inhibitor design. We have designed a hybrid structure-activity model containing both ligand based descriptors and energetic descriptors obtained from molecular docking based on a dataset of structurally diverse BACE1 inhibitors. An ensemble of receptor conformations have been used in the docking study, further improving the prediction ability of the model. The designed model that shows significant prediction ability judged by several statistical parameters has been used to screen an in house developed 3-D structural library of 731 phytochemicals. 24 highly potent, novel BACE1 inhibitors with predicted activity (Ki) ≤ 50 nM have been identified. Detailed analysis reveals pharmacophoric features of these novel inhibitors required to inhibit BACE1.

  9. Ensembles of satellite aerosol retrievals based on three AATSR algorithms within aerosol_cci

    NASA Astrophysics Data System (ADS)

    Kosmale, Miriam; Popp, Thomas

    2016-04-01

    Ensemble techniques are widely used in the modelling community, combining different modelling results in order to reduce uncertainties. This approach could be also adapted to satellite measurements. Aerosol_cci is an ESA funded project, where most of the European aerosol retrieval groups work together. The different algorithms are homogenized as far as it makes sense, but remain essentially different. Datasets are compared with ground based measurements and between each other. Three AATSR algorithms (Swansea university aerosol retrieval, ADV aerosol retrieval by FMI and Oxford aerosol retrieval ORAC) provide within this project 17 year global aerosol records. Each of these algorithms provides also uncertainty information on pixel level. Within the presented work, an ensembles of the three AATSR algorithms is performed. The advantage over each single algorithm is the higher spatial coverage due to more measurement pixels per gridbox. A validation to ground based AERONET measurements shows still a good correlation of the ensemble, compared to the single algorithms. Annual mean maps show the global aerosol distribution, based on a combination of the three aerosol algorithms. In addition, pixel level uncertainties of each algorithm are used for weighting the contributions, in order to reduce the uncertainty of the ensemble. Results of different versions of the ensembles for aerosol optical depth will be presented and discussed. The results are validated against ground based AERONET measurements. A higher spatial coverage on daily basis allows better results in annual mean maps. The benefit of using pixel level uncertainties is analysed.

  10. KinImmerse: Macromolecular VR for NMR ensembles

    PubMed Central

    Block, Jeremy N; Zielinski, David J; Chen, Vincent B; Davis, Ian W; Vinson, E Claire; Brady, Rachael; Richardson, Jane S; Richardson, David C

    2009-01-01

    Background In molecular applications, virtual reality (VR) and immersive virtual environments have generally been used and valued for the visual and interactive experience – to enhance intuition and communicate excitement – rather than as part of the actual research process. In contrast, this work develops a software infrastructure for research use and illustrates such use on a specific case. Methods The Syzygy open-source toolkit for VR software was used to write the KinImmerse program, which translates the molecular capabilities of the kinemage graphics format into software for display and manipulation in the DiVE (Duke immersive Virtual Environment) or other VR system. KinImmerse is supported by the flexible display construction and editing features in the KiNG kinemage viewer and it implements new forms of user interaction in the DiVE. Results In addition to molecular visualizations and navigation, KinImmerse provides a set of research tools for manipulation, identification, co-centering of multiple models, free-form 3D annotation, and output of results. The molecular research test case analyzes the local neighborhood around an individual atom within an ensemble of nuclear magnetic resonance (NMR) models, enabling immersive visual comparison of the local conformation with the local NMR experimental data, including target curves for residual dipolar couplings (RDCs). Conclusion The promise of KinImmerse for production-level molecular research in the DiVE is shown by the locally co-centered RDC visualization developed there, which gave new insights now being pursued in wider data analysis. PMID:19222844

  11. Highly selective and sensitive macrocycle-based dinuclear foldamer for fluorometric and colorimetric sensing of citrate in water.

    PubMed

    Rhaman, Md Mhahabubur; Hasan, Mohammad H; Alamgir, Azmain; Xu, Lihua; Powell, Douglas R; Wong, Bryan M; Tandon, Ritesh; Hossain, Md Alamgir

    2018-01-10

    The selective detection of citrate anions is essential for various biological functions in living systems. A quantitative assessment of citrate is required for the diagnosis of various diseases in the human body; however, it is extremely challenging to develop efficient fluorescence and color-detecting molecular probes for sensing citrate in water. Herein, we report a macrocycle-based dinuclear foldamer (1) assembled with eosin Y (EY) that has been studied for anion binding by fluorescence and colorimetric techniques in water at neutral pH. Results from the fluorescence titrations reveal that the 1·EY ensemble strongly binds citrate anions, showing remarkable selectivity over a wide range of inorganic and carboxylate anions. The addition of citrate anions to the 1·EY adduct led to a large fluorescence enhancement, displaying a detectable color change under both visible and UV light in water up to 2 μmol. The biocompatibility of 1·EY as an intracellular carrier in a biological system was evaluated on primary human foreskin fibroblast (HF) cells, showing an excellent cell viability. The strong binding properties of the ensemble allow it to be used as a highly sensitive, detective probe for biologically relevant citrate anions in various applications.

  12. Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms.

    PubMed

    Ozcift, Akin; Gulten, Arif

    2011-12-01

    Improving accuracies of machine learning algorithms is vital in designing high performance computer-aided diagnosis (CADx) systems. Researches have shown that a base classifier performance might be enhanced by ensemble classification strategies. In this study, we construct rotation forest (RF) ensemble classifiers of 30 machine learning algorithms to evaluate their classification performances using Parkinson's, diabetes and heart diseases from literature. While making experiments, first the feature dimension of three datasets is reduced using correlation based feature selection (CFS) algorithm. Second, classification performances of 30 machine learning algorithms are calculated for three datasets. Third, 30 classifier ensembles are constructed based on RF algorithm to assess performances of respective classifiers with the same disease data. All the experiments are carried out with leave-one-out validation strategy and the performances of the 60 algorithms are evaluated using three metrics; classification accuracy (ACC), kappa error (KE) and area under the receiver operating characteristic (ROC) curve (AUC). Base classifiers succeeded 72.15%, 77.52% and 84.43% average accuracies for diabetes, heart and Parkinson's datasets, respectively. As for RF classifier ensembles, they produced average accuracies of 74.47%, 80.49% and 87.13% for respective diseases. RF, a newly proposed classifier ensemble algorithm, might be used to improve accuracy of miscellaneous machine learning algorithms to design advanced CADx systems. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  13. Optimal Superpositioning of Flexible Molecule Ensembles

    PubMed Central

    Gapsys, Vytautas; de Groot, Bert L.

    2013-01-01

    Analysis of the internal dynamics of a biological molecule requires the successful removal of overall translation and rotation. Particularly for flexible or intrinsically disordered peptides, this is a challenging task due to the absence of a well-defined reference structure that could be used for superpositioning. In this work, we started the analysis with a widely known formulation of an objective for the problem of superimposing a set of multiple molecules as variance minimization over an ensemble. A negative effect of this superpositioning method is the introduction of ambiguous rotations, where different rotation matrices may be applied to structurally similar molecules. We developed two algorithms to resolve the suboptimal rotations. The first approach minimizes the variance together with the distance of a structure to a preceding molecule in the ensemble. The second algorithm seeks for minimal variance together with the distance to the nearest neighbors of each structure. The newly developed methods were applied to molecular-dynamics trajectories and normal-mode ensembles of the Aβ peptide, RS peptide, and lysozyme. These new (to our knowledge) superpositioning methods combine the benefits of variance and distance between nearest-neighbor(s) minimization, providing a solution for the analysis of intrinsic motions of flexible molecules and resolving ambiguous rotations. PMID:23332072

  14. Are Charge-State Distributions a Reliable Tool Describing Molecular Ensembles of Intrinsically Disordered Proteins by Native MS?

    NASA Astrophysics Data System (ADS)

    Natalello, Antonino; Santambrogio, Carlo; Grandori, Rita

    2017-01-01

    Native mass spectrometry (MS) has become a central tool of structural proteomics, but its applicability to the peculiar class of intrinsically disordered proteins (IDPs) is still object of debate. IDPs lack an ordered tridimensional structure and are characterized by high conformational plasticity. Since they represent valuable targets for cancer and neurodegeneration research, there is an urgent need of methodological advances for description of the conformational ensembles populated by these proteins in solution. However, structural rearrangements during electrospray-ionization (ESI) or after the transfer to the gas phase could affect data obtained by native ESI-MS. In particular, charge-state distributions (CSDs) are affected by protein conformation inside ESI droplets, while ion mobility (IM) reflects protein conformation in the gas phase. This review focuses on the available evidence relating IDP solution ensembles with CSDs, trying to summarize cases of apparent consistency or discrepancy. The protein-specificity of ionization patterns and their responses to ligands and buffer conditions suggests that CSDs are imprinted to protein structural features also in the case of IDPs. Nevertheless, it seems that these proteins are more easily affected by electrospray conditions, leading in some cases to rearrangements of the conformational ensembles.

  15. Are Charge-State Distributions a Reliable Tool Describing Molecular Ensembles of Intrinsically Disordered Proteins by Native MS?

    PubMed

    Natalello, Antonino; Santambrogio, Carlo; Grandori, Rita

    2017-01-01

    Native mass spectrometry (MS) has become a central tool of structural proteomics, but its applicability to the peculiar class of intrinsically disordered proteins (IDPs) is still object of debate. IDPs lack an ordered tridimensional structure and are characterized by high conformational plasticity. Since they represent valuable targets for cancer and neurodegeneration research, there is an urgent need of methodological advances for description of the conformational ensembles populated by these proteins in solution. However, structural rearrangements during electrospray-ionization (ESI) or after the transfer to the gas phase could affect data obtained by native ESI-MS. In particular, charge-state distributions (CSDs) are affected by protein conformation inside ESI droplets, while ion mobility (IM) reflects protein conformation in the gas phase. This review focuses on the available evidence relating IDP solution ensembles with CSDs, trying to summarize cases of apparent consistency or discrepancy. The protein-specificity of ionization patterns and their responses to ligands and buffer conditions suggests that CSDs are imprinted to protein structural features also in the case of IDPs. Nevertheless, it seems that these proteins are more easily affected by electrospray conditions, leading in some cases to rearrangements of the conformational ensembles. Graphical Abstract ᅟ.

  16. Nonlinear intrinsic variables and state reconstruction in multiscale simulations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dsilva, Carmeline J., E-mail: cdsilva@princeton.edu; Talmon, Ronen, E-mail: ronen.talmon@yale.edu; Coifman, Ronald R., E-mail: coifman@math.yale.edu

    2013-11-14

    Finding informative low-dimensional descriptions of high-dimensional simulation data (like the ones arising in molecular dynamics or kinetic Monte Carlo simulations of physical and chemical processes) is crucial to understanding physical phenomena, and can also dramatically assist in accelerating the simulations themselves. In this paper, we discuss and illustrate the use of nonlinear intrinsic variables (NIV) in the mining of high-dimensional multiscale simulation data. In particular, we focus on the way NIV allows us to functionally merge different simulation ensembles, and different partial observations of these ensembles, as well as to infer variables not explicitly measured. The approach relies on certainmore » simple features of the underlying process variability to filter out measurement noise and systematically recover a unique reference coordinate frame. We illustrate the approach through two distinct sets of atomistic simulations: a stochastic simulation of an enzyme reaction network exhibiting both fast and slow time scales, and a molecular dynamics simulation of alanine dipeptide in explicit water.« less

  17. Nonlinear intrinsic variables and state reconstruction in multiscale simulations

    NASA Astrophysics Data System (ADS)

    Dsilva, Carmeline J.; Talmon, Ronen; Rabin, Neta; Coifman, Ronald R.; Kevrekidis, Ioannis G.

    2013-11-01

    Finding informative low-dimensional descriptions of high-dimensional simulation data (like the ones arising in molecular dynamics or kinetic Monte Carlo simulations of physical and chemical processes) is crucial to understanding physical phenomena, and can also dramatically assist in accelerating the simulations themselves. In this paper, we discuss and illustrate the use of nonlinear intrinsic variables (NIV) in the mining of high-dimensional multiscale simulation data. In particular, we focus on the way NIV allows us to functionally merge different simulation ensembles, and different partial observations of these ensembles, as well as to infer variables not explicitly measured. The approach relies on certain simple features of the underlying process variability to filter out measurement noise and systematically recover a unique reference coordinate frame. We illustrate the approach through two distinct sets of atomistic simulations: a stochastic simulation of an enzyme reaction network exhibiting both fast and slow time scales, and a molecular dynamics simulation of alanine dipeptide in explicit water.

  18. Time-course, negative-stain electron microscopy–based analysis for investigating protein–protein interactions at the single-molecule level

    PubMed Central

    Nogal, Bartek; Bowman, Charles A.; Ward, Andrew B.

    2017-01-01

    Several biophysical approaches are available to study protein–protein interactions. Most approaches are conducted in bulk solution, and are therefore limited to an average measurement of the ensemble of molecular interactions. Here, we show how single-particle EM can enrich our understanding of protein–protein interactions at the single-molecule level and potentially capture states that are unobservable with ensemble methods because they are below the limit of detection or not conducted on an appropriate time scale. Using the HIV-1 envelope glycoprotein (Env) and its interaction with receptor CD4-binding site neutralizing antibodies as a model system, we both corroborate ensemble kinetics-derived parameters and demonstrate how time-course EM can further dissect stoichiometric states of complexes that are not readily observable with other methods. Visualization of the kinetics and stoichiometry of Env–antibody complexes demonstrated the applicability of our approach to qualitatively and semi-quantitatively differentiate two highly similar neutralizing antibodies. Furthermore, implementation of machine-learning techniques for sorting class averages of these complexes into discrete subclasses of particles helped reduce human bias. Our data provide proof of concept that single-particle EM can be used to generate a “visual” kinetic profile that should be amenable to studying many other protein–protein interactions, is relatively simple and complementary to well-established biophysical approaches. Moreover, our method provides critical insights into broadly neutralizing antibody recognition of Env, which may inform vaccine immunogen design and immunotherapeutic development. PMID:28972148

  19. Ensemble-based evaluation for protein structure models.

    PubMed

    Jamroz, Michal; Kolinski, Andrzej; Kihara, Daisuke

    2016-06-15

    Comparing protein tertiary structures is a fundamental procedure in structural biology and protein bioinformatics. Structure comparison is important particularly for evaluating computational protein structure models. Most of the model structure evaluation methods perform rigid body superimposition of a structure model to its crystal structure and measure the difference of the corresponding residue or atom positions between them. However, these methods neglect intrinsic flexibility of proteins by treating the native structure as a rigid molecule. Because different parts of proteins have different levels of flexibility, for example, exposed loop regions are usually more flexible than the core region of a protein structure, disagreement of a model to the native needs to be evaluated differently depending on the flexibility of residues in a protein. We propose a score named FlexScore for comparing protein structures that consider flexibility of each residue in the native state of proteins. Flexibility information may be extracted from experiments such as NMR or molecular dynamics simulation. FlexScore considers an ensemble of conformations of a protein described as a multivariate Gaussian distribution of atomic displacements and compares a query computational model with the ensemble. We compare FlexScore with other commonly used structure similarity scores over various examples. FlexScore agrees with experts' intuitive assessment of computational models and provides information of practical usefulness of models. https://bitbucket.org/mjamroz/flexscore dkihara@purdue.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  20. Ensemble-based evaluation for protein structure models

    PubMed Central

    Jamroz, Michal; Kolinski, Andrzej; Kihara, Daisuke

    2016-01-01

    Motivation: Comparing protein tertiary structures is a fundamental procedure in structural biology and protein bioinformatics. Structure comparison is important particularly for evaluating computational protein structure models. Most of the model structure evaluation methods perform rigid body superimposition of a structure model to its crystal structure and measure the difference of the corresponding residue or atom positions between them. However, these methods neglect intrinsic flexibility of proteins by treating the native structure as a rigid molecule. Because different parts of proteins have different levels of flexibility, for example, exposed loop regions are usually more flexible than the core region of a protein structure, disagreement of a model to the native needs to be evaluated differently depending on the flexibility of residues in a protein. Results: We propose a score named FlexScore for comparing protein structures that consider flexibility of each residue in the native state of proteins. Flexibility information may be extracted from experiments such as NMR or molecular dynamics simulation. FlexScore considers an ensemble of conformations of a protein described as a multivariate Gaussian distribution of atomic displacements and compares a query computational model with the ensemble. We compare FlexScore with other commonly used structure similarity scores over various examples. FlexScore agrees with experts’ intuitive assessment of computational models and provides information of practical usefulness of models. Availability and implementation: https://bitbucket.org/mjamroz/flexscore Contact: dkihara@purdue.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27307633

  1. Multiple-instance ensemble learning for hyperspectral images

    NASA Astrophysics Data System (ADS)

    Ergul, Ugur; Bilgin, Gokhan

    2017-10-01

    An ensemble framework for multiple-instance (MI) learning (MIL) is introduced for use in hyperspectral images (HSIs) by inspiring the bagging (bootstrap aggregation) method in ensemble learning. Ensemble-based bagging is performed by a small percentage of training samples, and MI bags are formed by a local windowing process with variable window sizes on selected instances. In addition to bootstrap aggregation, random subspace is another method used to diversify base classifiers. The proposed method is implemented using four MIL classification algorithms. The classifier model learning phase is carried out with MI bags, and the estimation phase is performed over single-test instances. In the experimental part of the study, two different HSIs that have ground-truth information are used, and comparative results are demonstrated with state-of-the-art classification methods. In general, the MI ensemble approach produces more compact results in terms of both diversity and error compared to equipollent non-MIL algorithms.

  2. Quantum Ensemble Classification: A Sampling-Based Learning Control Approach.

    PubMed

    Chen, Chunlin; Dong, Daoyi; Qi, Bo; Petersen, Ian R; Rabitz, Herschel

    2017-06-01

    Quantum ensemble classification (QEC) has significant applications in discrimination of atoms (or molecules), separation of isotopes, and quantum information extraction. However, quantum mechanics forbids deterministic discrimination among nonorthogonal states. The classification of inhomogeneous quantum ensembles is very challenging, since there exist variations in the parameters characterizing the members within different classes. In this paper, we recast QEC as a supervised quantum learning problem. A systematic classification methodology is presented by using a sampling-based learning control (SLC) approach for quantum discrimination. The classification task is accomplished via simultaneously steering members belonging to different classes to their corresponding target states (e.g., mutually orthogonal states). First, a new discrimination method is proposed for two similar quantum systems. Then, an SLC method is presented for QEC. Numerical results demonstrate the effectiveness of the proposed approach for the binary classification of two-level quantum ensembles and the multiclass classification of multilevel quantum ensembles.

  3. Characterizing RNA ensembles from NMR data with kinematic models

    PubMed Central

    Fonseca, Rasmus; Pachov, Dimitar V.; Bernauer, Julie; van den Bedem, Henry

    2014-01-01

    Functional mechanisms of biomolecules often manifest themselves precisely in transient conformational substates. Researchers have long sought to structurally characterize dynamic processes in non-coding RNA, combining experimental data with computer algorithms. However, adequate exploration of conformational space for these highly dynamic molecules, starting from static crystal structures, remains challenging. Here, we report a new conformational sampling procedure, KGSrna, which can efficiently probe the native ensemble of RNA molecules in solution. We found that KGSrna ensembles accurately represent the conformational landscapes of 3D RNA encoded by NMR proton chemical shifts. KGSrna resolves motionally averaged NMR data into structural contributions; when coupled with residual dipolar coupling data, a KGSrna ensemble revealed a previously uncharacterized transient excited state of the HIV-1 trans-activation response element stem–loop. Ensemble-based interpretations of averaged data can aid in formulating and testing dynamic, motion-based hypotheses of functional mechanisms in RNAs with broad implications for RNA engineering and therapeutic intervention. PMID:25114056

  4. A target recognition method for maritime surveillance radars based on hybrid ensemble selection

    NASA Astrophysics Data System (ADS)

    Fan, Xueman; Hu, Shengliang; He, Jingbo

    2017-11-01

    In order to improve the generalisation ability of the maritime surveillance radar, a novel ensemble selection technique, termed Optimisation and Dynamic Selection (ODS), is proposed. During the optimisation phase, the non-dominated sorting genetic algorithm II for multi-objective optimisation is used to find the Pareto front, i.e. a set of ensembles of classifiers representing different tradeoffs between the classification error and diversity. During the dynamic selection phase, the meta-learning method is used to predict whether a candidate ensemble is competent enough to classify a query instance based on three different aspects, namely, feature space, decision space and the extent of consensus. The classification performance and time complexity of ODS are compared against nine other ensemble methods using a self-built full polarimetric high resolution range profile data-set. The experimental results clearly show the effectiveness of ODS. In addition, the influence of the selection of diversity measures is studied concurrently.

  5. Hybrid Data Assimilation without Ensemble Filtering

    NASA Technical Reports Server (NTRS)

    Todling, Ricardo; Akkraoui, Amal El

    2014-01-01

    The Global Modeling and Assimilation Office is preparing to upgrade its three-dimensional variational system to a hybrid approach in which the ensemble is generated using a square-root ensemble Kalman filter (EnKF) and the variational problem is solved using the Grid-point Statistical Interpolation system. As in most EnKF applications, we found it necessary to employ a combination of multiplicative and additive inflations, to compensate for sampling and modeling errors, respectively and, to maintain the small-member ensemble solution close to the variational solution; we also found it necessary to re-center the members of the ensemble about the variational analysis. During tuning of the filter we have found re-centering and additive inflation to play a considerably larger role than expected, particularly in a dual-resolution context when the variational analysis is ran at larger resolution than the ensemble. This led us to consider a hybrid strategy in which the members of the ensemble are generated by simply converting the variational analysis to the resolution of the ensemble and applying additive inflation, thus bypassing the EnKF. Comparisons of this, so-called, filter-free hybrid procedure with an EnKF-based hybrid procedure and a control non-hybrid, traditional, scheme show both hybrid strategies to provide equally significant improvement over the control; more interestingly, the filter-free procedure was found to give qualitatively similar results to the EnKF-based procedure.

  6. Mass Conservation and Positivity Preservation with Ensemble-type Kalman Filter Algorithms

    NASA Technical Reports Server (NTRS)

    Janjic, Tijana; McLaughlin, Dennis B.; Cohn, Stephen E.; Verlaan, Martin

    2013-01-01

    Maintaining conservative physical laws numerically has long been recognized as being important in the development of numerical weather prediction (NWP) models. In the broader context of data assimilation, concerted efforts to maintain conservation laws numerically and to understand the significance of doing so have begun only recently. In order to enforce physically based conservation laws of total mass and positivity in the ensemble Kalman filter, we incorporate constraints to ensure that the filter ensemble members and the ensemble mean conserve mass and remain nonnegative through measurement updates. We show that the analysis steps of ensemble transform Kalman filter (ETKF) algorithm and ensemble Kalman filter algorithm (EnKF) can conserve the mass integral, but do not preserve positivity. Further, if localization is applied or if negative values are simply set to zero, then the total mass is not conserved either. In order to ensure mass conservation, a projection matrix that corrects for localization effects is constructed. In order to maintain both mass conservation and positivity preservation through the analysis step, we construct a data assimilation algorithms based on quadratic programming and ensemble Kalman filtering. Mass and positivity are both preserved by formulating the filter update as a set of quadratic programming problems that incorporate constraints. Some simple numerical experiments indicate that this approach can have a significant positive impact on the posterior ensemble distribution, giving results that are more physically plausible both for individual ensemble members and for the ensemble mean. The results show clear improvements in both analyses and forecasts, particularly in the presence of localized features. Behavior of the algorithm is also tested in presence of model error.

  7. Simulation of wave packet tunneling of interacting identical particles

    NASA Astrophysics Data System (ADS)

    Lozovik, Yu. E.; Filinov, A. V.; Arkhipov, A. S.

    2003-02-01

    We demonstrate a different method of simulation of nonstationary quantum processes, considering the tunneling of two interacting identical particles, represented by wave packets. The used method of quantum molecular dynamics (WMD) is based on the Wigner representation of quantum mechanics. In the context of this method ensembles of classical trajectories are used to solve quantum Wigner-Liouville equation. These classical trajectories obey Hamiltonian-like equations, where the effective potential consists of the usual classical term and the quantum term, which depends on the Wigner function and its derivatives. The quantum term is calculated using local distribution of trajectories in phase space, therefore, classical trajectories are not independent, contrary to classical molecular dynamics. The developed WMD method takes into account the influence of exchange and interaction between particles. The role of direct and exchange interactions in tunneling is analyzed. The tunneling times for interacting particles are calculated.

  8. Linking well-tempered metadynamics simulations with experiments.

    PubMed

    Barducci, Alessandro; Bonomi, Massimiliano; Parrinello, Michele

    2010-05-19

    Linking experiments with the atomistic resolution provided by molecular dynamics simulations can shed light on the structure and dynamics of protein-disordered states. The sampling limitations of classical molecular dynamics can be overcome using metadynamics, which is based on the introduction of a history-dependent bias on a small number of suitably chosen collective variables. Even if such bias distorts the probability distribution of the other degrees of freedom, the equilibrium Boltzmann distribution can be reconstructed using a recently developed reweighting algorithm. Quantitative comparison with experimental data is thus possible. Here we show the potential of this combined approach by characterizing the conformational ensemble explored by a 13-residue helix-forming peptide by means of a well-tempered metadynamics/parallel tempering approach and comparing the reconstructed nuclear magnetic resonance scalar couplings with experimental data. Copyright (c) 2010 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  9. Mechanochemical models of processive molecular motors

    NASA Astrophysics Data System (ADS)

    Lan, Ganhui; Sun, Sean X.

    2012-05-01

    Motor proteins are the molecular engines powering the living cell. These nanometre-sized molecules convert chemical energy, both enthalpic and entropic, into useful mechanical work. High resolution single molecule experiments can now observe motor protein movement with increasing precision. The emerging data must be combined with structural and kinetic measurements to develop a quantitative mechanism. This article describes a modelling framework where quantitative understanding of motor behaviour can be developed based on the protein structure. The framework is applied to myosin motors, with emphasis on how synchrony between motor domains give rise to processive unidirectional movement. The modelling approach shows that the elasticity of protein domains are important in regulating motor function. Simple models of protein domain elasticity are presented. The framework can be generalized to other motor systems, or an ensemble of motors such as muscle contraction. Indeed, for hundreds of myosins, our framework can be reduced to the Huxely-Simmons description of muscle movement in the mean-field limit.

  10. Thermal density functional theory, ensemble density functional theory, and potential functional theory for warm dense matter

    NASA Astrophysics Data System (ADS)

    Pribram-Jones, Aurora

    Warm dense matter (WDM) is a high energy phase between solids and plasmas, with characteristics of both. It is present in the centers of giant planets, within the earth's core, and on the path to ignition of inertial confinement fusion. The high temperatures and pressures of warm dense matter lead to complications in its simulation, as both classical and quantum effects must be included. One of the most successful simulation methods is density functional theory-molecular dynamics (DFT-MD). Despite great success in a diverse array of applications, DFT-MD remains computationally expensive and it neglects the explicit temperature dependence of electron-electron interactions known to exist within exact DFT. Finite-temperature density functional theory (FT DFT) is an extension of the wildly successful ground-state DFT formalism via thermal ensembles, broadening its quantum mechanical treatment of electrons to include systems at non-zero temperatures. Exact mathematical conditions have been used to predict the behavior of approximations in limiting conditions and to connect FT DFT to the ground-state theory. An introduction to FT DFT is given within the context of ensemble DFT and the larger field of DFT is discussed for context. Ensemble DFT is used to describe ensembles of ground-state and excited systems. Exact conditions in ensemble DFT and the performance of approximations depend on ensemble weights. Using an inversion method, exact Kohn-Sham ensemble potentials are found and compared to approximations. The symmetry eigenstate Hartree-exchange approximation is in good agreement with exact calculations because of its inclusion of an ensemble derivative discontinuity. Since ensemble weights in FT DFT are temperature-dependent Fermi weights, this insight may help develop approximations well-suited to both ground-state and FT DFT. A novel, highly efficient approach to free energy calculations, finite-temperature potential functional theory, is derived, which has the potential to transform the simulation of warm dense matter. As a semiclassical method, it connects the normally disparate regimes of cold condensed matter physics and hot plasma physics. This orbital-free approach captures the smooth classical density envelope and quantum density oscillations that are both crucial to accurate modeling of materials where temperature and pressure effects are influential.

  11. Fluorescent Binary Ensemble Based on Pyrene Derivative and Sodium Dodecyl Sulfate Assemblies as a Chemical Tongue for Discriminating Metal Ions and Brand Water.

    PubMed

    Zhang, Lijun; Huang, Xinyan; Cao, Yuan; Xin, Yunhong; Ding, Liping

    2017-12-22

    Enormous effort has been put to the detection and recognition of various heavy metal ions due to their involvement in serious environmental pollution and many major diseases. The present work has developed a single fluorescent sensor ensemble that can distinguish and identify a variety of heavy metal ions. A pyrene-based fluorophore (PB) containing a metal ion receptor group was specially designed and synthesized. Anionic surfactant sodium dodecyl sulfate (SDS) assemblies can effectively adjust its fluorescence behavior. The selected binary ensemble based on PB/SDS assemblies can exhibit multiple emission bands and provide wavelength-based cross-reactive responses to a series of metal ions to realize pattern recognition ability. The combination of surfactant assembly modulation and the receptor for metal ions empowers the present sensor ensemble with strong discrimination power, which could well differentiate 13 metal ions, including Cu 2+ , Co 2+ , Ni 2+ , Cr 3+ , Hg 2+ , Fe 3+ , Zn 2+ , Cd 2+ , Al 3+ , Pb 2+ , Ca 2+ , Mg 2+ , and Ba 2+ . Moreover, this single sensing ensemble could be further applied for identifying different brands of drinking water.

  12. Small-angle neutron scattering study of a monoclonal antibody using free-energy constraints.

    PubMed

    Clark, Nicholas J; Zhang, Hailiang; Krueger, Susan; Lee, Hyo Jin; Ketchem, Randal R; Kerwin, Bruce; Kanapuram, Sekhar R; Treuheit, Michael J; McAuley, Arnold; Curtis, Joseph E

    2013-11-14

    Monoclonal antibodies (mAbs) contain hinge-like regions that enable structural flexibility of globular domains that have a direct effect on biological function. A subclass of mAbs, IgG2, have several interchain disulfide bonds in the hinge region that could potentially limit structural flexibility of the globular domains and affect the overall configuration space available to the mAb. We have characterized human IgG2 mAb in solution via small-angle neutron scattering (SANS) and interpreted the scattering data using atomistic models. Molecular Monte Carlo combined with molecular dynamics simulations of a model mAb indicate that a wide range of structural configurations are plausible, spanning radius of gyration values from ∼39 to ∼55 Å. Structural ensembles and representative single structure solutions were derived by comparison of theoretical SANS profiles of mAb models to experimental SANS data. Additionally, molecular mechanical and solvation free-energy calculations were carried out on the ensemble of best-fitting mAb structures. The results of this study indicate that low-resolution techniques like small-angle scattering combined with atomistic molecular simulations with free-energy analysis may be helpful to determine the types of intramolecular interactions that influence function and could lead to deleterious changes to mAb structure. This methodology will be useful to analyze small-angle scattering data of many macromolecular systems.

  13. Large Unbalanced Credit Scoring Using Lasso-Logistic Regression Ensemble

    PubMed Central

    Wang, Hong; Xu, Qingsong; Zhou, Lifeng

    2015-01-01

    Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data. PMID:25706988

  14. Ocean state and uncertainty forecasts using HYCOM with Local Ensemble Transfer Kalman Filter (LETKF)

    NASA Astrophysics Data System (ADS)

    Wei, Mozheng; Hogan, Pat; Rowley, Clark; Smedstad, Ole-Martin; Wallcraft, Alan; Penny, Steve

    2017-04-01

    An ensemble forecast system based on the US Navy's operational HYCOM using Local Ensemble Transfer Kalman Filter (LETKF) technology has been developed for ocean state and uncertainty forecasts. One of the advantages is that the best possible initial analysis states for the HYCOM forecasts are provided by the LETKF which assimilates the operational observations using ensemble method. The background covariance during this assimilation process is supplied with the ensemble, thus it avoids the difficulty of developing tangent linear and adjoint models for 4D-VAR from the complicated hybrid isopycnal vertical coordinate in HYCOM. Another advantage is that the ensemble system provides the valuable uncertainty estimate corresponding to every state forecast from HYCOM. Uncertainty forecasts have been proven to be critical for the downstream users and managers to make more scientifically sound decisions in numerical prediction community. In addition, ensemble mean is generally more accurate and skilful than the single traditional deterministic forecast with the same resolution. We will introduce the ensemble system design and setup, present some results from 30-member ensemble experiment, and discuss scientific, technical and computational issues and challenges, such as covariance localization, inflation, model related uncertainties and sensitivity to the ensemble size.

  15. Optimizing inhomogeneous spin ensembles for quantum memory

    NASA Astrophysics Data System (ADS)

    Bensky, Guy; Petrosyan, David; Majer, Johannes; Schmiedmayer, Jörg; Kurizki, Gershon

    2012-07-01

    We propose a method to maximize the fidelity of quantum memory implemented by a spectrally inhomogeneous spin ensemble. The method is based on preselecting the optimal spectral portion of the ensemble by judiciously designed pulses. This leads to significant improvement of the transfer and storage of quantum information encoded in the microwave or optical field.

  16. Conservation of Mass and Preservation of Positivity with Ensemble-Type Kalman Filter Algorithms

    NASA Technical Reports Server (NTRS)

    Janjic, Tijana; Mclaughlin, Dennis; Cohn, Stephen E.; Verlaan, Martin

    2014-01-01

    This paper considers the incorporation of constraints to enforce physically based conservation laws in the ensemble Kalman filter. In particular, constraints are used to ensure that the ensemble members and the ensemble mean conserve mass and remain nonnegative through measurement updates. In certain situations filtering algorithms such as the ensemble Kalman filter (EnKF) and ensemble transform Kalman filter (ETKF) yield updated ensembles that conserve mass but are negative, even though the actual states must be nonnegative. In such situations if negative values are set to zero, or a log transform is introduced, the total mass will not be conserved. In this study, mass and positivity are both preserved by formulating the filter update as a set of quadratic programming problems that incorporate non-negativity constraints. Simple numerical experiments indicate that this approach can have a significant positive impact on the posterior ensemble distribution, giving results that are more physically plausible both for individual ensemble members and for the ensemble mean. In two examples, an update that includes a non-negativity constraint is able to properly describe the transport of a sharp feature (e.g., a triangle or cone). A number of implementation questions still need to be addressed, particularly the need to develop a computationally efficient quadratic programming update for large ensemble.

  17. Assessing the impact of land use change on hydrology by ensemble modeling (LUCHEM) III: Scenario analysis

    USGS Publications Warehouse

    Huisman, J.A.; Breuer, L.; Bormann, H.; Bronstert, A.; Croke, B.F.W.; Frede, H.-G.; Graff, T.; Hubrechts, L.; Jakeman, A.J.; Kite, G.; Lanini, J.; Leavesley, G.; Lettenmaier, D.P.; Lindstrom, G.; Seibert, J.; Sivapalan, M.; Viney, N.R.; Willems, P.

    2009-01-01

    An ensemble of 10 hydrological models was applied to the same set of land use change scenarios. There was general agreement about the direction of changes in the mean annual discharge and 90% discharge percentile predicted by the ensemble members, although a considerable range in the magnitude of predictions for the scenarios and catchments under consideration was obvious. Differences in the magnitude of the increase were attributed to the different mean annual actual evapotranspiration rates for each land use type. The ensemble of model runs was further analyzed with deterministic and probabilistic ensemble methods. The deterministic ensemble method based on a trimmed mean resulted in a single somewhat more reliable scenario prediction. The probabilistic reliability ensemble averaging (REA) method allowed a quantification of the model structure uncertainty in the scenario predictions. It was concluded that the use of a model ensemble has greatly increased our confidence in the reliability of the model predictions. ?? 2008 Elsevier Ltd.

  18. Multi-objective optimization for generating a weighted multi-model ensemble

    NASA Astrophysics Data System (ADS)

    Lee, H.

    2017-12-01

    Many studies have demonstrated that multi-model ensembles generally show better skill than each ensemble member. When generating weighted multi-model ensembles, the first step is measuring the performance of individual model simulations using observations. There is a consensus on the assignment of weighting factors based on a single evaluation metric. When considering only one evaluation metric, the weighting factor for each model is proportional to a performance score or inversely proportional to an error for the model. While this conventional approach can provide appropriate combinations of multiple models, the approach confronts a big challenge when there are multiple metrics under consideration. When considering multiple evaluation metrics, it is obvious that a simple averaging of multiple performance scores or model ranks does not address the trade-off problem between conflicting metrics. So far, there seems to be no best method to generate weighted multi-model ensembles based on multiple performance metrics. The current study applies the multi-objective optimization, a mathematical process that provides a set of optimal trade-off solutions based on a range of evaluation metrics, to combining multiple performance metrics for the global climate models and their dynamically downscaled regional climate simulations over North America and generating a weighted multi-model ensemble. NASA satellite data and the Regional Climate Model Evaluation System (RCMES) software toolkit are used for assessment of the climate simulations. Overall, the performance of each model differs markedly with strong seasonal dependence. Because of the considerable variability across the climate simulations, it is important to evaluate models systematically and make future projections by assigning optimized weighting factors to the models with relatively good performance. Our results indicate that the optimally weighted multi-model ensemble always shows better performance than an arithmetic ensemble mean and may provide reliable future projections.

  19. Rethinking the Default Construction of Multimodel Climate Ensembles

    DOE PAGES

    Rauser, Florian; Gleckler, Peter; Marotzke, Jochem

    2015-07-21

    Here, we discuss the current code of practice in the climate sciences to routinely create climate model ensembles as ensembles of opportunity from the newest phase of the Coupled Model Intercomparison Project (CMIP). We give a two-step argument to rethink this process. First, the differences between generations of ensembles corresponding to different CMIP phases in key climate quantities are not large enough to warrant an automatic separation into generational ensembles for CMIP3 and CMIP5. Second, we suggest that climate model ensembles cannot continue to be mere ensembles of opportunity but should always be based on a transparent scientific decision process.more » If ensembles can be constrained by observation, then they should be constructed as target ensembles that are specifically tailored to a physical question. If model ensembles cannot be constrained by observation, then they should be constructed as cross-generational ensembles, including all available model data to enhance structural model diversity and to better sample the underlying uncertainties. To facilitate this, CMIP should guide the necessarily ongoing process of updating experimental protocols for the evaluation and documentation of coupled models. Finally, with an emphasis on easy access to model data and facilitating the filtering of climate model data across all CMIP generations and experiments, our community could return to the underlying idea of using model data ensembles to improve uncertainty quantification, evaluation, and cross-institutional exchange.« less

  20. An Ensemble System Based on Hybrid EGARCH-ANN with Different Distributional Assumptions to Predict S&P 500 Intraday Volatility

    NASA Astrophysics Data System (ADS)

    Lahmiri, S.; Boukadoum, M.

    2015-10-01

    Accurate forecasting of stock market volatility is an important issue in portfolio risk management. In this paper, an ensemble system for stock market volatility is presented. It is composed of three different models that hybridize the exponential generalized autoregressive conditional heteroscedasticity (GARCH) process and the artificial neural network trained with the backpropagation algorithm (BPNN) to forecast stock market volatility under normal, t-Student, and generalized error distribution (GED) assumption separately. The goal is to design an ensemble system where each single hybrid model is capable to capture normality, excess skewness, or excess kurtosis in the data to achieve complementarity. The performance of each EGARCH-BPNN and the ensemble system is evaluated by the closeness of the volatility forecasts to realized volatility. Based on mean absolute error and mean of squared errors, the experimental results show that proposed ensemble model used to capture normality, skewness, and kurtosis in data is more accurate than the individual EGARCH-BPNN models in forecasting the S&P 500 intra-day volatility based on one and five-minute time horizons data.

  1. Towards the Operational Ensemble-based Data Assimilation System for the Wave Field at the National Weather Service

    NASA Astrophysics Data System (ADS)

    Flampouris, Stylianos; Penny, Steve; Alves, Henrique

    2017-04-01

    The National Centers for Environmental Prediction (NCEP) of the National Oceanic and Atmospheric Administration (NOAA) provides the operational wave forecast for the US National Weather Service (NWS). Given the continuous efforts to improve forecast, NCEP is developing an ensemble-based data assimilation system, based on the local ensemble transform Kalman filter (LETKF), the existing operational global wave ensemble system (GWES) and on satellite and in-situ observations. While the LETKF was designed for atmospheric applications (Hunt et al 2007), and has been adapted for several ocean models (e.g. Penny 2016), this is the first time applied for oceanic waves assimilation. This new wave assimilation system provides a global estimation of the surface sea state and its approximate uncertainty. It achieves this by analyzing the 21-member ensemble of the significant wave height provided by GWES every 6h. Observations from four altimeters and all the available in-situ measurements are used in this analysis. The analysis of the significant wave height is used for initializing the next forecasting cycle; the data assimilation system is currently being tested for operational use.

  2. Application of the AMPLE cluster-and-truncate approach to NMR structures for molecular replacement

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bibby, Jaclyn; Keegan, Ronan M.; Mayans, Olga

    2013-11-01

    Processing of NMR structures for molecular replacement by AMPLE works well. AMPLE is a program developed for clustering and truncating ab initio protein structure predictions into search models for molecular replacement. Here, it is shown that its core cluster-and-truncate methods also work well for processing NMR ensembles into search models. Rosetta remodelling helps to extend success to NMR structures bearing low sequence identity or high structural divergence from the target protein. Potential future routes to improved performance are considered and practical, general guidelines on using AMPLE are provided.

  3. General framework for constraints in molecular dynamics simulations

    NASA Astrophysics Data System (ADS)

    Kneller, Gerald R.

    2017-06-01

    The article presents a theoretical framework for molecular dynamics simulations of complex systems subject to any combination of holonomic and non-holonomic constraints. Using the concept of constrained inverse matrices both the particle accelerations and the associated constraint forces can be determined from given external forces and kinematical conditions. The formalism enables in particular the construction of explicit kinematical conditions which lead to the well-known Nosé-Hoover type equations of motion for the simulation of non-standard molecular dynamics ensembles. Illustrations are given for a few examples and an outline is presented for a numerical implementation of the method.

  4. Ensemble Pruning for Glaucoma Detection in an Unbalanced Data Set.

    PubMed

    Adler, Werner; Gefeller, Olaf; Gul, Asma; Horn, Folkert K; Khan, Zardad; Lausen, Berthold

    2016-12-07

    Random forests are successful classifier ensemble methods consisting of typically 100 to 1000 classification trees. Ensemble pruning techniques reduce the computational cost, especially the memory demand, of random forests by reducing the number of trees without relevant loss of performance or even with increased performance of the sub-ensemble. The application to the problem of an early detection of glaucoma, a severe eye disease with low prevalence, based on topographical measurements of the eye background faces specific challenges. We examine the performance of ensemble pruning strategies for glaucoma detection in an unbalanced data situation. The data set consists of 102 topographical features of the eye background of 254 healthy controls and 55 glaucoma patients. We compare the area under the receiver operating characteristic curve (AUC), and the Brier score on the total data set, in the majority class, and in the minority class of pruned random forest ensembles obtained with strategies based on the prediction accuracy of greedily grown sub-ensembles, the uncertainty weighted accuracy, and the similarity between single trees. To validate the findings and to examine the influence of the prevalence of glaucoma in the data set, we additionally perform a simulation study with lower prevalences of glaucoma. In glaucoma classification all three pruning strategies lead to improved AUC and smaller Brier scores on the total data set with sub-ensembles as small as 30 to 80 trees compared to the classification results obtained with the full ensemble consisting of 1000 trees. In the simulation study, we were able to show that the prevalence of glaucoma is a critical factor and lower prevalence decreases the performance of our pruning strategies. The memory demand for glaucoma classification in an unbalanced data situation based on random forests could effectively be reduced by the application of pruning strategies without loss of performance in a population with increased risk of glaucoma.

  5. Single Turnover at Molecular Polymerization Catalysts Reveals Spatiotemporally Resolved Reactions.

    PubMed

    Easter, Quinn T; Blum, Suzanne A

    2017-10-23

    Multiple active individual molecular ruthenium catalysts have been pinpointed within growing polynorbornene, thereby revealing information on the reaction dynamics and location that is unavailable through traditional ensemble experiments. This is the first single-turnover imaging of a molecular catalyst by fluorescence microscopy and allows detection of individual monomer reactions at an industrially important molecular ruthenium ring-opening metathesis polymerization (ROMP) catalyst under synthetically relevant conditions (e.g. unmodified industrial catalyst, ambient pressure, condensed phase, ca. 0.03 m monomer). These results further establish the key fundamentals of this imaging technique for characterizing the reactivity and location of active molecular catalysts even when they are the minor components. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  6. Coherent coupling of molecular resonators with a microcavity mode

    NASA Astrophysics Data System (ADS)

    Shalabney, A.; George, J.; Hutchison, J.; Pupillo, G.; Genet, C.; Ebbesen, T. W.

    2015-01-01

    The optical hybridization of the electronic states in strongly coupled molecule-cavity systems have revealed unique properties, such as lasing, room temperature polariton condensation and the modification of excited electronic landscapes involved in molecular isomerization. Here we show that molecular vibrational modes of the electronic ground state can also be coherently coupled with a microcavity mode at room temperature, given the low vibrational thermal occupation factors associated with molecular vibrations, and the collective coupling of a large ensemble of molecules immersed within the cavity-mode volume. This enables the enhancement of the collective Rabi-exchange rate with respect to the single-oscillator coupling strength. The possibility of inducing large shifts in the vibrational frequency of selected molecular bonds should have immediate consequences for chemistry.

  7. JEnsembl: a version-aware Java API to Ensembl data systems

    PubMed Central

    Paterson, Trevor; Law, Andy

    2012-01-01

    Motivation: The Ensembl Project provides release-specific Perl APIs for efficient high-level programmatic access to data stored in various Ensembl database schema. Although Perl scripts are perfectly suited for processing large volumes of text-based data, Perl is not ideal for developing large-scale software applications nor embedding in graphical interfaces. The provision of a novel Java API would facilitate type-safe, modular, object-orientated development of new Bioinformatics tools with which to access, analyse and visualize Ensembl data. Results: The JEnsembl API implementation provides basic data retrieval and manipulation functionality from the Core, Compara and Variation databases for all species in Ensembl and EnsemblGenomes and is a platform for the development of a richer API to Ensembl datasources. The JEnsembl architecture uses a text-based configuration module to provide evolving, versioned mappings from database schema to code objects. A single installation of the JEnsembl API can therefore simultaneously and transparently connect to current and previous database instances (such as those in the public archive) thus facilitating better analysis repeatability and allowing ‘through time’ comparative analyses to be performed. Availability: Project development, released code libraries, Maven repository and documentation are hosted at SourceForge (http://jensembl.sourceforge.net). Contact: jensembl-develop@lists.sf.net, andy.law@roslin.ed.ac.uk, trevor.paterson@roslin.ed.ac.uk PMID:22945789

  8. An Integrated Scenario Ensemble-Based Framework for Hurricane Evacuation Modeling: Part 2-Hazard Modeling.

    PubMed

    Blanton, Brian; Dresback, Kendra; Colle, Brian; Kolar, Randy; Vergara, Humberto; Hong, Yang; Leonardo, Nicholas; Davidson, Rachel; Nozick, Linda; Wachtendorf, Tricia

    2018-04-25

    Hurricane track and intensity can change rapidly in unexpected ways, thus making predictions of hurricanes and related hazards uncertain. This inherent uncertainty often translates into suboptimal decision-making outcomes, such as unnecessary evacuation. Representing this uncertainty is thus critical in evacuation planning and related activities. We describe a physics-based hazard modeling approach that (1) dynamically accounts for the physical interactions among hazard components and (2) captures hurricane evolution uncertainty using an ensemble method. This loosely coupled model system provides a framework for probabilistic water inundation and wind speed levels for a new, risk-based approach to evacuation modeling, described in a companion article in this issue. It combines the Weather Research and Forecasting (WRF) meteorological model, the Coupled Routing and Excess STorage (CREST) hydrologic model, and the ADvanced CIRCulation (ADCIRC) storm surge, tide, and wind-wave model to compute inundation levels and wind speeds for an ensemble of hurricane predictions. Perturbations to WRF's initial and boundary conditions and different model physics/parameterizations generate an ensemble of storm solutions, which are then used to drive the coupled hydrologic + hydrodynamic models. Hurricane Isabel (2003) is used as a case study to illustrate the ensemble-based approach. The inundation, river runoff, and wind hazard results are strongly dependent on the accuracy of the mesoscale meteorological simulations, which improves with decreasing lead time to hurricane landfall. The ensemble envelope brackets the observed behavior while providing "best-case" and "worst-case" scenarios for the subsequent risk-based evacuation model. © 2018 Society for Risk Analysis.

  9. Theory of retrieving orientation-resolved molecular information using time-domain rotational coherence spectroscopy

    NASA Astrophysics Data System (ADS)

    Wang, Xu; Le, Anh-Thu; Zhou, Zhaoyan; Wei, Hui; Lin, C. D.

    2017-08-01

    We provide a unified theoretical framework for recently emerging experiments that retrieve fixed-in-space molecular information through time-domain rotational coherence spectroscopy. Unlike a previous approach by Makhija et al. (V. Makhija et al., arXiv:1611.06476), our method can be applied to the retrieval of both real-valued (e.g., ionization yield) and complex-valued (e.g., induced dipole moment) molecular response information. It is also a direct retrieval method without using iterations. We also demonstrate that experimental parameters, such as the fluence of the aligning laser pulse and the rotational temperature of the molecular ensemble, can be quite accurately determined using a statistical method.

  10. Input Decimated Ensembles

    NASA Technical Reports Server (NTRS)

    Tumer, Kagan; Oza, Nikunj C.; Clancy, Daniel (Technical Monitor)

    2001-01-01

    Using an ensemble of classifiers instead of a single classifier has been shown to improve generalization performance in many pattern recognition problems. However, the extent of such improvement depends greatly on the amount of correlation among the errors of the base classifiers. Therefore, reducing those correlations while keeping the classifiers' performance levels high is an important area of research. In this article, we explore input decimation (ID), a method which selects feature subsets for their ability to discriminate among the classes and uses them to decouple the base classifiers. We provide a summary of the theoretical benefits of correlation reduction, along with results of our method on two underwater sonar data sets, three benchmarks from the Probenl/UCI repositories, and two synthetic data sets. The results indicate that input decimated ensembles (IDEs) outperform ensembles whose base classifiers use all the input features; randomly selected subsets of features; and features created using principal components analysis, on a wide range of domains.

  11. Comparing ensemble learning methods based on decision tree classifiers for protein fold recognition.

    PubMed

    Bardsiri, Mahshid Khatibi; Eftekhari, Mahdi

    2014-01-01

    In this paper, some methods for ensemble learning of protein fold recognition based on a decision tree (DT) are compared and contrasted against each other over three datasets taken from the literature. According to previously reported studies, the features of the datasets are divided into some groups. Then, for each of these groups, three ensemble classifiers, namely, random forest, rotation forest and AdaBoost.M1 are employed. Also, some fusion methods are introduced for combining the ensemble classifiers obtained in the previous step. After this step, three classifiers are produced based on the combination of classifiers of types random forest, rotation forest and AdaBoost.M1. Finally, the three different classifiers achieved are combined to make an overall classifier. Experimental results show that the overall classifier obtained by the genetic algorithm (GA) weighting fusion method, is the best one in comparison to previously applied methods in terms of classification accuracy.

  12. Mitosis detection using generic features and an ensemble of cascade adaboosts.

    PubMed

    Tek, F Boray

    2013-01-01

    Mitosis count is one of the factors that pathologists use to assess the risk of metastasis and survival of the patients, which are affected by the breast cancer. We investigate an application of a set of generic features and an ensemble of cascade adaboosts to the automated mitosis detection. Calculation of the features rely minimally on object-level descriptions and thus require minimal segmentation. The proposed work was developed and tested on International Conference on Pattern Recognition (ICPR) 2012 mitosis detection contest data. We plotted receiver operating characteristics curves of true positive versus false positive rates; calculated recall, precision, F-measure, and region overlap ratio measures. WE TESTED OUR FEATURES WITH TWO DIFFERENT CLASSIFIER CONFIGURATIONS: 1) An ensemble of single adaboosts, 2) an ensemble of cascade adaboosts. On the ICPR 2012 mitosis detection contest evaluation, the cascade ensemble scored 54, 62.7, and 58, whereas the non-cascade version scored 68, 28.1, and 39.7 for the recall, precision, and F-measure measures, respectively. Mostly used features in the adaboost classifier rules were a shape-based feature, which counted granularity and a color-based feature, which relied on Red, Green, and Blue channel statistics. The features, which express the granular structure and color variations, are found useful for mitosis detection. The ensemble of adaboosts performs better than the individual adaboost classifiers. Moreover, the ensemble of cascaded adaboosts was better than the ensemble of single adaboosts for mitosis detection.

  13. Molecular simulation of disjoining-pressure isotherms for free liquid , Lennard-Jones thin films

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bhatt, Divesh; Newman, John; Radke, C.J.

    2001-10-01

    We present canonical-ensemble molecular-dynamics simulations of disjoining-pressure isotherms in Lennard-Jones free liquid films. Thermodynamics demands that the disjoining pressure is determined uniquely as a function of the chemical potential purely from the phase diagram of the fluid. Our results from molecular dynamics validate this argument. The inverse-sixth-power distance term in the Lennard-Jones intermolecular potential represents van der Waals dispersion forces. Hence, we compare our results with classical Hamaker theory that is based on dispersion forces but assumes a slab geometry for the density profile and completely neglects fluid structure and entropy. We find that the Hamaker constant obtained from ourmore » simulations is about an order of magnitude larger than that from classical theory. To investigate the origin of this discrepancy, we calculate the disjoining-pressure isotherm using a density-functional theory relaxing the inherent assumptions in the Hamaker theory and imparting to the fluid an approximate structure. For disjoining pressure as a function of chemical potential, the results of density-functional theory and molecular dynamics are very close. Even for disjoining-pressure isotherms, and the subsequently calculated Hamaker constant, results of the density-functional theory are closer to the molecular-dynamics simulations by about a factor of 4 compared to Hamaker theory. [References: 44]« less

  14. Fluorescence Quenching by TEMPO: A Sub-30 Å Single-Molecule Ruler

    PubMed Central

    Zhu, Peizhi; Clamme, Jean-Pierre; Deniz, Ashok A.

    2005-01-01

    A series of DNA molecules labeled with 5-carboxytetramethylrhodamine (5-TAMRA) and the small nitroxide radical TEMPO were synthesized and tested to investigate whether the intramolecular quenching efficiency can be used to measure short intramolecular distances in small ensemble and single-molecule experiments. In combination with distance calculations using molecular mechanics modeling, the experimental results from steady-state ensemble fluorescence and fluorescence correlation spectroscopy measurements both show an exponential decrease in the quenching rate constant with the dye-quencher distance in the 10–30 Å range. The results demonstrate that TEMPO-5-TAMRA fluorescence quenching is a promising method to measure short distance changes within single biomolecules. PMID:16199509

  15. Impacts of calibration strategies and ensemble methods on ensemble flood forecasting over Lanjiang basin, Southeast China

    NASA Astrophysics Data System (ADS)

    Liu, Li; Xu, Yue-Ping

    2017-04-01

    Ensemble flood forecasting driven by numerical weather prediction products is becoming more commonly used in operational flood forecasting applications.In this study, a hydrological ensemble flood forecasting system based on Variable Infiltration Capacity (VIC) model and quantitative precipitation forecasts from TIGGE dataset is constructed for Lanjiang Basin, Southeast China. The impacts of calibration strategies and ensemble methods on the performance of the system are then evaluated.The hydrological model is optimized by parallel programmed ɛ-NSGAII multi-objective algorithm and two respectively parameterized models are determined to simulate daily flows and peak flows coupled with a modular approach.The results indicatethat the ɛ-NSGAII algorithm permits more efficient optimization and rational determination on parameter setting.It is demonstrated that the multimodel ensemble streamflow mean have better skills than the best singlemodel ensemble mean (ECMWF) and the multimodel ensembles weighted on members and skill scores outperform other multimodel ensembles. For typical flood event, it is proved that the flood can be predicted 3-4 days in advance, but the flows in rising limb can be captured with only 1-2 days ahead due to the flash feature. With respect to peak flows selected by Peaks Over Threshold approach, the ensemble means from either singlemodel or multimodels are generally underestimated as the extreme values are smoothed out by ensemble process.

  16. Lessons from Climate Modeling on the Design and Use of Ensembles for Crop Modeling

    NASA Technical Reports Server (NTRS)

    Wallach, Daniel; Mearns, Linda O.; Ruane, Alexander C.; Roetter, Reimund P.; Asseng, Senthold

    2016-01-01

    Working with ensembles of crop models is a recent but important development in crop modeling which promises to lead to better uncertainty estimates for model projections and predictions, better predictions using the ensemble mean or median, and closer collaboration within the modeling community. There are numerous open questions about the best way to create and analyze such ensembles. Much can be learned from the field of climate modeling, given its much longer experience with ensembles. We draw on that experience to identify questions and make propositions that should help make ensemble modeling with crop models more rigorous and informative. The propositions include defining criteria for acceptance of models in a crop MME, exploring criteria for evaluating the degree of relatedness of models in a MME, studying the effect of number of models in the ensemble, development of a statistical model of model sampling, creation of a repository for MME results, studies of possible differential weighting of models in an ensemble, creation of single model ensembles based on sampling from the uncertainty distribution of parameter values or inputs specifically oriented toward uncertainty estimation, the creation of super ensembles that sample more than one source of uncertainty, the analysis of super ensemble results to obtain information on total uncertainty and the separate contributions of different sources of uncertainty and finally further investigation of the use of the multi-model mean or median as a predictor.

  17. Constructing better classifier ensemble based on weighted accuracy and diversity measure.

    PubMed

    Zeng, Xiaodong; Wong, Derek F; Chao, Lidia S

    2014-01-01

    A weighted accuracy and diversity (WAD) method is presented, a novel measure used to evaluate the quality of the classifier ensemble, assisting in the ensemble selection task. The proposed measure is motivated by a commonly accepted hypothesis; that is, a robust classifier ensemble should not only be accurate but also different from every other member. In fact, accuracy and diversity are mutual restraint factors; that is, an ensemble with high accuracy may have low diversity, and an overly diverse ensemble may negatively affect accuracy. This study proposes a method to find the balance between accuracy and diversity that enhances the predictive ability of an ensemble for unknown data. The quality assessment for an ensemble is performed such that the final score is achieved by computing the harmonic mean of accuracy and diversity, where two weight parameters are used to balance them. The measure is compared to two representative measures, Kappa-Error and GenDiv, and two threshold measures that consider only accuracy or diversity, with two heuristic search algorithms, genetic algorithm, and forward hill-climbing algorithm, in ensemble selection tasks performed on 15 UCI benchmark datasets. The empirical results demonstrate that the WAD measure is superior to others in most cases.

  18. Constructing Better Classifier Ensemble Based on Weighted Accuracy and Diversity Measure

    PubMed Central

    Chao, Lidia S.

    2014-01-01

    A weighted accuracy and diversity (WAD) method is presented, a novel measure used to evaluate the quality of the classifier ensemble, assisting in the ensemble selection task. The proposed measure is motivated by a commonly accepted hypothesis; that is, a robust classifier ensemble should not only be accurate but also different from every other member. In fact, accuracy and diversity are mutual restraint factors; that is, an ensemble with high accuracy may have low diversity, and an overly diverse ensemble may negatively affect accuracy. This study proposes a method to find the balance between accuracy and diversity that enhances the predictive ability of an ensemble for unknown data. The quality assessment for an ensemble is performed such that the final score is achieved by computing the harmonic mean of accuracy and diversity, where two weight parameters are used to balance them. The measure is compared to two representative measures, Kappa-Error and GenDiv, and two threshold measures that consider only accuracy or diversity, with two heuristic search algorithms, genetic algorithm, and forward hill-climbing algorithm, in ensemble selection tasks performed on 15 UCI benchmark datasets. The empirical results demonstrate that the WAD measure is superior to others in most cases. PMID:24672402

  19. Genetic algorithm based adaptive neural network ensemble and its application in predicting carbon flux

    USGS Publications Warehouse

    Xue, Y.; Liu, S.; Hu, Y.; Yang, J.; Chen, Q.

    2007-01-01

    To improve the accuracy in prediction, Genetic Algorithm based Adaptive Neural Network Ensemble (GA-ANNE) is presented. Intersections are allowed between different training sets based on the fuzzy clustering analysis, which ensures the diversity as well as the accuracy of individual Neural Networks (NNs). Moreover, to improve the accuracy of the adaptive weights of individual NNs, GA is used to optimize the cluster centers. Empirical results in predicting carbon flux of Duke Forest reveal that GA-ANNE can predict the carbon flux more accurately than Radial Basis Function Neural Network (RBFNN), Bagging NN ensemble, and ANNE. ?? 2007 IEEE.

  20. Ensemble based adaptive over-sampling method for imbalanced data learning in computer aided detection of microaneurysm.

    PubMed

    Ren, Fulong; Cao, Peng; Li, Wei; Zhao, Dazhe; Zaiane, Osmar

    2017-01-01

    Diabetic retinopathy (DR) is a progressive disease, and its detection at an early stage is crucial for saving a patient's vision. An automated screening system for DR can help in reduce the chances of complete blindness due to DR along with lowering the work load on ophthalmologists. Among the earliest signs of DR are microaneurysms (MAs). However, current schemes for MA detection appear to report many false positives because detection algorithms have high sensitivity. Inevitably some non-MAs structures are labeled as MAs in the initial MAs identification step. This is a typical "class imbalance problem". Class imbalanced data has detrimental effects on the performance of conventional classifiers. In this work, we propose an ensemble based adaptive over-sampling algorithm for overcoming the class imbalance problem in the false positive reduction, and we use Boosting, Bagging, Random subspace as the ensemble framework to improve microaneurysm detection. The ensemble based over-sampling methods we proposed combine the strength of adaptive over-sampling and ensemble. The objective of the amalgamation of ensemble and adaptive over-sampling is to reduce the induction biases introduced from imbalanced data and to enhance the generalization classification performance of extreme learning machines (ELM). Experimental results show that our ASOBoost method has higher area under the ROC curve (AUC) and G-mean values than many existing class imbalance learning methods. Copyright © 2016 Elsevier Ltd. All rights reserved.

  1. Comparison of projection skills of deterministic ensemble methods using pseudo-simulation data generated from multivariate Gaussian distribution

    NASA Astrophysics Data System (ADS)

    Oh, Seok-Geun; Suh, Myoung-Seok

    2017-07-01

    The projection skills of five ensemble methods were analyzed according to simulation skills, training period, and ensemble members, using 198 sets of pseudo-simulation data (PSD) produced by random number generation assuming the simulated temperature of regional climate models. The PSD sets were classified into 18 categories according to the relative magnitude of bias, variance ratio, and correlation coefficient, where each category had 11 sets (including 1 truth set) with 50 samples. The ensemble methods used were as follows: equal weighted averaging without bias correction (EWA_NBC), EWA with bias correction (EWA_WBC), weighted ensemble averaging based on root mean square errors and correlation (WEA_RAC), WEA based on the Taylor score (WEA_Tay), and multivariate linear regression (Mul_Reg). The projection skills of the ensemble methods improved generally as compared with the best member for each category. However, their projection skills are significantly affected by the simulation skills of the ensemble member. The weighted ensemble methods showed better projection skills than non-weighted methods, in particular, for the PSD categories having systematic biases and various correlation coefficients. The EWA_NBC showed considerably lower projection skills than the other methods, in particular, for the PSD categories with systematic biases. Although Mul_Reg showed relatively good skills, it showed strong sensitivity to the PSD categories, training periods, and number of members. On the other hand, the WEA_Tay and WEA_RAC showed relatively superior skills in both the accuracy and reliability for all the sensitivity experiments. This indicates that WEA_Tay and WEA_RAC are applicable even for simulation data with systematic biases, a short training period, and a small number of ensemble members.

  2. Charge fluctuations in nanoscale capacitors.

    PubMed

    Limmer, David T; Merlet, Céline; Salanne, Mathieu; Chandler, David; Madden, Paul A; van Roij, René; Rotenberg, Benjamin

    2013-09-06

    The fluctuations of the charge on an electrode contain information on the microscopic correlations within the adjacent fluid and their effect on the electronic properties of the interface. We investigate these fluctuations using molecular dynamics simulations in a constant-potential ensemble with histogram reweighting techniques. This approach offers, in particular, an efficient, accurate, and physically insightful route to the differential capacitance that is broadly applicable. We demonstrate these methods with three different capacitors: pure water between platinum electrodes and a pure as well as a solvent-based organic electrolyte each between graphite electrodes. The total charge distributions with the pure solvent and solvent-based electrolytes are remarkably Gaussian, while in the pure ionic liquid the total charge distribution displays distinct non-Gaussian features, suggesting significant potential-driven changes in the organization of the interfacial fluid.

  3. Charge Fluctuations in Nanoscale Capacitors

    NASA Astrophysics Data System (ADS)

    Limmer, David T.; Merlet, Céline; Salanne, Mathieu; Chandler, David; Madden, Paul A.; van Roij, René; Rotenberg, Benjamin

    2013-09-01

    The fluctuations of the charge on an electrode contain information on the microscopic correlations within the adjacent fluid and their effect on the electronic properties of the interface. We investigate these fluctuations using molecular dynamics simulations in a constant-potential ensemble with histogram reweighting techniques. This approach offers, in particular, an efficient, accurate, and physically insightful route to the differential capacitance that is broadly applicable. We demonstrate these methods with three different capacitors: pure water between platinum electrodes and a pure as well as a solvent-based organic electrolyte each between graphite electrodes. The total charge distributions with the pure solvent and solvent-based electrolytes are remarkably Gaussian, while in the pure ionic liquid the total charge distribution displays distinct non-Gaussian features, suggesting significant potential-driven changes in the organization of the interfacial fluid.

  4. Enhancing the Popular Music Ensemble Workshop and Maximising Student Potential through the Integration of Creativity

    ERIC Educational Resources Information Center

    Hall, Richard

    2015-01-01

    Ensemble work is a key part of any performance-based popular music course and involves students replicating existing music or playing "covers". The creative process in popular music is a collaborative one and the ensemble workshop can be utilised to facilitate active learning and develop musical creativity within a group setting. This is…

  5. Stress-stress fluctuation formula for elastic constants in the NPT ensemble

    NASA Astrophysics Data System (ADS)

    Lips, Dominik; Maass, Philipp

    2018-05-01

    Several fluctuation formulas are available for calculating elastic constants from equilibrium correlation functions in computer simulations, but the ones available for simulations at constant pressure exhibit slow convergence properties and cannot be used for the determination of local elastic constants. To overcome these drawbacks, we derive a stress-stress fluctuation formula in the NPT ensemble based on known expressions in the NVT ensemble. We validate the formula in the NPT ensemble by calculating elastic constants for the simple nearest-neighbor Lennard-Jones crystal and by comparing the results with those obtained in the NVT ensemble. For both local and bulk elastic constants we find an excellent agreement between the simulated data in the two ensembles. To demonstrate the usefulness of the formula, we apply it to determine the elastic constants of a simulated lipid bilayer.

  6. Characterization of the glass transition of water predicted by molecular dynamics simulations using nonpolarizable intermolecular potentials.

    PubMed

    Kreck, Cara A; Mancera, Ricardo L

    2014-02-20

    Molecular dynamics simulations allow detailed study of the experimentally inaccessible liquid state of supercooled water below its homogeneous nucleation temperature and the characterization of the glass transition. Simple, nonpolarizable intermolecular potentials are commonly used in classical molecular dynamics simulations of water and aqueous systems due to their lower computational cost and their ability to reproduce a wide range of properties. Because the quality of these predictions varies between the potentials, the predicted glass transition of water is likely to be influenced by the choice of potential. We have thus conducted an extensive comparative investigation of various three-, four-, five-, and six-point water potentials in both the NPT and NVT ensembles. The T(g) predicted from NPT simulations is strongly correlated with the temperature of minimum density, whereas the maximum in the heat capacity plot corresponds to the minimum in the thermal expansion coefficient. In the NVT ensemble, these points are instead related to the maximum in the internal pressure and the minimum of its derivative, respectively. A detailed analysis of the hydrogen-bonding properties at the glass transition reveals that the extent of hydrogen-bonds lost upon the melting of the glassy state is related to the height of the heat capacity peak and varies between water potentials.

  7. Hybrid Quantum Mechanics/Molecular Mechanics Solvation Scheme for Computing Free Energies of Reactions at Metal-Water Interfaces.

    PubMed

    Faheem, Muhammad; Heyden, Andreas

    2014-08-12

    We report the development of a quantum mechanics/molecular mechanics free energy perturbation (QM/MM-FEP) method for modeling chemical reactions at metal-water interfaces. This novel solvation scheme combines planewave density function theory (DFT), periodic electrostatic embedded cluster method (PEECM) calculations using Gaussian-type orbitals, and classical molecular dynamics (MD) simulations to obtain a free energy description of a complex metal-water system. We derive a potential of mean force (PMF) of the reaction system within the QM/MM framework. A fixed-size, finite ensemble of MM conformations is used to permit precise evaluation of the PMF of QM coordinates and its gradient defined within this ensemble. Local conformations of adsorbed reaction moieties are optimized using sequential MD-sampling and QM-optimization steps. An approximate reaction coordinate is constructed using a number of interpolated states and the free energy difference between adjacent states is calculated using the QM/MM-FEP method. By avoiding on-the-fly QM calculations and by circumventing the challenges associated with statistical averaging during MD sampling, a computational speedup of multiple orders of magnitude is realized. The method is systematically validated against the results of ab initio QM calculations and demonstrated for C-C cleavage in double-dehydrogenated ethylene glycol on a Pt (111) model surface.

  8. Conformational Ensemble of hIAPP Dimer: Insight into the Molecular Mechanism by which a Green Tea Extract inhibits hIAPP Aggregation

    NASA Astrophysics Data System (ADS)

    Mo, Yuxiang; Lei, Jiangtao; Sun, Yunxiang; Zhang, Qingwen; Wei, Guanghong

    2016-09-01

    Small oligomers formed early along human islet amyloid polypeptide (hIAPP) aggregation is responsible for the cell death in Type II diabetes. The epigallocatechin gallate (EGCG), a green tea extract, was found to inhibit hIAPP fibrillation. However, the inhibition mechanism and the conformational distribution of the smallest hIAPP oligomer - dimer are mostly unknown. Herein, we performed extensive replica exchange molecular dynamic simulations on hIAPP dimer with and without EGCG molecules. Extended hIAPP dimer conformations, with a collision cross section value similar to that observed by ion mobility-mass spectrometry, were observed in our simulations. Notably, these dimers adopt a three-stranded antiparallel β-sheet and contain the previously reported β-hairpin amyloidogenic precursor. We find that EGCG binding strongly blocks both the inter-peptide hydrophobic and aromatic-stacking interactions responsible for inter-peptide β-sheet formation and intra-peptide interaction crucial for β-hairpin formation, thus abolishes the three-stranded β-sheet structures and leads to the formation of coil-rich conformations. Hydrophobic, aromatic-stacking, cation-π and hydrogen-bonding interactions jointly contribute to the EGCG-induced conformational shift. This study provides, on atomic level, the conformational ensemble of hIAPP dimer and the molecular mechanism by which EGCG inhibits hIAPP aggregation.

  9. Sampling-based ensemble segmentation against inter-operator variability

    NASA Astrophysics Data System (ADS)

    Huo, Jing; Okada, Kazunori; Pope, Whitney; Brown, Matthew

    2011-03-01

    Inconsistency and a lack of reproducibility are commonly associated with semi-automated segmentation methods. In this study, we developed an ensemble approach to improve reproducibility and applied it to glioblastoma multiforme (GBM) brain tumor segmentation on T1-weigted contrast enhanced MR volumes. The proposed approach combines samplingbased simulations and ensemble segmentation into a single framework; it generates a set of segmentations by perturbing user initialization and user-specified internal parameters, then fuses the set of segmentations into a single consensus result. Three combination algorithms were applied: majority voting, averaging and expectation-maximization (EM). The reproducibility of the proposed framework was evaluated by a controlled experiment on 16 tumor cases from a multicenter drug trial. The ensemble framework had significantly better reproducibility than the individual base Otsu thresholding method (p<.001).

  10. Multi-complexity ensemble measures for gait time series analysis: application to diagnostics, monitoring and biometrics.

    PubMed

    Gavrishchaka, Valeriy; Senyukova, Olga; Davis, Kristina

    2015-01-01

    Previously, we have proposed to use complementary complexity measures discovered by boosting-like ensemble learning for the enhancement of quantitative indicators dealing with necessarily short physiological time series. We have confirmed robustness of such multi-complexity measures for heart rate variability analysis with the emphasis on detection of emerging and intermittent cardiac abnormalities. Recently, we presented preliminary results suggesting that such ensemble-based approach could be also effective in discovering universal meta-indicators for early detection and convenient monitoring of neurological abnormalities using gait time series. Here, we argue and demonstrate that these multi-complexity ensemble measures for gait time series analysis could have significantly wider application scope ranging from diagnostics and early detection of physiological regime change to gait-based biometrics applications.

  11. Atomistic structural ensemble refinement reveals non-native structure stabilizes a sub-millisecond folding intermediate of CheY

    DOE PAGES

    Shi, Jade; Nobrega, R. Paul; Schwantes, Christian; ...

    2017-03-08

    The dynamics of globular proteins can be described in terms of transitions between a folded native state and less-populated intermediates, or excited states, which can play critical roles in both protein folding and function. Excited states are by definition transient species, and therefore are difficult to characterize using current experimental techniques. We report an atomistic model of the excited state ensemble of a stabilized mutant of an extensively studied flavodoxin fold protein CheY. We employed a hybrid simulation and experimental approach in which an aggregate 42 milliseconds of all-atom molecular dynamics were used as an informative prior for the structuremore » of the excited state ensemble. The resulting prior was then refined against small-angle X-ray scattering (SAXS) data employing an established method (EROS). The most striking feature of the resulting excited state ensemble was an unstructured N-terminus stabilized by non-native contacts in a conformation that is topologically simpler than the native state. We then predict incisive single molecule FRET experiments, using these results, as a means of model validation. Our study demonstrates the paradigm of uniting simulation and experiment in a statistical model to study the structure of protein excited states and rationally design validating experiments.« less

  12. Atomistic structural ensemble refinement reveals non-native structure stabilizes a sub-millisecond folding intermediate of CheY

    NASA Astrophysics Data System (ADS)

    Shi, Jade; Nobrega, R. Paul; Schwantes, Christian; Kathuria, Sagar V.; Bilsel, Osman; Matthews, C. Robert; Lane, T. J.; Pande, Vijay S.

    2017-03-01

    The dynamics of globular proteins can be described in terms of transitions between a folded native state and less-populated intermediates, or excited states, which can play critical roles in both protein folding and function. Excited states are by definition transient species, and therefore are difficult to characterize using current experimental techniques. Here, we report an atomistic model of the excited state ensemble of a stabilized mutant of an extensively studied flavodoxin fold protein CheY. We employed a hybrid simulation and experimental approach in which an aggregate 42 milliseconds of all-atom molecular dynamics were used as an informative prior for the structure of the excited state ensemble. This prior was then refined against small-angle X-ray scattering (SAXS) data employing an established method (EROS). The most striking feature of the resulting excited state ensemble was an unstructured N-terminus stabilized by non-native contacts in a conformation that is topologically simpler than the native state. Using these results, we then predict incisive single molecule FRET experiments as a means of model validation. This study demonstrates the paradigm of uniting simulation and experiment in a statistical model to study the structure of protein excited states and rationally design validating experiments.

  13. Enhanced conformational sampling to visualize a free-energy landscape of protein complex formation

    PubMed Central

    Iida, Shinji; Nakamura, Haruki; Higo, Junichi

    2016-01-01

    We introduce various, recently developed, generalized ensemble methods, which are useful to sample various molecular configurations emerging in the process of protein–protein or protein–ligand binding. The methods introduced here are those that have been or will be applied to biomolecular binding, where the biomolecules are treated as flexible molecules expressed by an all-atom model in an explicit solvent. Sampling produces an ensemble of conformations (snapshots) that are thermodynamically probable at room temperature. Then, projection of those conformations to an abstract low-dimensional space generates a free-energy landscape. As an example, we show a landscape of homo-dimer formation of an endothelin-1-like molecule computed using a generalized ensemble method. The lowest free-energy cluster at room temperature coincided precisely with the experimentally determined complex structure. Two minor clusters were also found in the landscape, which were largely different from the native complex form. Although those clusters were isolated at room temperature, with rising temperature a pathway emerged linking the lowest and second-lowest free-energy clusters, and a further temperature increment connected all the clusters. This exemplifies that the generalized ensemble method is a powerful tool for computing the free-energy landscape, by which one can discuss the thermodynamic stability of clusters and the temperature dependence of the cluster networks. PMID:27288028

  14. Impact of ensemble learning in the assessment of skeletal maturity.

    PubMed

    Cunha, Pedro; Moura, Daniel C; Guevara López, Miguel Angel; Guerra, Conceição; Pinto, Daniela; Ramos, Isabel

    2014-09-01

    The assessment of the bone age, or skeletal maturity, is an important task in pediatrics that measures the degree of maturation of children's bones. Nowadays, there is no standard clinical procedure for assessing bone age and the most widely used approaches are the Greulich and Pyle and the Tanner and Whitehouse methods. Computer methods have been proposed to automatize the process; however, there is a lack of exploration about how to combine the features of the different parts of the hand, and how to take advantage of ensemble techniques for this purpose. This paper presents a study where the use of ensemble techniques for improving bone age assessment is evaluated. A new computer method was developed that extracts descriptors for each joint of each finger, which are then combined using different ensemble schemes for obtaining a final bone age value. Three popular ensemble schemes are explored in this study: bagging, stacking and voting. Best results were achieved by bagging with a rule-based regression (M5P), scoring a mean absolute error of 10.16 months. Results show that ensemble techniques improve the prediction performance of most of the evaluated regression algorithms, always achieving best or comparable to best results. Therefore, the success of the ensemble methods allow us to conclude that their use may improve computer-based bone age assessment, offering a scalable option for utilizing multiple regions of interest and combining their output.

  15. Comparison of numerical weather prediction based deterministic and probabilistic wind resource assessment methods

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Jie; Draxl, Caroline; Hopson, Thomas

    Numerical weather prediction (NWP) models have been widely used for wind resource assessment. Model runs with higher spatial resolution are generally more accurate, yet extremely computational expensive. An alternative approach is to use data generated by a low resolution NWP model, in conjunction with statistical methods. In order to analyze the accuracy and computational efficiency of different types of NWP-based wind resource assessment methods, this paper performs a comparison of three deterministic and probabilistic NWP-based wind resource assessment methodologies: (i) a coarse resolution (0.5 degrees x 0.67 degrees) global reanalysis data set, the Modern-Era Retrospective Analysis for Research and Applicationsmore » (MERRA); (ii) an analog ensemble methodology based on the MERRA, which provides both deterministic and probabilistic predictions; and (iii) a fine resolution (2-km) NWP data set, the Wind Integration National Dataset (WIND) Toolkit, based on the Weather Research and Forecasting model. Results show that: (i) as expected, the analog ensemble and WIND Toolkit perform significantly better than MERRA confirming their ability to downscale coarse estimates; (ii) the analog ensemble provides the best estimate of the multi-year wind distribution at seven of the nine sites, while the WIND Toolkit is the best at one site; (iii) the WIND Toolkit is more accurate in estimating the distribution of hourly wind speed differences, which characterizes the wind variability, at five of the available sites, with the analog ensemble being best at the remaining four locations; and (iv) the analog ensemble computational cost is negligible, whereas the WIND Toolkit requires large computational resources. Future efforts could focus on the combination of the analog ensemble with intermediate resolution (e.g., 10-15 km) NWP estimates, to considerably reduce the computational burden, while providing accurate deterministic estimates and reliable probabilistic assessments.« less

  16. Computational predictive models for P-glycoprotein inhibition of in-house chalcone derivatives and drug-bank compounds.

    PubMed

    Ngo, Trieu-Du; Tran, Thanh-Dao; Le, Minh-Tri; Thai, Khac-Minh

    2016-11-01

    The human P-glycoprotein (P-gp) efflux pump is of great interest for medicinal chemists because of its important role in multidrug resistance (MDR). Because of the high polyspecificity as well as the unavailability of high-resolution X-ray crystal structures of this transmembrane protein, ligand-based, and structure-based approaches which were machine learning, homology modeling, and molecular docking were combined for this study. In ligand-based approach, individual two-dimensional quantitative structure-activity relationship models were developed using different machine learning algorithms and subsequently combined into the Ensemble model which showed good performance on both the diverse training set and the validation sets. The applicability domain and the prediction quality of the developed models were also judged using the state-of-the-art methods and tools. In our structure-based approach, the P-gp structure and its binding region were predicted for a docking study to determine possible interactions between the ligands and the receptor. Based on these in silico tools, hit compounds for reversing MDR were discovered from the in-house and DrugBank databases through virtual screening using prediction models and molecular docking in an attempt to restore cancer cell sensitivity to cytotoxic drugs.

  17. Peer Connectedness in the Middle School Band Program

    ERIC Educational Resources Information Center

    Rawlings, Jared R.; Stoddard, Sarah A.

    2017-01-01

    Previous research suggests that students participating in school-based musical ensembles are more engaged in school and more likely to connect to their peers in school; however, researchers have not specifically investigated peer connectedness among adolescents in school-based music ensembles. The purpose of this study was to explore middle school…

  18. Assessment in Performance-Based Secondary Music Classes

    ERIC Educational Resources Information Center

    Pellegrino, Kristen; Conway, Colleen M.; Russell, Joshua A.

    2015-01-01

    After sharing research findings about grading and assessment practices in secondary music ensemble classes, we offer examples of commonly used assessment tools (ratings scale, checklist, rubric) for the performance ensemble. Then, we explore the various purposes of assessment in performance-based music courses: (1) to meet state, national, and…

  19. Molecular and cellular heterogeneity: the hallmark of glioblastoma.

    PubMed

    Aum, Diane J; Kim, David H; Beaumont, Thomas L; Leuthardt, Eric C; Dunn, Gavin P; Kim, Albert H

    2014-12-01

    There has been increasing awareness that glioblastoma, which may seem histopathologically similar across many tumors, actually represents a group of molecularly distinct tumors. Emerging evidence suggests that cells even within the same tumor exhibit wide-ranging molecular diversity. Parallel to the discoveries of molecular heterogeneity among tumors and their individual cells, intense investigation of the cellular biology of glioblastoma has revealed that not all cancer cells within a given tumor behave the same. The identification of a subpopulation of brain tumor cells termed "glioblastoma cancer stem cells" or "tumor-initiating cells" has implications for the management of glioblastoma. This focused review will therefore summarize emerging concepts on the molecular and cellular heterogeneity of glioblastoma and emphasize that we should begin to consider each individual glioblastoma to be an ensemble of molecularly distinct subclones that reflect a spectrum of dynamic cell states.

  20. A user credit assessment model based on clustering ensemble for broadband network new media service supervision

    NASA Astrophysics Data System (ADS)

    Liu, Fang; Cao, San-xing; Lu, Rui

    2012-04-01

    This paper proposes a user credit assessment model based on clustering ensemble aiming to solve the problem that users illegally spread pirated and pornographic media contents within the user self-service oriented broadband network new media platforms. Its idea is to do the new media user credit assessment by establishing indices system based on user credit behaviors, and the illegal users could be found according to the credit assessment results, thus to curb the bad videos and audios transmitted on the network. The user credit assessment model based on clustering ensemble proposed by this paper which integrates the advantages that swarm intelligence clustering is suitable for user credit behavior analysis and K-means clustering could eliminate the scattered users existed in the result of swarm intelligence clustering, thus to realize all the users' credit classification automatically. The model's effective verification experiments are accomplished which are based on standard credit application dataset in UCI machine learning repository, and the statistical results of a comparative experiment with a single model of swarm intelligence clustering indicates this clustering ensemble model has a stronger creditworthiness distinguishing ability, especially in the aspect of predicting to find user clusters with the best credit and worst credit, which will facilitate the operators to take incentive measures or punitive measures accurately. Besides, compared with the experimental results of Logistic regression based model under the same conditions, this clustering ensemble model is robustness and has better prediction accuracy.

  1. Phthalocyanine-nanocarbon ensembles: from discrete molecular and supramolecular systems to hybrid nanomaterials.

    PubMed

    Bottari, Giovanni; de la Torre, Gema; Torres, Tomas

    2015-04-21

    Phthalocyanines (Pcs) are macrocyclic and aromatic compounds that present unique electronic features such as high molar absorption coefficients, rich redox chemistry, and photoinduced energy/electron transfer abilities that can be modulated as a function of the electronic character of their counterparts in donor-acceptor (D-A) ensembles. In this context, carbon nanostructures such as fullerenes, carbon nanotubes (CNTs), and, more recently, graphene are among the most suitable Pc "companions". Pc-C60 ensembles have been for a long time the main actors in this field, due to the commercial availability of C60 and the well-established synthetic methods for its functionalization. As a result, many Pc-C60 architectures have been prepared, featuring different connectivities (covalent or supramolecular), intermolecular interactions (self-organized or molecularly dispersed species), and Pc HOMO/LUMO levels. All these elements provide a versatile toolbox for tuning the photophysical properties in terms of the type of process (photoinduced energy/electron transfer), the nature of the interactions between the electroactive units (through bond or space), and the kinetics of the formation/decay of the photogenerated species. Some recent trends in this field include the preparation of stimuli-responsive multicomponent systems with tunable photophysical properties and highly ordered nanoarchitectures and surface-supported systems showing high charge mobilities. A breakthrough in the Pc-nanocarbon field was the appearance of CNTs and graphene, which opened a new avenue for the preparation of intriguing photoresponsive hybrid ensembles showing light-stimulated charge separation. The scarce solubility of these 1-D and 2-D nanocarbons, together with their lower reactivity with respect to C60 stemming from their less strained sp(2) carbon networks, has not meant an unsurmountable limitation for the preparation of variety of Pc-based hybrids. These systems, which show improved solubility and dispersibility features, bring together the unique electronic transport properties of CNTs and graphene with the excellent light-harvesting and tunable redox properties of Pcs. A singular and distinctive feature of these Pc-CNT/graphene (single- or few-layers) hybrid materials is the control of the direction of the photoinduced charge transfer as a result of the band-like electronic structure of these carbon nanoforms and the adjustable electronic levels of Pcs. Moreover, these conjugates present intensified light-harvesting capabilities resulting from the grafting of several chromophores on the same nanocarbon platform. In this Account, recent progress in the construction of covalent and supramolecular Pc-nanocarbon ensembles is summarized, with a particular emphasis on their photoinduced behavior. We believe that the high degree of control achieved in the preparation of Pc-carbon nanostructures, together with the increasing knowledge of the factors governing their photophysics, will allow for the design of next-generation light-fueled electroactive systems. Possible implementation of these Pc-nanocarbons in high performance devices is envisioned, finally turning into reality much of the expectations generated by these materials.

  2. UNRES server for physics-based coarse-grained simulations and prediction of protein structure, dynamics and thermodynamics.

    PubMed

    Czaplewski, Cezary; Karczynska, Agnieszka; Sieradzan, Adam K; Liwo, Adam

    2018-04-30

    A server implementation of the UNRES package (http://www.unres.pl) for coarse-grained simulations of protein structures with the physics-based UNRES model, coined a name UNRES server, is presented. In contrast to most of the protein coarse-grained models, owing to its physics-based origin, the UNRES force field can be used in simulations, including those aimed at protein-structure prediction, without ancillary information from structural databases; however, the implementation includes the possibility of using restraints. Local energy minimization, canonical molecular dynamics simulations, replica exchange and multiplexed replica exchange molecular dynamics simulations can be run with the current UNRES server; the latter are suitable for protein-structure prediction. The user-supplied input includes protein sequence and, optionally, restraints from secondary-structure prediction or small x-ray scattering data, and simulation type and parameters which are selected or typed in. Oligomeric proteins, as well as those containing D-amino-acid residues and disulfide links can be treated. The output is displayed graphically (minimized structures, trajectories, final models, analysis of trajectory/ensembles); however, all output files can be downloaded by the user. The UNRES server can be freely accessed at http://unres-server.chem.ug.edu.pl.

  3. The Jarzynski identity derived from general Hamiltonian or non-Hamiltonian dynamics reproducing NVT or NPT ensembles

    NASA Astrophysics Data System (ADS)

    Cuendet, Michel A.

    2006-10-01

    The Jarzynski identity (JI) relates nonequilibrium work averages to thermodynamic free energy differences. It was shown in a recent contribution [M. A. Cuendet, Phys. Rev. Lett. 96, 120602 (2006)] that the JI can, in particular, be derived directly from the Nosé-Hoover thermostated dynamics. This statistical mechanical derivation is particularly relevant in the framework of molecular dynamics simulation, because it is based solely on the equations of motion considered and is free of any additional assumptions on system size or bath coupling. Here, this result is generalized to a variety of dynamics, along two directions. On the one hand, specific improved thermostating schemes used in practical applications are treated. These include Nosé-Hoover chains, higher moment thermostats, as well as an isothermal-isobaric scheme yielding the JI in the NPT ensemble. On the other hand, the theoretical generality of the new derivation is explored. Generic dynamics with arbitrary coupling terms and an arbitrary number of thermostating variables, both non-Hamiltonian and Hamiltonian, are shown to imply the JI. In particular, a nonautonomous formulation of the generalized Nosé-Poincaré thermostat is proposed. Finally, general conditions required for the JI derivation are briefly discussed.

  4. Mutation-Induced Population Shift in the MexR Conformational Ensemble Disengages DNA Binding: A Novel Mechanism for MarR Family Derepression.

    PubMed

    Anandapadamanaban, Madhanagopal; Pilstål, Robert; Andresen, Cecilia; Trewhella, Jill; Moche, Martin; Wallner, Björn; Sunnerhagen, Maria

    2016-08-02

    MexR is a repressor of the MexAB-OprM multidrug efflux pump operon of Pseudomonas aeruginosa, where DNA-binding impairing mutations lead to multidrug resistance (MDR). Surprisingly, the crystal structure of an MDR-conferring MexR mutant R21W (2.19 Å) presented here is closely similar to wild-type MexR. However, our extended analysis, by molecular dynamics and small-angle X-ray scattering, reveals that the mutation stabilizes a ground state that is deficient of DNA binding and is shared by both mutant and wild-type MexR, whereas the DNA-binding state is only transiently reached by the more flexible wild-type MexR. This population shift in the conformational ensemble is effected by mutation-induced allosteric coupling of contact networks that are independent in the wild-type protein. We propose that the MexR-R21W mutant mimics derepression by small-molecule binding to MarR proteins, and that the described allosteric model based on population shifts may also apply to other MarR family members. Copyright © 2016 Elsevier Ltd. All rights reserved.

  5. The E. coli thioredoxin folding mechanism: the key role of the C-terminal helix.

    PubMed

    Vazquez, Diego S; Sánchez, Ignacio E; Garrote, Ana; Sica, Mauricio P; Santos, Javier

    2015-02-01

    In this work, the unfolding mechanism of oxidized Escherichia coli thioredoxin (EcTRX) was investigated experimentally and computationally. We characterized seven point mutants distributed along the C-terminal α-helix (CTH) and the preceding loop. The mutations destabilized the protein against global unfolding while leaving the native structure unchanged. Global analysis of the unfolding kinetics of all variants revealed a linear unfolding route with a high-energy on-pathway intermediate state flanked by two transition state ensembles TSE1 and TSE2. The experiments show that CTH is mainly unfolded in TSE1 and the intermediate and becomes structured in TSE2. Structure-based molecular dynamics are in agreement with these experiments and provide protein-wide structural information on transient states. In our model, EcTRX folding starts with structure formation in the β-sheet, while the protein helices coalesce later. As a whole, our results indicate that the CTH is a critical module in the folding process, restraining a heterogeneous intermediate ensemble into a biologically active native state and providing the native protein with thermodynamic and kinetic stability. Copyright © 2014 Elsevier B.V. All rights reserved.

  6. Equilibrium sampling by reweighting nonequilibrium simulation trajectories

    NASA Astrophysics Data System (ADS)

    Yang, Cheng; Wan, Biao; Xu, Shun; Wang, Yanting; Zhou, Xin

    2016-03-01

    Based on equilibrium molecular simulations, it is usually difficult to efficiently visit the whole conformational space of complex systems, which are separated into some metastable regions by high free energy barriers. Nonequilibrium simulations could enhance transitions among these metastable regions and then be applied to sample equilibrium distributions in complex systems, since the associated nonequilibrium effects can be removed by employing the Jarzynski equality (JE). Here we present such a systematical method, named reweighted nonequilibrium ensemble dynamics (RNED), to efficiently sample equilibrium conformations. The RNED is a combination of the JE and our previous reweighted ensemble dynamics (RED) method. The original JE reproduces equilibrium from lots of nonequilibrium trajectories but requires that the initial distribution of these trajectories is equilibrium. The RED reweights many equilibrium trajectories from an arbitrary initial distribution to get the equilibrium distribution, whereas the RNED has both advantages of the two methods, reproducing equilibrium from lots of nonequilibrium simulation trajectories with an arbitrary initial conformational distribution. We illustrated the application of the RNED in a toy model and in a Lennard-Jones fluid to detect its liquid-solid phase coexistence. The results indicate that the RNED sufficiently extends the application of both the original JE and the RED in equilibrium sampling of complex systems.

  7. Equilibrium sampling by reweighting nonequilibrium simulation trajectories.

    PubMed

    Yang, Cheng; Wan, Biao; Xu, Shun; Wang, Yanting; Zhou, Xin

    2016-03-01

    Based on equilibrium molecular simulations, it is usually difficult to efficiently visit the whole conformational space of complex systems, which are separated into some metastable regions by high free energy barriers. Nonequilibrium simulations could enhance transitions among these metastable regions and then be applied to sample equilibrium distributions in complex systems, since the associated nonequilibrium effects can be removed by employing the Jarzynski equality (JE). Here we present such a systematical method, named reweighted nonequilibrium ensemble dynamics (RNED), to efficiently sample equilibrium conformations. The RNED is a combination of the JE and our previous reweighted ensemble dynamics (RED) method. The original JE reproduces equilibrium from lots of nonequilibrium trajectories but requires that the initial distribution of these trajectories is equilibrium. The RED reweights many equilibrium trajectories from an arbitrary initial distribution to get the equilibrium distribution, whereas the RNED has both advantages of the two methods, reproducing equilibrium from lots of nonequilibrium simulation trajectories with an arbitrary initial conformational distribution. We illustrated the application of the RNED in a toy model and in a Lennard-Jones fluid to detect its liquid-solid phase coexistence. The results indicate that the RNED sufficiently extends the application of both the original JE and the RED in equilibrium sampling of complex systems.

  8. Conformational heterogeneity in the Hsp70 chaperone-substrate ensemble identified from analysis of NMR-detected titration data.

    PubMed

    Sekhar, Ashok; Nagesh, Jayashree; Rosenzweig, Rina; Kay, Lewis E

    2017-11-01

    The Hsp70 chaperone system plays a critical role in cellular homeostasis by binding to client protein molecules. We have recently shown by methyl-TROSY NMR methods that the Escherichia coli Hsp70, DnaK, can form multiple bound complexes with a small client protein, hTRF1. In an effort to characterize the interactions further we report here the results of an NMR-based titration study of hTRF1 and DnaK, where both molecular components are monitored simultaneously, leading to a binding model. A central finding is the formation of a previously undetected 3:1 hTRF1-DnaK complex, suggesting that under heat shock conditions, DnaK might be able to protect cytosolic proteins whose net concentrations would exceed that of the chaperone. Moreover, these results provide new insight into the heterogeneous ensemble of complexes formed by DnaK chaperones and further emphasize the unique role of NMR spectroscopy in obtaining information about individual events in a complex binding scheme by exploiting a large number of probes that report uniquely on distinct binding processes. © 2017 The Protein Society.

  9. O-Acetyl Side-Chains in Monosaccharides: Redundant NMR Spin-Couplings and Statistical Models for Acetate Ester Conformational Analysis.

    PubMed

    Turney, Toby; Pan, Qingfeng; Sernau, Luke; Carmichael, Ian; Zhang, Wenhui; Wang, Xiaocong; Woods, Robert J; Serianni, Anthony S

    2017-01-12

    α- and β-d-glucopyranose monoacetates 1-3 were prepared with selective 13 C enrichment in the O-acetyl side-chain, and ensembles of 13 C- 1 H and 13 C- 13 C NMR spin-couplings (J-couplings) were measured involving the labeled carbons. Density functional theory (DFT) was applied to a set of model structures to determine which J-couplings are sensitive to rotation of the ester bond θ. Eight J-couplings ( 1 J CC , 2 J CH , 2 J CC , 3 J CH , and 3 J CC ) were found to be sensitive to θ, and four equations were parametrized to allow quantitative interpretations of experimental J-values. Inspection of J-coupling ensembles in 1-3 showed that O-acetyl side-chain conformation depends on molecular context, with flanking groups playing a dominant role in determining the properties of θ in solution. To quantify these effects, ensembles of J-couplings containing four values were used to determine the precision and accuracy of several 2-parameter statistical models of rotamer distributions across θ in 1-3. The statistical method used to generate these models has been encoded in a newly developed program, MA'AT, which is available for public use. These models were compared to O-acetyl side-chain behavior observed in a representative sample of crystal structures, and in molecular dynamics (MD) simulations of O-acetylated model structures. While the functional form of the model had little effect on the precision of the calculated mean of θ in 1-3, platykurtic models were found to give more precise estimates of the width of the distribution about the mean (expressed as circular standard deviations). Validation of these 2-parameter models to interpret ensembles of redundant J-couplings using the O-acetyl system as a test case enables future extension of the approach to other flexible elements in saccharides, such as glycosidic linkage conformation.

  10. Molecular Dynamics Simulation of Membranes and a Transmembrane Helix

    NASA Astrophysics Data System (ADS)

    Duong, Tap Ha; Mehler, Ernest L.; Weinstein, Harel

    1999-05-01

    Three molecular dynamics (MD) simulations of 1.5-ns length were carried out on fully hydrated patches of dimyristoyl phosphatidylcholine (DMPC) bilayers in the liquid-crystalline phase. The simulations were performed using different ensembles and electrostatic conditions: a microcanonical ensemble or constant pressure-temperature ensemble, with or without truncated electrostatic interactions. Calculated properties of the membrane patches from the three different protocols were compared to available data from experiments. These data include the resulting overall geometrical dimensions, the order characteristics of the lipid hydrocarbon chains, as well as various measures of the conformations of the polar head groups. The comparisons indicate that the simulation carried out within the microcanonical ensemble with truncated electrostatic interactions yielded results closest to the experimental data, provided that the initial equilibration phase preceding the production run was sufficiently long. The effects of embedding a non-ideal helical protein domain in the membrane patch were studied with the same MD protocols. This simulation was carried out for 2.5 ns. The protein domain corresponds to the seventh transmembrane segment (TMS7) of the human serotonin 5HT 2Areceptor. The peptide is composed of two α-helical segments linked by a hinge domain around a perturbing Asn-Pro motif that produces at the end of the simulation a kink angle of nearly 80° between the two helices. Several aspects of the TMS7 structure, such as the bending angle, backbone Φ and Ψ torsion angles, the intramolecular hydrogen bonds, and the overall conformation, were found to be very similar to those determined by NMR for the corresponding transmembrane segment of the tachykinin NK-1 receptor. In general, the simulations were found to yield structural and dynamic characteristics that are in good agreement with experiment. These findings support the application of simulation methods to the study of the complex biomolecular systems at the membrane interface of cells.

  11. NWS Operational Requirements for Ensemble-Based Hydrologic Forecasts

    NASA Astrophysics Data System (ADS)

    Hartman, R. K.

    2008-12-01

    Ensemble-based hydrologic forecasts have been developed and issued by National Weather Service (NWS) staff at River Forecast Centers (RFCs) for many years. Used principally for long-range water supply forecasts, only the uncertainty associated with weather and climate have been traditionally considered. As technology and societal expectations of resource managers increase, the use and desire for risk-based decision support tools has also increased. These tools require forecast information that includes reliable uncertainty estimates across all time and space domains. The development of reliable uncertainty estimates associated with hydrologic forecasts is being actively pursued within the United States and internationally. This presentation will describe the challenges, components, and requirements for operational hydrologic ensemble-based forecasts from the perspective of a NOAA/NWS River Forecast Center.

  12. Verification of Ensemble Forecasts for the New York City Operations Support Tool

    NASA Astrophysics Data System (ADS)

    Day, G.; Schaake, J. C.; Thiemann, M.; Draijer, S.; Wang, L.

    2012-12-01

    The New York City water supply system operated by the Department of Environmental Protection (DEP) serves nine million people. It covers 2,000 square miles of portions of the Catskill, Delaware, and Croton watersheds, and it includes nineteen reservoirs and three controlled lakes. DEP is developing an Operations Support Tool (OST) to support its water supply operations and planning activities. OST includes historical and real-time data, a model of the water supply system complete with operating rules, and lake water quality models developed to evaluate alternatives for managing turbidity in the New York City Catskill reservoirs. OST will enable DEP to manage turbidity in its unfiltered system while satisfying its primary objective of meeting the City's water supply needs, in addition to considering secondary objectives of maintaining ecological flows, supporting fishery and recreation releases, and mitigating downstream flood peaks. The current version of OST relies on statistical forecasts of flows in the system based on recent observed flows. To improve short-term decision making, plans are being made to transition to National Weather Service (NWS) ensemble forecasts based on hydrologic models that account for short-term weather forecast skill, longer-term climate information, as well as the hydrologic state of the watersheds and recent observed flows. To ensure that the ensemble forecasts are unbiased and that the ensemble spread reflects the actual uncertainty of the forecasts, a statistical model has been developed to post-process the NWS ensemble forecasts to account for hydrologic model error as well as any inherent bias and uncertainty in initial model states, meteorological data and forecasts. The post-processor is designed to produce adjusted ensemble forecasts that are consistent with the DEP historical flow sequences that were used to develop the system operating rules. A set of historical hindcasts that is representative of the real-time ensemble forecasts is needed to verify that the post-processed forecasts are unbiased, statistically reliable, and preserve the skill inherent in the "raw" NWS ensemble forecasts. A verification procedure and set of metrics will be presented that provide an objective assessment of ensemble forecasts. The procedure will be applied to both raw ensemble hindcasts and to post-processed ensemble hindcasts. The verification metrics will be used to validate proper functioning of the post-processor and to provide a benchmark for comparison of different types of forecasts. For example, current NWS ensemble forecasts are based on climatology, using each historical year to generate a forecast trace. The NWS Hydrologic Ensemble Forecast System (HEFS) under development will utilize output from both the National Oceanic Atmospheric Administration (NOAA) Global Ensemble Forecast System (GEFS) and the Climate Forecast System (CFS). Incorporating short-term meteorological forecasts and longer-term climate forecast information should provide sharper, more accurate forecasts. Hindcasts from HEFS will enable New York City to generate verification results to validate the new forecasts and further fine-tune system operating rules. Project verification results will be presented for different watersheds across a range of seasons, lead times, and flow levels to assess the quality of the current ensemble forecasts.

  13. Calibration and validation of coarse-grained models of atomic systems: application to semiconductor manufacturing

    NASA Astrophysics Data System (ADS)

    Farrell, Kathryn; Oden, J. Tinsley

    2014-07-01

    Coarse-grained models of atomic systems, created by aggregating groups of atoms into molecules to reduce the number of degrees of freedom, have been used for decades in important scientific and technological applications. In recent years, interest in developing a more rigorous theory for coarse graining and in assessing the predictivity of coarse-grained models has arisen. In this work, Bayesian methods for the calibration and validation of coarse-grained models of atomistic systems in thermodynamic equilibrium are developed. For specificity, only configurational models of systems in canonical ensembles are considered. Among major challenges in validating coarse-grained models are (1) the development of validation processes that lead to information essential in establishing confidence in the model's ability predict key quantities of interest and (2), above all, the determination of the coarse-grained model itself; that is, the characterization of the molecular architecture, the choice of interaction potentials and thus parameters, which best fit available data. The all-atom model is treated as the "ground truth," and it provides the basis with respect to which properties of the coarse-grained model are compared. This base all-atom model is characterized by an appropriate statistical mechanics framework in this work by canonical ensembles involving only configurational energies. The all-atom model thus supplies data for Bayesian calibration and validation methods for the molecular model. To address the first challenge, we develop priors based on the maximum entropy principle and likelihood functions based on Gaussian approximations of the uncertainties in the parameter-to-observation error. To address challenge (2), we introduce the notion of model plausibilities as a means for model selection. This methodology provides a powerful approach toward constructing coarse-grained models which are most plausible for given all-atom data. We demonstrate the theory and methods through applications to representative atomic structures and we discuss extensions to the validation process for molecular models of polymer structures encountered in certain semiconductor nanomanufacturing processes. The powerful method of model plausibility as a means for selecting interaction potentials for coarse-grained models is discussed in connection with a coarse-grained hexane molecule. Discussions of how all-atom information is used to construct priors are contained in an appendix.

  14. Effect of Data Assimilation Parameters on The Optimized Surface CO2 Flux in Asia

    NASA Astrophysics Data System (ADS)

    Kim, Hyunjung; Kim, Hyun Mee; Kim, Jinwoong; Cho, Chun-Ho

    2018-02-01

    In this study, CarbonTracker, an inverse modeling system based on the ensemble Kalman filter, was used to evaluate the effects of data assimilation parameters (assimilation window length and ensemble size) on the estimation of surface CO2 fluxes in Asia. Several experiments with different parameters were conducted, and the results were verified using CO2 concentration observations. The assimilation window lengths tested were 3, 5, 7, and 10 weeks, and the ensemble sizes were 100, 150, and 300. Therefore, a total of 12 experiments using combinations of these parameters were conducted. The experimental period was from January 2006 to December 2009. Differences between the optimized surface CO2 fluxes of the experiments were largest in the Eurasian Boreal (EB) area, followed by Eurasian Temperate (ET) and Tropical Asia (TA), and were larger in boreal summer than in boreal winter. The effect of ensemble size on the optimized biosphere flux is larger than the effect of the assimilation window length in Asia, but the importance of them varies in specific regions in Asia. The optimized biosphere flux was more sensitive to the assimilation window length in EB, whereas it was sensitive to the ensemble size as well as the assimilation window length in ET. The larger the ensemble size and the shorter the assimilation window length, the larger the uncertainty (i.e., spread of ensemble) of optimized surface CO2 fluxes. The 10-week assimilation window and 300 ensemble size were the optimal configuration for CarbonTracker in the Asian region based on several verifications using CO2 concentration measurements.

  15. DNA under Force: Mechanics, Electrostatics, and Hydration.

    PubMed

    Li, Jingqiang; Wijeratne, Sithara S; Qiu, Xiangyun; Kiang, Ching-Hwa

    2015-02-25

    Quantifying the basic intra- and inter-molecular forces of DNA has helped us to better understand and further predict the behavior of DNA. Single molecule technique elucidates the mechanics of DNA under applied external forces, sometimes under extreme forces. On the other hand, ensemble studies of DNA molecular force allow us to extend our understanding of DNA molecules under other forces such as electrostatic and hydration forces. Using a variety of techniques, we can have a comprehensive understanding of DNA molecular forces, which is crucial in unraveling the complex DNA functions in living cells as well as in designing a system that utilizes the unique properties of DNA in nanotechnology.

  16. Molecular simulations of a CO2/CO mixture in MIL-127

    NASA Astrophysics Data System (ADS)

    Chokbunpiam, Tatiya; Fritzsche, Siegfried; Parasuk, Vudhichai; Caro, Jürgen; Assabumrungrat, Suttichai

    2018-03-01

    Adsorption and diffusion of an equimolar feed mixture of CO2 and CO in MIL-127 at three different temperatures and pressures up to 12 bar were investigated by molecular simulations. The adsorption was simulated using Gibbs-Ensemble Monte Carlo (GEMC). The structure of the adsorbed phase and the diffusion in the MIL were investigated using Molecular Dynamics (MD) simulations. The adsorption selectivity of MIL-127 for CO2 over CO at 233 K was about 15. When combining adsorption and diffusion selectivities, a membrane selectivity of about 12 is predicted. For higher temperatures, both adsorption and diffusion selectivity are found to be smaller.

  17. Effects of ensemble and summary displays on interpretations of geospatial uncertainty data.

    PubMed

    Padilla, Lace M; Ruginski, Ian T; Creem-Regehr, Sarah H

    2017-01-01

    Ensemble and summary displays are two widely used methods to represent visual-spatial uncertainty; however, there is disagreement about which is the most effective technique to communicate uncertainty to the general public. Visualization scientists create ensemble displays by plotting multiple data points on the same Cartesian coordinate plane. Despite their use in scientific practice, it is more common in public presentations to use visualizations of summary displays, which scientists create by plotting statistical parameters of the ensemble members. While prior work has demonstrated that viewers make different decisions when viewing summary and ensemble displays, it is unclear what components of the displays lead to diverging judgments. This study aims to compare the salience of visual features - or visual elements that attract bottom-up attention - as one possible source of diverging judgments made with ensemble and summary displays in the context of hurricane track forecasts. We report that salient visual features of both ensemble and summary displays influence participant judgment. Specifically, we find that salient features of summary displays of geospatial uncertainty can be misunderstood as displaying size information. Further, salient features of ensemble displays evoke judgments that are indicative of accurate interpretations of the underlying probability distribution of the ensemble data. However, when participants use ensemble displays to make point-based judgments, they may overweight individual ensemble members in their decision-making process. We propose that ensemble displays are a promising alternative to summary displays in a geospatial context but that decisions about visualization methods should be informed by the viewer's task.

  18. Verifying and Postprocesing the Ensemble Spread-Error Relationship

    NASA Astrophysics Data System (ADS)

    Hopson, Tom; Knievel, Jason; Liu, Yubao; Roux, Gregory; Wu, Wanli

    2013-04-01

    With the increased utilization of ensemble forecasts in weather and hydrologic applications, there is a need to verify their benefit over less expensive deterministic forecasts. One such potential benefit of ensemble systems is their capacity to forecast their own forecast error through the ensemble spread-error relationship. The paper begins by revisiting the limitations of the Pearson correlation alone in assessing this relationship. Next, we introduce two new metrics to consider in assessing the utility an ensemble's varying dispersion. We argue there are two aspects of an ensemble's dispersion that should be assessed. First, and perhaps more fundamentally: is there enough variability in the ensembles dispersion to justify the maintenance of an expensive ensemble prediction system (EPS), irrespective of whether the EPS is well-calibrated or not? To diagnose this, the factor that controls the theoretical upper limit of the spread-error correlation can be useful. Secondly, does the variable dispersion of an ensemble relate to variable expectation of forecast error? Representing the spread-error correlation in relation to its theoretical limit can provide a simple diagnostic of this attribute. A context for these concepts is provided by assessing two operational ensembles: 30-member Western US temperature forecasts for the U.S. Army Test and Evaluation Command and 51-member Brahmaputra River flow forecasts of the Climate Forecast and Applications Project for Bangladesh. Both of these systems utilize a postprocessing technique based on quantile regression (QR) under a step-wise forward selection framework leading to ensemble forecasts with both good reliability and sharpness. In addition, the methodology utilizes the ensemble's ability to self-diagnose forecast instability to produce calibrated forecasts with informative skill-spread relationships. We will describe both ensemble systems briefly, review the steps used to calibrate the ensemble forecast, and present verification statistics using error-spread metrics, along with figures from operational ensemble forecasts before and after calibration.

  19. Analog-Based Postprocessing of Navigation-Related Hydrological Ensemble Forecasts

    NASA Astrophysics Data System (ADS)

    Hemri, S.; Klein, B.

    2017-11-01

    Inland waterway transport benefits from probabilistic forecasts of water levels as they allow to optimize the ship load and, hence, to minimize the transport costs. Probabilistic state-of-the-art hydrologic ensemble forecasts inherit biases and dispersion errors from the atmospheric ensemble forecasts they are driven with. The use of statistical postprocessing techniques like ensemble model output statistics (EMOS) allows for a reduction of these systematic errors by fitting a statistical model based on training data. In this study, training periods for EMOS are selected based on forecast analogs, i.e., historical forecasts that are similar to the forecast to be verified. Due to the strong autocorrelation of water levels, forecast analogs have to be selected based on entire forecast hydrographs in order to guarantee similar hydrograph shapes. Custom-tailored measures of similarity for forecast hydrographs comprise hydrological series distance (SD), the hydrological matching algorithm (HMA), and dynamic time warping (DTW). Verification against observations reveals that EMOS forecasts for water level at three gauges along the river Rhine with training periods selected based on SD, HMA, and DTW compare favorably with reference EMOS forecasts, which are based on either seasonal training periods or on training periods obtained by dividing the hydrological forecast trajectories into runoff regimes.

  20. An evaluation of soil water outlooks for winter wheat in south-eastern Australia

    NASA Astrophysics Data System (ADS)

    Western, A. W.; Dassanayake, K. B.; Perera, K. C.; Alves, O.; Young, G.; Argent, R.

    2015-12-01

    Abstract: Soil moisture is a key limiting resource for rain-fed cropping in Australian broad-acre cropping zones. Seasonal rainfall and temperature outlooks are standard operational services offered by the Australian Bureau of Meteorology and are routinely used to support agricultural decisions. This presentation examines the performance of proposed soil water seasonal outlooks in the context of wheat cropping in south-eastern Australia (autumn planting, late spring harvest). We used weather ensembles simulated by the Predictive Ocean-Atmosphere Model for Australia (POAMA), as input to the Agricultural Production Simulator (APSIM) to construct ensemble soil water "outlooks" at twenty sites. Hindcasts were made over a 33 year period using the 33 POAMA ensemble members. The overall modelling flow involved: 1. Downscaling of the daily weather series (rainfall, minimum and maximum temperature, humidity, radiation) from the ~250km POAMA grid scale to a local weather station using quantile-quantile correction. This was based on a 33 year observation record extracted from the SILO data drill product. 2. Using APSIM to produce soil water ensembles from the downscaled weather ensembles. A warm up period of 5 years of observed weather was followed by a 9 month hindcast period based on each ensemble member. 3. The soil water ensembles were summarized by estimating the proportion of outlook ensembles in each climatological tercile, where the climatology was constructed using APSIM and observed weather from the 33 years of hindcasts at the relevant site. 4. The soil water outlooks were evaluated for different lead times and months using a "truth" run of APSIM based on observed weather. Outlooks generally have useful some forecast skill for lead times of up to two-three months, except late spring; in line with current useful lead times for rainfall outlooks. Better performance was found in summer and autumn when vegetation cover and water use is low.

  1. Ensemble data assimilation in the Red Sea: sensitivity to ensemble selection and atmospheric forcing

    NASA Astrophysics Data System (ADS)

    Toye, Habib; Zhan, Peng; Gopalakrishnan, Ganesh; Kartadikaria, Aditya R.; Huang, Huang; Knio, Omar; Hoteit, Ibrahim

    2017-07-01

    We present our efforts to build an ensemble data assimilation and forecasting system for the Red Sea. The system consists of the high-resolution Massachusetts Institute of Technology general circulation model (MITgcm) to simulate ocean circulation and of the Data Research Testbed (DART) for ensemble data assimilation. DART has been configured to integrate all members of an ensemble adjustment Kalman filter (EAKF) in parallel, based on which we adapted the ensemble operations in DART to use an invariant ensemble, i.e., an ensemble Optimal Interpolation (EnOI) algorithm. This approach requires only single forward model integration in the forecast step and therefore saves substantial computational cost. To deal with the strong seasonal variability of the Red Sea, the EnOI ensemble is then seasonally selected from a climatology of long-term model outputs. Observations of remote sensing sea surface height (SSH) and sea surface temperature (SST) are assimilated every 3 days. Real-time atmospheric fields from the National Center for Environmental Prediction (NCEP) and the European Center for Medium-Range Weather Forecasts (ECMWF) are used as forcing in different assimilation experiments. We investigate the behaviors of the EAKF and (seasonal-) EnOI and compare their performances for assimilating and forecasting the circulation of the Red Sea. We further assess the sensitivity of the assimilation system to various filtering parameters (ensemble size, inflation) and atmospheric forcing.

  2. Elucidating Ligand-Modulated Conformational Landscape of GPCRs Using Cloud-Computing Approaches.

    PubMed

    Shukla, Diwakar; Lawrenz, Morgan; Pande, Vijay S

    2015-01-01

    G-protein-coupled receptors (GPCRs) are a versatile family of membrane-bound signaling proteins. Despite the recent successes in obtaining crystal structures of GPCRs, much needs to be learned about the conformational changes associated with their activation. Furthermore, the mechanism by which ligands modulate the activation of GPCRs has remained elusive. Molecular simulations provide a way of obtaining detailed an atomistic description of GPCR activation dynamics. However, simulating GPCR activation is challenging due to the long timescales involved and the associated challenge of gaining insights from the "Big" simulation datasets. Here, we demonstrate how cloud-computing approaches have been used to tackle these challenges and obtain insights into the activation mechanism of GPCRs. In particular, we review the use of Markov state model (MSM)-based sampling algorithms for sampling milliseconds of dynamics of a major drug target, the G-protein-coupled receptor β2-AR. MSMs of agonist and inverse agonist-bound β2-AR reveal multiple activation pathways and how ligands function via modulation of the ensemble of activation pathways. We target this ensemble of conformations with computer-aided drug design approaches, with the goal of designing drugs that interact more closely with diverse receptor states, for overall increased efficacy and specificity. We conclude by discussing how cloud-based approaches present a powerful and broadly available tool for studying the complex biological systems routinely. © 2015 Elsevier Inc. All rights reserved.

  3. The FTMap family of web servers for determining and characterizing ligand binding hot spots of proteins

    PubMed Central

    Kozakov, Dima; Grove, Laurie E.; Hall, David R.; Bohnuud, Tanggis; Mottarella, Scott; Luo, Lingqi; Xia, Bing; Beglov, Dmitri; Vajda, Sandor

    2016-01-01

    FTMap is a computational mapping server that identifies binding hot spots of macromolecules, i.e., regions of the surface with major contributions to the ligand binding free energy. To use FTMap, users submit a protein, DNA, or RNA structure in PDB format. FTMap samples billions of positions of small organic molecules used as probes and scores the probe poses using a detailed energy expression. Regions that bind clusters of multiple probe types identify the binding hot spots, in good agreement with experimental data. FTMap serves as basis for other servers, namely FTSite to predict ligand binding sites, FTFlex to account for side chain flexibility, FTMap/param to parameterize additional probes, and FTDyn to map ensembles of protein structures. Applications include determining druggability of proteins, identifying ligand moieties that are most important for binding, finding the most bound-like conformation in ensembles of unliganded protein structures, and providing input for fragment based drug design. FTMap is more accurate than classical mapping methods such as GRID and MCSS, and is much faster than the more recent approaches to protein mapping based on mixed molecular dynamics. Using 16 probe molecules, the FTMap server finds the hot spots of an average size protein in less than an hour. Since FTFlex performs mapping for all low energy conformers of side chains in the binding site, its completion time is proportionately longer. PMID:25855957

  4. Adaptive correction of ensemble forecasts

    NASA Astrophysics Data System (ADS)

    Pelosi, Anna; Battista Chirico, Giovanni; Van den Bergh, Joris; Vannitsem, Stephane

    2017-04-01

    Forecasts from numerical weather prediction (NWP) models often suffer from both systematic and non-systematic errors. These are present in both deterministic and ensemble forecasts, and originate from various sources such as model error and subgrid variability. Statistical post-processing techniques can partly remove such errors, which is particularly important when NWP outputs concerning surface weather variables are employed for site specific applications. Many different post-processing techniques have been developed. For deterministic forecasts, adaptive methods such as the Kalman filter are often used, which sequentially post-process the forecasts by continuously updating the correction parameters as new ground observations become available. These methods are especially valuable when long training data sets do not exist. For ensemble forecasts, well-known techniques are ensemble model output statistics (EMOS), and so-called "member-by-member" approaches (MBM). Here, we introduce a new adaptive post-processing technique for ensemble predictions. The proposed method is a sequential Kalman filtering technique that fully exploits the information content of the ensemble. One correction equation is retrieved and applied to all members, however the parameters of the regression equations are retrieved by exploiting the second order statistics of the forecast ensemble. We compare our new method with two other techniques: a simple method that makes use of a running bias correction of the ensemble mean, and an MBM post-processing approach that rescales the ensemble mean and spread, based on minimization of the Continuous Ranked Probability Score (CRPS). We perform a verification study for the region of Campania in southern Italy. We use two years (2014-2015) of daily meteorological observations of 2-meter temperature and 10-meter wind speed from 18 ground-based automatic weather stations distributed across the region, comparing them with the corresponding COSMO-LEPS ensemble forecasts. Deterministic verification scores (e.g., mean absolute error, bias) and probabilistic scores (e.g., CRPS) are used to evaluate the post-processing techniques. We conclude that the new adaptive method outperforms the simpler running bias-correction. The proposed adaptive method often outperforms the MBM method in removing bias. The MBM method has the advantage of correcting the ensemble spread, although it needs more training data.

  5. Gold nanoclusters-Cu(2+) ensemble-based fluorescence turn-on and real-time assay for acetylcholinesterase activity and inhibitor screening.

    PubMed

    Sun, Jian; Yang, Xiurong

    2015-12-15

    Based on the specific binding of Cu(2+) ions to the 11-mercaptoundecanoic acid (11-MUA)-protected AuNCs with intense orange-red emission, we have proposed and constructed a novel fluorescent nanomaterials-metal ions ensemble at a nonfluorescence off-state. Subsequently, an AuNCs@11-MUA-Cu(2+) ensemble-based fluorescent chemosensor, which is amenable to convenient, sensitive, selective, turn-on and real-time assay of acetylcholinesterase (AChE), could be developed by using acetylthiocholine (ATCh) as the substrate. Herein, the sensing ensemble solution exhibits a marvelous fluorescent enhancement in the presence of AChE and ATCh, where AChE hydrolyzes its active substrate ATCh into thiocholine (TCh), and then TCh captures Cu(2+) from the ensemble, accompanied by the conversion from fluorescence off-state to on-state of the AuNCs. The AChE activity could be detected less than 0.05 mU/mL within a good linear range from 0.05 to 2.5 mU/mL. Our proposed fluorescence assay can be utilized to evaluate the AChE activity quantitatively in real biological sample, and furthermore to screen the inhibitor of AChE. As far as we know, the present study has reported the first analytical proposal for sensing AChE activity in real time by using a fluorescent nanomaterials-Cu(2+) ensemble or focusing on the Cu(2+)-triggered fluorescence quenching/recovery. This strategy paves a new avenue for exploring the biosensing applications of fluorescent AuNCs, and presents the prospect of AuNCs@11-MUA-Cu(2+) ensemble as versatile enzyme activity assay platforms by means of other appropriate substrates/analytes. Copyright © 2015 Elsevier B.V. All rights reserved.

  6. Evaluating Alignment of Shapes by Ensemble Visualization

    PubMed Central

    Raj, Mukund; Mirzargar, Mahsa; Preston, J. Samuel; Kirby, Robert M.; Whitaker, Ross T.

    2016-01-01

    The visualization of variability in surfaces embedded in 3D, which is a type of ensemble uncertainty visualization, provides a means of understanding the underlying distribution of a collection or ensemble of surfaces. Although ensemble visualization for isosurfaces has been described in the literature, we conduct an expert-based evaluation of various ensemble visualization techniques in a particular medical imaging application: the construction of atlases or templates from a population of images. In this work, we extend contour boxplot to 3D, allowing us to evaluate it against an enumeration-style visualization of the ensemble members and other conventional visualizations used by atlas builders, namely examining the atlas image and the corresponding images/data provided as part of the construction process. We present feedback from domain experts on the efficacy of contour boxplot compared to other modalities when used as part of the atlas construction and analysis stages of their work. PMID:26186768

  7. A study of fuzzy logic ensemble system performance on face recognition problem

    NASA Astrophysics Data System (ADS)

    Polyakova, A.; Lipinskiy, L.

    2017-02-01

    Some problems are difficult to solve by using a single intelligent information technology (IIT). The ensemble of the various data mining (DM) techniques is a set of models which are able to solve the problem by itself, but the combination of which allows increasing the efficiency of the system as a whole. Using the IIT ensembles can improve the reliability and efficiency of the final decision, since it emphasizes on the diversity of its components. The new method of the intellectual informational technology ensemble design is considered in this paper. It is based on the fuzzy logic and is designed to solve the classification and regression problems. The ensemble consists of several data mining algorithms: artificial neural network, support vector machine and decision trees. These algorithms and their ensemble have been tested by solving the face recognition problems. Principal components analysis (PCA) is used for feature selection.

  8. The Road to DLCZ Protocol in Rubidium Ensemble

    NASA Astrophysics Data System (ADS)

    Li, Chang; Pu, Yunfei; Jiang, Nan; Chang, Wei; Zhang, Sheng; CenterQuantum Information, InstituteInterdisciplinary Information Sciences, Tsinghua Univ Team

    2017-04-01

    Quantum communication is the powerful approach achieving a fully secure information transferal. The DLCZ protocol ensures that photon linearly decays with transferring distance increasing, which improves the success potential and shortens the time to build up an entangled channel. Apart from that, it provides an advanced idea that building up a quantum internet based on different nodes connected to different sites and themselves. In our laboratory, three sets of laser-cooled Rubidium 87 ensemble have been built. Two of them serve as the single photon emitter, which generate the entanglement between ensemble and photon. What's more, crossed AODs are equipped to multiplex and demultiplex optical circuit so that ensemble is divided into 2 hundred of 2D sub-memory cells. And the third ensemble is used as quantum telecommunication, which converts 780nm photon into telecom-wavelength one. And we have been building double-MOT system, which provides more atoms in ensemble and larger optical density.

  9. A gold cyano complex in nitromethane: MD simulation and X-ray diffraction.

    PubMed

    Probst, Michael; Injan, Natcha; Megyes, Tünde; Bako, Imre; Balint, Szabolcz; Limtrakul, Jumras; Nazmutdinov, Renat; Mitev, Pavlin D; Hermansson, Kersti

    2012-06-29

    The solvation structure around the dicyanoaurate(I) anion (Au(CN) 2 - ) in a dilute nitromethane (CH 3 NO 2 ) solution is presented from X-ray diffraction measurements and molecular dynamics simulation (NVT ensemble, 460 nitromethane molecules at room temperature). The simulations are based on a new solute-solvent force-field fitted to a training set of quantum-chemically derived interaction energies. Radial distribution functions from experiment and simulation are in good agreement. The solvation structure has been further elucidated from MD data. Several shells can be identified. We obtain a solvation number of 13-17 nitromethane molecules with a strong preference to be oriented with their methyl groups towards the solute.

  10. A gold cyano complex in nitromethane: MD simulation and X-ray diffraction

    PubMed Central

    Probst, Michael; Injan, Natcha; Megyes, Tünde; Bako, Imre; Balint, Szabolcz; Limtrakul, Jumras; Nazmutdinov, Renat; Mitev, Pavlin D.; Hermansson, Kersti

    2012-01-01

    The solvation structure around the dicyanoaurate(I) anion (Au(CN)2−) in a dilute nitromethane (CH3NO2) solution is presented from X-ray diffraction measurements and molecular dynamics simulation (NVT ensemble, 460 nitromethane molecules at room temperature). The simulations are based on a new solute–solvent force-field fitted to a training set of quantum-chemically derived interaction energies. Radial distribution functions from experiment and simulation are in good agreement. The solvation structure has been further elucidated from MD data. Several shells can be identified. We obtain a solvation number of 13–17 nitromethane molecules with a strong preference to be oriented with their methyl groups towards the solute. PMID:25540462

  11. Correlated electron-nuclear dynamics with conditional wave functions.

    PubMed

    Albareda, Guillermo; Appel, Heiko; Franco, Ignacio; Abedi, Ali; Rubio, Angel

    2014-08-22

    The molecular Schrödinger equation is rewritten in terms of nonunitary equations of motion for the nuclei (or electrons) that depend parametrically on the configuration of an ensemble of generally defined electronic (or nuclear) trajectories. This scheme is exact and does not rely on the tracing out of degrees of freedom. Hence, the use of trajectory-based statistical techniques can be exploited to circumvent the calculation of the computationally demanding Born-Oppenheimer potential-energy surfaces and nonadiabatic coupling elements. The concept of the potential-energy surface is restored by establishing a formal connection with the exact factorization of the full wave function. This connection is used to gain insight from a simplified form of the exact propagation scheme.

  12. Density distribution function of a self-gravitating isothermal compressible turbulent fluid in the context of molecular clouds ensembles

    NASA Astrophysics Data System (ADS)

    Donkov, Sava; Stefanov, Ivan Z.

    2018-03-01

    We have set ourselves the task of obtaining the probability distribution function of the mass density of a self-gravitating isothermal compressible turbulent fluid from its physics. We have done this in the context of a new notion: the molecular clouds ensemble. We have applied a new approach that takes into account the fractal nature of the fluid. Using the medium equations, under the assumption of steady state, we show that the total energy per unit mass is an invariant with respect to the fractal scales. As a next step we obtain a non-linear integral equation for the dimensionless scale Q which is the third root of the integral of the probability distribution function. It is solved approximately up to the leading-order term in the series expansion. We obtain two solutions. They are power-law distributions with different slopes: the first one is -1.5 at low densities, corresponding to an equilibrium between all energies at a given scale, and the second one is -2 at high densities, corresponding to a free fall at small scales.

  13. A virtual-system coupled multicanonical molecular dynamics simulation: Principles and applications to free-energy landscape of protein-protein interaction with an all-atom model in explicit solvent

    NASA Astrophysics Data System (ADS)

    Higo, Junichi; Umezawa, Koji; Nakamura, Haruki

    2013-05-01

    We propose a novel generalized ensemble method, a virtual-system coupled multicanonical molecular dynamics (V-McMD), to enhance conformational sampling of biomolecules expressed by an all-atom model in an explicit solvent. In this method, a virtual system, of which physical quantities can be set arbitrarily, is coupled with the biomolecular system, which is the target to be studied. This method was applied to a system of an Endothelin-1 derivative, KR-CSH-ET1, known to form an antisymmetric homodimer at room temperature. V-McMD was performed starting from a configuration in which two KR-CSH-ET1 molecules were mutually distant in an explicit solvent. The lowest free-energy state (the most thermally stable state) at room temperature coincides with the experimentally determined native complex structure. This state was separated to other non-native minor clusters by a free-energy barrier, although the barrier disappeared with elevated temperature. V-McMD produced a canonical ensemble faster than a conventional McMD method.

  14. Optimal control theory with continuously distributed target states: An application to NaK

    NASA Astrophysics Data System (ADS)

    Kaiser, Andreas; May, Volkhard

    2006-01-01

    Laser pulse control of molecular dynamics is studied theoretically by using optimal control theory. The control theory is extended to target states which are distributed in time as well as in a space of parameters which are responsible for a change of individual molecular properties. This generalized treatment of a control task is first applied to wave packet formation in randomly oriented diatomic systems. Concentrating on an ensemble of NaK molecules which are not aligned the control yield decreases drastically when compared with an aligned ensemble. Second, we demonstrate for NaK the maximization of the probe pulse transient absorption in a pump-probe scheme with an optimized pump pulse. These computations suggest an overall optical control scheme, whereby a flexible technique is suggested to form particular wave packets in the excited state potential energy surface. In particular, it is shown that considerable wave packet localization at the turning points of the first-excited Σ-state potential energy surfaces of NaK may be achieved. The dependency of the control yield on the probe pulse parameters is also discussed.

  15. Stresses and elastic constants of crystalline sodium, from molecular dynamics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schiferl, S.K.

    1985-02-01

    The stresses and the elastic constants of bcc sodium are calculated by molecular dynamics (MD) for temperatures to T = 340K. The total adiabatic potential of a system of sodium atoms is represented by pseudopotential model. The resulting expression has two terms: a large, strictly volume-dependent potential, plus a sum over ion pairs of a small, volume-dependent two-body potential. The stresses and the elastic constants are given as strain derivatives of the Helmholtz free energy. The resulting expressions involve canonical ensemble averages (and fluctuation averages) of the position and volume derivatives of the potential. An ensemble correction relates the resultsmore » to MD equilibrium averages. Evaluation of the potential and its derivatives requires the calculation of integrals with infinite upper limits of integration, and integrand singularities. Methods for calculating these integrals and estimating the effects of integration errors are developed. A method is given for choosing initial conditions that relax quickly to a desired equilibrium state. Statistical methods developed earlier for MD data are extended to evaluate uncertainties in fluctuation averages, and to test for symmetry. 45 refs., 10 figs., 4 tabs.« less

  16. Kinetic rate constant prediction supports the conformational selection mechanism of protein binding.

    PubMed

    Moal, Iain H; Bates, Paul A

    2012-01-01

    The prediction of protein-protein kinetic rate constants provides a fundamental test of our understanding of molecular recognition, and will play an important role in the modeling of complex biological systems. In this paper, a feature selection and regression algorithm is applied to mine a large set of molecular descriptors and construct simple models for association and dissociation rate constants using empirical data. Using separate test data for validation, the predicted rate constants can be combined to calculate binding affinity with accuracy matching that of state of the art empirical free energy functions. The models show that the rate of association is linearly related to the proportion of unbound proteins in the bound conformational ensemble relative to the unbound conformational ensemble, indicating that the binding partners must adopt a geometry near to that of the bound prior to binding. Mirroring the conformational selection and population shift mechanism of protein binding, the models provide a strong separate line of evidence for the preponderance of this mechanism in protein-protein binding, complementing structural and theoretical studies.

  17. Modality-Driven Classification and Visualization of Ensemble Variance

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bensema, Kevin; Gosink, Luke; Obermaier, Harald

    Advances in computational power now enable domain scientists to address conceptual and parametric uncertainty by running simulations multiple times in order to sufficiently sample the uncertain input space. While this approach helps address conceptual and parametric uncertainties, the ensemble datasets produced by this technique present a special challenge to visualization researchers as the ensemble dataset records a distribution of possible values for each location in the domain. Contemporary visualization approaches that rely solely on summary statistics (e.g., mean and variance) cannot convey the detailed information encoded in ensemble distributions that are paramount to ensemble analysis; summary statistics provide no informationmore » about modality classification and modality persistence. To address this problem, we propose a novel technique that classifies high-variance locations based on the modality of the distribution of ensemble predictions. Additionally, we develop a set of confidence metrics to inform the end-user of the quality of fit between the distribution at a given location and its assigned class. We apply a similar method to time-varying ensembles to illustrate the relationship between peak variance and bimodal or multimodal behavior. These classification schemes enable a deeper understanding of the behavior of the ensemble members by distinguishing between distributions that can be described by a single tendency and distributions which reflect divergent trends in the ensemble.« less

  18. On the Likely Utility of Hybrid Weights Optimized for Variances in Hybrid Error Covariance Models

    NASA Astrophysics Data System (ADS)

    Satterfield, E.; Hodyss, D.; Kuhl, D.; Bishop, C. H.

    2017-12-01

    Because of imperfections in ensemble data assimilation schemes, one cannot assume that the ensemble covariance is equal to the true error covariance of a forecast. Previous work demonstrated how information about the distribution of true error variances given an ensemble sample variance can be revealed from an archive of (observation-minus-forecast, ensemble-variance) data pairs. Here, we derive a simple and intuitively compelling formula to obtain the mean of this distribution of true error variances given an ensemble sample variance from (observation-minus-forecast, ensemble-variance) data pairs produced by a single run of a data assimilation system. This formula takes the form of a Hybrid weighted average of the climatological forecast error variance and the ensemble sample variance. Here, we test the extent to which these readily obtainable weights can be used to rapidly optimize the covariance weights used in Hybrid data assimilation systems that employ weighted averages of static covariance models and flow-dependent ensemble based covariance models. Univariate data assimilation and multi-variate cycling ensemble data assimilation are considered. In both cases, it is found that our computationally efficient formula gives Hybrid weights that closely approximate the optimal weights found through the simple but computationally expensive process of testing every plausible combination of weights.

  19. Next generation extended Lagrangian first principles molecular dynamics

    NASA Astrophysics Data System (ADS)

    Niklasson, Anders M. N.

    2017-08-01

    Extended Lagrangian Born-Oppenheimer molecular dynamics [A. M. N. Niklasson, Phys. Rev. Lett. 100, 123004 (2008)] is formulated for general Hohenberg-Kohn density-functional theory and compared with the extended Lagrangian framework of first principles molecular dynamics by Car and Parrinello [Phys. Rev. Lett. 55, 2471 (1985)]. It is shown how extended Lagrangian Born-Oppenheimer molecular dynamics overcomes several shortcomings of regular, direct Born-Oppenheimer molecular dynamics, while improving or maintaining important features of Car-Parrinello simulations. The accuracy of the electronic degrees of freedom in extended Lagrangian Born-Oppenheimer molecular dynamics, with respect to the exact Born-Oppenheimer solution, is of second-order in the size of the integration time step and of fourth order in the potential energy surface. Improved stability over recent formulations of extended Lagrangian Born-Oppenheimer molecular dynamics is achieved by generalizing the theory to finite temperature ensembles, using fractional occupation numbers in the calculation of the inner-product kernel of the extended harmonic oscillator that appears as a preconditioner in the electronic equations of motion. Material systems that normally exhibit slow self-consistent field convergence can be simulated using integration time steps of the same order as in direct Born-Oppenheimer molecular dynamics, but without the requirement of an iterative, non-linear electronic ground-state optimization prior to the force evaluations and without a systematic drift in the total energy. In combination with proposed low-rank and on the fly updates of the kernel, this formulation provides an efficient and general framework for quantum-based Born-Oppenheimer molecular dynamics simulations.

  20. Next generation extended Lagrangian first principles molecular dynamics.

    PubMed

    Niklasson, Anders M N

    2017-08-07

    Extended Lagrangian Born-Oppenheimer molecular dynamics [A. M. N. Niklasson, Phys. Rev. Lett. 100, 123004 (2008)] is formulated for general Hohenberg-Kohn density-functional theory and compared with the extended Lagrangian framework of first principles molecular dynamics by Car and Parrinello [Phys. Rev. Lett. 55, 2471 (1985)]. It is shown how extended Lagrangian Born-Oppenheimer molecular dynamics overcomes several shortcomings of regular, direct Born-Oppenheimer molecular dynamics, while improving or maintaining important features of Car-Parrinello simulations. The accuracy of the electronic degrees of freedom in extended Lagrangian Born-Oppenheimer molecular dynamics, with respect to the exact Born-Oppenheimer solution, is of second-order in the size of the integration time step and of fourth order in the potential energy surface. Improved stability over recent formulations of extended Lagrangian Born-Oppenheimer molecular dynamics is achieved by generalizing the theory to finite temperature ensembles, using fractional occupation numbers in the calculation of the inner-product kernel of the extended harmonic oscillator that appears as a preconditioner in the electronic equations of motion. Material systems that normally exhibit slow self-consistent field convergence can be simulated using integration time steps of the same order as in direct Born-Oppenheimer molecular dynamics, but without the requirement of an iterative, non-linear electronic ground-state optimization prior to the force evaluations and without a systematic drift in the total energy. In combination with proposed low-rank and on the fly updates of the kernel, this formulation provides an efficient and general framework for quantum-based Born-Oppenheimer molecular dynamics simulations.

  1. Improving consensus structure by eliminating averaging artifacts

    PubMed Central

    KC, Dukka B

    2009-01-01

    Background Common structural biology methods (i.e., NMR and molecular dynamics) often produce ensembles of molecular structures. Consequently, averaging of 3D coordinates of molecular structures (proteins and RNA) is a frequent approach to obtain a consensus structure that is representative of the ensemble. However, when the structures are averaged, artifacts can result in unrealistic local geometries, including unphysical bond lengths and angles. Results Herein, we describe a method to derive representative structures while limiting the number of artifacts. Our approach is based on a Monte Carlo simulation technique that drives a starting structure (an extended or a 'close-by' structure) towards the 'averaged structure' using a harmonic pseudo energy function. To assess the performance of the algorithm, we applied our approach to Cα models of 1364 proteins generated by the TASSER structure prediction algorithm. The average RMSD of the refined model from the native structure for the set becomes worse by a mere 0.08 Å compared to the average RMSD of the averaged structures from the native structure (3.28 Å for refined structures and 3.36 A for the averaged structures). However, the percentage of atoms involved in clashes is greatly reduced (from 63% to 1%); in fact, the majority of the refined proteins had zero clashes. Moreover, a small number (38) of refined structures resulted in lower RMSD to the native protein versus the averaged structure. Finally, compared to PULCHRA [1], our approach produces representative structure of similar RMSD quality, but with much fewer clashes. Conclusion The benchmarking results demonstrate that our approach for removing averaging artifacts can be very beneficial for the structural biology community. Furthermore, the same approach can be applied to almost any problem where averaging of 3D coordinates is performed. Namely, structure averaging is also commonly performed in RNA secondary prediction [2], which could also benefit from our approach. PMID:19267905

  2. Cluster ensemble based on Random Forests for genetic data.

    PubMed

    Alhusain, Luluah; Hafez, Alaaeldin M

    2017-01-01

    Clustering plays a crucial role in several application domains, such as bioinformatics. In bioinformatics, clustering has been extensively used as an approach for detecting interesting patterns in genetic data. One application is population structure analysis, which aims to group individuals into subpopulations based on shared genetic variations, such as single nucleotide polymorphisms. Advances in DNA sequencing technology have facilitated the obtainment of genetic datasets with exceptional sizes. Genetic data usually contain hundreds of thousands of genetic markers genotyped for thousands of individuals, making an efficient means for handling such data desirable. Random Forests (RFs) has emerged as an efficient algorithm capable of handling high-dimensional data. RFs provides a proximity measure that can capture different levels of co-occurring relationships between variables. RFs has been widely considered a supervised learning method, although it can be converted into an unsupervised learning method. Therefore, RF-derived proximity measure combined with a clustering technique may be well suited for determining the underlying structure of unlabeled data. This paper proposes, RFcluE, a cluster ensemble approach for determining the underlying structure of genetic data based on RFs. The approach comprises a cluster ensemble framework to combine multiple runs of RF clustering. Experiments were conducted on high-dimensional, real genetic dataset to evaluate the proposed approach. The experiments included an examination of the impact of parameter changes, comparing RFcluE performance against other clustering methods, and an assessment of the relationship between the diversity and quality of the ensemble and its effect on RFcluE performance. This paper proposes, RFcluE, a cluster ensemble approach based on RF clustering to address the problem of population structure analysis and demonstrate the effectiveness of the approach. The paper also illustrates that applying a cluster ensemble approach, combining multiple RF clusterings, produces more robust and higher-quality results as a consequence of feeding the ensemble with diverse views of high-dimensional genetic data obtained through bagging and random subspace, the two key features of the RF algorithm.

  3. A desirability-based multi objective approach for the virtual screening discovery of broad-spectrum anti-gastric cancer agents

    PubMed Central

    Sánchez-Rodríguez, Aminael; Tejera, Eduardo; Cruz-Monteagudo, Maykel; Borges, Fernanda; Cordeiro, M. Natália D. S.; Le-Thi-Thu, Huong; Pham-The, Hai

    2018-01-01

    Gastric cancer is the third leading cause of cancer-related mortality worldwide and despite advances in prevention, diagnosis and therapy, it is still regarded as a global health concern. The efficacy of the therapies for gastric cancer is limited by a poor response to currently available therapeutic regimens. One of the reasons that may explain these poor clinical outcomes is the highly heterogeneous nature of this disease. In this sense, it is essential to discover new molecular agents capable of targeting various gastric cancer subtypes simultaneously. Here, we present a multi-objective approach for the ligand-based virtual screening discovery of chemical compounds simultaneously active against the gastric cancer cell lines AGS, NCI-N87 and SNU-1. The proposed approach relays in a novel methodology based on the development of ensemble models for the bioactivity prediction against each individual gastric cancer cell line. The methodology includes the aggregation of one ensemble per cell line using a desirability-based algorithm into virtual screening protocols. Our research leads to the proposal of a multi-targeted virtual screening protocol able to achieve high enrichment of known chemicals with anti-gastric cancer activity. Specifically, our results indicate that, using the proposed protocol, it is possible to retrieve almost 20 more times multi-targeted compounds in the first 1% of the ranked list than what is expected from a uniform distribution of the active ones in the virtual screening database. More importantly, the proposed protocol attains an outstanding initial enrichment of known multi-targeted anti-gastric cancer agents. PMID:29420638

  4. Causal network in a deafferented non-human primate brain.

    PubMed

    Balasubramanian, Karthikeyan; Takahashi, Kazutaka; Hatsopoulos, Nicholas G

    2015-01-01

    De-afferented/efferented neural ensembles can undergo causal changes when interfaced to neuroprosthetic devices. These changes occur via recruitment or isolation of neurons, alterations in functional connectivity within the ensemble and/or changes in the role of neurons, i.e., excitatory/inhibitory. In this work, emergence of a causal network and changes in the dynamics are demonstrated for a deafferented brain region exposed to BMI (brain-machine interface) learning. The BMI was controlling a robot for reach-and-grasp behavior. And, the motor cortical regions used for the BMI were deafferented due to chronic amputation, and ensembles of neurons were decoded for velocity control of the multi-DOF robot. A generalized linear model-framework based Granger causality (GLM-GC) technique was used in estimating the ensemble connectivity. Model selection was based on the AIC (Akaike Information Criterion).

  5. An ensemble forecast of the South China Sea monsoon

    NASA Astrophysics Data System (ADS)

    Krishnamurti, T. N.; Tewari, Mukul; Bensman, Ed; Han, Wei; Zhang, Zhan; Lau, William K. M.

    1999-05-01

    This paper presents a generalized ensemble forecast procedure for the tropical latitudes. Here we propose an empirical orthogonal function-based procedure for the definition of a seven-member ensemble. The wind and the temperature fields are perturbed over the global tropics. Although the forecasts are made over the global belt with a high-resolution model, the emphasis of this study is on a South China Sea monsoon. Over this domain of the South China Sea includes the passage of a Tropical Storm, Gary, that moved eastwards north of the Philippines. The ensemble forecast handled the precipitation of this storm reasonably well. A global model at the resolution Triangular Truncation 126 waves is used to carry out these seven forecasts. The evaluation of the ensemble of forecasts is carried out via standard root mean square errors of the precipitation and the wind fields. The ensemble average is shown to have a higher skill compared to a control experiment, which was a first analysis based on operational data sets over both the global tropical and South China Sea domain. All of these experiments were subjected to physical initialization which provides a spin-up of the model rain close to that obtained from satellite and gauge-based estimates. The results furthermore show that inherently much higher skill resides in the forecast precipitation fields if they are averaged over area elements of the order of 4° latitude by 4° longitude squares.

  6. Creating ensembles of decision trees through sampling

    DOEpatents

    Kamath, Chandrika; Cantu-Paz, Erick

    2005-08-30

    A system for decision tree ensembles that includes a module to read the data, a module to sort the data, a module to evaluate a potential split of the data according to some criterion using a random sample of the data, a module to split the data, and a module to combine multiple decision trees in ensembles. The decision tree method is based on statistical sampling techniques and includes the steps of reading the data; sorting the data; evaluating a potential split according to some criterion using a random sample of the data, splitting the data, and combining multiple decision trees in ensembles.

  7. Implicit ligand theory for relative binding free energies

    NASA Astrophysics Data System (ADS)

    Nguyen, Trung Hai; Minh, David D. L.

    2018-03-01

    Implicit ligand theory enables noncovalent binding free energies to be calculated based on an exponential average of the binding potential of mean force (BPMF)—the binding free energy between a flexible ligand and rigid receptor—over a precomputed ensemble of receptor configurations. In the original formalism, receptor configurations were drawn from or reweighted to the apo ensemble. Here we show that BPMFs averaged over a holo ensemble yield binding free energies relative to the reference ligand that specifies the ensemble. When using receptor snapshots from an alchemical simulation with a single ligand, the new statistical estimator outperforms the original.

  8. Embedded feature ranking for ensemble MLP classifiers.

    PubMed

    Windeatt, Terry; Duangsoithong, Rakkrit; Smith, Raymond

    2011-06-01

    A feature ranking scheme for multilayer perceptron (MLP) ensembles is proposed, along with a stopping criterion based upon the out-of-bootstrap estimate. To solve multi-class problems feature ranking is combined with modified error-correcting output coding. Experimental results on benchmark data demonstrate the versatility of the MLP base classifier in removing irrelevant features.

  9. Clustering-Based Ensemble Learning for Activity Recognition in Smart Homes

    PubMed Central

    Jurek, Anna; Nugent, Chris; Bi, Yaxin; Wu, Shengli

    2014-01-01

    Application of sensor-based technology within activity monitoring systems is becoming a popular technique within the smart environment paradigm. Nevertheless, the use of such an approach generates complex constructs of data, which subsequently requires the use of intricate activity recognition techniques to automatically infer the underlying activity. This paper explores a cluster-based ensemble method as a new solution for the purposes of activity recognition within smart environments. With this approach activities are modelled as collections of clusters built on different subsets of features. A classification process is performed by assigning a new instance to its closest cluster from each collection. Two different sensor data representations have been investigated, namely numeric and binary. Following the evaluation of the proposed methodology it has been demonstrated that the cluster-based ensemble method can be successfully applied as a viable option for activity recognition. Results following exposure to data collected from a range of activities indicated that the ensemble method had the ability to perform with accuracies of 94.2% and 97.5% for numeric and binary data, respectively. These results outperformed a range of single classifiers considered as benchmarks. PMID:25014095

  10. Clustering-based ensemble learning for activity recognition in smart homes.

    PubMed

    Jurek, Anna; Nugent, Chris; Bi, Yaxin; Wu, Shengli

    2014-07-10

    Application of sensor-based technology within activity monitoring systems is becoming a popular technique within the smart environment paradigm. Nevertheless, the use of such an approach generates complex constructs of data, which subsequently requires the use of intricate activity recognition techniques to automatically infer the underlying activity. This paper explores a cluster-based ensemble method as a new solution for the purposes of activity recognition within smart environments. With this approach activities are modelled as collections of clusters built on different subsets of features. A classification process is performed by assigning a new instance to its closest cluster from each collection. Two different sensor data representations have been investigated, namely numeric and binary. Following the evaluation of the proposed methodology it has been demonstrated that the cluster-based ensemble method can be successfully applied as a viable option for activity recognition. Results following exposure to data collected from a range of activities indicated that the ensemble method had the ability to perform with accuracies of 94.2% and 97.5% for numeric and binary data, respectively. These results outperformed a range of single classifiers considered as benchmarks.

  11. Village Building Identification Based on Ensemble Convolutional Neural Networks

    PubMed Central

    Guo, Zhiling; Chen, Qi; Xu, Yongwei; Shibasaki, Ryosuke; Shao, Xiaowei

    2017-01-01

    In this study, we present the Ensemble Convolutional Neural Network (ECNN), an elaborate CNN frame formulated based on ensembling state-of-the-art CNN models, to identify village buildings from open high-resolution remote sensing (HRRS) images. First, to optimize and mine the capability of CNN for village mapping and to ensure compatibility with our classification targets, a few state-of-the-art models were carefully optimized and enhanced based on a series of rigorous analyses and evaluations. Second, rather than directly implementing building identification by using these models, we exploited most of their advantages by ensembling their feature extractor parts into a stronger model called ECNN based on the multiscale feature learning method. Finally, the generated ECNN was applied to a pixel-level classification frame to implement object identification. The proposed method can serve as a viable tool for village building identification with high accuracy and efficiency. The experimental results obtained from the test area in Savannakhet province, Laos, prove that the proposed ECNN model significantly outperforms existing methods, improving overall accuracy from 96.64% to 99.26%, and kappa from 0.57 to 0.86. PMID:29084154

  12. Progressive freezing of interacting spins in isolated finite magnetic ensembles

    NASA Astrophysics Data System (ADS)

    Bhattacharya, Kakoli; Dupuis, Veronique; Le-Roy, Damien; Deb, Pritam

    2017-02-01

    Self-organization of magnetic nanoparticles into secondary nanostructures provides an innovative way for designing functional nanomaterials with novel properties, different from the constituent primary nanoparticles as well as their bulk counterparts. Collective magnetic properties of such complex closed packing of magnetic nanoparticles makes them more appealing than the individual magnetic nanoparticles in many technological applications. This work reports the collective magnetic behaviour of magnetic ensembles comprising of single domain Fe3O4 nanoparticles. The present work reveals that the ensemble formation is based on the re-orientation and attachment of the nanoparticles in an iso-oriented fashion at the mesoscale regime. Comprehensive dc magnetic measurements show the prevalence of strong interparticle interactions in the ensembles. Due to the close range organization of primary Fe3O4 nanoparticles in the ensemble, the spins of the individual nanoparticles interact through dipolar interactions as realized from remnant magnetization measurements. Signature of super spin glass like behaviour in the ensembles is observed in the memory studies carried out in field cooled conditions. Progressive freezing of spins in the ensembles is corroborated from the Vogel-Fulcher fit of the susceptibility data. Dynamic scaling of relaxation reasserted slow spin dynamics substantiating cluster spin glass like behaviour in the ensembles.

  13. Evidence for Dynamic Chemical Kinetics at Individual Molecular Ruthenium Catalysts.

    PubMed

    Easter, Quinn T; Blum, Suzanne A

    2018-02-05

    Catalytic cycles are typically depicted as possessing time-invariant steps with fixed rates. Yet the true behavior of individual catalysts with respect to time is unknown, hidden by the ensemble averaging inherent to bulk measurements. Evidence is presented for variable chemical kinetics at individual catalysts, with a focus on ring-opening metathesis polymerization catalyzed by the second-generation Grubbs' ruthenium catalyst. Fluorescence microscopy is used to probe the chemical kinetics of the reaction because the technique possesses sufficient sensitivity for the detection of single chemical reactions. Insertion reactions in submicron regions likely occur at groups of many (not single) catalysts, yet not so many that their unique kinetic behavior is ensemble averaged. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. Statistical thermodynamics of clustered populations.

    PubMed

    Matsoukas, Themis

    2014-08-01

    We present a thermodynamic theory for a generic population of M individuals distributed into N groups (clusters). We construct the ensemble of all distributions with fixed M and N, introduce a selection functional that embodies the physics that governs the population, and obtain the distribution that emerges in the scaling limit as the most probable among all distributions consistent with the given physics. We develop the thermodynamics of the ensemble and establish a rigorous mapping to regular thermodynamics. We treat the emergence of a so-called giant component as a formal phase transition and show that the criteria for its emergence are entirely analogous to the equilibrium conditions in molecular systems. We demonstrate the theory by an analytic model and confirm the predictions by Monte Carlo simulation.

  15. Vapor-liquid phase equilibria of water modelled by a Kim-Gordon potential

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Maerzke, K A; McGrath, M J; Kuo, I W

    2009-03-16

    Gibbs ensemble Monte Carlo simulations were carried out to investigate the properties of a frozen-electron-density (or Kim-Gordon, KG) model of water along the vapor-liquid coexistence curve. Because of its theoretical basis, such a KG model provides for seamless coupling to Kohn-Sham density functional theory for use in mixed quantum mechanics/molecular mechanics (QM/MM) implementations. The Gibbs ensemble simulations indicate rather limited transferability of such a simple KG model to other state points. Specifically, a KG model that was parameterized by Barker and Sprik to the properties of liquid water at 300 K, yields saturated vapor pressures and a critical temperature thatmore » are significantly under- and over-estimated, respectively.« less

  16. Exploiting ensemble learning for automatic cataract detection and grading.

    PubMed

    Yang, Ji-Jiang; Li, Jianqiang; Shen, Ruifang; Zeng, Yang; He, Jian; Bi, Jing; Li, Yong; Zhang, Qinyan; Peng, Lihui; Wang, Qing

    2016-02-01

    Cataract is defined as a lenticular opacity presenting usually with poor visual acuity. It is one of the most common causes of visual impairment worldwide. Early diagnosis demands the expertise of trained healthcare professionals, which may present a barrier to early intervention due to underlying costs. To date, studies reported in the literature utilize a single learning model for retinal image classification in grading cataract severity. We present an ensemble learning based approach as a means to improving diagnostic accuracy. Three independent feature sets, i.e., wavelet-, sketch-, and texture-based features, are extracted from each fundus image. For each feature set, two base learning models, i.e., Support Vector Machine and Back Propagation Neural Network, are built. Then, the ensemble methods, majority voting and stacking, are investigated to combine the multiple base learning models for final fundus image classification. Empirical experiments are conducted for cataract detection (two-class task, i.e., cataract or non-cataractous) and cataract grading (four-class task, i.e., non-cataractous, mild, moderate or severe) tasks. The best performance of the ensemble classifier is 93.2% and 84.5% in terms of the correct classification rates for cataract detection and grading tasks, respectively. The results demonstrate that the ensemble classifier outperforms the single learning model significantly, which also illustrates the effectiveness of the proposed approach. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  17. An evaluation of noise reduction algorithms for particle-based fluid simulations in multi-scale applications

    NASA Astrophysics Data System (ADS)

    Zimoń, M. J.; Prosser, R.; Emerson, D. R.; Borg, M. K.; Bray, D. J.; Grinberg, L.; Reese, J. M.

    2016-11-01

    Filtering of particle-based simulation data can lead to reduced computational costs and enable more efficient information transfer in multi-scale modelling. This paper compares the effectiveness of various signal processing methods to reduce numerical noise and capture the structures of nano-flow systems. In addition, a novel combination of these algorithms is introduced, showing the potential of hybrid strategies to improve further the de-noising performance for time-dependent measurements. The methods were tested on velocity and density fields, obtained from simulations performed with molecular dynamics and dissipative particle dynamics. Comparisons between the algorithms are given in terms of performance, quality of the results and sensitivity to the choice of input parameters. The results provide useful insights on strategies for the analysis of particle-based data and the reduction of computational costs in obtaining ensemble solutions.

  18. Assessing a local ensemble Kalman filter: perfect model experiments with the National Centers for Environmental Prediction global model

    NASA Astrophysics Data System (ADS)

    Szunyogh, Istvan; Kostelich, Eric J.; Gyarmati, G.; Patil, D. J.; Hunt, Brian R.; Kalnay, Eugenia; Ott, Edward; Yorke, James A.

    2005-08-01

    The accuracy and computational efficiency of the recently proposed local ensemble Kalman filter (LEKF) data assimilation scheme is investigated on a state-of-the-art operational numerical weather prediction model using simulated observations. The model selected for this purpose is the T62 horizontal- and 28-level vertical-resolution version of the Global Forecast System (GFS) of the National Center for Environmental Prediction. The performance of the data assimilation system is assessed for different configurations of the LEKF scheme. It is shown that a modest size (40-member) ensemble is sufficient to track the evolution of the atmospheric state with high accuracy. For this ensemble size, the computational time per analysis is less than 9 min on a cluster of PCs. The analyses are extremely accurate in the mid-latitude storm track regions. The largest analysis errors, which are typically much smaller than the observational errors, occur where parametrized physical processes play important roles. Because these are also the regions where model errors are expected to be the largest, limitations of a real-data implementation of the ensemble-based Kalman filter may be easily mistaken for model errors. In light of these results, the importance of testing the ensemble-based Kalman filter data assimilation systems on simulated observations is stressed.

  19. A Bayesian Ensemble Approach for Epidemiological Projections

    PubMed Central

    Lindström, Tom; Tildesley, Michael; Webb, Colleen

    2015-01-01

    Mathematical models are powerful tools for epidemiology and can be used to compare control actions. However, different models and model parameterizations may provide different prediction of outcomes. In other fields of research, ensemble modeling has been used to combine multiple projections. We explore the possibility of applying such methods to epidemiology by adapting Bayesian techniques developed for climate forecasting. We exemplify the implementation with single model ensembles based on different parameterizations of the Warwick model run for the 2001 United Kingdom foot and mouth disease outbreak and compare the efficacy of different control actions. This allows us to investigate the effect that discrepancy among projections based on different modeling assumptions has on the ensemble prediction. A sensitivity analysis showed that the choice of prior can have a pronounced effect on the posterior estimates of quantities of interest, in particular for ensembles with large discrepancy among projections. However, by using a hierarchical extension of the method we show that prior sensitivity can be circumvented. We further extend the method to include a priori beliefs about different modeling assumptions and demonstrate that the effect of this can have different consequences depending on the discrepancy among projections. We propose that the method is a promising analytical tool for ensemble modeling of disease outbreaks. PMID:25927892

  20. An adaptive incremental approach to constructing ensemble classifiers: application in an information-theoretic computer-aided decision system for detection of masses in mammograms.

    PubMed

    Mazurowski, Maciej A; Zurada, Jacek M; Tourassi, Georgia D

    2009-07-01

    Ensemble classifiers have been shown efficient in multiple applications. In this article, the authors explore the effectiveness of ensemble classifiers in a case-based computer-aided diagnosis system for detection of masses in mammograms. They evaluate two general ways of constructing subclassifiers by resampling of the available development dataset: Random division and random selection. Furthermore, they discuss the problem of selecting the ensemble size and propose two adaptive incremental techniques that automatically select the size for the problem at hand. All the techniques are evaluated with respect to a previously proposed information-theoretic CAD system (IT-CAD). The experimental results show that the examined ensemble techniques provide a statistically significant improvement (AUC = 0.905 +/- 0.024) in performance as compared to the original IT-CAD system (AUC = 0.865 +/- 0.029). Some of the techniques allow for a notable reduction in the total number of examples stored in the case base (to 1.3% of the original size), which, in turn, results in lower storage requirements and a shorter response time of the system. Among the methods examined in this article, the two proposed adaptive techniques are by far the most effective for this purpose. Furthermore, the authors provide some discussion and guidance for choosing the ensemble parameters.

Top