Science.gov

Sample records for molecular ensemble based

  1. Cosolvent-Based Molecular Dynamics for Ensemble Docking: Practical Method for Generating Druggable Protein Conformations.

    PubMed

    Uehara, Shota; Tanaka, Shigenori

    2017-04-07

    Protein flexibility is a major hurdle in current structure-based virtual screening (VS). In spite of the recent advances in high-performance computing, protein-ligand docking methods still demand tremendous computational cost to take into account the full degree of protein flexibility. In this context, ensemble docking has proven its utility and efficiency for VS studies, but it still needs a rational and efficient method to select and/or generate multiple protein conformations. Molecular dynamics (MD) simulations are useful to produce distinct protein conformations without abundant experimental structures. In this study, we present a novel strategy that makes use of cosolvent-based molecular dynamics (CMD) simulations for ensemble docking. By mixing small organic molecules into a solvent, CMD can stimulate dynamic protein motions and induce partial conformational changes of binding pocket residues appropriate for the binding of diverse ligands. The present method has been applied to six diverse target proteins and assessed by VS experiments using many actives and decoys of DEKOIS 2.0. The simulation results have revealed that the CMD is beneficial for ensemble docking. Utilizing cosolvent simulation allows the generation of druggable protein conformations, improving the VS performance compared with the use of a single experimental structure or ensemble docking by standard MD with pure water as the solvent.

  2. Algorithms and novel applications based on the isokinetic ensemble. I. Biophysical and path integral molecular dynamics

    NASA Astrophysics Data System (ADS)

    Minary, Peter; Martyna, Glenn J.; Tuckerman, Mark E.

    2003-02-01

    In this paper (Paper I) and a companion paper (Paper II), novel new algorithms and applications of the isokinetic ensemble as generated by Gauss' principle of least constraint, pioneered for use with molecular dynamics 20 years ago, are presented for biophysical, path integral, and Car-Parrinello based ab initio molecular dynamics. In Paper I, a new "extended system" version of the isokinetic equations of motion that overcomes the ergodicity problems inherent in the standard approach, is developed using a new theory of non-Hamiltonian phase space analysis [M. E. Tuckerman et al., Europhys. Lett. 45, 149 (1999); J. Chem. Phys. 115, 1678 (2001)]. Reversible multiple time step integrations schemes for the isokinetic methods, first presented by Zhang [J. Chem. Phys. 106, 6102 (1997)] are reviewed. Next, holonomic constraints are incorporated into the isokinetic methodology for use in fast efficient biomolecular simulation studies. Model and realistic examples are presented in order to evaluate, critically, the performance of the new isokinetic molecular dynamic schemes. Comparisons are made to the, now standard, canonical dynamics method, Nosé-Hoover chain dynamics [G. J. Martyna et al., J. Chem. Phys. 97, 2635 (1992)]. The new isokinetic techniques are found to yield more efficient sampling than the Nosé-Hoover chain method in both path integral molecular dynamics and biophysical molecular dynamics calculations. In Paper II, the use of isokinetic methods in Car-Parrinello based ab initio molecular dynamics calculations is presented.

  3. Quantum metrology with molecular ensembles

    SciTech Connect

    Schaffry, Marcus; Gauger, Erik M.; Morton, John J. L.; Fitzsimons, Joseph; Benjamin, Simon C.; Lovett, Brendon W.

    2010-10-15

    The field of quantum metrology promises measurement devices that are fundamentally superior to conventional technologies. Specifically, when quantum entanglement is harnessed, the precision achieved is supposed to scale more favorably with the resources employed, such as system size and time required. Here, we consider measurement of magnetic-field strength using an ensemble of spin-active molecules. We identify a third essential resource: the change in ensemble polarization (entropy increase) during the metrology experiment. We find that performance depends crucially on the form of decoherence present; for a plausible dephasing model, we describe a quantum strategy, which can indeed beat the standard strategy.

  4. Algorithms and novel applications based on the isokinetic ensemble. II. Ab initio molecular dynamics

    NASA Astrophysics Data System (ADS)

    Minary, Peter; Martyna, Glenn J.; Tuckerman, Mark E.

    2003-02-01

    In this paper (Paper II), the isokinetic dynamics scheme described in Paper I is combined with the plane-wave based Car-Parrinello (CP) ab initio molecular dynamics (MD) method [R. Car and M. Parrinello, Phys. Rev. Lett. 55, 2471 (1985)] to enable the efficient study of chemical reactions and metallic systems. The Car-Parrinello approach employs "on the fly" electronic structure calculations as a means of generating accurate internuclear forces for use in a molecular dynamics simulation. This is accomplished by the introduction of an extended Lagrangian that contains the electronic orbitals as fictitious dynamical variables (often expressed directly in terms of the expansion coefficients of the orbitals in a particular basis set). Thus, rather than quench the expansion coefficients to obtain the ground state energy and nuclear forces at every time step, the orbitals are "propagated" under conditions that allow them to fluctuate rapidly around their global minimum and, hence, generate an accurate approximation to the nuclear forces as the simulation proceeds. Indeed, the CP technique requires the dynamics of the orbitals to be both fast compared to the nuclear degrees of freedom while keeping the fictitious kinetic energy that allows them to be propagated dynamically as small as possible. While these conditions can be easy to achieve in many types of systems, in metals and highly exothermic chemical reactions difficulties arise. (Note, the CP dynamics of metals is incorrect because the nuclear motion does not occur on the ground state electronic surface but it can, nonetheless, provide useful information.) In order to alleviate these difficulties the isokinetic methods of Paper I are applied to derive isokinetic CP equations of motion. The efficacy of the new isokinetic CPMD method is demonstrated on model and realistic systems. The latter include, metallic systems, liquid aluminum, a small silicon sample, the 2×1 reconstruction of the silicon 100 surface, and the

  5. Molecular docking to ensembles of protein structures.

    PubMed

    Knegtel, R M; Kuntz, I D; Oshiro, C M

    1997-02-21

    Until recently, applications of molecular docking assumed that the macromolecular receptor exists in a single, rigid conformation. However, structural studies involving different ligands bound to the same target biomolecule frequently reveal modest but significant conformational changes in the target. In this paper, two related methods for molecular docking are described that utilize information on conformational variability from ensembles of experimental receptor structures. One method combines the information into an "energy-weighted average" of the interaction energy between a ligand and each receptor structure. The other method performs the averaging on a structural level, producing a "geometry-weighted average" of the inter-molecular force field score used in DOCK 3.5. Both methods have been applied in docking small molecules to ensembles of crystal and solution structures, and we show that experimentally determined binding orientations and computed energies of known ligands can be reproduced accurately. The use of composite grids, when conformationally different protein structures are available, yields an improvement in computational speed for database searches in proportion to the number of structures.

  6. Emerging Methods for Ensemble-Based Virtual Screening

    PubMed Central

    Amaro, Rommie E.; Li, Wilfred W.

    2011-01-01

    Ensemble based virtual screening refers to the use of conformational ensembles from crystal structures, NMR studies or molecular dynamics simulations. It has gained greater acceptance as advances in the theoretical framework, computational algorithms, and software packages enable simulations at longer time scales. Here we focus on the use of computationally generated conformational ensembles and emerging methods that use these ensembles for discovery, such as the Relaxed Complex Scheme or Dynamic Pharmacophore Model. We also discuss the more rigorous physics-based computational techniques such as accelerated molecular dynamics and thermodynamic integration and their applications in improving conformational sampling or the ranking of virtual screening hits. Finally, technological advances that will help make virtual screening tools more accessible to a wider audience in computer aided drug design are discussed. PMID:19929833

  7. Preserving the Boltzmann ensemble in replica-exchange molecular dynamics.

    PubMed

    Cooke, Ben; Schmidler, Scott C

    2008-10-28

    We consider the convergence behavior of replica-exchange molecular dynamics (REMD) [Sugita and Okamoto, Chem. Phys. Lett. 314, 141 (1999)] based on properties of the numerical integrators in the underlying isothermal molecular dynamics (MD) simulations. We show that a variety of deterministic algorithms favored by molecular dynamics practitioners for constant-temperature simulation of biomolecules fail either to be measure invariant or irreducible, and are therefore not ergodic. We then show that REMD using these algorithms also fails to be ergodic. As a result, the entire configuration space may not be explored even in an infinitely long simulation, and the simulation may not converge to the desired equilibrium Boltzmann ensemble. Moreover, our analysis shows that for initial configurations with unfavorable energy, it may be impossible for the system to reach a region surrounding the minimum energy configuration. We demonstrate these failures of REMD algorithms for three small systems: a Gaussian distribution (simple harmonic oscillator dynamics), a bimodal mixture of Gaussians distribution, and the alanine dipeptide. Examination of the resulting phase plots and equilibrium configuration densities indicates significant errors in the ensemble generated by REMD simulation. We describe a simple modification to address these failures based on a stochastic hybrid Monte Carlo correction, and prove that this is ergodic.

  8. Molecular modeling of closed circular DNA thermodynamic ensembles.

    PubMed

    Sprous, D; Tan, R K; Harvey, S C

    1996-08-01

    Many modeling studies of supercoiled DNA are based on equilibrium structures from theoretical calculations or energy minimization. Since closed circular DNAs are flexible, it is possible that errors are introduced by calculating properties from a single minimum energy structure, rather than from a complete thermodynamic ensemble. We have investigated this question using molecular dynamics simulations on a low resolution molecular mechanics model in which each base pair is represented by three points (a plane). This allows the inclusion of sequence-dependent variations of tip, inclination, and twist. Three kinds of sequences were tested: (1) homogeneous DNA, in which all base pairs have the helicoidal parameters of an ideal, average B-DNA; (2) random sequence DNA; and (3) curved DNA. We examined the rate of convergence of various structural parameters. Convergence for most of these is slowest for homogeneous sequences, more rapid for random sequences, and most rapid for curved sequences. The most slowly converging parameter is the antipodes profile. In a plasmid with N base pairs (bp), the antipodes distance is the distance dij from base pair i to base pair j halfway around the plasmid, j = i + N/2. The antipodes profile at time tau is a plot of dij over the range i = 1, N/2. In a homogeneous plasmid, convergence requires that the antipodes profile averaged over time must be flat. Even in the small plasmids examined here, the average properties of the ensembles were found to differ from those of static equilibrium structures. These effects will be even more dramatic for larger plasmids. Further, average and dynamic properties are affected by both plasmid size and sequence.

  9. Thermodynamics and kinetics of a molecular motor ensemble.

    PubMed

    Baker, J E; Thomas, D D

    2000-10-01

    If, contrary to conventional models of muscle, it is assumed that molecular forces equilibrate among rather than within molecular motors, an equation of state and an expression for energy output can be obtained for a near-equilibrium, coworking ensemble of molecular motors. These equations predict clear, testable relationships between motor structure, motor biochemistry, and ensemble motor function, and we discuss these relationships in the context of various experimental studies. In this model, net work by molecular motors is performed with the relaxation of a near-equilibrium intermediate step in a motor-catalyzed reaction. The free energy available for work is localized to this step, and the rate at which this free energy is transferred to work is accelerated by the free energy of a motor-catalyzed reaction. This thermodynamic model implicitly deals with a motile cell system as a dynamic network (not a rigid lattice) of molecular motors within which the mechanochemistry of one motor influences and is influenced by the mechanochemistry of other motors in the ensemble.

  10. Thermodynamics and kinetics of a molecular motor ensemble.

    PubMed Central

    Baker, J E; Thomas, D D

    2000-01-01

    If, contrary to conventional models of muscle, it is assumed that molecular forces equilibrate among rather than within molecular motors, an equation of state and an expression for energy output can be obtained for a near-equilibrium, coworking ensemble of molecular motors. These equations predict clear, testable relationships between motor structure, motor biochemistry, and ensemble motor function, and we discuss these relationships in the context of various experimental studies. In this model, net work by molecular motors is performed with the relaxation of a near-equilibrium intermediate step in a motor-catalyzed reaction. The free energy available for work is localized to this step, and the rate at which this free energy is transferred to work is accelerated by the free energy of a motor-catalyzed reaction. This thermodynamic model implicitly deals with a motile cell system as a dynamic network (not a rigid lattice) of molecular motors within which the mechanochemistry of one motor influences and is influenced by the mechanochemistry of other motors in the ensemble. PMID:11023881

  11. Efficient Agent-Based Cluster Ensembles

    NASA Technical Reports Server (NTRS)

    Agogino, Adrian; Tumer, Kagan

    2006-01-01

    Numerous domains ranging from distributed data acquisition to knowledge reuse need to solve the cluster ensemble problem of combining multiple clusterings into a single unified clustering. Unfortunately current non-agent-based cluster combining methods do not work in a distributed environment, are not robust to corrupted clusterings and require centralized access to all original clusterings. Overcoming these issues will allow cluster ensembles to be used in fundamentally distributed and failure-prone domains such as data acquisition from satellite constellations, in addition to domains demanding confidentiality such as combining clusterings of user profiles. This paper proposes an efficient, distributed, agent-based clustering ensemble method that addresses these issues. In this approach each agent is assigned a small subset of the data and votes on which final cluster its data points should belong to. The final clustering is then evaluated by a global utility, computed in a distributed way. This clustering is also evaluated using an agent-specific utility that is shown to be easier for the agents to maximize. Results show that agents using the agent-specific utility can achieve better performance than traditional non-agent based methods and are effective even when up to 50% of the agents fail.

  12. MSEBAG: a dynamic classifier ensemble generation based on `minimum-sufficient ensemble' and bagging

    NASA Astrophysics Data System (ADS)

    Chen, Lei; Kamel, Mohamed S.

    2016-01-01

    In this paper, we propose a dynamic classifier system, MSEBAG, which is characterised by searching for the 'minimum-sufficient ensemble' and bagging at the ensemble level. It adopts an 'over-generation and selection' strategy and aims to achieve a good bias-variance trade-off. In the training phase, MSEBAG first searches for the 'minimum-sufficient ensemble', which maximises the in-sample fitness with the minimal number of base classifiers. Then, starting from the 'minimum-sufficient ensemble', a backward stepwise algorithm is employed to generate a collection of ensembles. The objective is to create a collection of ensembles with a descending fitness on the data, as well as a descending complexity in the structure. MSEBAG dynamically selects the ensembles from the collection for the decision aggregation. The extended adaptive aggregation (EAA) approach, a bagging-style algorithm performed at the ensemble level, is employed for this task. EAA searches for the competent ensembles using a score function, which takes into consideration both the in-sample fitness and the confidence of the statistical inference, and averages the decisions of the selected ensembles to label the test pattern. The experimental results show that the proposed MSEBAG outperforms the benchmarks on average.

  13. Formulation of Liouville's theorem for grand ensemble molecular simulations

    NASA Astrophysics Data System (ADS)

    Delle Site, Luigi

    2016-02-01

    Liouville's theorem in a grand ensemble, that is for situations where a system is in equilibrium with a reservoir of energy and particles, is a subject that, to our knowledge, has not been explicitly treated in literature related to molecular simulation. Instead, Liouville's theorem, a central concept for the correct employment of molecular simulation techniques, is implicitly considered only within the framework of systems where the total number of particles is fixed. However, the pressing demand of applied science in treating open systems leads to the question of the existence and possible exact formulation of Liouville's theorem when the number of particles changes during the dynamical evolution of the system. The intention of this paper is to stimulate a debate about this crucial issue for molecular simulation.

  14. Ensemble-based global ocean data assimilation

    NASA Astrophysics Data System (ADS)

    Nadiga, Balasubramanya T.; Casper, W. Riley; Jones, Philip W.

    2013-12-01

    We present results of experiments performing global, ensemble-based, ocean-only data assimilation and assess the utility of such data assimilation in improving model predictions. The POP (Parallel Ocean Program) Ocean General Circulation Model (OGCM) is forced by interannually varying atmospheric fields of version 2 of the Coordinated Ocean Reference Experiment (CORE) data set, and temperature and salinity observations from the World Ocean Database 2009 (WOD09) are assimilated. The assimilation experiments are conducted over a period of about two years starting January 1, 1990 using the framework of the Data Assimilation Research Testbed (DART). We find that an inflation scheme that blends the ensemble-based sample error covariance with a static estimate of ensemble spread is necessary for the assimilations to be effective in the ocean model. We call this Climatology-based Spread Inflation or CSI for short. The effectiveness of the proposed inflation scheme is investigated in a low-order model; a series of experiments in this context demonstrates its effectiveness. Using a number of diagnostics, we show that the resulting assimilated state of ocean circulation is more realistic: In particular, the sea surface temperature (SST) shows reduced errors with respect to an unassimilated SST data set, and the subsurface temperature shows reduced errors with respect to observations. Finally, towards assessing the utility of assimilations for predictions, we show that the use of an assimilated state as initial condition leads to improved hindcast skill over a significant period of time; that is when the OGCM is initialized with an assimilated state and run forward, it is better able to predict unassimilated observations of the WOD09 than a control non-assimilating run (≈ 20% reduction in error) over a period of about three months. The loss of skill beyond this period is conjectured to be due, in part, to model error and prevents an improvement in the representation of

  15. Online cross-validation-based ensemble learning.

    PubMed

    Benkeser, David; Ju, Cheng; Lendle, Sam; van der Laan, Mark

    2017-05-04

    Online estimators update a current estimate with a new incoming batch of data without having to revisit past data thereby providing streaming estimates that are scalable to big data. We develop flexible, ensemble-based online estimators of an infinite-dimensional target parameter, such as a regression function, in the setting where data are generated sequentially by a common conditional data distribution given summary measures of the past. This setting encompasses a wide range of time-series models and, as special case, models for independent and identically distributed data. Our estimator considers a large library of candidate online estimators and uses online cross-validation to identify the algorithm with the best performance. We show that by basing estimates on the cross-validation-selected algorithm, we are asymptotically guaranteed to perform as well as the true, unknown best-performing algorithm. We provide extensions of this approach including online estimation of the optimal ensemble of candidate online estimators. We illustrate excellent performance of our methods using simulations and a real data example where we make streaming predictions of infectious disease incidence using data from a large database. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  16. Sequential ensemble-based optimal design for parameter estimation: SEQUENTIAL ENSEMBLE-BASED OPTIMAL DESIGN

    SciTech Connect

    Man, Jun; Zhang, Jiangjiang; Li, Weixuan; Zeng, Lingzao; Wu, Laosheng

    2016-10-01

    The ensemble Kalman filter (EnKF) has been widely used in parameter estimation for hydrological models. The focus of most previous studies was to develop more efficient analysis (estimation) algorithms. On the other hand, it is intuitively understandable that a well-designed sampling (data-collection) strategy should provide more informative measurements and subsequently improve the parameter estimation. In this work, a Sequential Ensemble-based Optimal Design (SEOD) method, coupled with EnKF, information theory and sequential optimal design, is proposed to improve the performance of parameter estimation. Based on the first-order and second-order statistics, different information metrics including the Shannon entropy difference (SD), degrees of freedom for signal (DFS) and relative entropy (RE) are used to design the optimal sampling strategy, respectively. The effectiveness of the proposed method is illustrated by synthetic one-dimensional and two-dimensional unsaturated flow case studies. It is shown that the designed sampling strategies can provide more accurate parameter estimation and state prediction compared with conventional sampling strategies. Optimal sampling designs based on various information metrics perform similarly in our cases. The effect of ensemble size on the optimal design is also investigated. Overall, larger ensemble size improves the parameter estimation and convergence of optimal sampling strategy. Although the proposed method is applied to unsaturated flow problems in this study, it can be equally applied in any other hydrological problems.

  17. Argumentation based joint learning: a novel ensemble learning approach.

    PubMed

    Xu, Junyi; Yao, Li; Li, Le

    2015-01-01

    Recently, ensemble learning methods have been widely used to improve classification performance in machine learning. In this paper, we present a novel ensemble learning method: argumentation based multi-agent joint learning (AMAJL), which integrates ideas from multi-agent argumentation, ensemble learning, and association rule mining. In AMAJL, argumentation technology is introduced as an ensemble strategy to integrate multiple base classifiers and generate a high performance ensemble classifier. We design an argumentation framework named Arena as a communication platform for knowledge integration. Through argumentation based joint learning, high quality individual knowledge can be extracted, and thus a refined global knowledge base can be generated and used independently for classification. We perform numerous experiments on multiple public datasets using AMAJL and other benchmark methods. The results demonstrate that our method can effectively extract high quality knowledge for ensemble classifier and improve the performance of classification.

  18. Ensemble control of Kondo screening in molecular adsorbates

    DOE PAGES

    Maughan, Bret; Zahl, Percy; Sutter, Peter; ...

    2017-04-06

    Switching the magnetic properties of organic semiconductors on a metal surface has thus far largely been limited to molecule-by-molecule tip-induced transformations in scanned probe experiments. Here we demonstrate with molecular resolution that collective control of activated Kondo screening can be achieved in thin-films of the organic semiconductor titanyl phthalocyanine on Cu(110) to obtain tunable concentrations of Kondo impurities. Using low-temperature scanning tunneling microscopy and spectroscopy, we show that a thermally activated molecular distortion dramatically shifts surface–molecule coupling and enables ensemble-level control of Kondo screening in the interfacial spin system. This is accompanied by the formation of a temperature-dependent Abrikosov–Suhl–Kondo resonancemore » in the local density of states of the activated molecules. This enables coverage-dependent control over activation to the Kondo screening state. Finally, our study thus advances the versatility of molecular switching for Kondo physics and opens new avenues for scalable bottom-up tailoring of the electronic structure and magnetic texture of organic semiconductor interfaces at the nanoscale.« less

  19. A simple method to improve ensemble-based ozone forecasts

    NASA Astrophysics Data System (ADS)

    Pagowski, M.; Grell, G. A.; McKeen, S. A.; Dévényi, D.; Wilczak, J. M.; Bouchet, V.; Gong, W.; McHenry, J.; Peckham, S.; McQueen, J.; Moffet, R.; Tang, Y.

    2005-04-01

    Forecasts from seven air quality models and ozone data collected over the eastern USA and southern Canada during July and August 2004 are used in creating a simple method to improve ensemble-based forecasts of maximum daily 1-hr and 8-hr averaged ozone concentrations. The method minimizes least-square error of ensemble forecasts by assigning weights for its members. The real-time ozone (O3) forecasts from this ensemble of models are statistically evaluated against the ozone observations collected for the AIRNow database comprising more than 350 stations. Application of this method is shown to significantly improve overall statistics (e.g., bias, root mean square error, and index of agreement) of the weighted ensemble compared to the averaged ensemble or any individual ensemble member. If a sufficient number of observations is available, we recommend that weights be calculated daily; if not, a longer training phase will still provide a positive benefit.

  20. Identifying ultrasound and clinical features of breast cancer molecular subtypes by ensemble decision

    PubMed Central

    Zhang, Lei; Li, Jing; Xiao, Yun; Cui, Hao; Du, Guoqing; Wang, Ying; Li, Ziyao; Wu, Tong; Li, Xia; Tian, Jiawei

    2015-01-01

    Breast cancer is molecularly heterogeneous and categorized into four molecular subtypes: Luminal-A, Luminal-B, HER2-amplified and Triple-negative. In this study, we aimed to apply an ensemble decision approach to identify the ultrasound and clinical features related to the molecular subtypes. We collected ultrasound and clinical features from 1,000 breast cancer patients and performed immunohistochemistry on these samples. We used the ensemble decision approach to select unique features and to construct decision models. The decision model for Luminal-A subtype was constructed based on the presence of an echogenic halo and post-acoustic shadowing or indifference. The decision model for Luminal-B subtype was constructed based on the absence of an echogenic halo and vascularity. The decision model for HER2-amplified subtype was constructed based on the presence of post-acoustic enhancement, calcification, vascularity and advanced age. The model for Triple-negative subtype followed two rules. One was based on irregular shape, lobulate margin contour, the absence of calcification and hypovascularity, whereas the other was based on oval shape, hypovascularity and micro-lobulate margin contour. The accuracies of the models were 83.8%, 77.4%, 87.9% and 92.7%, respectively. We identified specific features of each molecular subtype and expanded the scope of ultrasound for making diagnoses using these decision models. PMID:26046791

  1. Coherent Radiative Decay of Molecular Rotations: A Comparative Study of Terahertz-Oriented versus Optically Aligned Molecular Ensembles

    NASA Astrophysics Data System (ADS)

    Damari, Ran; Rosenberg, Dina; Fleischer, Sharly

    2017-07-01

    The decay of field-free rotational dynamics is experimentally studied by two complementary methods: laser-induced molecular alignment and terahertz-field-induced molecular orientation. A comparison between the decay rates of different molecular species at various gas pressures reveals that oriented molecular ensembles decay faster than aligned ensembles. The discrepancy in decay rates is attributed to the coherent radiation emitted by the transiently oriented ensembles and is absent from aligned molecules. The experimental results reveal the dramatic contribution of coherent radiative emission to the observed decay of rotational dynamics and underline a general phenomenon expected whenever field-free coherent dipole oscillations are induced.

  2. Coherent Radiative Decay of Molecular Rotations: A Comparative Study of Terahertz-Oriented versus Optically Aligned Molecular Ensembles.

    PubMed

    Damari, Ran; Rosenberg, Dina; Fleischer, Sharly

    2017-07-21

    The decay of field-free rotational dynamics is experimentally studied by two complementary methods: laser-induced molecular alignment and terahertz-field-induced molecular orientation. A comparison between the decay rates of different molecular species at various gas pressures reveals that oriented molecular ensembles decay faster than aligned ensembles. The discrepancy in decay rates is attributed to the coherent radiation emitted by the transiently oriented ensembles and is absent from aligned molecules. The experimental results reveal the dramatic contribution of coherent radiative emission to the observed decay of rotational dynamics and underline a general phenomenon expected whenever field-free coherent dipole oscillations are induced.

  3. Protein Remote Homology Detection Based on an Ensemble Learning Approach.

    PubMed

    Chen, Junjie; Liu, Bingquan; Huang, Dong

    2016-01-01

    Protein remote homology detection is one of the central problems in bioinformatics. Although some computational methods have been proposed, the problem is still far from being solved. In this paper, an ensemble classifier for protein remote homology detection, called SVM-Ensemble, was proposed with a weighted voting strategy. SVM-Ensemble combined three basic classifiers based on different feature spaces, including Kmer, ACC, and SC-PseAAC. These features consider the characteristics of proteins from various perspectives, incorporating both the sequence composition and the sequence-order information along the protein sequences. Experimental results on a widely used benchmark dataset showed that the proposed SVM-Ensemble can obviously improve the predictive performance for the protein remote homology detection. Moreover, it achieved the best performance and outperformed other state-of-the-art methods.

  4. Quantum Ensemble Classification: A Sampling-Based Learning Control Approach.

    PubMed

    Chen, Chunlin; Dong, Daoyi; Qi, Bo; Petersen, Ian R; Rabitz, Herschel

    2017-06-01

    Quantum ensemble classification (QEC) has significant applications in discrimination of atoms (or molecules), separation of isotopes, and quantum information extraction. However, quantum mechanics forbids deterministic discrimination among nonorthogonal states. The classification of inhomogeneous quantum ensembles is very challenging, since there exist variations in the parameters characterizing the members within different classes. In this paper, we recast QEC as a supervised quantum learning problem. A systematic classification methodology is presented by using a sampling-based learning control (SLC) approach for quantum discrimination. The classification task is accomplished via simultaneously steering members belonging to different classes to their corresponding target states (e.g., mutually orthogonal states). First, a new discrimination method is proposed for two similar quantum systems. Then, an SLC method is presented for QEC. Numerical results demonstrate the effectiveness of the proposed approach for the binary classification of two-level quantum ensembles and the multiclass classification of multilevel quantum ensembles.

  5. Building Ensemble-Based Data Assimilation Systems with Coupled Models

    NASA Astrophysics Data System (ADS)

    Nerger, Lars

    2017-04-01

    Discussed is the construction of programs for efficient ensemble data assimilation systems based on a direct connection between a coupled simulation model and ensemble data assimilation software. The strategy allows us to set up a data assimilation program with high flexibility and parallel scalability with only small changes to the model. The direct connection is obtained by first extending the source code of the coupled model so that it is able to run an ensemble of model states. In addition, a filtering step is added using a combination of in-memory access and parallel communication to create an online-coupled ensemble assimilation program. The direct connection avoids the common need to stop and restart a whole coupled model system to perform the assimilation of observations in the analysis step of ensemble-based filter methods like ensemble Kalman or particle filters. Instead, the analysis step is performed in between time steps and is independent of the actual model coupler. This strategy allows us to perform both in-compartment (for weakly coupled assimilation) and cross-compartment (for strongly coupled assimilation) assimilation. The assimilation frequency can be kept flexible, so that assimilation of observations from different compartments can be performed at different time intervals. Using the parallel data assimilation framework (PDAF, http://pdaf.awi.de), the direct connection strategy will be exemplified for the ocean-atmosphere model ECHAM6-FESOM.

  6. Multiphysics ensemble-based modelling of an alpine snowpack

    NASA Astrophysics Data System (ADS)

    Lafaysse, Matthieu; Cluzet, Bertrand; Dumont, Marie; Lejeune, Yves; Vionnet, Vincent; Morin, Samuel

    2017-04-01

    Physically based multilayer snowpack models suffer from various modelling errors. It is necessary to quantify these errors in various applications including ensemble forecasting of snowpack conditions and ensemble assimilation of snowpack observations. We present here the new multi-physical ensemble system ESCROC (Ensemble System Crocus) which describes the uncertainties of snowpack modelling by new representations of different physical processes in the deterministic coupled multi-layer ground/snowpack model SURFEX/ISBA/Crocus, including 3 different options for snow metamorphism among others. This ensemble was driven and evaluated at Col de Porte (1325 m a.s.l., French alps) over 18 years with a high quality meteorological and snow dataset. 7776 simulations were evaluated separately accounting for the uncertainties of evaluation data. The ability of the ensemble to capture the uncertainty associated to modelling errors is assessed with probabilistic tools for snow depth, snow water equivalent, bulk density, albedo and surface temperature. Results show that optimal members of the ESCROC system are able to explain about 2/3 of the total simulation errors. The 3 different options of snow metamorphism can exhibit a similar skill for the evaluated variables, with a high dependency of results on the options chosen for the other physical processes (compaction, liquid water percolation, solar radiation absorption, turbulent fluxes, etc.). ESCROC is a promising system to integrate numerical snow modelling errors in ensemble forecasting and ensemble assimilation systems in support of avalanche hazard forecasting and other snowpack modelling applications. It may benefit of any future improvement in the uncertainty quantification about modelling of each specific physical process, such as snow metamorphism modelling.

  7. Ensemble Sampling vs. Time Sampling in Molecular Dynamics Simulations of Thermal Conductivity

    SciTech Connect

    Gordiz, Kiarash; Singh, David J.; Henry, Asegun

    2015-01-29

    In this report we compare time sampling and ensemble averaging as two different methods available for phase space sampling. For the comparison, we calculate thermal conductivities of solid argon and silicon structures, using equilibrium molecular dynamics. We introduce two different schemes for the ensemble averaging approach, and show that both can reduce the total simulation time as compared to time averaging. It is also found that velocity rescaling is an efficient mechanism for phase space exploration. Although our methodology is tested using classical molecular dynamics, the ensemble generation approaches may find their greatest utility in computationally expensive simulations such as first principles molecular dynamics. For such simulations, where each time step is costly, time sampling can require long simulation times because each time step must be evaluated sequentially and therefore phase space averaging is achieved through sequential operations. On the other hand, with ensemble averaging, phase space sampling can be achieved through parallel operations, since each ensemble is independent. For this reason, particularly when using massively parallel architectures, ensemble sampling can result in much shorter simulation times and exhibits similar overall computational effort.

  8. Ensemble Sampling vs. Time Sampling in Molecular Dynamics Simulations of Thermal Conductivity

    DOE PAGES

    Gordiz, Kiarash; Singh, David J.; Henry, Asegun

    2015-01-29

    In this report we compare time sampling and ensemble averaging as two different methods available for phase space sampling. For the comparison, we calculate thermal conductivities of solid argon and silicon structures, using equilibrium molecular dynamics. We introduce two different schemes for the ensemble averaging approach, and show that both can reduce the total simulation time as compared to time averaging. It is also found that velocity rescaling is an efficient mechanism for phase space exploration. Although our methodology is tested using classical molecular dynamics, the ensemble generation approaches may find their greatest utility in computationally expensive simulations such asmore » first principles molecular dynamics. For such simulations, where each time step is costly, time sampling can require long simulation times because each time step must be evaluated sequentially and therefore phase space averaging is achieved through sequential operations. On the other hand, with ensemble averaging, phase space sampling can be achieved through parallel operations, since each ensemble is independent. For this reason, particularly when using massively parallel architectures, ensemble sampling can result in much shorter simulation times and exhibits similar overall computational effort.« less

  9. Large margin classifier-based ensemble tracking

    NASA Astrophysics Data System (ADS)

    Wang, Yuru; Liu, Qiaoyuan; Yin, Minghao; Wang, ShengSheng

    2016-07-01

    In recent years, many studies consider visual tracking as a two-class classification problem. The key problem is to construct a classifier with sufficient accuracy in distinguishing the target from its background and sufficient generalize ability in handling new frames. However, the variable tracking conditions challenges the existing methods. The difficulty mainly comes from the confused boundary between the foreground and background. This paper handles this difficulty by generalizing the classifier's learning step. By introducing the distribution data of samples, the classifier learns more essential characteristics in discriminating the two classes. Specifically, the samples are represented in a multiscale visual model. For features with different scales, several large margin distribution machine (LDMs) with adaptive kernels are combined in a Baysian way as a strong classifier. Where, in order to improve the accuracy and generalization ability, not only the margin distance but also the sample distribution is optimized in the learning step. Comprehensive experiments are performed on several challenging video sequences, through parameter analysis and field comparison, the proposed LDM combined ensemble tracker is demonstrated to perform with sufficient accuracy and generalize ability in handling various typical tracking difficulties.

  10. Verification of the Forecast Errors Based on Ensemble Spread

    NASA Astrophysics Data System (ADS)

    Vannitsem, S.; Van Schaeybroeck, B.

    2014-12-01

    The use of ensemble prediction systems allows for an uncertainty estimation of the forecast. Most end users do not require all the information contained in an ensemble and prefer the use of a single uncertainty measure. This measure is the ensemble spread which serves to forecast the forecast error. It is however unclear how best the quality of these forecasts can be performed, based on spread and forecast error only. The spread-error verification is intricate for two reasons: First for each probabilistic forecast only one observation is substantiated and second, the spread is not meant to provide an exact prediction for the error. Despite these facts several advances were recently made, all based on traditional deterministic verification of the error forecast. In particular, Grimit and Mass (2007) and Hopson (2014) considered in detail the strengths and weaknesses of the spread-error correlation, while Christensen et al (2014) developed a proper-score extension of the mean squared error. However, due to the strong variance of the error given a certain spread, the error forecast should be preferably considered as probabilistic in nature. In the present work, different probabilistic error models are proposed depending on the spread-error metrics used. Most of these models allow for the discrimination of a perfect forecast from an imperfect one, independent of the underlying ensemble distribution. The new spread-error scores are tested on the ensemble prediction system of the European Centre of Medium-range forecasts (ECMWF) over Europe and Africa. ReferencesChristensen, H. M., Moroz, I. M. and Palmer, T. N., 2014, Evaluation of ensemble forecast uncertainty using a new proper score: application to medium-range and seasonal forecasts. In press, Quarterly Journal of the Royal Meteorological Society. Grimit, E. P., and C. F. Mass, 2007: Measuring the ensemble spread-error relationship with a probabilistic approach: Stochastic ensemble results. Mon. Wea. Rev., 135, 203

  11. Representing rainfall uncertainties using radar ensembles: generation of radar based rainfall ensembles for QPE and QPF

    NASA Astrophysics Data System (ADS)

    Sempere-Torres, D.; Llort, X.; Roca, J.; Pegram, G.

    2009-04-01

    In the last years, new comprehension of the physics underlying the radar measurements as well as new technological advancements have allowed radar community to propose better algorithms and methodologies and significant advancements have been achieved in improving Quantitative Precipitation Estimates (QPE) and Quantitative Precipitation forecasting (QPF) by radar. Thus the study of the 2D uncertainties field associated to these estimates has become an important subject, specially to enhance the use of radar QPE and QPF in hydrological studies, as well as in providing a reference for satellite precipitations measurements. In this context the use of radar-based rainfall ensembles (i.e. equiprobable rainfall field scenarios generated to be compatible with the observations/forecasts and with the inferred structure of the uncertainties) has been seen as an extremely interesting tool to represent their associated uncertainties. The generation of such radar ensembles requires first the full characterization of the 3D field of associated uncertainties (2D spatial plus temporal), since rainfall estimates show an error structure highly correlated in space and time. A full methodology to deal with this kind of radar-based rainfall ensembles is presented. Given a rainfall event, the 2D uncertainty fields associated to the radar estimates are defined for every time step using a benchmark, or reference field, based on the best available estimate of the rainfall field. This benchmark is built using an advanced non parametric interpolation of a dense raingauge network able to use the spatial structure provided by the radar observations, and is confined to the region in which this combination could be taken as a reference measurement (Velasco-Forero et al. 2008, doi:10.1016/j.advwatres.2008.10.004). Then the spatial and temporal structures of these uncertainty fields are characterized and a methodology to generate consistent multiple realisations of them is used to generate the

  12. A Link-Based Approach to the Cluster Ensemble Problem.

    PubMed

    Iam-On, Natthakan; Boongoen, Tossapon; Garrett, Simon; Price, Chris

    2011-12-01

    Cluster ensembles have recently emerged as a powerful alternative to standard cluster analysis, aggregating several input data clusterings to generate a single output clustering, with improved robustness and stability. From the early work, these techniques held great promise; however, most of them generate the final solution based on incomplete information of a cluster ensemble. The underlying ensemble-information matrix reflects only cluster-data point relations, while those among clusters are generally overlooked. This paper presents a new link-based approach to improve the conventional matrix. It achieves this using the similarity between clusters that are estimated from a link network model of the ensemble. In particular, three new link-based algorithms are proposed for the underlying similarity assessment. The final clustering result is generated from the refined matrix using two different consensus functions of feature-based and graph-based partitioning. This approach is the first to address and explicitly employ the relationship between input partitions, which has not been emphasized by recent studies of matrix refinement. The effectiveness of the link-based approach is empirically demonstrated over 10 data sets (synthetic and real) and three benchmark evaluation measures. The results suggest the new approach is able to efficiently extract information embedded in the input clusterings, and regularly illustrate higher clustering quality in comparison to several state-of-the-art techniques.

  13. Sampling-based ensemble segmentation against inter-operator variability

    NASA Astrophysics Data System (ADS)

    Huo, Jing; Okada, Kazunori; Pope, Whitney; Brown, Matthew

    2011-03-01

    Inconsistency and a lack of reproducibility are commonly associated with semi-automated segmentation methods. In this study, we developed an ensemble approach to improve reproducibility and applied it to glioblastoma multiforme (GBM) brain tumor segmentation on T1-weigted contrast enhanced MR volumes. The proposed approach combines samplingbased simulations and ensemble segmentation into a single framework; it generates a set of segmentations by perturbing user initialization and user-specified internal parameters, then fuses the set of segmentations into a single consensus result. Three combination algorithms were applied: majority voting, averaging and expectation-maximization (EM). The reproducibility of the proposed framework was evaluated by a controlled experiment on 16 tumor cases from a multicenter drug trial. The ensemble framework had significantly better reproducibility than the individual base Otsu thresholding method (p<.001).

  14. Wang-Landau Reaction Ensemble Method: Simulation of Weak Polyelectrolytes and General Acid-Base Reactions.

    PubMed

    Landsgesell, Jonas; Holm, Christian; Smiatek, Jens

    2017-02-14

    We present a novel method for the study of weak polyelectrolytes and general acid-base reactions in molecular dynamics and Monte Carlo simulations. The approach combines the advantages of the reaction ensemble and the Wang-Landau sampling method. Deprotonation and protonation reactions are simulated explicitly with the help of the reaction ensemble method, while the accurate sampling of the corresponding phase space is achieved by the Wang-Landau approach. The combination of both techniques provides a sufficient statistical accuracy such that meaningful estimates for the density of states and the partition sum can be obtained. With regard to these estimates, several thermodynamic observables like the heat capacity or reaction free energies can be calculated. We demonstrate that the computation times for the calculation of titration curves with a high statistical accuracy can be significantly decreased when compared to the original reaction ensemble method. The applicability of our approach is validated by the study of weak polyelectrolytes and their thermodynamic properties.

  15. A method of determining RNA conformational ensembles using structure-based calculations of residual dipolar couplings

    NASA Astrophysics Data System (ADS)

    Borkar, Aditi N.; De Simone, Alfonso; Montalvao, Rinaldo W.; Vendruscolo, Michele

    2013-06-01

    We describe a method of determining the conformational fluctuations of RNA based on the incorporation of nuclear magnetic resonance (NMR) residual dipolar couplings (RDCs) as replica-averaged structural restraints in molecular dynamics simulations. In this approach, the alignment tensor required to calculate the RDCs corresponding to a given conformation is estimated from its shape, and multiple replicas of the RNA molecule are simulated simultaneously to reproduce in silico the ensemble-averaging procedure performed in the NMR measurements. We provide initial evidence that with this approach it is possible to determine accurately structural ensembles representing the conformational fluctuations of RNA by applying the reference ensemble test to the trans-activation response element of the human immunodeficiency virus type 1.

  16. Bayesian Energy Landscape Tilting: Towards Concordant Models of Molecular Ensembles

    PubMed Central

    Beauchamp, Kyle A.; Pande, Vijay S.; Das, Rhiju

    2014-01-01

    Predicting biological structure has remained challenging for systems such as disordered proteins that take on myriad conformations. Hybrid simulation/experiment strategies have been undermined by difficulties in evaluating errors from computational model inaccuracies and data uncertainties. Building on recent proposals from maximum entropy theory and nonequilibrium thermodynamics, we address these issues through a Bayesian energy landscape tilting (BELT) scheme for computing Bayesian hyperensembles over conformational ensembles. BELT uses Markov chain Monte Carlo to directly sample maximum-entropy conformational ensembles consistent with a set of input experimental observables. To test this framework, we apply BELT to model trialanine, starting from disagreeing simulations with the force fields ff96, ff99, ff99sbnmr-ildn, CHARMM27, and OPLS-AA. BELT incorporation of limited chemical shift and 3J measurements gives convergent values of the peptide’s α, β, and PPII conformational populations in all cases. As a test of predictive power, all five BELT hyperensembles recover set-aside measurements not used in the fitting and report accurate errors, even when starting from highly inaccurate simulations. BELT’s principled framework thus enables practical predictions for complex biomolecular systems from discordant simulations and sparse data. PMID:24655513

  17. Sequential ensemble-based optimal design for parameter estimation

    NASA Astrophysics Data System (ADS)

    Man, Jun; Zhang, Jiangjiang; Li, Weixuan; Zeng, Lingzao; Wu, Laosheng

    2016-10-01

    The ensemble Kalman filter (EnKF) has been widely used in parameter estimation for hydrological models. The focus of most previous studies was to develop more efficient analysis (estimation) algorithms. On the other hand, it is intuitively understandable that a well-designed sampling (data-collection) strategy should provide more informative measurements and subsequently improve the parameter estimation. In this work, a Sequential Ensemble-based Optimal Design (SEOD) method, coupled with EnKF, information theory and sequential optimal design, is proposed to improve the performance of parameter estimation. Based on the first-order and second-order statistics, different information metrics including the Shannon entropy difference (SD), degrees of freedom for signal (DFS) and relative entropy (RE) are used to design the optimal sampling strategy, respectively. The effectiveness of the proposed method is illustrated by synthetic one-dimensional and two-dimensional unsaturated flow case studies. It is shown that the designed sampling strategies can provide more accurate parameter estimation and state prediction compared with conventional sampling strategies. Optimal sampling designs based on various information metrics perform similarly in our cases. The effect of ensemble size on the optimal design is also investigated. Overall, larger ensemble size improves the parameter estimation and convergence of optimal sampling strategy. Although the proposed method is applied to unsaturated flow problems in this study, it can be equally applied in any other hydrological problems.

  18. Current path in light emitting diodes based on nanowire ensembles.

    PubMed

    Limbach, F; Hauswald, C; Lähnemann, J; Wölz, M; Brandt, O; Trampert, A; Hanke, M; Jahn, U; Calarco, R; Geelhaar, L; Riechert, H

    2012-11-23

    Light emitting diodes (LEDs) have been fabricated using ensembles of free-standing (In, Ga)N/GaN nanowires (NWs) grown on Si substrates in the self-induced growth mode by molecular beam epitaxy. Electron-beam-induced current analysis, cathodoluminescence as well as biased μ-photoluminescence spectroscopy, transmission electron microscopy, and electrical measurements indicate that the electroluminescence of such LEDs is governed by the differences in the individual current densities of the single-NW LEDs operated in parallel, i.e. by the inhomogeneity of the current path in the ensemble LED. In addition, the optoelectronic characterization leads to the conclusion that these NWs exhibit N-polarity and that the (In, Ga)N quantum well states in the NWs are subject to a non-vanishing quantum confined Stark effect.

  19. Knowledge-Based Methods To Train and Optimize Virtual Screening Ensembles

    PubMed Central

    2016-01-01

    Ensemble docking can be a successful virtual screening technique that addresses the innate conformational heterogeneity of macromolecular drug targets. Yet, lacking a method to identify a subset of conformational states that effectively segregates active and inactive small molecules, ensemble docking may result in the recommendation of a large number of false positives. Here, three knowledge-based methods that construct structural ensembles for virtual screening are presented. Each method selects ensembles by optimizing an objective function calculated using the receiver operating characteristic (ROC) curve: either the area under the ROC curve (AUC) or a ROC enrichment factor (EF). As the number of receptor conformations, N, becomes large, the methods differ in their asymptotic scaling. Given a set of small molecules with known activities and a collection of target conformations, the most resource intense method is guaranteed to find the optimal ensemble but scales as O(2N). A recursive approximation to the optimal solution scales as O(N2), and a more severe approximation leads to a faster method that scales linearly, O(N). The techniques are generally applicable to any system, and we demonstrate their effectiveness on the androgen nuclear hormone receptor (AR), cyclin-dependent kinase 2 (CDK2), and the peroxisome proliferator-activated receptor δ (PPAR-δ) drug targets. Conformations that consisted of a crystal structure and molecular dynamics simulation cluster centroids were used to form AR and CDK2 ensembles. Multiple available crystal structures were used to form PPAR-δ ensembles. For each target, we show that the three methods perform similarly to one another on both the training and test sets. PMID:27097522

  20. Hurricane ensemble prediction using EOF-based perturbations

    NASA Astrophysics Data System (ADS)

    Zhang, Zhan

    1997-10-01

    In this study, a method to generate perturbations for hurricane ensemble prediction is proposed and examined on five hurricane cases with different kinds of tracks. The model used is the Florida State University Global Spectral Model (FSUGSM) with horizontal spectral resolution of T63 and 14 vertical levels. The method proposed here is based on the premise that (a) model perturbation grows linearly during the first few days of model integration; and (b) in order to make a complete set of ensemble perturbations of hurricane forecasts, both initial intensity and position of the hurricane need to be perturbed. The initial position of the hurricane is perturbed by displacing its original position 50 km equally toward the north, south, east and west directions. The fast growing perturbations can be generated by implementing EOF analysis to the differences between forecasts starting from regular analysis and randomly perturbed analysis. The eigenmode with the largest eigenvalue is then considered as the fast growing perturbation. The proposed perturbation method has been examined through five hurricane case studies. The results show that EOF based perturbations are indeed the optimal perturbations for hurricane ensemble forecasting compared to Monte Carlo Forecasting method. A comparison has also been made between the control experiment (single forecast from regular analysis) and the ensemble experiment. The results show that the predicted hurricane position errors are largely reduced by the ensemble prediction for most of the hurricane cases that have been tested, compared with the control experiment. A higher horizontal resolution model T106 is performed on one hurricane case (Andrew) to provide a comparison between different resolution models.

  1. Ensemble DFT Approach to Excited States of Strongly Correlated Molecular Systems.

    PubMed

    Filatov, Michael

    2016-01-01

    Ensemble density functional theory (DFT) is a novel time-independent formalism for obtaining excitation energies of many-body fermionic systems. A considerable advantage of ensemble DFT over the more common Kohn-Sham (KS) DFT and time-dependent DFT formalisms is that it enables one to account for strong non-dynamic electron correlation in the ground and excited states of molecular systems in a transparent and accurate fashion. Despite its positive aspects, ensemble DFT has not so far found its way into the repertoire of methods of modern computational chemistry, probably because of the perceived lack of practically affordable implementations of the theory. The spin-restricted ensemble-referenced KS (REKS) method is perhaps the first computationally feasible implementation of the ideas behind ensemble DFT which enables one to describe accurately electronic transitions in a wide class of molecular systems, including strongly correlated molecules (biradicals, molecules undergoing bond breaking/formation), extended π-conjugated systems, donor-acceptor charge transfer adducts, etc.

  2. Muscle activation described with a differential equation model for large ensembles of locally coupled molecular motors.

    PubMed

    Walcott, Sam

    2014-10-01

    Molecular motors, by turning chemical energy into mechanical work, are responsible for active cellular processes. Often groups of these motors work together to perform their biological role. Motors in an ensemble are coupled and exhibit complex emergent behavior. Although large motor ensembles can be modeled with partial differential equations (PDEs) by assuming that molecules function independently of their neighbors, this assumption is violated when motors are coupled locally. It is therefore unclear how to describe the ensemble behavior of the locally coupled motors responsible for biological processes such as calcium-dependent skeletal muscle activation. Here we develop a theory to describe locally coupled motor ensembles and apply the theory to skeletal muscle activation. The central idea is that a muscle filament can be divided into two phases: an active and an inactive phase. Dynamic changes in the relative size of these phases are described by a set of linear ordinary differential equations (ODEs). As the dynamics of the active phase are described by PDEs, muscle activation is governed by a set of coupled ODEs and PDEs, building on previous PDE models. With comparison to Monte Carlo simulations, we demonstrate that the theory captures the behavior of locally coupled ensembles. The theory also plausibly describes and predicts muscle experiments from molecular to whole muscle scales, suggesting that a micro- to macroscale muscle model is within reach.

  3. Human resource recommendation algorithm based on ensemble learning and Spark

    NASA Astrophysics Data System (ADS)

    Cong, Zihan; Zhang, Xingming; Wang, Haoxiang; Xu, Hongjie

    2017-08-01

    Aiming at the problem of “information overload” in the human resources industry, this paper proposes a human resource recommendation algorithm based on Ensemble Learning. The algorithm considers the characteristics and behaviours of both job seeker and job features in the real business circumstance. Firstly, the algorithm uses two ensemble learning methods-Bagging and Boosting. The outputs from both learning methods are then merged to form user interest model. Based on user interest model, job recommendation can be extracted for users. The algorithm is implemented as a parallelized recommendation system on Spark. A set of experiments have been done and analysed. The proposed algorithm achieves significant improvement in accuracy, recall rate and coverage, compared with recommendation algorithms such as UserCF and ItemCF.

  4. Polyphony: superposition independent methods for ensemble-based drug discovery.

    PubMed

    Pitt, William R; Montalvão, Rinaldo W; Blundell, Tom L

    2014-09-30

    Structure-based drug design is an iterative process, following cycles of structural biology, computer-aided design, synthetic chemistry and bioassay. In favorable circumstances, this process can lead to the structures of hundreds of protein-ligand crystal structures. In addition, molecular dynamics simulations are increasingly being used to further explore the conformational landscape of these complexes. Currently, methods capable of the analysis of ensembles of crystal structures and MD trajectories are limited and usually rely upon least squares superposition of coordinates. Novel methodologies are described for the analysis of multiple structures of a protein. Statistical approaches that rely upon residue equivalence, but not superposition, are developed. Tasks that can be performed include the identification of hinge regions, allosteric conformational changes and transient binding sites. The approaches are tested on crystal structures of CDK2 and other CMGC protein kinases and a simulation of p38α. Known interaction - conformational change relationships are highlighted but also new ones are revealed. A transient but druggable allosteric pocket in CDK2 is predicted to occur under the CMGC insert. Furthermore, an evolutionarily-conserved conformational link from the location of this pocket, via the αEF-αF loop, to phosphorylation sites on the activation loop is discovered. New methodologies are described and validated for the superimposition independent conformational analysis of large collections of structures or simulation snapshots of the same protein. The methodologies are encoded in a Python package called Polyphony, which is released as open source to accompany this paper [http://wrpitt.bitbucket.org/polyphony/].

  5. Stochastic dynamics of small ensembles of non-processive molecular motors: the parallel cluster model.

    PubMed

    Erdmann, Thorsten; Albert, Philipp J; Schwarz, Ulrich S

    2013-11-07

    Non-processive molecular motors have to work together in ensembles in order to generate appreciable levels of force or movement. In skeletal muscle, for example, hundreds of myosin II molecules cooperate in thick filaments. In non-muscle cells, by contrast, small groups with few tens of non-muscle myosin II motors contribute to essential cellular processes such as transport, shape changes, or mechanosensing. Here we introduce a detailed and analytically tractable model for this important situation. Using a three-state crossbridge model for the myosin II motor cycle and exploiting the assumptions of fast power stroke kinetics and equal load sharing between motors in equivalent states, we reduce the stochastic reaction network to a one-step master equation for the binding and unbinding dynamics (parallel cluster model) and derive the rules for ensemble movement. We find that for constant external load, ensemble dynamics is strongly shaped by the catch bond character of myosin II, which leads to an increase of the fraction of bound motors under load and thus to firm attachment even for small ensembles. This adaptation to load results in a concave force-velocity relation described by a Hill relation. For external load provided by a linear spring, myosin II ensembles dynamically adjust themselves towards an isometric state with constant average position and load. The dynamics of the ensembles is now determined mainly by the distribution of motors over the different kinds of bound states. For increasing stiffness of the external spring, there is a sharp transition beyond which myosin II can no longer perform the power stroke. Slow unbinding from the pre-power-stroke state protects the ensembles against detachment.

  6. Stochastic dynamics of small ensembles of non-processive molecular motors: The parallel cluster model

    SciTech Connect

    Erdmann, Thorsten; Albert, Philipp J.; Schwarz, Ulrich S.

    2013-11-07

    Non-processive molecular motors have to work together in ensembles in order to generate appreciable levels of force or movement. In skeletal muscle, for example, hundreds of myosin II molecules cooperate in thick filaments. In non-muscle cells, by contrast, small groups with few tens of non-muscle myosin II motors contribute to essential cellular processes such as transport, shape changes, or mechanosensing. Here we introduce a detailed and analytically tractable model for this important situation. Using a three-state crossbridge model for the myosin II motor cycle and exploiting the assumptions of fast power stroke kinetics and equal load sharing between motors in equivalent states, we reduce the stochastic reaction network to a one-step master equation for the binding and unbinding dynamics (parallel cluster model) and derive the rules for ensemble movement. We find that for constant external load, ensemble dynamics is strongly shaped by the catch bond character of myosin II, which leads to an increase of the fraction of bound motors under load and thus to firm attachment even for small ensembles. This adaptation to load results in a concave force-velocity relation described by a Hill relation. For external load provided by a linear spring, myosin II ensembles dynamically adjust themselves towards an isometric state with constant average position and load. The dynamics of the ensembles is now determined mainly by the distribution of motors over the different kinds of bound states. For increasing stiffness of the external spring, there is a sharp transition beyond which myosin II can no longer perform the power stroke. Slow unbinding from the pre-power-stroke state protects the ensembles against detachment.

  7. Stochastic dynamics of small ensembles of non-processive molecular motors: The parallel cluster model

    NASA Astrophysics Data System (ADS)

    Erdmann, Thorsten; Albert, Philipp J.; Schwarz, Ulrich S.

    2013-11-01

    Non-processive molecular motors have to work together in ensembles in order to generate appreciable levels of force or movement. In skeletal muscle, for example, hundreds of myosin II molecules cooperate in thick filaments. In non-muscle cells, by contrast, small groups with few tens of non-muscle myosin II motors contribute to essential cellular processes such as transport, shape changes, or mechanosensing. Here we introduce a detailed and analytically tractable model for this important situation. Using a three-state crossbridge model for the myosin II motor cycle and exploiting the assumptions of fast power stroke kinetics and equal load sharing between motors in equivalent states, we reduce the stochastic reaction network to a one-step master equation for the binding and unbinding dynamics (parallel cluster model) and derive the rules for ensemble movement. We find that for constant external load, ensemble dynamics is strongly shaped by the catch bond character of myosin II, which leads to an increase of the fraction of bound motors under load and thus to firm attachment even for small ensembles. This adaptation to load results in a concave force-velocity relation described by a Hill relation. For external load provided by a linear spring, myosin II ensembles dynamically adjust themselves towards an isometric state with constant average position and load. The dynamics of the ensembles is now determined mainly by the distribution of motors over the different kinds of bound states. For increasing stiffness of the external spring, there is a sharp transition beyond which myosin II can no longer perform the power stroke. Slow unbinding from the pre-power-stroke state protects the ensembles against detachment.

  8. Ensemble polarimetric SAR image classification based on contextual sparse representation

    NASA Astrophysics Data System (ADS)

    Zhang, Lamei; Wang, Xiao; Zou, Bin; Qiao, Zhijun

    2016-05-01

    Polarimetric SAR image interpretation has become one of the most interesting topics, in which the construction of the reasonable and effective technique of image classification is of key importance. Sparse representation represents the data using the most succinct sparse atoms of the over-complete dictionary and the advantages of sparse representation also have been confirmed in the field of PolSAR classification. However, it is not perfect, like the ordinary classifier, at different aspects. So ensemble learning is introduced to improve the issue, which makes a plurality of different learners training and obtained the integrated results by combining the individual learner to get more accurate and ideal learning results. Therefore, this paper presents a polarimetric SAR image classification method based on the ensemble learning of sparse representation to achieve the optimal classification.

  9. Ensemble-Based Assimilation of Aerosol Observations in GEOS-5

    NASA Technical Reports Server (NTRS)

    Buchard, V.; Da Silva, A.

    2016-01-01

    MERRA-2 is the latest Aerosol Reanalysis produced at NASA's Global Modeling Assimilation Office (GMAO) from 1979 to present. This reanalysis is based on a version of the GEOS-5 model radiatively coupled to GOCART aerosols and includes assimilation of bias corrected Aerosol Optical Depth (AOD) from AVHRR over ocean, MODIS sensors on both Terra and Aqua satellites, MISR over bright surfaces and AERONET data. In order to assimilate lidar profiles of aerosols, we are updating the aerosol component of our assimilation system to an Ensemble Kalman Filter (EnKF) type of scheme using ensembles generated routinely by the meteorological assimilation. Following the work performed with the first NASA's aerosol reanalysis (MERRAero), we first validate the vertical structure of MERRA-2 aerosol assimilated fields using CALIOP data over regions of particular interest during 2008.

  10. Weighted ensemble based automatic detection of exudates in fundus photographs.

    PubMed

    Prentasic, Pavle; Loncaric, Sven

    2014-01-01

    Diabetic retinopathy (DR) is a visual complication of diabetes, which has become one of the leading causes of preventable blindness in the world. Exudate detection is an important problem in automatic screening systems for detection of diabetic retinopathy using color fundus photographs. In this paper, we present a method for detection of exudates in color fundus photographs, which combines several preprocessing and candidate extraction algorithms to increase the exudate detection accuracy. The first stage of the method consists of an ensemble of several exudate candidate extraction algorithms. In the learning phase, simulated annealing is used to determine weights for combining the results of the ensemble candidate extraction algorithms. The second stage of the method uses a machine learning-based classification for detection of exudate regions. The experimental validation was performed using the DRiDB color fundus image set. The validation has demonstrated that the proposed method achieved higher accuracy in comparison to state-of-the art methods.

  11. Protonation states and conformational ensemble in ligand-based QSAR modeling.

    PubMed

    De Benedetti, Pier G

    2013-01-01

    Drug affinity and function depend on the different protonation species (present in the biological context) that generate different conformational ensembles with different structural features and, hence, different physico-chemical properties. In the present review article these strongly interdependent structural features will be considered for their crucial role in ligand-based QSAR modeling and drug design by using quantum chemical electronic/reactivity descriptors and molecular shape description. Some selected and relevant examples illustrate the role of these molecular descriptors, computed on the bioactive protonation states and conformers, as determinant factors in mechanistic/causative QSAR analysis.

  12. 2010 oil spill: trajectory projections based on ensemble drifter analyses

    NASA Astrophysics Data System (ADS)

    Chang, Yu-Lin; Oey, Leo; Xu, Fang-Hua; Lu, Hung-Fu; Fujisaki, Ayumi

    2011-06-01

    An accurate method for long-term (weeks to months) projections of oil spill trajectories based on multi-year ensemble analyses of simulated surface and subsurface ( z = -800 m) drifters released at the northern Gulf of Mexico spill site is demonstrated during the 2010 oil spill. The simulation compares well with satellite images of the actual oil spill which show that the surface spread of oil was mainly confined to the northern shelf and slope of the Gulf of Mexico, with some (more limited) spreading over the north/northeastern face of the Loop Current, as well as northwestward toward the Louisiana-Texas shelf. At subsurface, the ensemble projection shows drifters spreading south/southwestward, and this tendency agrees well with ADCP current measurements near the spill site during the months of May-July, which also show southward mean currents. An additional model analysis during the spill period (Apr-Jul/2010) confirms the above ensemble projection. The 2010 analysis confirms that the reason for the surface oil spread to be predominantly confined to the northern Gulf shelf and slope is because the 2010 wind was more southerly compared to climatology and also because a cyclone existed north of the Loop Current which moreover was positioned to the south of the spilled site.

  13. Molecular dynamics in the isothermal-isobaric ensemble: the requirement of a "shell" molecule. II. Simulation results.

    PubMed

    Uline, Mark J; Corti, David S

    2005-10-22

    The results of a series of constant pressure and temperature molecular-dynamics (MD) simulation studies based on the rigorous shell particle formulation of the isothermal-isobaric (NpT) ensemble are presented. These MD simulations validate the newly proposed constant pressure equations of motion in which a "shell" particle is used to define uniquely the volume of the system [M. J. Uline and D. S. Corti, J. Chem. Phys. (to be published), preceding paper]. Ensemble averages obtained with the new MD NpT algorithm match the ensemble averages obtained using the previously derived shell particle Monte Carlo NpT method [D. S. Corti, Mol. Phys. 100, 1887 (2002)]. In addition, we also verify that the Hoover NpT MD algorithm [W. G. Hoover, Phys. Rev. A 31, 1695 (1985); 34, 2499 (1986)] generates the correct ensemble averages, though only when periodic boundary conditions are employed. The extension of the shell particle MD algorithm to multicomponent systems is also discussed, in which we show for equilibrium properties that the identity of the shell particle is completely arbitrary when periodic boundary conditions are applied. Self-diffusion coefficients determined with the shell particle equations of motion are also identical to those obtained in other ensembles. Finally, since the mass of the shell particle is known, the system itself, and not a piston of arbitrary mass, controls the time scales for internal pressure and volume fluctuations. We therefore consider the effects of the shell particle on the dynamics of the system. Overall, the shell particle MD algorithm is an effective simulation method for studying systems exposed to a constant external pressure and may provide an advantage over other existing constant pressure approaches when developing nonequilibrium MD methods.

  14. Ensemble-based evaluation for protein structure models.

    PubMed

    Jamroz, Michal; Kolinski, Andrzej; Kihara, Daisuke

    2016-06-15

    Comparing protein tertiary structures is a fundamental procedure in structural biology and protein bioinformatics. Structure comparison is important particularly for evaluating computational protein structure models. Most of the model structure evaluation methods perform rigid body superimposition of a structure model to its crystal structure and measure the difference of the corresponding residue or atom positions between them. However, these methods neglect intrinsic flexibility of proteins by treating the native structure as a rigid molecule. Because different parts of proteins have different levels of flexibility, for example, exposed loop regions are usually more flexible than the core region of a protein structure, disagreement of a model to the native needs to be evaluated differently depending on the flexibility of residues in a protein. We propose a score named FlexScore for comparing protein structures that consider flexibility of each residue in the native state of proteins. Flexibility information may be extracted from experiments such as NMR or molecular dynamics simulation. FlexScore considers an ensemble of conformations of a protein described as a multivariate Gaussian distribution of atomic displacements and compares a query computational model with the ensemble. We compare FlexScore with other commonly used structure similarity scores over various examples. FlexScore agrees with experts' intuitive assessment of computational models and provides information of practical usefulness of models. https://bitbucket.org/mjamroz/flexscore dkihara@purdue.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  15. Binding energy calculations for hevein-carbohydrate interactions using expanded ensemble molecular dynamics simulations

    NASA Astrophysics Data System (ADS)

    Koppisetty, Chaitanya A. K.; Frank, Martin; Lyubartsev, Alexander P.; Nyholm, Per-Georg

    2015-01-01

    Accurate estimation of protein-carbohydrate binding energies using computational methods is a challenging task. Here we report the use of expanded ensemble molecular dynamics (EEMD) simulation with double decoupling for estimation of binding energies of hevein, a plant lectin with its monosaccharide and disaccharide ligands GlcNAc and (GlcNAc)2, respectively. In addition to the binding energies, enthalpy and entropy components of the binding energy are also calculated. The estimated binding energies for the hevein-carbohydrate interactions are within the range of ±0.5 kcal of the previously reported experimental binding data. For comparison, binding energies were also estimated using thermodynamic integration, molecular dynamics end point calculations (MM/GBSA) and the expanded ensemble methodology is seen to be more accurate. To our knowledge, the method of EEMD simulations has not been previously reported for estimating biomolecular binding energies.

  16. Ensemble ROCK Methods and Ensemble SWFM Methods for Clustering of Cross Citrus Accessions Based on Mixed Numerical and Categorical Dataset

    NASA Astrophysics Data System (ADS)

    Alvionita; Sutikno; Suharsono, A.

    2017-03-01

    Cluster analysis is a technique in multivariate analysis methods that reduces (classifying) data. This analysis has the main purpose to classify the objects of observation into groups based on characteristics. In the process, a cluster analysis is not only used for numerical data or categorical data but also developed for mixed data. There are several methods in analyzing the mixed data as ensemble methods and methods Similarity Weight and Filter Methods (SWFM). There is a lot of research on these methods, but the study did not compare the performance given by both of these methods. Therefore, this paper will be compared the performance between the clustering ensemble ROCK methods and ensemble SWFM methods. These methods will be used in clustering cross citrus accessions based on the characteristics of fruit and leaves that involve variables that are a mixture of numerical and categorical. Clustering methods with the best performance determined by looking at the ratio of standard deviation values within groups (SW) with a standard deviation between groups (SB). Methods with the best performance has the smallest ratio. From the result, we get that the performance of ensemble ROCK methods is better than ensemble SWFM methods.

  17. Assessing an ensemble docking-based virtual screening strategy for kinase targets by considering protein flexibility.

    PubMed

    Tian, Sheng; Sun, Huiyong; Pan, Peichen; Li, Dan; Zhen, Xuechu; Li, Youyong; Hou, Tingjun

    2014-10-27

    In this study, to accommodate receptor flexibility, based on multiple receptor conformations, a novel ensemble docking protocol was developed by using the naïve Bayesian classification technique, and it was evaluated in terms of the prediction accuracy of docking-based virtual screening (VS) of three important targets in the kinase family: ALK, CDK2, and VEGFR2. First, for each target, the representative crystal structures were selected by structural clustering, and the capability of molecular docking based on each representative structure to discriminate inhibitors from non-inhibitors was examined. Then, for each target, 50 ns molecular dynamics (MD) simulations were carried out to generate an ensemble of the conformations, and multiple representative structures/snapshots were extracted from each MD trajectory by structural clustering. On average, the representative crystal structures outperform the representative structures extracted from MD simulations in terms of the capabilities to separate inhibitors from non-inhibitors. Finally, by using the naïve Bayesian classification technique, an integrated VS strategy was developed to combine the prediction results of molecular docking based on different representative conformations chosen from crystal structures and MD trajectories. It was encouraging to observe that the integrated VS strategy yields better performance than the docking-based VS based on any single rigid conformation. This novel protocol may provide an improvement over existing strategies to search for more diverse and promising active compounds for a target of interest.

  18. Copula Based Post-processing Method for Hydrologic Ensemble Forecast

    NASA Astrophysics Data System (ADS)

    Duan, Q.; Li, W.

    2016-12-01

    Hydrology forecasts often contain uncertainties from model inputs, model parameters and model structure. Post processing methods can be applied to the original hydrology ensemble forecasts to correct the bias and spread error. Most existing post-processing methods applied the Normal Quantile Transform (NQT) to transform the hydrology variables to normal distribution for convenient statistical inference. However, the NQT based algorithm suffer several problems, such as the extrapolation problem in back-transform process. In this research, a copula based post-processing method was developed. The copula function estimates the joint distribution of observation and model forecast directly, and then the conditional distribution of observation given the model forecasts could be obtained without NQT. The proposed post-processing method was tested and compared with two other popular methods based on NQT, namely the Hydrology Uncertainty Processor (HUP) and General Linear Model Post-Processor (GLMPP) using the observation and simulation dataset from the Model Parameter Estimation Experiment (MOPEX) project. The results show that the drawback of NQT based post-processing methods can be alleviated in the proposed algorithm. Some suitable conditions and suggestions on the application of copula based post-processing method for hydrology ensemble forecast were also provided.

  19. Phthalocyanine-nanocarbon ensembles: from discrete molecular and supramolecular systems to hybrid nanomaterials.

    PubMed

    Bottari, Giovanni; de la Torre, Gema; Torres, Tomas

    2015-04-21

    Phthalocyanines (Pcs) are macrocyclic and aromatic compounds that present unique electronic features such as high molar absorption coefficients, rich redox chemistry, and photoinduced energy/electron transfer abilities that can be modulated as a function of the electronic character of their counterparts in donor-acceptor (D-A) ensembles. In this context, carbon nanostructures such as fullerenes, carbon nanotubes (CNTs), and, more recently, graphene are among the most suitable Pc "companions". Pc-C60 ensembles have been for a long time the main actors in this field, due to the commercial availability of C60 and the well-established synthetic methods for its functionalization. As a result, many Pc-C60 architectures have been prepared, featuring different connectivities (covalent or supramolecular), intermolecular interactions (self-organized or molecularly dispersed species), and Pc HOMO/LUMO levels. All these elements provide a versatile toolbox for tuning the photophysical properties in terms of the type of process (photoinduced energy/electron transfer), the nature of the interactions between the electroactive units (through bond or space), and the kinetics of the formation/decay of the photogenerated species. Some recent trends in this field include the preparation of stimuli-responsive multicomponent systems with tunable photophysical properties and highly ordered nanoarchitectures and surface-supported systems showing high charge mobilities. A breakthrough in the Pc-nanocarbon field was the appearance of CNTs and graphene, which opened a new avenue for the preparation of intriguing photoresponsive hybrid ensembles showing light-stimulated charge separation. The scarce solubility of these 1-D and 2-D nanocarbons, together with their lower reactivity with respect to C60 stemming from their less strained sp(2) carbon networks, has not meant an unsurmountable limitation for the preparation of variety of Pc-based hybrids. These systems, which show improved

  20. Ensemble-based docking: From hit discovery to metabolism and toxicity predictions

    DOE PAGES

    Evangelista, Wilfredo; Weir, Rebecca; Ellingson, Sally; ...

    2016-07-29

    The use of ensemble-based docking for the exploration of biochemical pathways and toxicity prediction of drug candidates is described. We describe the computational engineering work necessary to enable large ensemble docking campaigns on supercomputers. We show examples where ensemble-based docking has significantly increased the number and the diversity of validated drug candidates. Finally, we illustrate how ensemble-based docking can be extended beyond hit discovery and toward providing a structural basis for the prediction of metabolism and off-target binding relevant to pre-clinical and clinical trials.

  1. Ensemble-based docking: From hit discovery to metabolism and toxicity predictions

    SciTech Connect

    Evangelista, Wilfredo; Weir, Rebecca; Ellingson, Sally; Harris, Jason B.; Kapoor, Karan; Smith, Jeremy C.; Baudry, Jerome

    2016-07-29

    The use of ensemble-based docking for the exploration of biochemical pathways and toxicity prediction of drug candidates is described. We describe the computational engineering work necessary to enable large ensemble docking campaigns on supercomputers. We show examples where ensemble-based docking has significantly increased the number and the diversity of validated drug candidates. Finally, we illustrate how ensemble-based docking can be extended beyond hit discovery and toward providing a structural basis for the prediction of metabolism and off-target binding relevant to pre-clinical and clinical trials.

  2. Pedestrian detection based on diverse margin distribution ensemble

    NASA Astrophysics Data System (ADS)

    Cheng, Fanyong; Zhang, Jing; Wen, Cuihong; Li, Zuoyong

    2016-07-01

    This paper studies the impact of margin distribution on detection performance and proposes Diverse Margin Distribution Ensemble (DMDE) for pedestrian detection, based on HOG descriptor. Large margin Distribution Machine (LDM) introduces the margin mean and margin variance. Large margin mean is relevant to the strong generalization performance and large margin variance is relevant to the more balanced detection rate between two classes. Inspired by this recognition, DMDE is proposed to obtain greater robustness and balance for pedestrian detection. It is a blending of SVM and two LDMs with different parameter orders and can aggregate the merits of the three classifiers. Experimental results show that DMDE is more robust and balanced than single SVM or LDM for pedestrian detection.

  3. Quantum repeaters based on atomic ensembles and linear optics

    NASA Astrophysics Data System (ADS)

    Sangouard, Nicolas; Simon, Christoph; de Riedmatten, Hugues; Gisin, Nicolas

    2011-01-01

    The distribution of quantum states over long distances is limited by photon loss. Straightforward amplification as in classical telecommunications is not an option in quantum communication because of the no-cloning theorem. This problem could be overcome by implementing quantum repeater protocols, which create long-distance entanglement from shorter-distance entanglement via entanglement swapping. Such protocols require the capacity to create entanglement in a heralded fashion, to store it in quantum memories, and to swap it. One attractive general strategy for realizing quantum repeaters is based on the use of atomic ensembles as quantum memories, in combination with linear optical techniques and photon counting to perform all required operations. Here the theoretical and experimental status quo of this very active field are reviewed. The potentials of different approaches are compared quantitatively, with a focus on the most immediate goal of outperforming the direct transmission of photons.

  4. Memory imperfections in atomic-ensemble-based quantum repeaters

    NASA Astrophysics Data System (ADS)

    Brask, Jonatan Bohr; Sørensen, Anders Søndberg

    2008-07-01

    Quantum repeaters promise to deliver long-distance entanglement overcoming loss in realistic quantum channels. A promising class of repeaters, based on atomic ensemble quantum memories and linear optics, follows the proposal by L.-M. Duan , Nature (London) 414, 413 (2001). Here we analyze this protocol in terms of a very general model for the quantum memories employed. We derive analytical expressions for scaling of entanglement with memory imperfections, dark counts, loss, and distance, and we apply our results to two specific quantum memory protocols. Our methods apply to any quantum memory with an interaction Hamiltonian at most quadratic in the mode operators and are in principle extendible to more recent modifications of the original proposal of Duan, Lukin, Cirac, and Zoller.

  5. Uncertainty Visualization in HARDI based on Ensembles of ODFs

    PubMed Central

    Jiao, Fangxiang; Phillips, Jeff M.; Gur, Yaniv; Johnson, Chris R.

    2013-01-01

    In this paper, we propose a new and accurate technique for uncertainty analysis and uncertainty visualization based on fiber orientation distribution function (ODF) glyphs, associated with high angular resolution diffusion imaging (HARDI). Our visualization applies volume rendering techniques to an ensemble of 3D ODF glyphs, which we call SIP functions of diffusion shapes, to capture their variability due to underlying uncertainty. This rendering elucidates the complex heteroscedastic structural variation in these shapes. Furthermore, we quantify the extent of this variation by measuring the fraction of the volume of these shapes, which is consistent across all noise levels, the certain volume ratio. Our uncertainty analysis and visualization framework is then applied to synthetic data, as well as to HARDI human-brain data, to study the impact of various image acquisition parameters and background noise levels on the diffusion shapes. PMID:24466504

  6. Proteomic mass spectra classification using decision tree based ensemble methods.

    PubMed

    Geurts, Pierre; Fillet, Marianne; de Seny, Dominique; Meuwis, Marie-Alice; Malaise, Michel; Merville, Marie-Paule; Wehenkel, Louis

    2005-07-15

    Modern mass spectrometry allows the determination of proteomic fingerprints of body fluids like serum, saliva or urine. These measurements can be used in many medical applications in order to diagnose the current state or predict the evolution of a disease. Recent developments in machine learning allow one to exploit such datasets, characterized by small numbers of very high-dimensional samples. We propose a systematic approach based on decision tree ensemble methods, which is used to automatically determine proteomic biomarkers and predictive models. The approach is validated on two datasets of surface-enhanced laser desorption/ionization time of flight measurements, for the diagnosis of rheumatoid arthritis and inflammatory bowel diseases. The results suggest that the methodology can handle a broad class of similar problems.

  7. Ensembler: Enabling High-Throughput Molecular Simulations at the Superfamily Scale.

    PubMed

    Parton, Daniel L; Grinaway, Patrick B; Hanson, Sonya M; Beauchamp, Kyle A; Chodera, John D

    2016-06-01

    The rapidly expanding body of available genomic and protein structural data provides a rich resource for understanding protein dynamics with biomolecular simulation. While computational infrastructure has grown rapidly, simulations on an omics scale are not yet widespread, primarily because software infrastructure to enable simulations at this scale has not kept pace. It should now be possible to study protein dynamics across entire (super)families, exploiting both available structural biology data and conformational similarities across homologous proteins. Here, we present a new tool for enabling high-throughput simulation in the genomics era. Ensembler takes any set of sequences-from a single sequence to an entire superfamily-and shepherds them through various stages of modeling and refinement to produce simulation-ready structures. This includes comparative modeling to all relevant PDB structures (which may span multiple conformational states of interest), reconstruction of missing loops, addition of missing atoms, culling of nearly identical structures, assignment of appropriate protonation states, solvation in explicit solvent, and refinement and filtering with molecular simulation to ensure stable simulation. The output of this pipeline is an ensemble of structures ready for subsequent molecular simulations using computer clusters, supercomputers, or distributed computing projects like Folding@home. Ensembler thus automates much of the time-consuming process of preparing protein models suitable for simulation, while allowing scalability up to entire superfamilies. A particular advantage of this approach can be found in the construction of kinetic models of conformational dynamics-such as Markov state models (MSMs)-which benefit from a diverse array of initial configurations that span the accessible conformational states to aid sampling. We demonstrate the power of this approach by constructing models for all catalytic domains in the human tyrosine kinase

  8. Ensembler: Enabling High-Throughput Molecular Simulations at the Superfamily Scale

    PubMed Central

    Parton, Daniel L.; Grinaway, Patrick B.; Hanson, Sonya M.; Beauchamp, Kyle A.; Chodera, John D.

    2016-01-01

    The rapidly expanding body of available genomic and protein structural data provides a rich resource for understanding protein dynamics with biomolecular simulation. While computational infrastructure has grown rapidly, simulations on an omics scale are not yet widespread, primarily because software infrastructure to enable simulations at this scale has not kept pace. It should now be possible to study protein dynamics across entire (super)families, exploiting both available structural biology data and conformational similarities across homologous proteins. Here, we present a new tool for enabling high-throughput simulation in the genomics era. Ensembler takes any set of sequences—from a single sequence to an entire superfamily—and shepherds them through various stages of modeling and refinement to produce simulation-ready structures. This includes comparative modeling to all relevant PDB structures (which may span multiple conformational states of interest), reconstruction of missing loops, addition of missing atoms, culling of nearly identical structures, assignment of appropriate protonation states, solvation in explicit solvent, and refinement and filtering with molecular simulation to ensure stable simulation. The output of this pipeline is an ensemble of structures ready for subsequent molecular simulations using computer clusters, supercomputers, or distributed computing projects like Folding@home. Ensembler thus automates much of the time-consuming process of preparing protein models suitable for simulation, while allowing scalability up to entire superfamilies. A particular advantage of this approach can be found in the construction of kinetic models of conformational dynamics—such as Markov state models (MSMs)—which benefit from a diverse array of initial configurations that span the accessible conformational states to aid sampling. We demonstrate the power of this approach by constructing models for all catalytic domains in the human tyrosine

  9. Sidekick for Membrane Simulations: Automated Ensemble Molecular Dynamics Simulations of Transmembrane Helices.

    PubMed

    Hall, Benjamin A; Halim, Khairul Bariyyah Abd; Buyan, Amanda; Emmanouil, Beatrice; Sansom, Mark S P

    2014-05-13

    The interactions of transmembrane (TM) α-helices with the phospholipid membrane and with one another are central to understanding the structure and stability of integral membrane proteins. These interactions may be analyzed via coarse grained molecular dynamics (CGMD) simulations. To obtain statistically meaningful analysis of TM helix interactions, large (N ca. 100) ensembles of CGMD simulations are needed. To facilitate the running and analysis of such ensembles of simulations, we have developed Sidekick, an automated pipeline software for performing high throughput CGMD simulations of α-helical peptides in lipid bilayer membranes. Through an end-to-end approach, which takes as input a helix sequence and outputs analytical metrics derived from CGMD simulations, we are able to predict the orientation and likelihood of insertion into a lipid bilayer of a given helix of a family of helix sequences. We illustrate this software via analyses of insertion into a membrane of short hydrophobic TM helices containing a single cationic arginine residue positioned at different positions along the length of the helix. From analyses of these ensembles of simulations, we estimate apparent energy barriers to insertion which are comparable to experimentally determined values. In a second application, we use CGMD simulations to examine the self-assembly of dimers of TM helices from the ErbB1 receptor tyrosine kinase and analyze the numbers of simulation repeats necessary to obtain convergence of simple descriptors of the mode of packing of the two helices within a dimer. Our approach offers a proof-of-principle platform for the further employment of automation in large ensemble CGMD simulations of membrane proteins.

  10. A Sidekick for Membrane Simulations: Automated Ensemble Molecular Dynamics Simulations of Transmembrane Helices

    PubMed Central

    Hall, Benjamin A; Halim, Khairul Abd; Buyan, Amanda; Emmanouil, Beatrice; Sansom, Mark S P

    2016-01-01

    The interactions of transmembrane (TM) α-helices with the phospholipid membrane and with one another are central to understanding the structure and stability of integral membrane proteins. These interactions may be analysed via coarse-grained molecular dynamics (CGMD) simulations. To obtain statistically meaningful analysis of TM helix interactions, large (N ca. 100) ensembles of CGMD simulations are needed. To facilitate the running and analysis of such ensembles of simulations we have developed Sidekick, an automated pipeline software for performing high throughput CGMD simulations of α-helical peptides in lipid bilayer membranes. Through an end-to-end approach, which takes as input a helix sequence and outputs analytical metrics derived from CGMD simulations, we are able to predict the orientation and likelihood of insertion into a lipid bilayer of a given helix of family of helix sequences. We illustrate this software via analysis of insertion into a membrane of short hydrophobic TM helices containing a single cationic arginine residue positioned at different positions along the length of the helix. From analysis of these ensembles of simulations we estimate apparent energy barriers to insertion which are comparable to experimentally determined values. In a second application we use CGMD simulations to examine self-assembly of dimers of TM helices from the ErbB1 receptor tyrosine kinase, and analyse the numbers of simulation repeats necessary to obtain convergence of simple descriptors of the mode of packing of the two helices within a dimer. Our approach offers proof-of-principle platform for the further employment of automation in large ensemble CGMD simulations of membrane proteins. PMID:26580541

  11. Ensemble-based prediction of RNA secondary structures.

    PubMed

    Aghaeepour, Nima; Hoos, Holger H

    2013-04-24

    Accurate structure prediction methods play an important role for the understanding of RNA function. Energy-based, pseudoknot-free secondary structure prediction is one of the most widely used and versatile approaches, and improved methods for this task have received much attention over the past five years. Despite the impressive progress that as been achieved in this area, existing evaluations of the prediction accuracy achieved by various algorithms do not provide a comprehensive, statistically sound assessment. Furthermore, while there is increasing evidence that no prediction algorithm consistently outperforms all others, no work has been done to exploit the complementary strengths of multiple approaches. In this work, we present two contributions to the area of RNA secondary structure prediction. Firstly, we use state-of-the-art, resampling-based statistical methods together with a previously published and increasingly widely used dataset of high-quality RNA structures to conduct a comprehensive evaluation of existing RNA secondary structure prediction procedures. The results from this evaluation clarify the performance relationship between ten well-known existing energy-based pseudoknot-free RNA secondary structure prediction methods and clearly demonstrate the progress that has been achieved in recent years. Secondly, we introduce AveRNA, a generic and powerful method for combining a set of existing secondary structure prediction procedures into an ensemble-based method that achieves significantly higher prediction accuracies than obtained from any of its component procedures. Our new, ensemble-based method, AveRNA, improves the state of the art for energy-based, pseudoknot-free RNA secondary structure prediction by exploiting the complementary strengths of multiple existing prediction procedures, as demonstrated using a state-of-the-art statistical resampling approach. In addition, AveRNA allows an intuitive and effective control of the trade-off between

  12. Ensemble-based prediction of RNA secondary structures

    PubMed Central

    2013-01-01

    Background Accurate structure prediction methods play an important role for the understanding of RNA function. Energy-based, pseudoknot-free secondary structure prediction is one of the most widely used and versatile approaches, and improved methods for this task have received much attention over the past five years. Despite the impressive progress that as been achieved in this area, existing evaluations of the prediction accuracy achieved by various algorithms do not provide a comprehensive, statistically sound assessment. Furthermore, while there is increasing evidence that no prediction algorithm consistently outperforms all others, no work has been done to exploit the complementary strengths of multiple approaches. Results In this work, we present two contributions to the area of RNA secondary structure prediction. Firstly, we use state-of-the-art, resampling-based statistical methods together with a previously published and increasingly widely used dataset of high-quality RNA structures to conduct a comprehensive evaluation of existing RNA secondary structure prediction procedures. The results from this evaluation clarify the performance relationship between ten well-known existing energy-based pseudoknot-free RNA secondary structure prediction methods and clearly demonstrate the progress that has been achieved in recent years. Secondly, we introduce AveRNA, a generic and powerful method for combining a set of existing secondary structure prediction procedures into an ensemble-based method that achieves significantly higher prediction accuracies than obtained from any of its component procedures. Conclusions Our new, ensemble-based method, AveRNA, improves the state of the art for energy-based, pseudoknot-free RNA secondary structure prediction by exploiting the complementary strengths of multiple existing prediction procedures, as demonstrated using a state-of-the-art statistical resampling approach. In addition, AveRNA allows an intuitive and effective

  13. Development of Ensemble Model Based Water Demand Forecasting Model

    NASA Astrophysics Data System (ADS)

    Kwon, Hyun-Han; So, Byung-Jin; Kim, Seong-Hyeon; Kim, Byung-Seop

    2014-05-01

    In recent years, Smart Water Grid (SWG) concept has globally emerged over the last decade and also gained significant recognition in South Korea. Especially, there has been growing interest in water demand forecast and optimal pump operation and this has led to various studies regarding energy saving and improvement of water supply reliability. Existing water demand forecasting models are categorized into two groups in view of modeling and predicting their behavior in time series. One is to consider embedded patterns such as seasonality, periodicity and trends, and the other one is an autoregressive model that is using short memory Markovian processes (Emmanuel et al., 2012). The main disadvantage of the abovementioned model is that there is a limit to predictability of water demands of about sub-daily scale because the system is nonlinear. In this regard, this study aims to develop a nonlinear ensemble model for hourly water demand forecasting which allow us to estimate uncertainties across different model classes. The proposed model is consist of two parts. One is a multi-model scheme that is based on combination of independent prediction model. The other one is a cross validation scheme named Bagging approach introduced by Brieman (1996) to derive weighting factors corresponding to individual models. Individual forecasting models that used in this study are linear regression analysis model, polynomial regression, multivariate adaptive regression splines(MARS), SVM(support vector machine). The concepts are demonstrated through application to observed from water plant at several locations in the South Korea. Keywords: water demand, non-linear model, the ensemble forecasting model, uncertainty. Acknowledgements This subject is supported by Korea Ministry of Environment as "Projects for Developing Eco-Innovation Technologies (GT-11-G-02-001-6)

  14. Random Coding Bounds for DNA Codes Based on Fibonacci Ensembles of DNA Sequences

    DTIC Science & Technology

    2008-07-01

    COVERED (From - To) 6 Jul 08 – 11 Jul 08 4. TITLE AND SUBTITLE RANDOM CODING BOUNDS FOR DNA CODES BASED ON FIBONACCI ENSEMBLES OF DNA SEQUENCES ... sequences which are generalizations of the Fibonacci sequences . 15. SUBJECT TERMS DNA Codes, Fibonacci Ensembles, DNA Computing, Code Optimization 16...coding bound on the rate of DNA codes is proved. To obtain the bound, we use some ensembles of DNA sequences which are generalizations of the Fibonacci

  15. Ensemble-Biased Metadynamics: A Molecular Simulation Method to Sample Experimental Distributions.

    PubMed

    Marinelli, Fabrizio; Faraldo-Gómez, José D

    2015-06-16

    We introduce an enhanced-sampling method for molecular dynamics (MD) simulations referred to as ensemble-biased metadynamics (EBMetaD). The method biases a conventional MD simulation to sample a molecular ensemble that is consistent with one or more probability distributions known a priori, e.g., experimental intramolecular distance distributions obtained by double electron-electron resonance or other spectroscopic techniques. To this end, EBMetaD adds an adaptive biasing potential throughout the simulation that discourages sampling of configurations inconsistent with the target probability distributions. The bias introduced is the minimum necessary to fulfill the target distributions, i.e., EBMetaD satisfies the maximum-entropy principle. Unlike other methods, EBMetaD does not require multiple simulation replicas or the introduction of Lagrange multipliers, and is therefore computationally efficient and straightforward in practice. We demonstrate the performance and accuracy of the method for a model system as well as for spin-labeled T4 lysozyme in explicit water, and show how EBMetaD reproduces three double electron-electron resonance distance distributions concurrently within a few tens of nanoseconds of simulation time. EBMetaD is integrated in the open-source PLUMED plug-in (www.plumed-code.org), and can be therefore readily used with multiple MD engines.

  16. Appraisal of jump distributions in ensemble-based sampling algorithms

    NASA Astrophysics Data System (ADS)

    Dejanic, Sanda; Scheidegger, Andreas; Rieckermann, Jörg; Albert, Carlo

    2017-04-01

    Sampling Bayesian posteriors of model parameters is often required for making model-based probabilistic predictions. For complex environmental models, standard Monte Carlo Markov Chain (MCMC) methods are often infeasible because they require too many sequential model runs. Therefore, we focused on ensemble methods that use many Markov chains in parallel, since they can be run on modern cluster architectures. Little is known about how to choose the best performing sampler, for a given application. A poor choice can lead to an inappropriate representation of posterior knowledge. We assessed two different jump moves, the stretch and the differential evolution move, underlying, respectively, the software packages EMCEE and DREAM, which are popular in different scientific communities. For the assessment, we used analytical posteriors with features as they often occur in real posteriors, namely high dimensionality, strong non-linear correlations or multimodality. For posteriors with non-linear features, standard convergence diagnostics based on sample means can be insufficient. Therefore, we resorted to an entropy-based convergence measure. We assessed the samplers by means of their convergence speed, robustness and effective sample sizes. For posteriors with strongly non-linear features, we found that the stretch move outperforms the differential evolution move, w.r.t. all three aspects.

  17. Leveraging Gibbs Ensemble Molecular Dynamics and Hybrid Monte Carlo/Molecular Dynamics for Efficient Study of Phase Equilibria.

    PubMed

    Gartner, Thomas E; Epps, Thomas H; Jayaraman, Arthi

    2016-11-08

    We describe an extension of the Gibbs ensemble molecular dynamics (GEMD) method for studying phase equilibria. Our modifications to GEMD allow for direct control over particle transfer between phases and improve the method's numerical stability. Additionally, we found that the modified GEMD approach had advantages in computational efficiency in comparison to a hybrid Monte Carlo (MC)/MD Gibbs ensemble scheme in the context of the single component Lennard-Jones fluid. We note that this increase in computational efficiency does not compromise the close agreement of phase equilibrium results between the two methods. However, numerical instabilities in the GEMD scheme hamper GEMD's use near the critical point. We propose that the computationally efficient GEMD simulations can be used to map out the majority of the phase window, with hybrid MC/MD used as a follow up for conditions under which GEMD may be unstable (e.g., near-critical behavior). In this manner, we can capitalize on the contrasting strengths of these two methods to enable the efficient study of phase equilibria for systems that present challenges for a purely stochastic GEMC method, such as dense or low temperature systems, and/or those with complex molecular topologies.

  18. Ensemble Classifier Strategy Based on Transient Feature Fusion in Electronic Nose

    NASA Astrophysics Data System (ADS)

    Bagheri, Mohammad Ali; Montazer, Gholam Ali

    2011-09-01

    In this paper, we test the performance of several ensembles of classifiers and each base learner has been trained on different types of extracted features. Experimental results show the potential benefits introduced by the usage of simple ensemble classification systems for the integration of different types of transient features.

  19. Molecular dynamics in the isothermal-isobaric ensemble: the requirement of a "shell" molecule. III. Discontinuous potentials.

    PubMed

    Uline, Mark J; Corti, David S

    2008-07-07

    Based on the approach of Gruhn and Monson [Phys. Rev. E 63, 061106 (2001)], we present a new method for deriving the collisions dynamics for particles that interact via discontinuous potentials. By invoking the conservation of the extended Hamiltonian, we generate molecular dynamics (MD) algorithms for simulating the hard-sphere and square-well fluids within the isothermal-isobaric (NpT) ensemble. Consistent with the recent rigorous reformulation of the NpT ensemble partition function, the equations of motion impose a constant external pressure via the introduction of a shell particle of known mass [M. J. Uline and D. S. Corti, J. Chem. Phys. 123, 164101 (2005); 123, 164102 (2005)], which serves to define uniquely the volume of the system. The particles are also connected to a temperature reservoir through the use of a chain of Nose-Hoover thermostats, the properties of which are not affected by a hard-sphere or square-well collision. By using the Liouville operator formalism and the Trotter expansion theorem to integrate the equations of motion, the update of the thermostat variables can be decoupled from the update of the positions of the particles and the momentum changes upon a collision. Hence, once the appropriate collision dynamics for the isobaric-isenthalpic (NpH) equations of motion is known, the adaptation of the algorithm to the NpT ensemble is straightforward. Results of MD simulations for the pure component square-well fluid are presented and serve to validate our algorithm. Finally, since the mass of the shell particle is known, the system itself, and not a piston of arbitrary mass, controls the time scales for internal pressure and volume fluctuations. We therefore consider the influence of the shell particle algorithm on the dynamics of the square-well fluid.

  20. g_contacts: Fast contact search in bio-molecular ensemble data

    NASA Astrophysics Data System (ADS)

    Blau, Christian; Grubmuller, Helmut

    2013-12-01

    Short-range interatomic interactions govern many bio-molecular processes. Therefore, identifying close interaction partners in ensemble data is an essential task in structural biology and computational biophysics. A contact search can be cast as a typical range search problem for which efficient algorithms have been developed. However, none of those has yet been adapted to the context of macromolecular ensembles, particularly in a molecular dynamics (MD) framework. Here a set-decomposition algorithm is implemented which detects all contacting atoms or residues in maximum O(Nlog(N)) run-time, in contrast to the O(N2) complexity of a brute-force approach. Catalogue identifier: AEQA_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEQA_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 8945 No. of bytes in distributed program, including test data, etc.: 981604 Distribution format: tar.gz Programming language: C99. Computer: PC. Operating system: Linux. RAM: ≈Size of input frame Classification: 3, 4.14. External routines: Gromacs 4.6[1] Nature of problem: Finding atoms or residues that are closer to one another than a given cut-off. Solution method: Excluding distant atoms from distance calculations by decomposing the given set of atoms into disjoint subsets. Running time:≤O(Nlog(N)) References: [1] S. Pronk, S. Pall, R. Schulz, P. Larsson, P. Bjelkmar, R. Apostolov, M. R. Shirts, J.C. Smith, P. M. Kasson, D. van der Spoel, B. Hess and Erik Lindahl, Gromacs 4.5: a high-throughput and highly parallel open source molecular simulation toolkit, Bioinformatics 29 (7) (2013).

  1. Canonical-ensemble extended Lagrangian Born–Oppenheimer molecular dynamics for the linear scaling density functional theory

    NASA Astrophysics Data System (ADS)

    Hirakawa, Teruo; Suzuki, Teppei; Bowler, David R.; Miyazaki, Tsuyoshi

    2017-10-01

    We discuss the development and implementation of a constant temperature (NVT) molecular dynamics scheme that combines the Nosé–Hoover chain thermostat with the extended Lagrangian Born–Oppenheimer molecular dynamics (BOMD) scheme, using a linear scaling density functional theory (DFT) approach. An integration scheme for this canonical-ensemble extended Lagrangian BOMD is developed and discussed in the context of the Liouville operator formulation. Linear scaling DFT canonical-ensemble extended Lagrangian BOMD simulations are tested on bulk silicon and silicon carbide systems to evaluate our integration scheme. The results show that the conserved quantity remains stable with no systematic drift even in the presence of the thermostat.

  2. Visualizing Confidence in Cluster-based Ensemble Weather Forecast Analyses.

    PubMed

    Kumpf, Alexander; Tost, Bianca; Baumgart, Marlene; Riemer, Michael; Westermann, Rudiger; Rautenhaus, Marc

    2017-08-29

    In meteorology, cluster analysis is frequently used to determine representative trends in ensemble weather predictions in a selected spatio-temporal region, e.g., to reduce a set of ensemble members to simplify and improve their analysis. Identified clusters (i.e., groups of similar members), however, can be very sensitive to small changes of the selected region, so that clustering results can be misleading and bias subsequent analyses. In this article, we -a team of visualization scientists and meteorologists- deliver visual analytics solutions to analyze the sensitivity of clustering results with respect to changes of a selected region. We propose an interactive visual interface that enables simultaneous visualization of a) the variation in composition of identified clusters (i.e., their robustness), b) the variability in cluster membership for individual ensemble members, and c) the uncertainty in the spatial locations of identified trends. We demonstrate that our solution shows meteorologists how representative a clustering result is, and with respect to which changes in the selected region it becomes unstable. Furthermore, our solution helps to identify those ensemble members which stably belong to a given cluster and can thus be considered similar. In a real-world application case we show how our approach is used to analyze the clustering behavior of different regions in a forecast of "Tropical Cyclone Karl", guiding the user towards the cluster robustness information required for subsequent ensemble analysis.

  3. Confidence-based ensemble for GBM brain tumor segmentation

    NASA Astrophysics Data System (ADS)

    Huo, Jing; van Rikxoort, Eva M.; Okada, Kazunori; Kim, Hyun J.; Pope, Whitney; Goldin, Jonathan; Brown, Matthew

    2011-03-01

    It is a challenging task to automatically segment glioblastoma multiforme (GBM) brain tumors on T1w post-contrast isotropic MR images. A semi-automated system using fuzzy connectedness has recently been developed for computing the tumor volume that reduces the cost of manual annotation. In this study, we propose a an ensemble method that combines multiple segmentation results into a final ensemble one. The method is evaluated on a dataset of 20 cases from a multi-center pharmaceutical drug trial and compared to the fuzzy connectedness method. Three individual methods were used in the framework: fuzzy connectedness, GrowCut, and voxel classification. The combination method is a confidence map averaging (CMA) method. The CMA method shows an improved ROC curve compared to the fuzzy connectedness method (p < 0.001). The CMA ensemble result is more robust compared to the three individual methods.

  4. Molecular Dynamics and Monte Carlo simulations in the microcanonical ensemble: Quantitative comparison and reweighting techniques.

    PubMed

    Schierz, Philipp; Zierenberg, Johannes; Janke, Wolfhard

    2015-10-07

    Molecular Dynamics (MD) and Monte Carlo (MC) simulations are the most popular simulation techniques for many-particle systems. Although they are often applied to similar systems, it is unclear to which extent one has to expect quantitative agreement of the two simulation techniques. In this work, we present a quantitative comparison of MD and MC simulations in the microcanonical ensemble. For three test examples, we study first- and second-order phase transitions with a focus on liquid-gas like transitions. We present MD analysis techniques to compensate for conservation law effects due to linear and angular momentum conservation. Additionally, we apply the weighted histogram analysis method to microcanonical histograms reweighted from MD simulations. By this means, we are able to estimate the density of states from many microcanonical simulations at various total energies. This further allows us to compute estimates of canonical expectation values.

  5. Ensemble-Based Data Assimilation With a Martian GCM

    NASA Astrophysics Data System (ADS)

    Lawson, W.; Richardson, M. I.; McCleese, D. J.; Anderson, J. L.; Chen, Y.; Snyder, C.

    2007-12-01

    Monte Carlo approximations, "ensemble-based methods," has matured enough to be both appropriate for use in planetary problems and exploitably within the reach of planetary scientists. Capitalizing on this new class of methods, the National Center for Atmospheric Research (NCAR) has developed a framework for ensemble-based DA that is flexible and modular in its use of various forecast models and data sets. The framework is called DART, the Data Assimilation Research Testbed, and it is freely available on-line. We have begun to take advantage of this rich software infrastructure, and are on our way toward performing state of the art DA in the martian atmosphere using Caltech's martian general circulation model, PlanetWRF. We have begun by testing and validating the model within DART under idealized scenarios, and we hope to address actual, available infrared remote sensing datasets from Mars orbiters in the coming year. We shall present the details of this approach and our progress to date.

  6. Ensemble-based methods for forecasting census in hospital units

    PubMed Central

    2013-01-01

    Background The ability to accurately forecast census counts in hospital departments has considerable implications for hospital resource allocation. In recent years several different methods have been proposed forecasting census counts, however many of these approaches do not use available patient-specific information. Methods In this paper we present an ensemble-based methodology for forecasting the census under a framework that simultaneously incorporates both (i) arrival trends over time and (ii) patient-specific baseline and time-varying information. The proposed model for predicting census has three components, namely: current census count, number of daily arrivals and number of daily departures. To model the number of daily arrivals, we use a seasonality adjusted Poisson Autoregressive (PAR) model where the parameter estimates are obtained via conditional maximum likelihood. The number of daily departures is predicted by modeling the probability of departure from the census using logistic regression models that are adjusted for the amount of time spent in the census and incorporate both patient-specific baseline and time varying patient-specific covariate information. We illustrate our approach using neonatal intensive care unit (NICU) data collected at Women & Infants Hospital, Providence RI, which consists of 1001 consecutive NICU admissions between April 1st 2008 and March 31st 2009. Results Our results demonstrate statistically significant improved prediction accuracy for 3, 5, and 7 day census forecasts and increased precision of our forecasting model compared to a forecasting approach that ignores patient-specific information. Conclusions Forecasting models that utilize patient-specific baseline and time-varying information make the most of data typically available and have the capacity to substantially improve census forecasts. PMID:23721123

  7. Ensemble-based methods for forecasting census in hospital units.

    PubMed

    Koestler, Devin C; Ombao, Hernando; Bender, Jesse

    2013-05-30

    The ability to accurately forecast census counts in hospital departments has considerable implications for hospital resource allocation. In recent years several different methods have been proposed forecasting census counts, however many of these approaches do not use available patient-specific information. In this paper we present an ensemble-based methodology for forecasting the census under a framework that simultaneously incorporates both (i) arrival trends over time and (ii) patient-specific baseline and time-varying information. The proposed model for predicting census has three components, namely: current census count, number of daily arrivals and number of daily departures. To model the number of daily arrivals, we use a seasonality adjusted Poisson Autoregressive (PAR) model where the parameter estimates are obtained via conditional maximum likelihood. The number of daily departures is predicted by modeling the probability of departure from the census using logistic regression models that are adjusted for the amount of time spent in the census and incorporate both patient-specific baseline and time varying patient-specific covariate information. We illustrate our approach using neonatal intensive care unit (NICU) data collected at Women & Infants Hospital, Providence RI, which consists of 1001 consecutive NICU admissions between April 1st 2008 and March 31st 2009. Our results demonstrate statistically significant improved prediction accuracy for 3, 5, and 7 day census forecasts and increased precision of our forecasting model compared to a forecasting approach that ignores patient-specific information. Forecasting models that utilize patient-specific baseline and time-varying information make the most of data typically available and have the capacity to substantially improve census forecasts.

  8. Nonlinear stability and ergodicity of ensemble based Kalman filters

    NASA Astrophysics Data System (ADS)

    Tong, Xin T.; Majda, Andrew J.; Kelly, David

    2016-02-01

    The ensemble Kalman filter (EnKF) and ensemble square root filter (ESRF) are data assimilation methods used to combine high dimensional, nonlinear dynamical models with observed data. Despite their widespread usage in climate science and oil reservoir simulation, very little is known about the long-time behavior of these methods and why they are effective when applied with modest ensemble sizes in large dimensional turbulent dynamical systems. By following the basic principles of energy dissipation and controllability of filters, this paper establishes a simple, systematic and rigorous framework for the nonlinear analysis of EnKF and ESRF with arbitrary ensemble size, focusing on the dynamical properties of boundedness and geometric ergodicity. The time uniform boundedness guarantees that the filter estimate will not diverge to machine infinity in finite time, which is a potential threat for EnKF and ESQF known as the catastrophic filter divergence. Geometric ergodicity ensures in addition that the filter has a unique invariant measure and that initialization errors will dissipate exponentially in time. We establish these results by introducing a natural notion of observable energy dissipation. The time uniform bound is achieved through a simple Lyapunov function argument, this result applies to systems with complete observations and strong kinetic energy dissipation, but also to concrete examples with incomplete observations. With the Lyapunov function argument established, the geometric ergodicity is obtained by verifying the controllability of the filter processes; in particular, such analysis for ESQF relies on a careful multivariate perturbation analysis of the covariance eigen-structure.

  9. Possible Room-Temperature Ferromagnetism in Self-Assembled Ensembles of Paramagnetic and Diamagnetic Molecular Semiconductors.

    PubMed

    Dhara, Barun; Tarafder, Kartick; Jha, Plawan K; Panja, Soumendra N; Nair, Sunil; Oppeneer, Peter M; Ballav, Nirmalya

    2016-12-15

    Owing to long spin-relaxation time and chemically customizable physical properties, molecule-based semiconductor materials like metal-phthalocyanines offer promising alternatives to conventional dilute magnetic semiconductors/oxides (DMSs/DMOs) to achieve room-temperature (RT) ferromagnetism. However, air-stable molecule-based materials exhibiting both semiconductivity and magnetic-order at RT have so far remained elusive. We present here the concept of supramolecular arrangement to accomplish possibly RT ferromagnetism. Specifically, we observe a clear hysteresis-loop (Hc ≈ 120 Oe) at 300 K in the magnetization versus field (M-H) plot of the self-assembled ensembles of diamagnetic Zn-phthalocyanine having peripheral F atoms (ZnFPc; S = 0) and paramagnetic Fe-phthalocyanine having peripehral H atoms (FePc; S = 1). Tauc plot of the self-assembled FePc···ZnFPc ensembles showed an optical band gap of ∼1.05 eV and temperature-dependent current-voltage (I-V) studies suggest semiconducting characteristics in the material. Using DFT+U quantum-chemical calculations, we reveal the origin of such unusual ferromagnetic exchange-interaction in the supramolecular FePc···ZnFPc system.

  10. Monte Carlo and Molecular Dynamics in the Multicanonical Ensemble: Connections between Wang-Landau Sampling and Metadynamics

    NASA Astrophysics Data System (ADS)

    Vogel, Thomas; Perez, Danny; Junghans, Christoph

    2014-03-01

    We show direct formal relationships between the Wang-Landau iteration [PRL 86, 2050 (2001)], metadynamics [PNAS 99, 12562 (2002)] and statistical temperature molecular dynamics [PRL 97, 050601 (2006)], the major Monte Carlo and molecular dynamics work horses for sampling from a generalized, multicanonical ensemble. We aim at helping to consolidate the developments in the different areas by indicating how methodological advancements can be transferred in a straightforward way, avoiding the parallel, largely independent, developments tracks observed in the past.

  11. Accurate ensemble molecular dynamics binding free energy ranking of multidrug-resistant HIV-1 proteases.

    PubMed

    Sadiq, S Kashif; Wright, David W; Kenway, Owain A; Coveney, Peter V

    2010-05-24

    Accurate calculation of important thermodynamic properties, such as macromolecular binding free energies, is one of the principal goals of molecular dynamics simulations. However, single long simulation frequently produces incorrectly converged quantitative results due to inadequate sampling of conformational space in a feasible wall-clock time. Multiple short (ensemble) simulations have been shown to explore conformational space more effectively than single long simulations, but the two methods have not yet been thermodynamically compared. Here we show that, for end-state binding free energy determination methods, ensemble simulations exhibit significantly enhanced thermodynamic sampling over single long simulations and result in accurate and converged relative binding free energies that are reproducible to within 0.5 kcal/mol. Completely correct ranking is obtained for six HIV-1 protease variants bound to lopinavir with a correlation coefficient of 0.89 and a mean relative deviation from experiment of 0.9 kcal/mol. Multidrug resistance to lopinavir is enthalpically driven and increases through a decrease in the protein-ligand van der Waals interaction, principally due to the V82A/I84V mutation, and an increase in net electrostatic repulsion due to water-mediated disruption of protein-ligand interactions in the catalytic region. Furthermore, we correctly rank, to within 1 kcal/mol of experiment, the substantially increased chemical potency of lopinavir binding to the wild-type protease compared to saquinavir and show that lopinavir takes advantage of a decreased net electrostatic repulsion to confer enhanced binding. Our approach is dependent on the combined use of petascale computing resources and on an automated simulation workflow to attain the required level of sampling and turn around time to obtain the results, which can be as little as three days. This level of performance promotes integration of such methodology with clinical decision support systems for

  12. Ensemble-based docking: From hit discovery to metabolism and toxicity predictions.

    PubMed

    Evangelista, Wilfredo; Weir, Rebecca L; Ellingson, Sally R; Harris, Jason B; Kapoor, Karan; Smith, Jeremy C; Baudry, Jerome

    2016-10-15

    This paper describes and illustrates the use of ensemble-based docking, i.e., using a collection of protein structures in docking calculations for hit discovery, the exploration of biochemical pathways and toxicity prediction of drug candidates. We describe the computational engineering work necessary to enable large ensemble docking campaigns on supercomputers. We show examples where ensemble-based docking has significantly increased the number and the diversity of validated drug candidates. Finally, we illustrate how ensemble-based docking can be extended beyond hit discovery and toward providing a structural basis for the prediction of metabolism and off-target binding relevant to pre-clinical and clinical trials. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. Modeling Dynamic Systems with Efficient Ensembles of Process-Based Models

    PubMed Central

    Simidjievski, Nikola; Todorovski, Ljupčo; Džeroski, Sašo

    2016-01-01

    Ensembles are a well established machine learning paradigm, leading to accurate and robust models, predominantly applied to predictive modeling tasks. Ensemble models comprise a finite set of diverse predictive models whose combined output is expected to yield an improved predictive performance as compared to an individual model. In this paper, we propose a new method for learning ensembles of process-based models of dynamic systems. The process-based modeling paradigm employs domain-specific knowledge to automatically learn models of dynamic systems from time-series observational data. Previous work has shown that ensembles based on sampling observational data (i.e., bagging and boosting), significantly improve predictive performance of process-based models. However, this improvement comes at the cost of a substantial increase of the computational time needed for learning. To address this problem, the paper proposes a method that aims at efficiently learning ensembles of process-based models, while maintaining their accurate long-term predictive performance. This is achieved by constructing ensembles with sampling domain-specific knowledge instead of sampling data. We apply the proposed method to and evaluate its performance on a set of problems of automated predictive modeling in three lake ecosystems using a library of process-based knowledge for modeling population dynamics. The experimental results identify the optimal design decisions regarding the learning algorithm. The results also show that the proposed ensembles yield significantly more accurate predictions of population dynamics as compared to individual process-based models. Finally, while their predictive performance is comparable to the one of ensembles obtained with the state-of-the-art methods of bagging and boosting, they are substantially more efficient. PMID:27078633

  14. Modeling Dynamic Systems with Efficient Ensembles of Process-Based Models.

    PubMed

    Simidjievski, Nikola; Todorovski, Ljupčo; Džeroski, Sašo

    2016-01-01

    Ensembles are a well established machine learning paradigm, leading to accurate and robust models, predominantly applied to predictive modeling tasks. Ensemble models comprise a finite set of diverse predictive models whose combined output is expected to yield an improved predictive performance as compared to an individual model. In this paper, we propose a new method for learning ensembles of process-based models of dynamic systems. The process-based modeling paradigm employs domain-specific knowledge to automatically learn models of dynamic systems from time-series observational data. Previous work has shown that ensembles based on sampling observational data (i.e., bagging and boosting), significantly improve predictive performance of process-based models. However, this improvement comes at the cost of a substantial increase of the computational time needed for learning. To address this problem, the paper proposes a method that aims at efficiently learning ensembles of process-based models, while maintaining their accurate long-term predictive performance. This is achieved by constructing ensembles with sampling domain-specific knowledge instead of sampling data. We apply the proposed method to and evaluate its performance on a set of problems of automated predictive modeling in three lake ecosystems using a library of process-based knowledge for modeling population dynamics. The experimental results identify the optimal design decisions regarding the learning algorithm. The results also show that the proposed ensembles yield significantly more accurate predictions of population dynamics as compared to individual process-based models. Finally, while their predictive performance is comparable to the one of ensembles obtained with the state-of-the-art methods of bagging and boosting, they are substantially more efficient.

  15. Initial perturbations based on the ensemble transform (ET) technique in the NCEP global operational forecast system

    NASA Astrophysics Data System (ADS)

    Wei, Mozheng; Toth, Zoltan; Wobus, Richard; Zhu, Yuejian

    2008-01-01

    Since modern data assimilation (DA) involves the repetitive use of dynamical forecasts, errors in analyses share characteristics of those in short-range forecasts. Initial conditions for an ensemble prediction/forecast system (EPS or EFS) are expected to sample uncertainty in the analysis field. Ensemble forecasts with such initial conditions can therefore (a) be fed back to DA to reduce analysis uncertainty, as well as (b) sample forecast uncertainty related to initial conditions. Optimum performance of both DA and EFS requires a careful choice of initial ensemble perturbations. DA can be improved with an EFS that represents the dynamically conditioned part of forecast error covariance as accurately as possible, while an EFS can be improved by initial perturbations reflecting analysis error variance. Initial perturbation generation schemes that dynamically cycle ensemble perturbations reminiscent to how forecast errors are cycled in DA schemes may offer consistency between DA and EFS, and good performance for both. In this paper, we introduce an EFS based on the initial perturbations that are generated by the Ensemble Transform (ET) and ET with rescaling (ETR) methods to achieve this goal. Both ET and ETR are generalizations of the breeding method (BM). The results from ensemble systems based on BM, ET, ETR and the Ensemble Transform Kalman Filter (ETKF) method are experimentally compared in the context of ensemble forecast performance. Initial perturbations are centred around a 3D-VAR analysis, with a variance equal to that of estimated analysis errors. Of the four methods, the ETR method performed best in most probabilistic scores and in terms of the forecast error explained by the perturbations. All methods display very high time consistency between the analysis and forecast perturbations. It is expected that DA performance can be improved by the use of forecast error covariance from a dynamically cycled ensemble either with a variational DA approach (coupled

  16. Dynamic Metabolic Model Building Based on the Ensemble Modeling Approach

    SciTech Connect

    Liao, James C.

    2016-10-01

    Ensemble modeling of kinetic systems addresses the challenges of kinetic model construction, with respect to parameter value selection, and still allows for the rich insights possible from kinetic models. This project aimed to show that constructing, implementing, and analyzing such models is a useful tool for the metabolic engineering toolkit, and that they can result in actionable insights from models. Key concepts are developed and deliverable publications and results are presented.

  17. An integrated uncertainty and ensemble-based data assimilation approach for improved operational streamflow predictions

    NASA Astrophysics Data System (ADS)

    He, M.; Hogue, T. S.; Margulis, S. A.; Franz, K. J.

    2012-03-01

    The current study proposes an integrated uncertainty and ensemble-based data assimilation framework (ICEA) and evaluates its viability in providing operational streamflow predictions via assimilating snow water equivalent (SWE) data. This step-wise framework applies a parameter uncertainty analysis algorithm (ISURF) to identify the uncertainty structure of sensitive model parameters, which is subsequently formulated into an Ensemble Kalman Filter (EnKF) to generate updated snow states for streamflow prediction. The framework is coupled to the US National Weather Service (NWS) snow and rainfall-runoff models. Its applicability is demonstrated for an operational basin of a western River Forecast Center (RFC) of the NWS. Performance of the framework is evaluated against existing operational baseline (RFC predictions), the stand-alone ISURF and the stand-alone EnKF. Results indicate that the ensemble-mean prediction of ICEA considerably outperforms predictions from the other three scenarios investigated, particularly in the context of predicting high flows (top 5th percentile). The ICEA streamflow ensemble predictions capture the variability of the observed streamflow well, however the ensemble is not wide enough to consistently contain the range of streamflow observations in the study basin. Our findings indicate that the ICEA has the potential to supplement the current operational (deterministic) forecasting method in terms of providing improved single-valued (e.g., ensemble mean) streamflow predictions as well as meaningful ensemble predictions.

  18. An integrated uncertainty and ensemble-based data assimilation approach for improved operational streamflow predictions

    NASA Astrophysics Data System (ADS)

    He, M.; Hogue, T. S.; Margulis, S. A.; Franz, K. J.

    2011-08-01

    The current study proposes an integrated uncertainty and ensemble-based data assimilation framework (ICEA) and evaluates its viability in providing operational streamflow predictions via assimilating snow water equivalent (SWE) data. This step-wise framework applies a parameter uncertainty analysis algorithm (ISURF) to identify the uncertainty structure of sensitive model parameters, which is subsequently formulated into an Ensemble Kalman Filter (EnKF) to generate updated snow states for streamflow prediction. The framework is coupled to the US National Weather Service (NWS) snow and rainfall-runoff models. Its applicability is demonstrated for an operational basin of a western River Forecast Center (RFC) of the NWS. Performance of the framework is evaluated against existing operational baseline (RFC predictions), the stand-alone ISURF, and the stand-alone EnKF. Results indicate that the ensemble-mean prediction of ICEA considerably outperforms predictions from the other three scenarios investigated, particularly in the context of predicting high flows (top 5th percentile). The ICEA streamflow ensemble predictions capture the variability of the observed streamflow well, however the ensemble is not wide enough to consistently contain the range of streamflow observations in the study basin. Our findings indicate that the ICEA has the potential to supplement the current operational (deterministic) forecasting method in terms of providing improved single-valued (e.g., ensemble mean) streamflow predictions as well as meaningful ensemble predictions.

  19. Ensembl 2015

    PubMed Central

    Cunningham, Fiona; Amode, M. Ridwan; Barrell, Daniel; Beal, Kathryn; Billis, Konstantinos; Brent, Simon; Carvalho-Silva, Denise; Clapham, Peter; Coates, Guy; Fitzgerald, Stephen; Gil, Laurent; Girón, Carlos García; Gordon, Leo; Hourlier, Thibaut; Hunt, Sarah E.; Janacek, Sophie H.; Johnson, Nathan; Juettemann, Thomas; Kähäri, Andreas K.; Keenan, Stephen; Martin, Fergal J.; Maurel, Thomas; McLaren, William; Murphy, Daniel N.; Nag, Rishi; Overduin, Bert; Parker, Anne; Patricio, Mateus; Perry, Emily; Pignatelli, Miguel; Riat, Harpreet Singh; Sheppard, Daniel; Taylor, Kieron; Thormann, Anja; Vullo, Alessandro; Wilder, Steven P.; Zadissa, Amonida; Aken, Bronwen L.; Birney, Ewan; Harrow, Jennifer; Kinsella, Rhoda; Muffato, Matthieu; Ruffier, Magali; Searle, Stephen M.J.; Spudich, Giulietta; Trevanion, Stephen J.; Yates, Andy; Zerbino, Daniel R.; Flicek, Paul

    2015-01-01

    Ensembl (http://www.ensembl.org) is a genomic interpretation system providing the most up-to-date annotations, querying tools and access methods for chordates and key model organisms. This year we released updated annotation (gene models, comparative genomics, regulatory regions and variation) on the new human assembly, GRCh38, although we continue to support researchers using the GRCh37.p13 assembly through a dedicated site (http://grch37.ensembl.org). Our Regulatory Build has been revamped to identify regulatory regions of interest and to efficiently highlight their activity across disparate epigenetic data sets. A number of new interfaces allow users to perform large-scale comparisons of their data against our annotations. The REST server (http://rest.ensembl.org), which allows programs written in any language to query our databases, has moved to a full service alongside our upgraded website tools. Our online Variant Effect Predictor tool has been updated to process more variants and calculate summary statistics. Lastly, the WiggleTools package enables users to summarize large collections of data sets and view them as single tracks in Ensembl. The Ensembl code base itself is more accessible: it is now hosted on our GitHub organization page (https://github.com/Ensembl) under an Apache 2.0 open source license. PMID:25352552

  20. An Ensemble Generator for Quantitative Precipitation Estimation Based on Censored Shifted Gamma Distributions

    NASA Astrophysics Data System (ADS)

    Wright, D.; Kirschbaum, D.; Yatheendradas, S.

    2016-12-01

    The considerable uncertainties associated with quantitative precipitation estimates (QPE), whether from satellite platforms, ground-based weather radar, or numerical weather models, suggest that such QPE should be expressed as distributions or ensembles of possible values, rather than as single values. In this research, we borrow a framework from the weather forecast verification community, to "correct" satellite precipitation and generate ensemble QPE. This approach is based on the censored shifted gamma distribution (CSGD). The probability of precipitation, central tendency (i.e. mean), and the uncertainty can be captured by the three parameters of the CSGD. The CSGD can then be applied for simulation of rainfall ensembles using a flexible nonlinear regression framework, whereby the CSGD parameters can be conditioned on one or more reference rainfall datasets and on other time-varying covariates such as modeled or measured estimates of precipitable water and relative humidity. We present the framework and initial results by generating precipitation ensembles based on the Tropical Rainfall Measuring Mission Multi-satellite Precipitation Analysis (TMPA) dataset, using both NLDAS and PERSIANN-CDR precipitation datasets as references. We also incorporate a number of covariates from MERRA2 reanalysis including model-estimated precipitation, precipitable water, relative humidity, and lifting condensation level. We explore the prospects for applying the framework and other ensemble error models globally, including in regions where high-quality "ground truth" rainfall estimates are lacking. We compare the ensemble outputs against those of an independent rain gage-based ensemble rainfall dataset. "Pooling" of regional rainfall observations is explored as one option for improving ensemble estimates of rainfall extremes. The approach has potential applications in near-realtime, retrospective, and scenario modeling of rainfall-driven hazards such as floods and landslides

  1. Force Sensor Based Tool Condition Monitoring Using a Heterogeneous Ensemble Learning Model

    PubMed Central

    Wang, Guofeng; Yang, Yinwei; Li, Zhimeng

    2014-01-01

    Tool condition monitoring (TCM) plays an important role in improving machining efficiency and guaranteeing workpiece quality. In order to realize reliable recognition of the tool condition, a robust classifier needs to be constructed to depict the relationship between tool wear states and sensory information. However, because of the complexity of the machining process and the uncertainty of the tool wear evolution, it is hard for a single classifier to fit all the collected samples without sacrificing generalization ability. In this paper, heterogeneous ensemble learning is proposed to realize tool condition monitoring in which the support vector machine (SVM), hidden Markov model (HMM) and radius basis function (RBF) are selected as base classifiers and a stacking ensemble strategy is further used to reflect the relationship between the outputs of these base classifiers and tool wear states. Based on the heterogeneous ensemble learning classifier, an online monitoring system is constructed in which the harmonic features are extracted from force signals and a minimal redundancy and maximal relevance (mRMR) algorithm is utilized to select the most prominent features. To verify the effectiveness of the proposed method, a titanium alloy milling experiment was carried out and samples with different tool wear states were collected to build the proposed heterogeneous ensemble learning classifier. Moreover, the homogeneous ensemble learning model and majority voting strategy are also adopted to make a comparison. The analysis and comparison results show that the proposed heterogeneous ensemble learning classifier performs better in both classification accuracy and stability. PMID:25405514

  2. Force sensor based tool condition monitoring using a heterogeneous ensemble learning model.

    PubMed

    Wang, Guofeng; Yang, Yinwei; Li, Zhimeng

    2014-11-14

    Tool condition monitoring (TCM) plays an important role in improving machining efficiency and guaranteeing workpiece quality. In order to realize reliable recognition of the tool condition, a robust classifier needs to be constructed to depict the relationship between tool wear states and sensory information. However, because of the complexity of the machining process and the uncertainty of the tool wear evolution, it is hard for a single classifier to fit all the collected samples without sacrificing generalization ability. In this paper, heterogeneous ensemble learning is proposed to realize tool condition monitoring in which the support vector machine (SVM), hidden Markov model (HMM) and radius basis function (RBF) are selected as base classifiers and a stacking ensemble strategy is further used to reflect the relationship between the outputs of these base classifiers and tool wear states. Based on the heterogeneous ensemble learning classifier, an online monitoring system is constructed in which the harmonic features are extracted from force signals and a minimal redundancy and maximal relevance (mRMR) algorithm is utilized to select the most prominent features. To verify the effectiveness of the proposed method, a titanium alloy milling experiment was carried out and samples with different tool wear states were collected to build the proposed heterogeneous ensemble learning classifier. Moreover, the homogeneous ensemble learning model and majority voting strategy are also adopted to make a comparison. The analysis and comparison results show that the proposed heterogeneous ensemble learning classifier performs better in both classification accuracy and stability.

  3. A Gibbs-ensemble based technique for Monte Carlo simulation of electric double layer capacitors (EDLC) at constant voltage.

    PubMed

    Punnathanam, Sudeep N

    2014-05-07

    Current methods for molecular simulations of Electric Double Layer Capacitors (EDLC) have both the electrodes and the electrolyte region in a single simulation box. This necessitates simulation of the electrode-electrolyte region interface. Typical capacitors have macroscopic dimensions where the fraction of the molecules at the electrode-electrolyte region interface is very low. Hence, large systems sizes are needed to minimize the electrode-electrolyte region interfacial effects. To overcome these problems, a new technique based on the Gibbs Ensemble is proposed for simulation of an EDLC. In the proposed technique, each electrode is simulated in a separate simulation box. Application of periodic boundary conditions eliminates the interfacial effects. This in addition to the use of constant voltage ensemble allows for a more convenient comparison of simulation results with experimental measurements on typical EDLCs.

  4. Overlapped partitioning for ensemble classifiers of P300-based brain-computer interfaces.

    PubMed

    Onishi, Akinari; Natsume, Kiyohisa

    2014-01-01

    A P300-based brain-computer interface (BCI) enables a wide range of people to control devices that improve their quality of life. Ensemble classifiers with naive partitioning were recently applied to the P300-based BCI and these classification performances were assessed. However, they were usually trained on a large amount of training data (e.g., 15300). In this study, we evaluated ensemble linear discriminant analysis (LDA) classifiers with a newly proposed overlapped partitioning method using 900 training data. In addition, the classification performances of the ensemble classifier with naive partitioning and a single LDA classifier were compared. One of three conditions for dimension reduction was applied: the stepwise method, principal component analysis (PCA), or none. The results show that an ensemble stepwise LDA (SWLDA) classifier with overlapped partitioning achieved a better performance than the commonly used single SWLDA classifier and an ensemble SWLDA classifier with naive partitioning. This result implies that the performance of the SWLDA is improved by overlapped partitioning and the ensemble classifier with overlapped partitioning requires less training data than that with naive partitioning. This study contributes towards reducing the required amount of training data and achieving better classification performance.

  5. Application of ensemble classifier in EEG-based motor imagery tasks

    NASA Astrophysics Data System (ADS)

    Liu, Bianhong; Hao, Hongwei

    2007-12-01

    Electroencephalogram (EEG) recorded during motor imagery tasks can be used to move a cursor to a target on a computer screen. Such an EEG-based brain-computer interface (BCI) can provide a new communication channel for the subjects with neuromuscular disorders. To achieve higher speed and more accuracy to enhance the practical applications of BCI in computer aid medical systems, the ensemble classifier is used for the single classification. The ERDs at the electrodes C3 and C4 are calculated and then stacked together into the feature vector for the ensemble classifier. The ensemble classifier is based on Linear Discriminant Analysis (LDA) and Nearest Neighbor (NN). Furthermore, it considers the feedback. This method is successfully used in the 2003 international data analysis competition on BCI-tasks (data set III). The results show that the ensemble classifier succeed with a recognition as 90%, on average, which is 5% and 3% higher than that of using the LDA and NN separately. Moreover, the ensemble classifier outperforms LDA and NN in the whole time course. With adequate recognition, ease of use and clearly understood, the ensemble classifier can meet the need of time-requires for single classification.

  6. Overlapped Partitioning for Ensemble Classifiers of P300-Based Brain-Computer Interfaces

    PubMed Central

    Onishi, Akinari; Natsume, Kiyohisa

    2014-01-01

    A P300-based brain-computer interface (BCI) enables a wide range of people to control devices that improve their quality of life. Ensemble classifiers with naive partitioning were recently applied to the P300-based BCI and these classification performances were assessed. However, they were usually trained on a large amount of training data (e.g., 15300). In this study, we evaluated ensemble linear discriminant analysis (LDA) classifiers with a newly proposed overlapped partitioning method using 900 training data. In addition, the classification performances of the ensemble classifier with naive partitioning and a single LDA classifier were compared. One of three conditions for dimension reduction was applied: the stepwise method, principal component analysis (PCA), or none. The results show that an ensemble stepwise LDA (SWLDA) classifier with overlapped partitioning achieved a better performance than the commonly used single SWLDA classifier and an ensemble SWLDA classifier with naive partitioning. This result implies that the performance of the SWLDA is improved by overlapped partitioning and the ensemble classifier with overlapped partitioning requires less training data than that with naive partitioning. This study contributes towards reducing the required amount of training data and achieving better classification performance. PMID:24695550

  7. AWE-WQ: Fast-Forwarding Molecular Dynamics Using the Accelerated Weighted Ensemble

    PubMed Central

    2015-01-01

    A limitation of traditional molecular dynamics (MD) is that reaction rates are difficult to compute. This is due to the rarity of observing transitions between metastable states since high energy barriers trap the system in these states. Recently the weighted ensemble (WE) family of methods have emerged which can flexibly and efficiently sample conformational space without being trapped and allow calculation of unbiased rates. However, while WE can sample correctly and efficiently, a scalable implementation applicable to interesting biomolecular systems is not available. We provide here a GPLv2 implementation called AWE-WQ of a WE algorithm using the master/worker distributed computing WorkQueue (WQ) framework. AWE-WQ is scalable to thousands of nodes and supports dynamic allocation of computer resources, heterogeneous resource usage (such as central processing units (CPU) and graphical processing units (GPUs) concurrently), seamless heterogeneous cluster usage (i.e., campus grids and cloud providers), and support for arbitrary MD codes such as GROMACS, while ensuring that all statistics are unbiased. We applied AWE-WQ to a 34 residue protein which simulated 1.5 ms over 8 months with peak aggregate performance of 1000 ns/h. Comparison was done with a 200 μs simulation collected on a GPU over a similar timespan. The folding and unfolded rates were of comparable accuracy. PMID:25207854

  8. Classification of lung cancer using ensemble-based feature selection and machine learning methods.

    PubMed

    Cai, Zhihua; Xu, Dong; Zhang, Qing; Zhang, Jiexia; Ngai, Sai-Ming; Shao, Jianlin

    2015-03-01

    Lung cancer is one of the leading causes of death worldwide. There are three major types of lung cancers, non-small cell lung cancer (NSCLC), small cell lung cancer (SCLC) and carcinoid. NSCLC is further classified into lung adenocarcinoma (LADC), squamous cell lung cancer (SQCLC) as well as large cell lung cancer. Many previous studies demonstrated that DNA methylation has emerged as potential lung cancer-specific biomarkers. However, whether there exists a set of DNA methylation markers simultaneously distinguishing such three types of lung cancers remains elusive. In the present study, ROC (Receiving Operating Curve), RFs (Random Forests) and mRMR (Maximum Relevancy and Minimum Redundancy) were proposed to capture the unbiased, informative as well as compact molecular signatures followed by machine learning methods to classify LADC, SQCLC and SCLC. As a result, a panel of 16 DNA methylation markers exhibits an ideal classification power with an accuracy of 86.54%, 84.6% and a recall 84.37%, 85.5% in the leave-one-out cross-validation (LOOCV) and independent data set test experiments, respectively. Besides, comparison results indicate that ensemble-based feature selection methods outperform individual ones when combined with the incremental feature selection (IFS) strategy in terms of the informative and compact property of features. Taken together, results obtained suggest the effectiveness of the ensemble-based feature selection approach and the possible existence of a common panel of DNA methylation markers among such three types of lung cancer tissue, which would facilitate clinical diagnosis and treatment.

  9. Molecular dynamics simulation of configurational ensembles compatible with experimental FRET efficiency data through a restraint on instantaneous FRET efficiencies.

    PubMed

    Reif, Maria M; Oostenbrink, Chris

    2014-12-15

    Förster resonance energy transfer (FRET) measurements are widely used to investigate (bio)molecular interactions or/and association. FRET efficiencies, the primary data obtained from this method, give, in combination with the common assumption of isotropic chromophore orientation, detailed insight into the lengthscale of molecular phenomena. This study illustrates the application of a FRET efficiency restraint during classical atomistic molecular dynamics simulations of a mutant mastoparan X peptide in either water or 7 M aqueous urea. The restraint forces acting on the donor and acceptor chromophores ensure that the sampled peptide configurational ensemble satisfies the experimental primary data by modifying interchromophore separation and chromophore transition dipole moment orientations. By means of a conformational cluster analysis, it is seen that indeed different configurational ensembles may be sampled without and with application of the restraint. In particular, while the FRET efficiency and interchromophore distances monitored in an unrestrained simulation may differ from the experimentally-determined values, they can be brought in agreement with experimental data through usage of the FRET efficiency restraining potential. Furthermore, the present results suggest that the assumption of isotropic chromophore orientation is not always justified. The FRET efficiency restraint allows the generation of configurational ensembles that may not be accessible with unrestrained simulations, and thereby supports a meaningful interpretation of experimental FRET results in terms of the underlying molecular degrees of freedom. Thus, it offers an additional tool to connect the realms of computer and wet-lab experimentation.

  10. Intelligent Ensemble Forecasting System of Stock Market Fluctuations Based on Symetric and Asymetric Wavelet Functions

    NASA Astrophysics Data System (ADS)

    Lahmiri, Salim; Boukadoum, Mounir

    2015-08-01

    We present a new ensemble system for stock market returns prediction where continuous wavelet transform (CWT) is used to analyze return series and backpropagation neural networks (BPNNs) for processing CWT-based coefficients, determining the optimal ensemble weights, and providing final forecasts. Particle swarm optimization (PSO) is used for finding optimal weights and biases for each BPNN. To capture symmetry/asymmetry in the underlying data, three wavelet functions with different shapes are adopted. The proposed ensemble system was tested on three Asian stock markets: The Hang Seng, KOSPI, and Taiwan stock market data. Three statistical metrics were used to evaluate the forecasting accuracy; including, mean of absolute errors (MAE), root mean of squared errors (RMSE), and mean of absolute deviations (MADs). Experimental results showed that our proposed ensemble system outperformed the individual CWT-ANN models each with different wavelet function. In addition, the proposed ensemble system outperformed the conventional autoregressive moving average process. As a result, the proposed ensemble system is suitable to capture symmetry/asymmetry in financial data fluctuations for better prediction accuracy.

  11. Ensembles of satellite aerosol retrievals based on three AATSR algorithms within aerosol_cci

    NASA Astrophysics Data System (ADS)

    Kosmale, Miriam; Popp, Thomas

    2016-04-01

    Ensemble techniques are widely used in the modelling community, combining different modelling results in order to reduce uncertainties. This approach could be also adapted to satellite measurements. Aerosol_cci is an ESA funded project, where most of the European aerosol retrieval groups work together. The different algorithms are homogenized as far as it makes sense, but remain essentially different. Datasets are compared with ground based measurements and between each other. Three AATSR algorithms (Swansea university aerosol retrieval, ADV aerosol retrieval by FMI and Oxford aerosol retrieval ORAC) provide within this project 17 year global aerosol records. Each of these algorithms provides also uncertainty information on pixel level. Within the presented work, an ensembles of the three AATSR algorithms is performed. The advantage over each single algorithm is the higher spatial coverage due to more measurement pixels per gridbox. A validation to ground based AERONET measurements shows still a good correlation of the ensemble, compared to the single algorithms. Annual mean maps show the global aerosol distribution, based on a combination of the three aerosol algorithms. In addition, pixel level uncertainties of each algorithm are used for weighting the contributions, in order to reduce the uncertainty of the ensemble. Results of different versions of the ensembles for aerosol optical depth will be presented and discussed. The results are validated against ground based AERONET measurements. A higher spatial coverage on daily basis allows better results in annual mean maps. The benefit of using pixel level uncertainties is analysed.

  12. Skin lesion computational diagnosis of dermoscopic images: Ensemble models based on input feature manipulation.

    PubMed

    Oliveira, Roberta B; Pereira, Aledir S; Tavares, João Manuel R S

    2017-10-01

    The number of deaths worldwide due to melanoma has risen in recent times, in part because melanoma is the most aggressive type of skin cancer. Computational systems have been developed to assist dermatologists in early diagnosis of skin cancer, or even to monitor skin lesions. However, there still remains a challenge to improve classifiers for the diagnosis of such skin lesions. The main objective of this article is to evaluate different ensemble classification models based on input feature manipulation to diagnose skin lesions. Input feature manipulation processes are based on feature subset selections from shape properties, colour variation and texture analysis to generate diversity for the ensemble models. Three subset selection models are presented here: (1) a subset selection model based on specific feature groups, (2) a correlation-based subset selection model, and (3) a subset selection model based on feature selection algorithms. Each ensemble classification model is generated using an optimum-path forest classifier and integrated with a majority voting strategy. The proposed models were applied on a set of 1104 dermoscopic images using a cross-validation procedure. The best results were obtained by the first ensemble classification model that generates a feature subset ensemble based on specific feature groups. The skin lesion diagnosis computational system achieved 94.3% accuracy, 91.8% sensitivity and 96.7% specificity. The input feature manipulation process based on specific feature subsets generated the greatest diversity for the ensemble classification model with very promising results. Copyright © 2017 Elsevier B.V. All rights reserved.

  13. GACEM: Genetic Algorithm Based Classifier Ensemble in a Multi-sensor System

    PubMed Central

    Xu, Rongwu; He, Lin

    2008-01-01

    Multi-sensor systems (MSS) have been increasingly applied in pattern classification while searching for the optimal classification framework is still an open problem. The development of the classifier ensemble seems to provide a promising solution. The classifier ensemble is a learning paradigm where many classifiers are jointly used to solve a problem, which has been proven an effective method for enhancing the classification ability. In this paper, by introducing the concept of Meta-feature (MF) and Trans-function (TF) for describing the relationship between the nature and the measurement of the observed phenomenon, classification in a multi-sensor system can be unified in the classifier ensemble framework. Then an approach called Genetic Algorithm based Classifier Ensemble in Multi-sensor system (GACEM) is presented, where a genetic algorithm is utilized for optimization of both the selection of features subset and the decision combination simultaneously. GACEM trains a number of classifiers based on different combinations of feature vectors at first and then selects the classifiers whose weight is higher than the pre-set threshold to make up the ensemble. An empirical study shows that, compared with the conventional feature-level voting and decision-level voting, not only can GACEM achieve better and more robust performance, but also simplify the system markedly. PMID:27873866

  14. Genetic algorithm based adaptive neural network ensemble and its application in predicting carbon flux

    USGS Publications Warehouse

    Xue, Y.; Liu, S.; Hu, Y.; Yang, J.; Chen, Q.

    2007-01-01

    To improve the accuracy in prediction, Genetic Algorithm based Adaptive Neural Network Ensemble (GA-ANNE) is presented. Intersections are allowed between different training sets based on the fuzzy clustering analysis, which ensures the diversity as well as the accuracy of individual Neural Networks (NNs). Moreover, to improve the accuracy of the adaptive weights of individual NNs, GA is used to optimize the cluster centers. Empirical results in predicting carbon flux of Duke Forest reveal that GA-ANNE can predict the carbon flux more accurately than Radial Basis Function Neural Network (RBFNN), Bagging NN ensemble, and ANNE. ?? 2007 IEEE.

  15. Probabilistic precipitation nowcasting based on an extrapolation of radar reflectivity and an ensemble approach

    NASA Astrophysics Data System (ADS)

    Sokol, Zbyněk; Mejsnar, Jan; Pop, Lukáš; Bližňák, Vojtěch

    2017-09-01

    A new method for the probabilistic nowcasting of instantaneous rain rates (ENS) based on the ensemble technique and extrapolation along Lagrangian trajectories of current radar reflectivity is presented. Assuming inaccurate forecasts of the trajectories, an ensemble of precipitation forecasts is calculated and used to estimate the probability that rain rates will exceed a given threshold in a given grid point. Although the extrapolation neglects the growth and decay of precipitation, their impact on the probability forecast is taken into account by the calibration of forecasts using the reliability component of the Brier score (BS). ENS forecasts the probability that the rain rates will exceed thresholds of 0.1, 1.0 and 3.0 mm/h in squares of 3 km by 3 km. The lead times were up to 60 min, and the forecast accuracy was measured by the BS. The ENS forecasts were compared with two other methods: combined method (COM) and neighbourhood method (NEI). NEI considered the extrapolated values in the square neighbourhood of 5 by 5 grid points of the point of interest as ensemble members, and the COM ensemble was comprised of united ensemble members of ENS and NEI. The results showed that the calibration technique significantly improves bias of the probability forecasts by including additional uncertainties that correspond to neglected processes during the extrapolation. In addition, the calibration can also be used for finding the limits of maximum lead times for which the forecasting method is useful. We found that ENS is useful for lead times up to 60 min for thresholds of 0.1 and 1 mm/h and approximately 30 to 40 min for a threshold of 3 mm/h. We also found that a reasonable size of the ensemble is 100 members, which provided better scores than ensembles with 10, 25 and 50 members. In terms of the BS, the best results were obtained by ENS and COM, which are comparable. However, ENS is better calibrated and thus preferable.

  16. Comparative Visualization of Vector Field Ensembles Based on Longest Common Subsequence

    SciTech Connect

    Liu, Richen; Guo, Hanqi; Zhang, Jiang; Yuan, Xiaoru

    2016-04-19

    We propose a longest common subsequence (LCS) based approach to compute the distance among vector field ensembles. By measuring how many common blocks the ensemble pathlines passing through, the LCS distance defines the similarity among vector field ensembles by counting the number of sharing domain data blocks. Compared to the traditional methods (e.g. point-wise Euclidean distance or dynamic time warping distance), the proposed approach is robust to outlier, data missing, and sampling rate of pathline timestep. Taking the advantages of smaller and reusable intermediate output, visualization based on the proposed LCS approach revealing temporal trends in the data at low storage cost, and avoiding tracing pathlines repeatedly. Finally, we evaluate our method on both synthetic data and simulation data, which demonstrate the robustness of the proposed approach.

  17. Multimodal Degradation Prognostics Based on Switching Kalman Filter Ensemble.

    PubMed

    Lim, Pin; Goh, Chi Keong; Tan, Kay Chen; Dutta, Partha

    2017-01-01

    For accurate prognostics, users have to determine the current health of the system and predict future degradation pattern of the system. An increasingly popular approach toward tackling prognostic problems involves the use of switching models to represent various degradation phases, which the system undergoes. Such approaches have the advantage of determining the exact degradation phase of the system and being able to handle nonlinear degradation models through piecewise linear approximation. However, limitations of such existing methods include, limited applicability due to the discretization of predicted remaining useful life, insufficient robustness due to the use of single models and others. This paper circumvents these limitations by proposing a hybrid of ensemble methods with switching methods. The proposed method first implements a switching Kalman filter (SKF) to classify between various linear degradation phases, then predict the future propagation of fault dimension using appropriate Kalman filters for each phase. This proposed method achieves both continuous and discrete prediction values representing the remaining life and degradation phase of the system, respectively. The proposed framework is shown via a case study on benchmark simulated aeroengine data sets. The evaluation of the proposed framework shows that the proposed method achieves better accuracy and robustness against noise compared with other methods reported in the literature. The results also indicate the effectiveness of the SKF in detecting the switching point between various degradation modes.

  18. Optimized expanded ensembles for simulations involving molecular insertions and deletions. II. Open systems

    NASA Astrophysics Data System (ADS)

    Escobedo, Fernando A.

    2007-11-01

    In the Grand Canonical, osmotic, and Gibbs ensembles, chemical potential equilibrium is attained via transfers of molecules between the system and either a reservoir or another subsystem. In this work, the expanded ensemble (EXE) methods described in part I [F. A. Escobedo and F. J. Martínez-Veracoechea, J. Chem. Phys. 127, 174103 (2007)] of this series are extended to these ensembles to overcome the difficulties associated with implementing such whole-molecule transfers. In EXE, such moves occur via a target molecule that undergoes transitions through a number of intermediate coupling states. To minimize the tunneling time between the fully coupled and fully decoupled states, the intermediate states could be either: (i) sampled with an optimal frequency distribution (the sampling problem) or (ii) selected with an optimal spacing distribution (staging problem). The sampling issue is addressed by determining the biasing weights that would allow generating an optimal ensemble; discretized versions of this algorithm (well suited for small number of coupling stages) are also presented. The staging problem is addressed by selecting the intermediate stages in such a way that a flat histogram is the optimized ensemble. The validity of the advocated methods is demonstrated by their application to two model problems, the solvation of large hard spheres into a fluid of small and large spheres, and the vapor-liquid equilibrium of a chain system.

  19. Optimized expanded ensembles for simulations involving molecular insertions and deletions. II. Open systems.

    PubMed

    Escobedo, Fernando A

    2007-11-07

    In the Grand Canonical, osmotic, and Gibbs ensembles, chemical potential equilibrium is attained via transfers of molecules between the system and either a reservoir or another subsystem. In this work, the expanded ensemble (EXE) methods described in part I [F. A. Escobedo and F. J. Martinez-Veracoechea, J. Chem. Phys. 127, 174103 (2007)] of this series are extended to these ensembles to overcome the difficulties associated with implementing such whole-molecule transfers. In EXE, such moves occur via a target molecule that undergoes transitions through a number of intermediate coupling states. To minimize the tunneling time between the fully coupled and fully decoupled states, the intermediate states could be either: (i) sampled with an optimal frequency distribution (the sampling problem) or (ii) selected with an optimal spacing distribution (staging problem). The sampling issue is addressed by determining the biasing weights that would allow generating an optimal ensemble; discretized versions of this algorithm (well suited for small number of coupling stages) are also presented. The staging problem is addressed by selecting the intermediate stages in such a way that a flat histogram is the optimized ensemble. The validity of the advocated methods is demonstrated by their application to two model problems, the solvation of large hard spheres into a fluid of small and large spheres, and the vapor-liquid equilibrium of a chain system.

  20. Accurate eQTL prioritization with an ensemble-based framework.

    PubMed

    Zeng, Haoyang; Edwards, Matthew D; Guo, Yuchun; Gifford, David K

    2017-02-21

    We present a novel ensemble-based computational framework, EnsembleExpr, that achieved the best performance in the Fourth Critical Assessment of Genome Interpretation (CAGI4) "eQTL-causal SNPs" challenge for identifying eQTLs and prioritizing their gene expression effects. Expression quantitative trait loci (eQTLs) are genome sequence variants that result in gene expression changes and thus are prime suspects in the search for contributions to the causality of complex traits. When EnsembleExpr is trained on data from massively parallel reporter assays (MPRA) it accurately predicts reporter expression levels from unseen regulatory sequences and identifies sequence variants that exhibit significant changes in reporter expression. Compared with other state-of-the-art methods, EnsembleExpr achieved competitive performance when applied on eQTL datasets determined by other protocols. We envision EnsembleExpr to be a resource to help interpret non-coding regulatory variants and prioritize disease-associated mutations for downstream validation. This article is protected by copyright. All rights reserved.

  1. Determination of the DFN modeling domain size based on ensemble variability of equivalent permeability

    NASA Astrophysics Data System (ADS)

    Ji, S. H.; Koh, Y. K.

    2015-12-01

    Conceptualization of the fracture network in a disposal site is important for the safety assessment of a subsurface repository for radioactive waste. To consider the uncertainty of the stochastically conceptualized discrete fracture networks (DFNs), the ensemble variability of equivalent permeability was evaluated by defining different network structures with various fracture densities and characterization levels, and analyzing the ensemble mean and variability of the equivalent permeability of the networks, where the characterization level was defined as the ratio of the number of deterministically conceptualized fractures to the total number of fractures in the domain. The results show that the hydraulic property of the generated fractures were similar among the ensembles when the fracture density was larger than the specific fracture density where the domain size was equal to the correlation length of a given fracture network. In a sparsely fracture network where the fracture density was smaller than the specific fracture density, the ensemble variability was too large to ensure the consistent property from the stochastic DFN modeling. Deterministic information for a portion of a fracture network could reduce the uncertainty of the hydraulic property only when the fracture density was larger than the specific fracture density. Based on these results, the DFN modeling domain size for KAERI's (Korea Atomic Energy Research Institute) URT (Underground Research Tunnel) site to guarantee a less variable hydraulic property of the fracture network was determined by calculating the correlation length, and verified by evaluating the ensemble variability of the equivalent permeability.

  2. Ensemble-based Regional Climate Prediction: Political Impacts

    NASA Astrophysics Data System (ADS)

    Miguel, E.; Dykema, J.; Satyanath, S.; Anderson, J. G.

    2008-12-01

    Accurate forecasts of regional climate, including temperature and precipitation, have significant implications for human activities, not just economically but socially. Sub Saharan Africa is a region that has displayed an exceptional propensity for devastating civil wars. Recent research in political economy has revealed a strong statistical relationship between year to year fluctuations in precipitation and civil conflict in this region in the 1980s and 1990s. To investigate how climate change may modify the regional risk of civil conflict in the future requires a probabilistic regional forecast that explicitly accounts for the community's uncertainty in the evolution of rainfall under anthropogenic forcing. We approach the regional climate prediction aspect of this question through the application of a recently demonstrated method called generalized scalar prediction (Leroy et al. 2009), which predicts arbitrary scalar quantities of the climate system. This prediction method can predict change in any variable or linear combination of variables of the climate system averaged over a wide range spatial scales, from regional to hemispheric to global. Generalized scalar prediction utilizes an ensemble of model predictions to represent the community's uncertainty range in climate modeling in combination with a timeseries of any type of observational data that exhibits sensitivity to the scalar of interest. It is not necessary to prioritize models in deriving with the final prediction. We present the results of the application of generalized scalar prediction for regional forecasts of temperature and precipitation and Sub Saharan Africa. We utilize the climate predictions along with the established statistical relationship between year-to-year rainfall variability in Sub Saharan Africa to investigate the potential impact of climate change on civil conflict within that region.

  3. Improving SVDD classification performance on hyperspectral images via correlation based ensemble technique

    NASA Astrophysics Data System (ADS)

    Uslu, Faruk Sukru; Binol, Hamidullah; Ilarslan, Mustafa; Bal, Abdullah

    2017-02-01

    Support Vector Data Description (SVDD) is a nonparametric and powerful method for target detection and classification. The SVDD constructs a minimum hypersphere enclosing the target objects as much as possible. It has advantages of sparsity, good generalization and using kernel machines. In many studies, different methods have been offered in order to improve the performance of the SVDD. In this paper, we have presented ensemble methods to improve classification performance of the SVDD in remotely sensed hyperspectral imagery (HSI) data. Among various ensemble approaches we have selected bagging technique for training data set with different combinations. As a novel technique for weighting we have proposed a correlation based weight coefficients assignment. In this technique, correlation between each bagged classifier is calculated to give coefficients to weighted combinators. To verify the improvement performance, two hyperspectral images are processed for classification purpose. The obtained results show that the ensemble SVDD has been found to be significantly better than conventional SVDD in terms of classification accuracy.

  4. Cavity QED based on collective magnetic dipole coupling: spin ensembles as hybrid two-level systems.

    PubMed

    Imamoğlu, Atac

    2009-02-27

    We analyze the magnetic dipole coupling of an ensemble of spins to a superconducting microwave stripline structure, incorporating a Josephson junction based transmon qubit. We show that this system is described by an embedded Jaynes-Cummings model: in the strong coupling regime, collective spin-wave excitations of the ensemble of spins pick up the nonlinearity of the cavity mode, such that the two lowest eigenstates of the coupled spin wave-microwave cavity-Josephson junction system define a hybrid two-level system. The proposal described here enables new avenues for nonlinear optics using optical photons coupled to spin ensembles via Raman transitions. The possibility of strong coupling cavity QED with magnetic dipole transitions also opens up the possibility of extending quantum information processing protocols to spins in silicon or graphene, without the need for single-spin confinement.

  5. Discrimination of Metalloproteins by a Mini Sensor Array Based on Bispyrene Fluorophore/Surfactant Aggregate Ensembles.

    PubMed

    Cao, Yuan; Zhang, Lijun; Huang, Xinyan; Xin, Yunhong; Ding, Liping

    2016-12-28

    Fluorescent sensor arrays with pattern recognition ability have been widely used to detect and identify multiple chemically similar analytes. In the present work, two particular bispyrene fluorophores containing hydrophilic oligo(oxyethylene) spacer, 6 and 4, were synthesized, but one is with and the other is without cholesterol unit. Their ensembles with cationic surfactant (CTAB) assemblies realize multiple fluorescence responses to different metalloproteins, including hemoglobin, myoglobin, ferritin, cytochrome c, and alcohol dehydrogenase. The combination of fluorescence variation at monomer and excimer emission of the two binary sensor ensembles enables the mini sensor array to provide a specific fingerprint pattern to each metalloprotein. Linear discriminant analysis shows that the two-ensemble-sensor-based array could well discriminate the five tested metalloproteins. The present work realizes using a mini sensor array to accomplish discrimination of complex analytes like proteins. They also display a very high sensitivity to the tested metalloproteins with detection limits in the range of picomolar concentration.

  6. Ensembl 2007.

    PubMed

    Hubbard, T J P; Aken, B L; Beal, K; Ballester, B; Caccamo, M; Chen, Y; Clarke, L; Coates, G; Cunningham, F; Cutts, T; Down, T; Dyer, S C; Fitzgerald, S; Fernandez-Banet, J; Graf, S; Haider, S; Hammond, M; Herrero, J; Holland, R; Howe, K; Howe, K; Johnson, N; Kahari, A; Keefe, D; Kokocinski, F; Kulesha, E; Lawson, D; Longden, I; Melsopp, C; Megy, K; Meidl, P; Ouverdin, B; Parker, A; Prlic, A; Rice, S; Rios, D; Schuster, M; Sealy, I; Severin, J; Slater, G; Smedley, D; Spudich, G; Trevanion, S; Vilella, A; Vogel, J; White, S; Wood, M; Cox, T; Curwen, V; Durbin, R; Fernandez-Suarez, X M; Flicek, P; Kasprzyk, A; Proctor, G; Searle, S; Smith, J; Ureta-Vidal, A; Birney, E

    2007-01-01

    The Ensembl (http://www.ensembl.org/) project provides a comprehensive and integrated source of annotation of chordate genome sequences. Over the past year the number of genomes available from Ensembl has increased from 15 to 33, with the addition of sites for the mammalian genomes of elephant, rabbit, armadillo, tenrec, platypus, pig, cat, bush baby, common shrew, microbat and european hedgehog; the fish genomes of stickleback and medaka and the second example of the genomes of the sea squirt (Ciona savignyi) and the mosquito (Aedes aegypti). Some of the major features added during the year include the first complete gene sets for genomes with low-sequence coverage, the introduction of new strain variation data and the introduction of new orthology/paralog annotations based on gene trees.

  7. Operational optimization of irrigation scheduling for citrus trees using an ensemble based data assimilation approach

    NASA Astrophysics Data System (ADS)

    Hendricks Franssen, H.; Han, X.; Martinez, F.; Jimenez, M.; Manzano, J.; Chanzy, A.; Vereecken, H.

    2013-12-01

    Data assimilation (DA) techniques, like the local ensemble transform Kalman filter (LETKF) not only offer the opportunity to update model predictions by assimilating new measurement data in real time, but also provide an improved basis for real-time (DA-based) control. This study focuses on the optimization of real-time irrigation scheduling for fields of citrus trees near Picassent (Spain). For three selected fields the irrigation was optimized with DA-based control, and for other fields irrigation was optimized on the basis of a more traditional approach where reference evapotranspiration for citrus trees was estimated using the FAO-method. The performance of the two methods is compared for the year 2013. The DA-based real-time control approach is based on ensemble predictions of soil moisture profiles, using the Community Land Model (CLM). The uncertainty in the model predictions is introduced by feeding the model with weather predictions from an ensemble prediction system (EPS) and uncertain soil hydraulic parameters. The model predictions are updated daily by assimilating soil moisture data measured by capacitance probes. The measurement data are assimilated with help of LETKF. The irrigation need was calculated for each of the ensemble members, averaged, and logistic constraints (hydraulics, energy costs) were taken into account for the final assigning of irrigation in space and time. For the operational scheduling based on this approach only model states and no model parameters were updated by the model. Other, non-operational simulation experiments for the same period were carried out where (1) neither ensemble weather forecast nor DA were used (open loop), (2) Only ensemble weather forecast was used, (3) Only DA was used, (4) also soil hydraulic parameters were updated in data assimilation and (5) both soil hydraulic and plant specific parameters were updated. The FAO-based and DA-based real-time irrigation control are compared in terms of soil moisture

  8. Recognition of multiple imbalanced cancer types based on DNA microarray data using ensemble classifiers.

    PubMed

    Yu, Hualong; Hong, Shufang; Yang, Xibei; Ni, Jun; Dan, Yuanyuan; Qin, Bin

    2013-01-01

    DNA microarray technology can measure the activities of tens of thousands of genes simultaneously, which provides an efficient way to diagnose cancer at the molecular level. Although this strategy has attracted significant research attention, most studies neglect an important problem, namely, that most DNA microarray datasets are skewed, which causes traditional learning algorithms to produce inaccurate results. Some studies have considered this problem, yet they merely focus on binary-class problem. In this paper, we dealt with multiclass imbalanced classification problem, as encountered in cancer DNA microarray, by using ensemble learning. We utilized one-against-all coding strategy to transform multiclass to multiple binary classes, each of them carrying out feature subspace, which is an evolving version of random subspace that generates multiple diverse training subsets. Next, we introduced one of two different correction technologies, namely, decision threshold adjustment or random undersampling, into each training subset to alleviate the damage of class imbalance. Specifically, support vector machine was used as base classifier, and a novel voting rule called counter voting was presented for making a final decision. Experimental results on eight skewed multiclass cancer microarray datasets indicate that unlike many traditional classification approaches, our methods are insensitive to class imbalance.

  9. Application of dynamic linear regression to improve the skill of ensemble-based deterministic ozone forecasts

    SciTech Connect

    Pagowski, M O; Grell, G A; Devenyi, D; Peckham, S E; McKeen, S A; Gong, W; Monache, L D; McHenry, J N; McQueen, J; Lee, P

    2006-02-02

    Forecasts from seven air quality models and surface ozone data collected over the eastern USA and southern Canada during July and August 2004 provide a unique opportunity to assess benefits of ensemble-based ozone forecasting and devise methods to improve ozone forecasts. In this investigation, past forecasts from the ensemble of models and hourly surface ozone measurements at over 350 sites are used to issue deterministic 24-h forecasts using a method based on dynamic linear regression. Forecasts of hourly ozone concentrations as well as maximum daily 8-h and 1-h averaged concentrations are considered. It is shown that the forecasts issued with the application of this method have reduced bias and root mean square error and better overall performance scores than any of the ensemble members and the ensemble average. Performance of the method is similar to another method based on linear regression described previously by Pagowski et al., but unlike the latter, the current method does not require measurements from multiple monitors since it operates on individual time series. Improvement in the forecasts can be easily implemented and requires minimal computational cost.

  10. An Improved Ensemble of Random Vector Functional Link Networks Based on Particle Swarm Optimization with Double Optimization Strategy.

    PubMed

    Ling, Qing-Hua; Song, Yu-Qing; Han, Fei; Yang, Dan; Huang, De-Shuang

    2016-01-01

    For ensemble learning, how to select and combine the candidate classifiers are two key issues which influence the performance of the ensemble system dramatically. Random vector functional link networks (RVFL) without direct input-to-output links is one of suitable base-classifiers for ensemble systems because of its fast learning speed, simple structure and good generalization performance. In this paper, to obtain a more compact ensemble system with improved convergence performance, an improved ensemble of RVFL based on attractive and repulsive particle swarm optimization (ARPSO) with double optimization strategy is proposed. In the proposed method, ARPSO is applied to select and combine the candidate RVFL. As for using ARPSO to select the optimal base RVFL, ARPSO considers both the convergence accuracy on the validation data and the diversity of the candidate ensemble system to build the RVFL ensembles. In the process of combining RVFL, the ensemble weights corresponding to the base RVFL are initialized by the minimum norm least-square method and then further optimized by ARPSO. Finally, a few redundant RVFL is pruned, and thus the more compact ensemble of RVFL is obtained. Moreover, in this paper, theoretical analysis and justification on how to prune the base classifiers on classification problem is presented, and a simple and practically feasible strategy for pruning redundant base classifiers on both classification and regression problems is proposed. Since the double optimization is performed on the basis of the single optimization, the ensemble of RVFL built by the proposed method outperforms that built by some single optimization methods. Experiment results on function approximation and classification problems verify that the proposed method could improve its convergence accuracy as well as reduce the complexity of the ensemble system.

  11. An Improved Ensemble of Random Vector Functional Link Networks Based on Particle Swarm Optimization with Double Optimization Strategy

    PubMed Central

    Ling, Qing-Hua; Song, Yu-Qing; Han, Fei; Yang, Dan; Huang, De-Shuang

    2016-01-01

    For ensemble learning, how to select and combine the candidate classifiers are two key issues which influence the performance of the ensemble system dramatically. Random vector functional link networks (RVFL) without direct input-to-output links is one of suitable base-classifiers for ensemble systems because of its fast learning speed, simple structure and good generalization performance. In this paper, to obtain a more compact ensemble system with improved convergence performance, an improved ensemble of RVFL based on attractive and repulsive particle swarm optimization (ARPSO) with double optimization strategy is proposed. In the proposed method, ARPSO is applied to select and combine the candidate RVFL. As for using ARPSO to select the optimal base RVFL, ARPSO considers both the convergence accuracy on the validation data and the diversity of the candidate ensemble system to build the RVFL ensembles. In the process of combining RVFL, the ensemble weights corresponding to the base RVFL are initialized by the minimum norm least-square method and then further optimized by ARPSO. Finally, a few redundant RVFL is pruned, and thus the more compact ensemble of RVFL is obtained. Moreover, in this paper, theoretical analysis and justification on how to prune the base classifiers on classification problem is presented, and a simple and practically feasible strategy for pruning redundant base classifiers on both classification and regression problems is proposed. Since the double optimization is performed on the basis of the single optimization, the ensemble of RVFL built by the proposed method outperforms that built by some single optimization methods. Experiment results on function approximation and classification problems verify that the proposed method could improve its convergence accuracy as well as reduce the complexity of the ensemble system. PMID:27835638

  12. Probing dynamic conformations of the high-molecular-weight αB-crystallin heat shock protein ensemble by NMR spectroscopy.

    PubMed

    Baldwin, Andrew J; Walsh, Patrick; Hansen, D Flemming; Hilton, Gillian R; Benesch, Justin L P; Sharpe, Simon; Kay, Lewis E

    2012-09-19

    Solution- and solid-state nuclear magnetic resonance (NMR) spectroscopy are highly complementary techniques for studying supra-molecular structure. Here they are employed for investigating the molecular chaperone αB-crystallin, a polydisperse ensemble of between 10 and 40 identical subunits with an average molecular mass of approximately 600 kDa. An IxI motif in the C-terminal region of each of the subunits is thought to play a critical role in regulating the size distribution of oligomers and in controlling the kinetics of subunit exchange between them. Previously published solid-state NMR and X-ray results are consistent with a bound IxI conformation, while solution NMR studies provide strong support for a highly dynamic state. Here we demonstrate through FROSTY (freezing rotational diffusion of protein solutions at low temperature and high viscosity) MAS (magic angle spinning) NMR that both populations are present at low temperatures (<0 °C), while at higher temperatures only the mobile state is observed. Solution NMR relaxation dispersion experiments performed under physiologically relevant conditions establish that the motif interchanges between flexible (highly populated) and bound (sparsely populated) states. This work emphasizes the importance of using multiple methods in studies of supra-molecules, especially for highly dynamic ensembles where sample conditions can potentially affect the conformational properties observed.

  13. A hybrid space approach for ensemble-based 4-D variational data assimilation

    NASA Astrophysics Data System (ADS)

    Shao, Aimei; Xi, Shuang; Qiu, Chongjian; Xu, Qin

    2009-09-01

    A new scheme is developed to improve the ensemble-based 4-D variational data assimilation (En4DVar). In this scheme, leading singular vectors are extracted from 4-D ensemble perturbations in a hybrid space and then used to construct the analysis increment to fit the 4-D innovation (observation minus background) data. The hybrid space combines the 4-D observation space with only a gridded 3-D subspace at the end of each assimilation cycle, so its dimension can be much smaller than the dimension of the fully gridded 4-D space used in the original En4DVar. This improves the computational efficiency. With this hybrid space approach, the analysis increment can fit the 4-D innovation data in the observation space directly and also provide the necessary initial condition in the gridded 3-D subspace exclusively for the model integration into the next assimilation cycle, so the background covariance matrix can be and only needs to be constructed by the ensemble perturbations in the 3-D subspace. This reduces the rank deficiency of the ensemble-constructed covariance matrix and improves analysis accuracy as long as the observations are not too sparse. The potential merits of the new scheme are demonstrated by assimilation experiments performed with an imperfect shallow-water equation model and simulated observations.

  14. Three-dimensional theory of quantum memories based on {Lambda}-type atomic ensembles

    SciTech Connect

    Zeuthen, Emil; Grodecka-Grad, Anna; Soerensen, Anders S.

    2011-10-15

    We develop a three-dimensional theory for quantum memories based on light storage in ensembles of {Lambda}-type atoms, where two long-lived atomic ground states are employed. We consider light storage in an ensemble of finite spatial extent and we show that within the paraxial approximation the Fresnel number of the atomic ensemble and the optical depth are the only important physical parameters determining the quality of the quantum memory. We analyze the influence of these parameters on the storage of light followed by either forward or backward read-out from the quantum memory. We show that for small Fresnel numbers the forward memory provides higher efficiencies, whereas for large Fresnel numbers the backward memory is advantageous. The optimal light modes to store in the memory are presented together with the corresponding spin waves and outcoming light modes. We show that for high optical depths such {Lambda}-type atomic ensembles allow for highly efficient backward and forward memories even for small Fresnel numbers F(greater-or-similar sign)0.1.

  15. A novel method for molecular dynamics simulation in the isothermal-isobaric ensemble

    NASA Astrophysics Data System (ADS)

    Huang, Cunkui; Li, Chunli; Choi, Phillip Y. K.; Nandakumar, K.; Kostiuk, Larry W.

    2011-01-01

    A novel algorithm is proposed to study fluid properties in the isothermal-isobaric (NPT) ensemble. The major feature of this approach is that the constant pressure in the NPT ensemble is created by two auto-adjusting boundaries that allow the system volume to fluctuate. Relative to other methods used to create the NPT ensemble, this approach is simpler to perform since no additional variables are introduced into the simulation system. To test this method, two systems with the same constant target pressure and temperature but different thermostats (Nose-Hoover and Berendsen) were performed by using a commonly used cut-off distance (i.e. r c = 2.5σ). The simulation results show that the proposed method works well in terms of creating spatially uniform mean temperature, pressure and density while still allowing appropriate levels of instantaneous fluctuations for observable quantities. The fluctuations of the system volume produced by this method were compared with that calculated by the theoretical equation. To test the reliability of the proposed method, additional simulations were carried out at eight different thermodynamic states but with the use of a longer cut-off distance (r c = 4.5σ). The results were compared with those obtained using the Nose-Hoover barostat with an r c of 4.5σ, as well as with experiments. The comparison shows that the results using the algorithm proposed in this article agree well with those obtained using other methods.

  16. An ensemble of dissimilarity based classifiers for Mackerel gender determination

    NASA Astrophysics Data System (ADS)

    Blanco, A.; Rodriguez, R.; Martinez-Maranon, I.

    2014-03-01

    Mackerel is an infravalored fish captured by European fishing vessels. A manner to add value to this specie can be achieved by trying to classify it attending to its sex. Colour measurements were performed on Mackerel females and males (fresh and defrozen) extracted gonads to obtain differences between sexes. Several linear and non linear classifiers such as Support Vector Machines (SVM), k Nearest Neighbors (k-NN) or Diagonal Linear Discriminant Analysis (DLDA) can been applied to this problem. However, theyare usually based on Euclidean distances that fail to reflect accurately the sample proximities. Classifiers based on non-Euclidean dissimilarities misclassify a different set of patterns. We combine different kind of dissimilarity based classifiers. The diversity is induced considering a set of complementary dissimilarities for each model. The experimental results suggest that our algorithm helps to improve classifiers based on a single dissimilarity.

  17. A Remodeled Hsp90 Molecular Chaperone Ensemble with the Novel Cochaperone Aarsd1 Is Required for Muscle Differentiation.

    PubMed

    Echeverría, Pablo C; Briand, Pierre-André; Picard, Didier

    2016-04-01

    Hsp90 is the ATP-consuming core component of a very abundant molecular chaperone machine that handles a substantial portion of the cytosolic proteome. Rather than one machine, it is in fact an ensemble of molecular machines, since most mammalian cells express two cytosolic isoforms of Hsp90 and a subset of up to 40 to 50 cochaperones and regulate their interactions and functions by a variety of posttranslational modifications. We demonstrate that the Hsp90 ensemble is fundamentally remodeled during muscle differentiation and that this remodeling is not just a consequence of muscle differentiation but possibly one of the drivers to accompany and to match the vast proteomic changes associated with this process. As myoblasts differentiate into myotubes, Hsp90α disappears and only Hsp90β remains, which is the only isoform capable of interacting with the novel muscle-specific Hsp90 cochaperone Aarsd1L. Artificially maintaining Hsp90α or knocking down Aarsd1L expression interferes with the differentiation of C2C12 myotubes. During muscle differentiation, Aarsd1L replaces the more ubiquitous cochaperone p23 and in doing so dampens the activity of the glucocorticoid receptor, one of the Hsp90 clients relevant to muscle functions. This cochaperone switch protects muscle cells against the inhibitory effects of glucocorticoids and may contribute to preventing muscle wasting induced by excess glucocorticoids.

  18. A Remodeled Hsp90 Molecular Chaperone Ensemble with the Novel Cochaperone Aarsd1 Is Required for Muscle Differentiation

    PubMed Central

    Echeverría, Pablo C.; Briand, Pierre-André

    2016-01-01

    Hsp90 is the ATP-consuming core component of a very abundant molecular chaperone machine that handles a substantial portion of the cytosolic proteome. Rather than one machine, it is in fact an ensemble of molecular machines, since most mammalian cells express two cytosolic isoforms of Hsp90 and a subset of up to 40 to 50 cochaperones and regulate their interactions and functions by a variety of posttranslational modifications. We demonstrate that the Hsp90 ensemble is fundamentally remodeled during muscle differentiation and that this remodeling is not just a consequence of muscle differentiation but possibly one of the drivers to accompany and to match the vast proteomic changes associated with this process. As myoblasts differentiate into myotubes, Hsp90α disappears and only Hsp90β remains, which is the only isoform capable of interacting with the novel muscle-specific Hsp90 cochaperone Aarsd1L. Artificially maintaining Hsp90α or knocking down Aarsd1L expression interferes with the differentiation of C2C12 myotubes. During muscle differentiation, Aarsd1L replaces the more ubiquitous cochaperone p23 and in doing so dampens the activity of the glucocorticoid receptor, one of the Hsp90 clients relevant to muscle functions. This cochaperone switch protects muscle cells against the inhibitory effects of glucocorticoids and may contribute to preventing muscle wasting induced by excess glucocorticoids. PMID:26884463

  19. Dual control cell reaction ensemble molecular dynamics: A method for simulations of reactions and adsorption in porous materials

    NASA Astrophysics Data System (ADS)

    Lísal, Martin; Brennan, John K.; Smith, William R.; Siperstein, Flor R.

    2004-09-01

    We present a simulation tool to study fluid mixtures that are simultaneously chemically reacting and adsorbing in a porous material. The method is a combination of the reaction ensemble Monte Carlo method and the dual control volume grand canonical molecular dynamics technique. The method, termed the dual control cell reaction ensemble molecular dynamics method, allows for the calculation of both equilibrium and nonequilibrium transport properties in porous materials such as diffusion coefficients, permeability, and mass flux. Control cells, which are in direct physical contact with the porous solid, are used to maintain the desired reaction and flow conditions for the system. The simulation setup closely mimics an actual experimental system in which the thermodynamic and flow parameters are precisely controlled. We present an application of the method to the dry reforming of methane reaction within a nanoscale reactor model in the presence of a semipermeable membrane that was modeled as a porous material similar to silicalite. We studied the effects of the membrane structure and porosity on the reaction species permeability by considering three different membrane models. We also studied the effects of an imposed pressure gradient across the membrane on the mass flux of the reaction species. Conversion of syngas (H2/CO) increased significantly in all the nanoscale membrane reactor models considered. A brief discussion of further potential applications is also presented.

  20. Dual control cell reaction ensemble molecular dynamics: a method for simulations of reactions and adsorption in porous materials.

    PubMed

    Lisal, Martin; Brennan, John K; Smith, William R; Siperstein, Flor R

    2004-09-08

    We present a simulation tool to study fluid mixtures that are simultaneously chemically reacting and adsorbing in a porous material. The method is a combination of the reaction ensemble Monte Carlo method and the dual control volume grand canonical molecular dynamics technique. The method, termed the dual control cell reaction ensemble molecular dynamics method, allows for the calculation of both equilibrium and nonequilibrium transport properties in porous materials such as diffusion coefficients, permeability, and mass flux. Control cells, which are in direct physical contact with the porous solid, are used to maintain the desired reaction and flow conditions for the system. The simulation setup closely mimics an actual experimental system in which the thermodynamic and flow parameters are precisely controlled. We present an application of the method to the dry reforming of methane reaction within a nanoscale reactor model in the presence of a semipermeable membrane that was modeled as a porous material similar to silicalite. We studied the effects of the membrane structure and porosity on the reaction species permeability by considering three different membrane models. We also studied the effects of an imposed pressure gradient across the membrane on the mass flux of the reaction species. Conversion of syngas (H2/CO) increased significantly in all the nanoscale membrane reactor models considered. A brief discussion of further potential applications is also presented.

  1. DNA based molecular motors

    NASA Astrophysics Data System (ADS)

    Michaelis, Jens; Muschielok, Adam; Andrecka, Joanna; Kügel, Wolfgang; Moffitt, Jeffrey R.

    2009-12-01

    Most of the essential cellular processes such as polymerisation reactions, gene expression and regulation are governed by mechanical processes. Controlled mechanical investigations of these processes are therefore required in order to take our understanding of molecular biology to the next level. Single-molecule manipulation and force spectroscopy have over the last 15 years been developed into extremely powerful techniques. Applying these techniques to the investigation of proteins and DNA molecules has led to a mechanistic understanding of protein function on the level of single molecules. As examples for DNA based molecular machines we will describe single-molecule experiments on RNA polymerases as well as on the packaging of DNA into a viral capsid-a process that is driven by one of the most powerful molecular motors.

  2. Reduced-order flow modeling and geological parameterization for ensemble-based data assimilation

    NASA Astrophysics Data System (ADS)

    He, Jincong; Sarma, Pallav; Durlofsky, Louis J.

    2013-06-01

    Reduced-order modeling represents an attractive approach for accelerating computationally expensive reservoir simulation applications. In this paper, we introduce and apply such a methodology for data assimilation problems. The technique applied to provide flow simulation results, trajectory piecewise linearization (TPWL), has been used previously for production optimization problems, where it has provided large computational speedups. The TPWL model developed here represents simulation results for new geological realizations in terms of a linearization around previously simulated (training) cases. The high-dimensional representation of the states is projected into a low-dimensional subspace using proper orthogonal decomposition. The geological models are also represented in reduced terms using a Karhunen-Loève expansion of the log-transmissibility field. Thus, both the reservoir states and geological parameters are described very concisely. The reduced-order representation of flow and geology is appropriate for use with ensemble-based data assimilation procedures, and here it is incorporated into an ensemble Kalman filter (EnKF) framework to enrich the ensemble at a low cost. The method is able to reconstruct full-order states, which are required by EnKF, whenever necessary. The combined technique enables EnKF to be applied using many fewer high-fidelity reservoir simulations than would otherwise be required to avoid ensemble collapse. For two- and three-dimensional example cases, it is demonstrated that EnKF results using 50 high-fidelity simulations along with 150 TPWL simulations are much better than those using only 50 high-fidelity simulations (for which ensemble collapse is observed) and are, in fact, comparable to the results achieved using 200 high-fidelity simulations.

  3. Profiles and majority voting-based ensemble method for protein secondary structure prediction.

    PubMed

    Bouziane, Hafida; Messabih, Belhadri; Chouarfia, Abdallah

    2011-01-01

    Machine learning techniques have been widely applied to solve the problem of predicting protein secondary structure from the amino acid sequence. They have gained substantial success in this research area. Many methods have been used including k-Nearest Neighbors (k-NNs), Hidden Markov Models (HMMs), Artificial Neural Networks (ANNs) and Support Vector Machines (SVMs), which have attracted attention recently. Today, the main goal remains to improve the prediction quality of the secondary structure elements. The prediction accuracy has been continuously improved over the years, especially by using hybrid or ensemble methods and incorporating evolutionary information in the form of profiles extracted from alignments of multiple homologous sequences. In this paper, we investigate how best to combine k-NNs, ANNs and Multi-class SVMs (M-SVMs) to improve secondary structure prediction of globular proteins. An ensemble method which combines the outputs of two feed-forward ANNs, k-NN and three M-SVM classifiers has been applied. Ensemble members are combined using two variants of majority voting rule. An heuristic based filter has also been applied to refine the prediction. To investigate how much improvement the general ensemble method can give rather than the individual classifiers that make up the ensemble, we have experimented with the proposed system on the two widely used benchmark datasets RS126 and CB513 using cross-validation tests by including PSI-BLAST position-specific scoring matrix (PSSM) profiles as inputs. The experimental results reveal that the proposed system yields significant performance gains when compared with the best individual classifier.

  4. System for NIS Forecasting Based on Ensembles Analysis

    SciTech Connect

    2014-01-02

    BMA-NIS is a package/library designed to be called by a script (e.g. Perl or Python). The software itself is written in the language of R. The software assists electric power delivery systems in planning resource availability and demand, based on historical data and current data variables. Net Interchange Schedule (NIS) is the algebraic sum of all energy scheduled to flow into or out of a balancing area during any interval. Accurate forecasts for NIS are important so that the Area Control Error (ACE) stays within an acceptable limit. To date, there are many approaches for forecasting NIS but all none of these are based on single models that can be sensitive to time of day and day of week effects.

  5. Ensemble method: Community detection based on game theory

    NASA Astrophysics Data System (ADS)

    Zhang, Xia; Xia, Zhengyou; Xu, Shengwu; Wang, J. D.

    2014-08-01

    Timely and cost-effective analytics over social network has emerged as a key ingredient for success in many businesses and government endeavors. Community detection is an active research area of relevance to analyze online social network. The problem of selecting a particular community detection algorithm is crucial if the aim is to unveil the community structure of a network. The choice of a given methodology could affect the outcome of the experiments because different algorithms have different advantages and depend on tuning specific parameters. In this paper, we propose a community division model based on the notion of game theory, which can combine advantages of previous algorithms effectively to get a better community classification result. By making experiments on some standard dataset, it verifies that our community detection model based on game theory is valid and better.

  6. Protein Complex Detection via Weighted Ensemble Clustering Based on Bayesian Nonnegative Matrix Factorization

    PubMed Central

    Ou-Yang, Le; Dai, Dao-Qing; Zhang, Xiao-Fei

    2013-01-01

    Detecting protein complexes from protein-protein interaction (PPI) networks is a challenging task in computational biology. A vast number of computational methods have been proposed to undertake this task. However, each computational method is developed to capture one aspect of the network. The performance of different methods on the same network can differ substantially, even the same method may have different performance on networks with different topological characteristic. The clustering result of each computational method can be regarded as a feature that describes the PPI network from one aspect. It is therefore desirable to utilize these features to produce a more accurate and reliable clustering. In this paper, a novel Bayesian Nonnegative Matrix Factorization(NMF)-based weighted Ensemble Clustering algorithm (EC-BNMF) is proposed to detect protein complexes from PPI networks. We first apply different computational algorithms on a PPI network to generate some base clustering results. Then we integrate these base clustering results into an ensemble PPI network, in the form of weighted combination. Finally, we identify overlapping protein complexes from this network by employing Bayesian NMF model. When generating an ensemble PPI network, EC-BNMF can automatically optimize the values of weights such that the ensemble algorithm can deliver better results. Experimental results on four PPI networks of Saccharomyces cerevisiae well verify the effectiveness of EC-BNMF in detecting protein complexes. EC-BNMF provides an effective way to integrate different clustering results for more accurate and reliable complex detection. Furthermore, EC-BNMF has a high degree of flexibility in the choice of base clustering results. It can be coupled with existing clustering methods to identify protein complexes. PMID:23658709

  7. Protein complex detection via weighted ensemble clustering based on Bayesian nonnegative matrix factorization.

    PubMed

    Ou-Yang, Le; Dai, Dao-Qing; Zhang, Xiao-Fei

    2013-01-01

    Detecting protein complexes from protein-protein interaction (PPI) networks is a challenging task in computational biology. A vast number of computational methods have been proposed to undertake this task. However, each computational method is developed to capture one aspect of the network. The performance of different methods on the same network can differ substantially, even the same method may have different performance on networks with different topological characteristic. The clustering result of each computational method can be regarded as a feature that describes the PPI network from one aspect. It is therefore desirable to utilize these features to produce a more accurate and reliable clustering. In this paper, a novel Bayesian Nonnegative Matrix Factorization (NMF)-based weighted Ensemble Clustering algorithm (EC-BNMF) is proposed to detect protein complexes from PPI networks. We first apply different computational algorithms on a PPI network to generate some base clustering results. Then we integrate these base clustering results into an ensemble PPI network, in the form of weighted combination. Finally, we identify overlapping protein complexes from this network by employing Bayesian NMF model. When generating an ensemble PPI network, EC-BNMF can automatically optimize the values of weights such that the ensemble algorithm can deliver better results. Experimental results on four PPI networks of Saccharomyces cerevisiae well verify the effectiveness of EC-BNMF in detecting protein complexes. EC-BNMF provides an effective way to integrate different clustering results for more accurate and reliable complex detection. Furthermore, EC-BNMF has a high degree of flexibility in the choice of base clustering results. It can be coupled with existing clustering methods to identify protein complexes.

  8. Using multi-compartment ensemble modeling as an investigative tool of spatially distributed biophysical balances: application to hippocampal oriens-lacunosum/moleculare (O-LM) cells.

    PubMed

    Sekulić, Vladislav; Lawrence, J Josh; Skinner, Frances K

    2014-01-01

    Multi-compartmental models of neurons provide insight into the complex, integrative properties of dendrites. Because it is not feasible to experimentally determine the exact density and kinetics of each channel type in every neuronal compartment, an essential goal in developing models is to help characterize these properties. To address biological variability inherent in a given neuronal type, there has been a shift away from using hand-tuned models towards using ensembles or populations of models. In collectively capturing a neuron's output, ensemble modeling approaches uncover important conductance balances that control neuronal dynamics. However, conductances are never entirely known for a given neuron class in terms of its types, densities, kinetics and distributions. Thus, any multi-compartment model will always be incomplete. In this work, our main goal is to use ensemble modeling as an investigative tool of a neuron's biophysical balances, where the cycling between experiment and model is a design criterion from the start. We consider oriens-lacunosum/moleculare (O-LM) interneurons, a prominent interneuron subtype that plays an essential gating role of information flow in hippocampus. O-LM cells express the hyperpolarization-activated current (Ih). Although dendritic Ih could have a major influence on the integrative properties of O-LM cells, the compartmental distribution of Ih on O-LM dendrites is not known. Using a high-performance computing cluster, we generated a database of models that included those with or without dendritic Ih. A range of conductance values for nine different conductance types were used, and different morphologies explored. Models were quantified and ranked based on minimal error compared to a dataset of O-LM cell electrophysiological properties. Co-regulatory balances between conductances were revealed, two of which were dependent on the presence of dendritic Ih. These findings inform future experiments that differentiate between

  9. Using Multi-Compartment Ensemble Modeling As an Investigative Tool of Spatially Distributed Biophysical Balances: Application to Hippocampal Oriens-Lacunosum/Moleculare (O-LM) Cells

    PubMed Central

    Sekulić, Vladislav; Lawrence, J. Josh; Skinner, Frances K.

    2014-01-01

    Multi-compartmental models of neurons provide insight into the complex, integrative properties of dendrites. Because it is not feasible to experimentally determine the exact density and kinetics of each channel type in every neuronal compartment, an essential goal in developing models is to help characterize these properties. To address biological variability inherent in a given neuronal type, there has been a shift away from using hand-tuned models towards using ensembles or populations of models. In collectively capturing a neuron's output, ensemble modeling approaches uncover important conductance balances that control neuronal dynamics. However, conductances are never entirely known for a given neuron class in terms of its types, densities, kinetics and distributions. Thus, any multi-compartment model will always be incomplete. In this work, our main goal is to use ensemble modeling as an investigative tool of a neuron's biophysical balances, where the cycling between experiment and model is a design criterion from the start. We consider oriens-lacunosum/moleculare (O-LM) interneurons, a prominent interneuron subtype that plays an essential gating role of information flow in hippocampus. O-LM cells express the hyperpolarization-activated current (Ih). Although dendritic Ih could have a major influence on the integrative properties of O-LM cells, the compartmental distribution of Ih on O-LM dendrites is not known. Using a high-performance computing cluster, we generated a database of models that included those with or without dendritic Ih. A range of conductance values for nine different conductance types were used, and different morphologies explored. Models were quantified and ranked based on minimal error compared to a dataset of O-LM cell electrophysiological properties. Co-regulatory balances between conductances were revealed, two of which were dependent on the presence of dendritic Ih. These findings inform future experiments that differentiate between

  10. A single-ensemble-based hybrid approach to clutter rejection combining bilinear Hankel with regression.

    PubMed

    Shen, Zhiyuan; Feng, Naizhang; Lee, Chin-Hui

    2013-04-01

    Clutter regarded as ultrasound Doppler echoes of soft tissue interferes with the primary objective of color flow imaging (CFI): measurement and display of blood flow. Multi-ensemble samples based clutter filters degrade resolution or frame rate of CFI. The prevalent single-ensemble clutter rejection filter is based on a single rejection criterion and fails to achieve a high accuracy for estimating both the low- and high-velocity blood flow components. The Bilinear Hankel-SVD achieved more exact signal decomposition than the conventional Hankel-SVD. Furthermore, the correlation between two arbitrary eigen-components obtained by the B-Hankel-SVD was demonstrated. In the hybrid approach, the input ultrasound Doppler signal first passes through a low-order regression filter, and then the output is properly decomposed into a collection of eigen-components under the framework of B-Hankel-SVD. The blood flow components are finally extracted based on a frequency threshold. In a series of simulations, the proposed B-Hankel-SVD filter reduced the estimation bias of the blood flow over the conventional Hankel-SVD filter. The hybrid algorithm was shown to be more effective than regression or Hankel-SVD filters alone in rejecting the undesirable clutter components with single-ensemble (S-E) samples. It achieved a significant improvement in blood flow frequency estimation and estimation variance over the other competing filters.

  11. Clustering-Based Ensemble Learning for Activity Recognition in Smart Homes

    PubMed Central

    Jurek, Anna; Nugent, Chris; Bi, Yaxin; Wu, Shengli

    2014-01-01

    Application of sensor-based technology within activity monitoring systems is becoming a popular technique within the smart environment paradigm. Nevertheless, the use of such an approach generates complex constructs of data, which subsequently requires the use of intricate activity recognition techniques to automatically infer the underlying activity. This paper explores a cluster-based ensemble method as a new solution for the purposes of activity recognition within smart environments. With this approach activities are modelled as collections of clusters built on different subsets of features. A classification process is performed by assigning a new instance to its closest cluster from each collection. Two different sensor data representations have been investigated, namely numeric and binary. Following the evaluation of the proposed methodology it has been demonstrated that the cluster-based ensemble method can be successfully applied as a viable option for activity recognition. Results following exposure to data collected from a range of activities indicated that the ensemble method had the ability to perform with accuracies of 94.2% and 97.5% for numeric and binary data, respectively. These results outperformed a range of single classifiers considered as benchmarks. PMID:25014095

  12. Clustering-based ensemble learning for activity recognition in smart homes.

    PubMed

    Jurek, Anna; Nugent, Chris; Bi, Yaxin; Wu, Shengli

    2014-07-10

    Application of sensor-based technology within activity monitoring systems is becoming a popular technique within the smart environment paradigm. Nevertheless, the use of such an approach generates complex constructs of data, which subsequently requires the use of intricate activity recognition techniques to automatically infer the underlying activity. This paper explores a cluster-based ensemble method as a new solution for the purposes of activity recognition within smart environments. With this approach activities are modelled as collections of clusters built on different subsets of features. A classification process is performed by assigning a new instance to its closest cluster from each collection. Two different sensor data representations have been investigated, namely numeric and binary. Following the evaluation of the proposed methodology it has been demonstrated that the cluster-based ensemble method can be successfully applied as a viable option for activity recognition. Results following exposure to data collected from a range of activities indicated that the ensemble method had the ability to perform with accuracies of 94.2% and 97.5% for numeric and binary data, respectively. These results outperformed a range of single classifiers considered as benchmarks.

  13. Representing radar rainfall uncertainty with ensembles based on a time-variant geostatistical error modelling approach

    NASA Astrophysics Data System (ADS)

    Cecinati, Francesca; Rico-Ramirez, Miguel Angel; Heuvelink, Gerard B. M.; Han, Dawei

    2017-05-01

    The application of radar quantitative precipitation estimation (QPE) to hydrology and water quality models can be preferred to interpolated rainfall point measurements because of the wide coverage that radars can provide, together with a good spatio-temporal resolutions. Nonetheless, it is often limited by the proneness of radar QPE to a multitude of errors. Although radar errors have been widely studied and techniques have been developed to correct most of them, residual errors are still intrinsic in radar QPE. An estimation of uncertainty of radar QPE and an assessment of uncertainty propagation in modelling applications is important to quantify the relative importance of the uncertainty associated to radar rainfall input in the overall modelling uncertainty. A suitable tool for this purpose is the generation of radar rainfall ensembles. An ensemble is the representation of the rainfall field and its uncertainty through a collection of possible alternative rainfall fields, produced according to the observed errors, their spatial characteristics, and their probability distribution. The errors are derived from a comparison between radar QPE and ground point measurements. The novelty of the proposed ensemble generator is that it is based on a geostatistical approach that assures a fast and robust generation of synthetic error fields, based on the time-variant characteristics of errors. The method is developed to meet the requirement of operational applications to large datasets. The method is applied to a case study in Northern England, using the UK Met Office NIMROD radar composites at 1 km resolution and at 1 h accumulation on an area of 180 km by 180 km. The errors are estimated using a network of 199 tipping bucket rain gauges from the Environment Agency. 183 of the rain gauges are used for the error modelling, while 16 are kept apart for validation. The validation is done by comparing the radar rainfall ensemble with the values recorded by the validation rain

  14. Hybrid Molecular and Spin Dynamics Simulations for Ensembles of Magnetic Nanoparticles for Magnetoresistive Systems.

    PubMed

    Teich, Lisa; Schröder, Christian

    2015-11-13

    The development of magnetoresistive sensors based on magnetic nanoparticles which are immersed in conductive gel matrices requires detailed information about the corresponding magnetoresistive properties in order to obtain optimal sensor sensitivities. Here, crucial parameters are the particle concentration, the viscosity of the gel matrix and the particle structure. Experimentally, it is not possible to obtain detailed information about the magnetic microstructure, i.e., orientations of the magnetic moments of the particles that define the magnetoresistive properties, however, by using numerical simulations one can study the magnetic microstructure theoretically, although this requires performing classical spin dynamics and molecular dynamics simulations simultaneously. Here, we present such an approach which allows us to calculate the orientation and the trajectory of every single magnetic nanoparticle. This enables us to study not only the static magnetic microstructure, but also the dynamics of the structuring process in the gel matrix itself. With our hybrid approach, arbitrary sensor configurations can be investigated and their magnetoresistive properties can be optimized.

  15. Ensemble-Based Parameter Estimation in a Coupled GCM Using the Adaptive Spatial Average Method

    DOE PAGES

    Liu, Y.; Liu, Z.; Zhang, S.; ...

    2014-05-29

    Ensemble-based parameter estimation for a climate model is emerging as an important topic in climate research. And for a complex system such as a coupled ocean–atmosphere general circulation model, the sensitivity and response of a model variable to a model parameter could vary spatially and temporally. An adaptive spatial average (ASA) algorithm is proposed to increase the efficiency of parameter estimation. Refined from a previous spatial average method, the ASA uses the ensemble spread as the criterion for selecting “good” values from the spatially varying posterior estimated parameter values; these good values are then averaged to give the final globalmore » uniform posterior parameter. In comparison with existing methods, the ASA parameter estimation has a superior performance: faster convergence and enhanced signal-to-noise ratio.« less

  16. Ensemble-Based Parameter Estimation in a Coupled GCM Using the Adaptive Spatial Average Method

    SciTech Connect

    Liu, Y.; Liu, Z.; Zhang, S.; Rong, X.; Jacob, R.; Wu, S.; Lu, F.

    2014-05-29

    Ensemble-based parameter estimation for a climate model is emerging as an important topic in climate research. And for a complex system such as a coupled ocean–atmosphere general circulation model, the sensitivity and response of a model variable to a model parameter could vary spatially and temporally. An adaptive spatial average (ASA) algorithm is proposed to increase the efficiency of parameter estimation. Refined from a previous spatial average method, the ASA uses the ensemble spread as the criterion for selecting “good” values from the spatially varying posterior estimated parameter values; these good values are then averaged to give the final global uniform posterior parameter. In comparison with existing methods, the ASA parameter estimation has a superior performance: faster convergence and enhanced signal-to-noise ratio.

  17. An ensemble-based approach for breast mass classification in mammography images

    NASA Astrophysics Data System (ADS)

    Ribeiro, Patricia B.; Papa, João. P.; Romero, Roseli A. F.

    2017-03-01

    Mammography analysis is an important tool that helps detecting breast cancer at the very early stages of the disease, thus increasing the quality of life of hundreds of thousands of patients worldwide. In Computer-Aided Detection systems, the identification of mammograms with and without masses (without clinical findings) is highly needed to reduce the false positive rates regarding the automatic selection of regions of interest that may contain some suspicious content. In this work, the introduce a variant of the Optimum-Path Forest (OPF) classifier for breast mass identification, as well as we employed an ensemble-based approach that can enhance the effectiveness of individual classifiers aiming at dealing with the aforementioned purpose. The experimental results also comprise the naïve OPF and a traditional neural network, being the most accurate results obtained through the ensemble of classifiers, with an accuracy nearly to 86%.

  18. Design of protein switches based on an ensemble model of allostery.

    PubMed

    Choi, Jay H; Laurent, Abigail H; Hilser, Vincent J; Ostermeier, Marc

    2015-04-22

    Switchable proteins that can be regulated through exogenous or endogenous inputs have a broad range of biotechnological and biomedical applications. Here we describe the design of switchable enzymes based on an ensemble allosteric model. First, we insert an enzyme domain into an effector-binding domain such that both domains remain functionally intact. Second, we induce the fusion to behave as a switch through the introduction of conditional conformational flexibility designed to increase the conformational entropy of the enzyme domain in a temperature- or pH-dependent fashion. We confirm the switching behaviour in vitro and in vivo. Structural and thermodynamic studies support the hypothesis that switching result from an increase in conformational entropy of the enzyme domain in the absence of effector. These results support the ensemble model of allostery and embody a strategy for the design of protein switches.

  19. An ensemble classification-based approach applied to retinal blood vessel segmentation.

    PubMed

    Fraz, Muhammad Moazam; Remagnino, Paolo; Hoppe, Andreas; Uyyanonvara, Bunyarit; Rudnicka, Alicja R; Owen, Christopher G; Barman, Sarah A

    2012-09-01

    This paper presents a new supervised method for segmentation of blood vessels in retinal photographs. This method uses an ensemble system of bagged and boosted decision trees and utilizes a feature vector based on the orientation analysis of gradient vector field, morphological transformation, line strength measures, and Gabor filter responses. The feature vector encodes information to handle the healthy as well as the pathological retinal image. The method is evaluated on the publicly available DRIVE and STARE databases, frequently used for this purpose and also on a new public retinal vessel reference dataset CHASE_DB1 which is a subset of retinal images of multiethnic children from the Child Heart and Health Study in England (CHASE) dataset. The performance of the ensemble system is evaluated in detail and the incurred accuracy, speed, robustness, and simplicity make the algorithm a suitable tool for automated retinal image analysis.

  20. Random feature subspace ensemble based Extreme Learning Machine for liver tumor detection and segmentation.

    PubMed

    Huang, Weimin; Yang, Yongzhong; Lin, Zhiping; Huang, Guang-Bin; Zhou, Jiayin; Duan, Yuping; Xiong, Wei

    2014-01-01

    This paper presents a new approach to detect and segment liver tumors. The detection and segmentation of liver tumors can be formulized as novelty detection or two-class classification problem. Each voxel is characterized by a rich feature vector, and a classifier using random feature subspace ensemble is trained to classify the voxels. Since Extreme Learning Machine (ELM) has advantages of very fast learning speed and good generalization ability, it is chosen to be the base classifier in the ensemble. Besides, majority voting is incorporated for fusion of classification results from the ensemble of base classifiers. In order to further increase testing accuracy, ELM autoencoder is implemented as a pre-training step. In automatic liver tumor detection, ELM is trained as a one-class classifier with only healthy liver samples, and the performance is compared with two-class ELM. In liver tumor segmentation, a semi-automatic approach is adopted by selecting samples in 3D space to train the classifier. The proposed method is tested and evaluated on a group of patients' CT data and experiment show promising results.

  1. Planetary gearbox condition monitoring of ship-based satellite communication antennas using ensemble multiwavelet analysis method

    NASA Astrophysics Data System (ADS)

    Chen, Jinglong; Zhang, Chunlin; Zhang, Xiaoyan; Zi, Yanyang; He, Shuilong; Yang, Zhe

    2015-03-01

    Satellite communication antennas are key devices of a measurement ship to support voice, data, fax and video integration services. Condition monitoring of mechanical equipment from the vibration measurement data is significant for guaranteeing safe operation and avoiding the unscheduled breakdown. So, condition monitoring system for ship-based satellite communication antennas is designed and developed. Planetary gearboxes play an important role in the transmission train of satellite communication antenna. However, condition monitoring of planetary gearbox still faces challenges due to complexity and weak condition feature. This paper provides a possibility for planetary gearbox condition monitoring by proposing ensemble a multiwavelet analysis method. Benefit from the property on multi-resolution analysis and the multiple wavelet basis functions, multiwavelet has the advantage over characterizing the non-stationary signal. In order to realize the accurate detection of the condition feature and multi-resolution analysis in the whole frequency band, adaptive multiwavelet basis function is constructed via increasing multiplicity and then vibration signal is processed by the ensemble multiwavelet transform. Finally, normalized ensemble multiwavelet transform information entropy is computed to describe the condition of planetary gearbox. The effectiveness of proposed method is first validated through condition monitoring of experimental planetary gearbox. Then this method is used for planetary gearbox condition monitoring of ship-based satellite communication antennas and the results support its feasibility.

  2. Ensemble global ocean forecasting

    NASA Astrophysics Data System (ADS)

    Brassington, G. B.

    2016-02-01

    A novel time-lagged ensemble system based on multiple independent cycles has been performed in operations at the Australian Bureau of Meteorology for the past 3 years. Despite the use of only four cycles the ensemble mean provided robustly higher skill and the ensemble variance was a reliable predictor of forecast errors. A spectral analysis comparing the ensemble mean with the members demonstrated the gradual increase in power of random errors with wavenumber up to a saturation length scale imposed by the resolution of the observing system. This system has been upgraded to a near-global 0.1 degree system in a new hybrid six-member ensemble system configuration including a new data assimilation system, cycling pattern and initialisation. The hybrid system consists of two ensemble members per day each with a 3 day cycle. We will outline the performance of both the deterministic and ensemble ocean forecast system.

  3. Coastal aquifer management under parameter uncertainty: Ensemble surrogate modeling based simulation-optimization

    NASA Astrophysics Data System (ADS)

    Janardhanan, S.; Datta, B.

    2011-12-01

    Surrogate models are widely used to develop computationally efficient simulation-optimization models to solve complex groundwater management problems. Artificial intelligence based models are most often used for this purpose where they are trained using predictor-predictand data obtained from a numerical simulation model. Most often this is implemented with the assumption that the parameters and boundary conditions used in the numerical simulation model are perfectly known. However, in most practical situations these values are uncertain. Under these circumstances the application of such approximation surrogates becomes limited. In our study we develop a surrogate model based coupled simulation optimization methodology for determining optimal pumping strategies for coastal aquifers considering parameter uncertainty. An ensemble surrogate modeling approach is used along with multiple realization optimization. The methodology is used to solve a multi-objective coastal aquifer management problem considering two conflicting objectives. Hydraulic conductivity and the aquifer recharge are considered as uncertain values. Three dimensional coupled flow and transport simulation model FEMWATER is used to simulate the aquifer responses for a number of scenarios corresponding to Latin hypercube samples of pumping and uncertain parameters to generate input-output patterns for training the surrogate models. Non-parametric bootstrap sampling of this original data set is used to generate multiple data sets which belong to different regions in the multi-dimensional decision and parameter space. These data sets are used to train and test multiple surrogate models based on genetic programming. The ensemble of surrogate models is then linked to a multi-objective genetic algorithm to solve the pumping optimization problem. Two conflicting objectives, viz, maximizing total pumping from beneficial wells and minimizing the total pumping from barrier wells for hydraulic control of

  4. Exploring the Alzheimer amyloid-β peptide conformational ensemble: A review of molecular dynamics approaches.

    PubMed

    Tran, Linh; Ha-Duong, Tâp

    2015-07-01

    Alzheimer's disease is one of the most common dementia among elderly worldwide. There is no therapeutic drugs until now to treat effectively this disease. One main reason is due to the poorly understood mechanism of Aβ peptide aggregation, which plays a crucial role in the development of Alzheimer's disease. It remains challenging to experimentally or theoretically characterize the secondary and tertiary structures of the Aβ monomer because of its high flexibility and aggregation propensity, and its conformations that lead to the aggregation are not fully identified. In this review, we highlight various structural ensembles of Aβ peptide revealed and characterized by computational approaches in order to find converging structures of Aβ monomer. Understanding how Aβ peptide forms transiently stable structures prior to aggregation will contribute to the design of new therapeutic molecules against the Alzheimer's disease. Copyright © 2015 Elsevier Inc. All rights reserved.

  5. Multi-model Ensembling based on Predictor State Space: Seasonal Streamflow Forecasts and Causal Relations

    NASA Astrophysics Data System (ADS)

    Arumugam, S.; Devineni, N.; Ghosh, S.

    2006-12-01

    Seasonal streamflow forecasts contingent on climate information are essential for short-term planning and for setting up contingency measures during extreme years. Recent research show that operational climate forecasts obtained from multiple General Circulation Models (GCM) have improved predictability than climate forecasts from single GCMs. In this study, we present a new approach for multi-model ensembling by evaluating model performance from the predictor state space. By analyzing the model performance using retrospective forecasts, we show that any systematic errors in model prediction with reference to the predictor state could be reduced by combining forecasts from multiple models as well as with climatology. The methodology is demonstrated for obtaining seasonal streamflow forecasts for the Neuse river basin from two different GCMs and from two statistical models. We employ Rank Probability Score (RPS) as the basis for performing developing multi-model ensembles. The performance of the multi-model forecasts are compared with the individual model's performance using various forecast verification measures including reliability diagrams and likelihood ratio. By developing both retrospective and adaptive forecasts using this methodology, we show that evaluating the model performance from predictor state space is a good alternative in developing multi-model ensembles instead of climatology (long-term predictability) based model performance evaluation.

  6. Fireball as the result of self-organization of an ensemble of diamagnetic electron-ion nanoparticles in molecular gas

    SciTech Connect

    Lopasov, V. P.

    2011-12-15

    The conditions for dissipative self-organization of a fireball (FB) is a molecular gas by means of a regular correction of an elastic collision of water and nitrogen molecules by the field of a coherent bi-harmonic light wave (BLW) are presented. The BWL field is generated due to conversion of energy of a linear lightning discharge into light energy. A FB consists of two components: an ensemble of optically active diamagnetic electron-ion nanoparticles and a standing wave of elliptical polarization (SWEP). It is shown that the FB lifetime depends on the energies accumulated by nanoparticles and the SWEP field and on the stability of self-oscillations of the energy between nanoparticles and SWEP.

  7. BODIPY-based azamacrocyclic ensemble for selective fluorescence detection and quantification of homocysteine in biological applications.

    PubMed

    Li, Zan; Geng, Zhi-Rong; Zhang, Cui; Wang, Xiao-Bo; Wang, Zhi-Lin

    2015-10-15

    Considering the significant role of plasma homocysteine in physiological processes, two ensembles (F465-Cu(2+) and F508-Cu(2+)) were constructed based on a BODIPY (4,4-difluoro-1,3,5,7-tetramethyl-4-bora-3a,4a-diaza-s-indacene) scaffold conjugated with an azamacrocyclic (1,4,7-triazacyclononane and 1,4,7,10-tetraazacyclododecane) Cu(2+) complex. The results of this effort demonstrated that the F465-Cu(2+) ensemble could be employed to detect homocysteine in the presence of other biologically relevant species, including cysteine and glutathione, under physiological conditions with high selectivity and sensitivity in the turn-on fluorescence mode, while the F508-Cu(2+) ensemble showed no fluorescence responses toward biothiols. A possible mechanism for this homocysteine-specific specificity involving the formation of a homocysteine-induced six-membered ring sandwich structure was proposed and confirmed for the first time by time-dependent fluorescence spectra, ESI-MS and EPR. The detection limit of homocysteine in deproteinized human serum was calculated to be 241.4 nM with a linear range of 0-90.0 μM and the detection limit of F465 for Cu(2+) is 74.7 nM with a linear range of 0-6.0 μM (F508, 80.2 nM, 0-7.0 μM). We have demonstrated the application of the F465-Cu(2+) ensemble for detecting homocysteine in human serum and monitoring the activity of cystathionine β-synthase in vitro.

  8. Super Ensemble-based Aviation Turbulence Guidance (SEATG) for Air Traffic Management (ATM)

    NASA Astrophysics Data System (ADS)

    Kim, Jung-Hoon; Chan, William; Sridhar, Banavar; Sharman, Robert

    2014-05-01

    Super Ensemble (ensemble of ten turbulence metrics from time-lagged ensemble members of weather forecast data)-based Aviation Turbulence Guidance (SEATG) is developed using Weather Research and Forecasting (WRF) model and in-situ eddy dissipation rate (EDR) observations equipped on commercial aircraft over the contiguous United States. SEATG is a sequence of five procedures including weather modeling, calculating turbulence metrics, mapping EDR-scale, evaluating metrics, and producing final SEATG forecast. This uses similar methodology to the operational Graphic Turbulence Guidance (GTG) with three major improvements. First, SEATG use a higher resolution (3-km) WRF model to capture cloud-resolving scale phenomena. Second, SEATG computes turbulence metrics for multiple forecasts that are combined at the same valid time resulting in an time-lagged ensemble of multiple turbulence metrics. Third, SEATG provides both deterministic and probabilistic turbulence forecasts to take into account weather uncertainties and user demands. It is found that the SEATG forecasts match well with observed radar reflectivity along a surface front as well as convectively induced turbulence outside the clouds on 7-8 Sep 2012. And, overall performance skill of deterministic SEATG against the observed EDR data during this period is superior to any single turbulence metrics. Finally, probabilistic SEATG is used as an example application of turbulence forecast for air-traffic management. In this study, a simple Wind-Optimal Route (WOR) passing through the potential areas of probabilistic SEATG and Lateral Turbulence Avoidance Route (LTAR) taking into account the SEATG are calculated at z = 35000 ft (z = 12 km) from Los Angeles to John F. Kennedy international airports. As a result, WOR takes total of 239 minutes with 16 minutes of SEATG areas for 40% of moderate turbulence potential, while LTAR takes total of 252 minutes travel time that 5% of fuel would be additionally consumed to entirely

  9. Application of new methods based on ECMWF ensemble model for predicting severe convective weather situations

    NASA Astrophysics Data System (ADS)

    Lazar, Dora; Ihasz, Istvan

    2013-04-01

    The short and medium range operational forecasts, warning and alarm of the severe weather are one of the most important activities of the Hungarian Meteorological Service. Our study provides comprehensive summary of newly developed methods based on ECMWF ensemble forecasts to assist successful prediction of the convective weather situations. . In the first part of the study a brief overview is given about the components of atmospheric convection, which are the atmospheric lifting force, convergence and vertical wind shear. The atmospheric instability is often used to characterize the so-called instability index; one of the most popular and often used indexes is the convective available potential energy. Heavy convective events, like intensive storms, supercells and tornadoes are needed the vertical instability, adequate moisture and vertical wind shear. As a first step statistical studies of these three parameters are based on nine years time series of 51-member ensemble forecasting model based on convective summer time period, various statistical analyses were performed. Relationship of the rate of the convective and total precipitation and above three parameters was studied by different statistical methods. Four new visualization methods were applied for supporting successful forecasts of severe weathers. Two of the four visualization methods the ensemble meteogram and the ensemble vertical profiles had been available at the beginning of our work. Both methods show probability of the meteorological parameters for the selected location. Additionally two new methods have been developed. First method provides probability map of the event exceeding predefined values, so the incident of the spatial uncertainty is well-defined. The convective weather events are characterized by the incident of space often rhapsodic occurs rather have expected the event area can be selected so that the ensemble forecasts give very good support. Another new visualization tool shows time

  10. Ensembl 2017

    PubMed Central

    Aken, Bronwen L.; Achuthan, Premanand; Akanni, Wasiu; Amode, M. Ridwan; Bernsdorff, Friederike; Bhai, Jyothish; Billis, Konstantinos; Carvalho-Silva, Denise; Cummins, Carla; Clapham, Peter; Gil, Laurent; Girón, Carlos García; Gordon, Leo; Hourlier, Thibaut; Hunt, Sarah E.; Janacek, Sophie H.; Juettemann, Thomas; Keenan, Stephen; Laird, Matthew R.; Lavidas, Ilias; Maurel, Thomas; McLaren, William; Moore, Benjamin; Murphy, Daniel N.; Nag, Rishi; Newman, Victoria; Nuhn, Michael; Ong, Chuang Kee; Parker, Anne; Patricio, Mateus; Riat, Harpreet Singh; Sheppard, Daniel; Sparrow, Helen; Taylor, Kieron; Thormann, Anja; Vullo, Alessandro; Walts, Brandon; Wilder, Steven P.; Zadissa, Amonida; Kostadima, Myrto; Martin, Fergal J.; Muffato, Matthieu; Perry, Emily; Ruffier, Magali; Staines, Daniel M.; Trevanion, Stephen J.; Cunningham, Fiona; Yates, Andrew; Zerbino, Daniel R.; Flicek, Paul

    2017-01-01

    Ensembl (www.ensembl.org) is a database and genome browser for enabling research on vertebrate genomes. We import, analyse, curate and integrate a diverse collection of large-scale reference data to create a more comprehensive view of genome biology than would be possible from any individual dataset. Our extensive data resources include evidence-based gene and regulatory region annotation, genome variation and gene trees. An accompanying suite of tools, infrastructure and programmatic access methods ensure uniform data analysis and distribution for all supported species. Together, these provide a comprehensive solution for large-scale and targeted genomics applications alike. Among many other developments over the past year, we have improved our resources for gene regulation and comparative genomics, and added CRISPR/Cas9 target sites. We released new browser functionality and tools, including improved filtering and prioritization of genome variation, Manhattan plot visualization for linkage disequilibrium and eQTL data, and an ontology search for phenotypes, traits and disease. We have also enhanced data discovery and access with a track hub registry and a selection of new REST end points. All Ensembl data are freely released to the scientific community and our source code is available via the open source Apache 2.0 license. PMID:27899575

  11. The role of ensemble-based statistics in variational assimilation of cloud-affected observations from infrared imagers

    NASA Astrophysics Data System (ADS)

    Hacker, Joshua; Vandenberghe, Francois; Jung, Byoung-Jo; Snyder, Chris

    2017-04-01

    Effective assimilation of cloud-affected radiance observations from space-borne imagers, with the aim of improving cloud analysis and forecasting, has proven to be difficult. Large observation biases, nonlinear observation operators, and non-Gaussian innovation statistics present many challenges. Ensemble-variational data assimilation (EnVar) systems offer the benefits of flow-dependent background error statistics from an ensemble, and the ability of variational minimization to handle nonlinearity. The specific benefits of ensemble statistics, relative to static background errors more commonly used in variational systems, have not been quantified for the problem of assimilating cloudy radiances. A simple experiment framework is constructed with a regional NWP model and operational variational data assimilation system, to provide the basis understanding the importance of ensemble statistics in cloudy radiance assimilation. Restricting the observations to those corresponding to clouds in the background forecast leads to innovations that are more Gaussian. The number of large innovations is reduced compared to the more general case of all observations, but not eliminated. The Huber norm is investigated to handle the fat tails of the distributions, and allow more observations to be assimilated without the need for strict background checks that eliminate them. Comparing assimilation using only ensemble background error statistics with assimilation using only static background error statistics elucidates the importance of the ensemble statistics. Although the cost functions in both experiments converge to similar values after sufficient outer-loop iterations, the resulting cloud water, ice, and snow content are greater in the ensemble-based analysis. The subsequent forecasts from the ensemble-based analysis also retain more condensed water species, indicating that the local environment is more supportive of clouds. In this presentation we provide details that explain the

  12. Formulation of state projected centroid molecular dynamics: Microcanonical ensemble and connection to the Wigner distribution

    NASA Astrophysics Data System (ADS)

    Orr, Lindsay; Hernández de la Peña, Lisandro; Roy, Pierre-Nicholas

    2017-06-01

    A derivation of quantum statistical mechanics based on the concept of a Feynman path centroid is presented for the case of generalized density operators using the projected density operator formalism of Blinov and Roy [J. Chem. Phys. 115, 7822-7831 (2001)]. The resulting centroid densities, centroid symbols, and centroid correlation functions are formulated and analyzed in the context of the canonical equilibrium picture of Jang and Voth [J. Chem. Phys. 111, 2357-2370 (1999)]. The case where the density operator projects onto a particular energy eigenstate of the system is discussed, and it is shown that one can extract microcanonical dynamical information from double Kubo transformed correlation functions. It is also shown that the proposed projection operator approach can be used to formally connect the centroid and Wigner phase-space distributions in the zero reciprocal temperature β limit. A Centroid Molecular Dynamics (CMD) approximation to the state-projected exact quantum dynamics is proposed and proven to be exact in the harmonic limit. The state projected CMD method is also tested numerically for a quartic oscillator and a double-well potential and found to be more accurate than canonical CMD. In the case of a ground state projection, this method can resolve tunnelling splittings of the double well problem in the higher barrier regime where canonical CMD fails. Finally, the state-projected CMD framework is cast in a path integral form.

  13. A unified thermostat scheme for efficient configurational sampling for classical/quantum canonical ensembles via molecular dynamics

    NASA Astrophysics Data System (ADS)

    Zhang, Zhijun; Liu, Xinzijian; Chen, Zifei; Zheng, Haifeng; Yan, Kangyu; Liu, Jian

    2017-07-01

    We show a unified second-order scheme for constructing simple, robust, and accurate algorithms for typical thermostats for configurational sampling for the canonical ensemble. When Langevin dynamics is used, the scheme leads to the BAOAB algorithm that has been recently investigated. We show that the scheme is also useful for other types of thermostats, such as the Andersen thermostat and Nosé-Hoover chain, regardless of whether the thermostat is deterministic or stochastic. In addition to analytical analysis, two 1-dimensional models and three typical real molecular systems that range from the gas phase, clusters, to the condensed phase are used in numerical examples for demonstration. Accuracy may be increased by an order of magnitude for estimating coordinate-dependent properties in molecular dynamics (when the same time interval is used), irrespective of which type of thermostat is applied. The scheme is especially useful for path integral molecular dynamics because it consistently improves the efficiency for evaluating all thermodynamic properties for any type of thermostat.

  14. Role of different Pd/Pt ensembles in determining CO chemisorption on Au-based bimetallic alloys: A first-principles study

    NASA Astrophysics Data System (ADS)

    Ham, Hyung Chul; Manogaran, Dhivya; Hwang, Gyeong S.; Han, Jonghee; Kim, Hyoung-Juhn; Nam, Suk Woo; Lim, Tae Hoon

    2015-03-01

    Using spin-polarized density functional calculations, we investigate the role of different Pd/Pt ensembles in determining CO chemisorption on Au-based bimetallic alloys through a study of the energetics, charge transfer, geometric and electronic structures of CO on various Pd/Pt ensembles (monomer/dimer/trimer/tetramer). We find that the effect of Pd ensembles on the reduction of CO chemisorption energy is much larger than the Pt ensemble case. In particular, small-sized Pd ensembles like monomer show a substantial reduction of CO chemisorption energy compared to the pure Pd (1 1 1) surface, while there are no significant size and shape effects of Pt ensembles on CO chemisorption energy. This is related to two factors: (1) the steeper potential energy surface (PES) of CO in Pd (1 1 1) than in Pt (1 1 1), indicating that the effect of switch of binding site preference on CO chemisorption energy is much larger in Pd ensembles than in Pt ensembles, and (2) down-shift of d-band in Pd ensembles/up-shift of d-band in Pt ensembles as compared to the corresponding pure Pd (1 1 1)/Pt (1 1 1) surfaces, suggesting more reduced activity of Pd ensembles toward CO adsorption than the Pt ensemble case. We also present the different bonding mechanism of CO on Pd/Pt ensembles by the analysis of orbital resolved density of state.

  15. Characterizing the structural ensemble of γ-secretase using a multiscale molecular dynamics approach.

    PubMed

    Aguayo-Ortiz, Rodrigo; Chávez-García, Cecilia; Straub, John E; Dominguez, Laura

    2017-08-01

    γ-Secretase is an intramembrane-cleaving aspartyl protease that plays an essential role in the processing of a variety of integral membrane proteins. Its role in the ultimate cleavage step in the processing of amyloid precursor protein to form amyloid-β (Aβ) peptide makes it an important therapeutic target in Alzheimer's disease research. Significant recent advances have been made in structural studies of this critical membrane protein complex. However, details of the mechanism of activation of the enzyme complex remain unclear. Using a multiscale computational modeling approach, combining multiple coarse-grained microsecond dynamic trajectories with all-atom models, the structure and two conformational states of the γ-secretase complex were evaluated. The transition between enzymatic state 1 and state 2 is shown to critically depend on the protonation states of the key catalytic residues Asp257 and Asp385 in the active site domain. The active site formation, related to our γ-secretase state 2, is observed to involve a concerted movement of four transmembrane helices from the catalytic subunit, resulting in the required localization of the catalytic residues. Global analysis of the structural ensemble of the enzyme complex was used to identify collective fluctuations important to the mechanism of substrate recognition and demonstrate that the corresponding fluctuations observed were uncorrelated with structural changes associated with enzyme activation. Overall, this computational study provides essential insight into the role of structure and dynamics in the activation and function of γ-secretase.

  16. Nonequilibrium and generalized-ensemble molecular dynamics simulations for amyloid fibril

    SciTech Connect

    Okumura, Hisashi

    2015-12-31

    Amyloids are insoluble and misfolded fibrous protein aggregates and associated with more than 20 serious human diseases. We perform all-atom molecular dynamics simulations of amyloid fibril assembly and disassembly.

  17. A new strategy for snow-cover mapping using remote sensing data and ensemble based systems techniques

    NASA Astrophysics Data System (ADS)

    Roberge, S.; Chokmani, K.; De Sève, D.

    2012-04-01

    The snow cover plays an important role in the hydrological cycle of Quebec (Eastern Canada). Consequently, evaluating its spatial extent interests the authorities responsible for the management of water resources, especially hydropower companies. The main objective of this study is the development of a snow-cover mapping strategy using remote sensing data and ensemble based systems techniques. Planned to be tested in a near real-time operational mode, this snow-cover mapping strategy has the advantage to provide the probability of a pixel to be snow covered and its uncertainty. Ensemble systems are made of two key components. First, a method is needed to build an ensemble of classifiers that is diverse as much as possible. Second, an approach is required to combine the outputs of individual classifiers that make up the ensemble in such a way that correct decisions are amplified, and incorrect ones are cancelled out. In this study, we demonstrate the potential of ensemble systems to snow-cover mapping using remote sensing data. The chosen classifier is a sequential thresholds algorithm using NOAA-AVHRR data adapted to conditions over Eastern Canada. Its special feature is the use of a combination of six sequential thresholds varying according to the day in the winter season. Two versions of the snow-cover mapping algorithm have been developed: one is specific for autumn (from October 1st to December 31st) and the other for spring (from March 16th to May 31st). In order to build the ensemble based system, different versions of the algorithm are created by varying randomly its parameters. One hundred of the versions are included in the ensemble. The probability of a pixel to be snow, no-snow or cloud covered corresponds to the amount of votes the pixel has been classified as such by all classifiers. The overall performance of ensemble based mapping is compared to the overall performance of the chosen classifier, and also with ground observations at meteorological

  18. A Statistical Investigation of the Sensitivity of Ensemble-Based Kalman Filters to Covariance Filtering

    DTIC Science & Technology

    2011-09-01

    averaging: A review. Mon. Wea. Rev., 138, 3693 –3720. Bishop, C. H., and D . Hodyss, 2007: Flow adaptive moderation of spurious ensemble correlations...beyond a prescribed distance d . Some localization functions do not change the sample-based estimate p̂buy of the covariance when the distance ju 2 yj...associated with the pair of state vector components is smaller than d , but replaces p̂buy with zero when ju 2 yj $ d (e.g., (Houtekamer and Mitchell

  19. Optimal control of light storage in atomic ensemble based on photon echoes

    NASA Astrophysics Data System (ADS)

    Wu, Tingwan; Chen, Qinzhi

    2009-11-01

    This paper presents a simple quantum memory method for efficient storage and retrieve of light. The technique is based on the principle of controlled reversible inhomogeneous broadening for which the information of the quantum state light is imprinted in a two-level atoms ensemble and recalled by flipping the external nonuniform electric field. In present work, the induced Stark shift varied linearly with position, and a numerical analysis for this protocol has been studied. It shows that the storage efficiency can nearly reach 100% with a large enough optical depth, and the optimal broadening for a given pulse width is also analyzed.

  20. Ensemble-based evaluation of extreme water levels for the eastern Baltic Sea

    NASA Astrophysics Data System (ADS)

    Eelsalu, Maris; Soomere, Tarmo

    2016-04-01

    The risks and damages associated with coastal flooding that are naturally associated with an increase in the magnitude of extreme storm surges are one of the largest concerns of countries with extensive low-lying nearshore areas. The relevant risks are even more contrast for semi-enclosed water bodies such as the Baltic Sea where subtidal (weekly-scale) variations in the water volume of the sea substantially contribute to the water level and lead to large spreading of projections of future extreme water levels. We explore the options for using large ensembles of projections to more reliably evaluate return periods of extreme water levels. Single projections of the ensemble are constructed by means of fitting several sets of block maxima with various extreme value distributions. The ensemble is based on two simulated data sets produced in the Swedish Meteorological and Hydrological Institute. A hindcast by the Rossby Centre Ocean model is sampled with a resolution of 6 h and a similar hindcast by the circulation model NEMO with a resolution of 1 h. As the annual maxima of water levels in the Baltic Sea are not always uncorrelated, we employ maxima for calendar years and for stormy seasons. As the shape parameter of the Generalised Extreme Value distribution changes its sign and substantially varies in magnitude along the eastern coast of the Baltic Sea, the use of a single distribution for the entire coast is inappropriate. The ensemble involves projections based on the Generalised Extreme Value, Gumbel and Weibull distributions. The parameters of these distributions are evaluated using three different ways: maximum likelihood method and method of moments based on both biased and unbiased estimates. The total number of projections in the ensemble is 40. As some of the resulting estimates contain limited additional information, the members of pairs of projections that are highly correlated are assigned weights 0.6. A comparison of the ensemble-based projection of

  1. Interrogating Emergent Transport Properties for Molecular Motor Ensembles: A Semi-analytical Approach.

    PubMed

    Bhaban, Shreyas; Materassi, Donatello; Li, Mingang; Hays, Thomas; Salapaka, Murti

    2016-11-01

    Intracellular transport is an essential function in eucaryotic cells, facilitated by motor proteins-proteins converting chemical energy into kinetic energy. It is understood that motor proteins work in teams enabling unidirectional and bidirectional transport of intracellular cargo over long distances. Disruptions of the underlying transport mechanisms, often caused by mutations that alter single motor characteristics, are known to cause neurodegenerative diseases. For example, phosphorylation of kinesin motor domain at the serine residue is implicated in Huntington's disease, with a recent study of phosphorylated and phosphomimetic serine residues indicating lowered single motor stalling forces. In this article we report the effects of mutations of this nature on transport properties of cargo carried by multiple wild-type and mutant motors. Results indicate that mutants with altered stall forces might determine the average velocity and run-length even when they are outnumbered by wild type motors in the ensemble. It is shown that mutants gain a competitive advantage and lead to an increase in the expected run-length when the load on the cargo is in the vicinity of the mutant's stalling force or a multiple of its stalling force. A separate contribution of this article is the development of a semi-analytic method to analyze transport of cargo by multiple motors of multiple types. The technique determines transition rates between various relative configurations of motors carrying the cargo using the transition rates between various absolute configurations. This enables a computation of biologically relevant quantities like average velocity and run-length without resorting to Monte Carlo simulations. It can also be used to introduce alterations of various single motor parameters to model a mutation and to deduce effects of such alterations on the transport of a common cargo by multiple motors. Our method is easily implementable and we provide a software package for

  2. Interrogating Emergent Transport Properties for Molecular Motor Ensembles: A Semi-analytical Approach

    PubMed Central

    Materassi, Donatello; Li, Mingang; Hays, Thomas; Salapaka, Murti

    2016-01-01

    Intracellular transport is an essential function in eucaryotic cells, facilitated by motor proteins—proteins converting chemical energy into kinetic energy. It is understood that motor proteins work in teams enabling unidirectional and bidirectional transport of intracellular cargo over long distances. Disruptions of the underlying transport mechanisms, often caused by mutations that alter single motor characteristics, are known to cause neurodegenerative diseases. For example, phosphorylation of kinesin motor domain at the serine residue is implicated in Huntington’s disease, with a recent study of phosphorylated and phosphomimetic serine residues indicating lowered single motor stalling forces. In this article we report the effects of mutations of this nature on transport properties of cargo carried by multiple wild-type and mutant motors. Results indicate that mutants with altered stall forces might determine the average velocity and run-length even when they are outnumbered by wild type motors in the ensemble. It is shown that mutants gain a competitive advantage and lead to an increase in the expected run-length when the load on the cargo is in the vicinity of the mutant’s stalling force or a multiple of its stalling force. A separate contribution of this article is the development of a semi-analytic method to analyze transport of cargo by multiple motors of multiple types. The technique determines transition rates between various relative configurations of motors carrying the cargo using the transition rates between various absolute configurations. This enables a computation of biologically relevant quantities like average velocity and run-length without resorting to Monte Carlo simulations. It can also be used to introduce alterations of various single motor parameters to model a mutation and to deduce effects of such alterations on the transport of a common cargo by multiple motors. Our method is easily implementable and we provide a software package

  3. An investigation of ensemble-based assimilation of satellite altimetry and tide gauge data in storm surge prediction

    NASA Astrophysics Data System (ADS)

    Etala, Paula; Saraceno, Martín; Echevarría, Pablo

    2015-03-01

    Cyclogenesis and long-fetched winds along the southeastern coast of South America may lead to floods in populated areas, as the Buenos Aires Province, with important economic and social impacts. A numerical model (SMARA) has already been implemented in the region to forecast storm surges. The propagation time of the surge in such extensive and shallow area allows the detection of anomalies based on observations from several hours up to the order of a day prior to the event. Here, we investigate the impact and potential benefit of storm surge level data assimilation into the SMARA model, with the objective of improving the forecast. In the experiments, the surface wind stress from an ensemble prediction system drives a storm surge model ensemble, based on the operational 2-D depth-averaged SMARA model. A 4-D Local Ensemble Transform Kalman Filter (4D-LETKF) initializes the ensemble in a 6-h cycle, assimilating the very few tide gauge observations available along the northern coast and satellite altimeter data. The sparse coverage of the altimeters is a challenge to data assimilation; however, the 4D-LETKF evolving covariance of the ensemble perturbations provides realistic cross-track analysis increments. Improvements on the forecast ensemble mean show the potential of an effective use of the sparse satellite altimeter and tidal gauges observations in the data assimilation prototype. Furthermore, the effects of the localization scale and of the observational errors of coastal altimetry and tidal gauges in the data assimilation approach are assessed.

  4. Subspace ensembles for classification

    NASA Astrophysics Data System (ADS)

    Sun, Shiliang; Zhang, Changshui

    2007-11-01

    Ensemble learning constitutes one of the principal current directions in machine learning and data mining. In this paper, we explore subspace ensembles for classification by manipulating different feature subspaces. Commencing with the nature of ensemble efficacy, we probe into the microcosmic meaning of ensemble diversity, and propose to use region partitioning and region weighting to implement effective subspace ensembles. Individual classifiers possessing eminent performance on a partitioned region reflected by high neighborhood accuracies are deemed to contribute largely to this region, and are assigned large weights in determining the labels of instances in this area. A robust algorithm “Sena” that incarnates the mechanism is presented, which is insensitive to the number of nearest neighbors chosen to calculate neighborhood accuracies. The algorithm exhibits improved performance over the well-known ensembles of bagging, AdaBoost and random subspace. The difference of its effectivity with varying base classifiers is also investigated.

  5. The Ensemble Canon

    NASA Technical Reports Server (NTRS)

    MIittman, David S

    2011-01-01

    Ensemble is an open architecture for the development, integration, and deployment of mission operations software. Fundamentally, it is an adaptation of the Eclipse Rich Client Platform (RCP), a widespread, stable, and supported framework for component-based application development. By capitalizing on the maturity and availability of the Eclipse RCP, Ensemble offers a low-risk, politically neutral path towards a tighter integration of operations tools. The Ensemble project is a highly successful, ongoing collaboration among NASA Centers. Since 2004, the Ensemble project has supported the development of mission operations software for NASA's Exploration Systems, Science, and Space Operations Directorates.

  6. CMIP5 ensemble-based spatial rainfall projection over homogeneous zones of India

    NASA Astrophysics Data System (ADS)

    Akhter, Javed; Das, Lalu; Deb, Argha

    2017-09-01

    Performances of the state-of-the-art CMIP5 models in reproducing the spatial rainfall patterns over seven homogeneous rainfall zones of India viz. North Mountainous India (NMI), Northwest India (NWI), North Central India (NCI), Northeast India (NEI), West Peninsular India (WPI), East Peninsular India (EPI) and South Peninsular India (SPI) have been assessed using different conventional performance metrics namely spatial correlation (R), index of agreement (d-index), Nash-Sutcliffe efficiency (NSE), Ratio of RMSE to the standard deviation of the observations (RSR) and mean bias (MB). The results based on these indices revealed that majority of the models are unable to reproduce finer-scaled spatial patterns over most of the zones. Thereafter, four bias correction methods i.e. Scaling, Standardized Reconstruction, Empirical Quantile Mapping and Gamma Quantile Mapping have been applied on GCM simulations to enhance the skills of the GCM projections. It has been found that scaling method compared to other three methods shown its better skill in capturing mean spatial patterns. Multi-model ensemble (MME) comprising 25 numbers of better performing bias corrected (Scaled) GCMs, have been considered for developing future rainfall patterns over seven zones. Models' spread from ensemble mean (uncertainty) has been found to be larger in RCP 8.5 than RCP4.5 ensemble. In general, future rainfall projections from RCP 4.5 and RCP 8.5 revealed an increasing rainfall over seven zones during 2020s, 2050s, and 2080s. The maximum increase has been found over southwestern part of NWI (12-30%), northwestern part of WPI (3-30%), southeastern part of NEI (5-18%) and northern and eastern part of SPI (6-24%). However, the contiguous region comprising by the southeastern part of NCI and northeastern part of EPI, may experience slight decreasing rainfall (about 3%) during 2020s whereas the western part of NMI may also receive around 3% reduction in rainfall during both 2050s and 2080s.

  7. CMIP5 ensemble-based spatial rainfall projection over homogeneous zones of India

    NASA Astrophysics Data System (ADS)

    Akhter, Javed; Das, Lalu; Deb, Argha

    2016-11-01

    Performances of the state-of-the-art CMIP5 models in reproducing the spatial rainfall patterns over seven homogeneous rainfall zones of India viz. North Mountainous India (NMI), Northwest India (NWI), North Central India (NCI), Northeast India (NEI), West Peninsular India (WPI), East Peninsular India (EPI) and South Peninsular India (SPI) have been assessed using different conventional performance metrics namely spatial correlation (R), index of agreement (d-index), Nash-Sutcliffe efficiency (NSE), Ratio of RMSE to the standard deviation of the observations (RSR) and mean bias (MB). The results based on these indices revealed that majority of the models are unable to reproduce finer-scaled spatial patterns over most of the zones. Thereafter, four bias correction methods i.e. Scaling, Standardized Reconstruction, Empirical Quantile Mapping and Gamma Quantile Mapping have been applied on GCM simulations to enhance the skills of the GCM projections. It has been found that scaling method compared to other three methods shown its better skill in capturing mean spatial patterns. Multi-model ensemble (MME) comprising 25 numbers of better performing bias corrected (Scaled) GCMs, have been considered for developing future rainfall patterns over seven zones. Models' spread from ensemble mean (uncertainty) has been found to be larger in RCP 8.5 than RCP4.5 ensemble. In general, future rainfall projections from RCP 4.5 and RCP 8.5 revealed an increasing rainfall over seven zones during 2020s, 2050s, and 2080s. The maximum increase has been found over southwestern part of NWI (12-30%), northwestern part of WPI (3-30%), southeastern part of NEI (5-18%) and northern and eastern part of SPI (6-24%). However, the contiguous region comprising by the southeastern part of NCI and northeastern part of EPI, may experience slight decreasing rainfall (about 3%) during 2020s whereas the western part of NMI may also receive around 3% reduction in rainfall during both 2050s and 2080s.

  8. Basin-scale runoff prediction: An Ensemble Kalman Filter framework based on global hydrometeorological data sets

    NASA Astrophysics Data System (ADS)

    Lorenz, Christof; Tourian, Mohammad J.; Devaraju, Balaji; Sneeuw, Nico; Kunstmann, Harald

    2015-10-01

    In order to cope with the steady decline of the number of in situ gauges worldwide, there is a growing need for alternative methods to estimate runoff. We present an Ensemble Kalman Filter based approach that allows us to conclude on runoff for poorly or irregularly gauged basins. The approach focuses on the application of publicly available global hydrometeorological data sets for precipitation (GPCC, GPCP, CRU, UDEL), evapotranspiration (MODIS, FLUXNET, GLEAM, ERA interim, GLDAS), and water storage changes (GRACE, WGHM, GLDAS, MERRA LAND). Furthermore, runoff data from the GRDC and satellite altimetry derived estimates are used. We follow a least squares prediction that exploits the joint temporal and spatial auto- and cross-covariance structures of precipitation, evapotranspiration, water storage changes and runoff. We further consider time-dependent uncertainty estimates derived from all data sets. Our in-depth analysis comprises of 29 large river basins of different climate regions, with which runoff is predicted for a subset of 16 basins. Six configurations are analyzed: the Ensemble Kalman Filter (Smoother) and the hard (soft) Constrained Ensemble Kalman Filter (Smoother). Comparing the predictions to observed monthly runoff shows correlations larger than 0.5, percentage biases lower than ± 20%, and NSE-values larger than 0.5. A modified NSE-metric, stressing the difference to the mean annual cycle, shows an improvement of runoff predictions for 14 of the 16 basins. The proposed method is able to provide runoff estimates for nearly 100 poorly gauged basins covering an area of more than 11,500,000 km2 with a freshwater discharge, in volume, of more than 125,000 m3/s.

  9. Basin-scale runoff prediction: An Ensemble Kalman Filter framework based on global hydrometeorological data sets

    NASA Astrophysics Data System (ADS)

    Kunstmann, Harald; Lorenz, Christof; Tourian, Mohammad; Devaraju, Balaji; Sneeuw, Nico

    2016-04-01

    In order to cope with the steady decline of the number of in situ gauges worldwide, there is a growing need for alternative methods to estimate runoff. We present an Ensemble Kalman Filter based approach that allows us to conclude on runoff for poorly or irregularly gauged basins. The approach focuses on the application of publicly available global hydrometeorological data sets for precipitation (GPCC, GPCP, CRU, UDEL), evapotranspiration (MODIS, FLUXNET, GLEAM, ERA interim, GLDAS), and water storage changes (GRACE, WGHM, GLDAS, MERRA LAND). Furthermore, runoff data from the GRDC and satellite altimetry derived estimates are used. We follow a least squares prediction that exploits the joint temporal and spatial auto- and cross-covariance structures of precipitation, evapotranspiration, water storage changes and runoff. We further consider time-dependent uncertainty estimates derived from all data sets. Our in-depth analysis comprises of 29 large river basins of different climate regions, with which runoff is predicted for a subset of 16 basins. Six configurations are analyzed: the Ensemble Kalman Filter (Smoother) and the hard (soft) Constrained Ensemble Kalman Filter (Smoother). Comparing the predictions to observed monthly runoff shows correlations larger than 0.5, percentage biases lower than ± 20%, and NSE-values larger than 0.5. A modified NSE-metric, stressing the difference to the mean annual cycle, shows an improvement of runoff predictions for 14 of the 16 basins. The proposed method is able to provide runoff estimates for nearly 100 poorly gauged basins covering an area of more than 11,500,000 km2 with a freshwater discharge, in volume, of more than 125,000 m3/s.

  10. An ensemble method based on uninformative variable elimination and mutual information for spectral multivariate calibration

    NASA Astrophysics Data System (ADS)

    Tan, Chao; Wang, Jinyue; Wu, Tong; Qin, Xin; Li, Menglong

    2010-12-01

    Based on the combination of uninformative variable elimination (UVE), bootstrap and mutual information (MI), a simple ensemble algorithm, named ESPLS, is proposed for spectral multivariate calibration (MVC). In ESPLS, those uninformative variables are first removed; and then a preparatory training set is produced by bootstrap, on which a MI spectrum of retained variables is calculated. The variables that exhibit higher MI than a defined threshold form a subspace on which a candidate partial least-squares (PLS) model is constructed. This process is repeated. After a number of candidate models are obtained, a small part of models is picked out to construct an ensemble model by simple/weighted average. Four near/mid-infrared (NIR/MIR) spectral datasets concerning the determination of six components are used to verify the proposed ESPLS. The results indicate that ESPLS is superior to UVEPLS and its combination with MI-based variable selection (SPLS) in terms of both the accuracy and robustness. Besides, from the perspective of end-users, ESPLS does not increase the complexity of a calibration when enhancing its performance.

  11. Ensemble Learning for Spatial Interpolation of Soil Potassium Content Based on Environmental Information

    PubMed Central

    Liu, Wei; Du, Peijun; Wang, Dongchen

    2015-01-01

    One important method to obtain the continuous surfaces of soil properties from point samples is spatial interpolation. In this paper, we propose a method that combines ensemble learning with ancillary environmental information for improved interpolation of soil properties (hereafter, EL-SP). First, we calculated the trend value for soil potassium contents at the Qinghai Lake region in China based on measured values. Then, based on soil types, geology types, land use types, and slope data, the remaining residual was simulated with the ensemble learning model. Next, the EL-SP method was applied to interpolate soil potassium contents at the study site. To evaluate the utility of the EL-SP method, we compared its performance with other interpolation methods including universal kriging, inverse distance weighting, ordinary kriging, and ordinary kriging combined geographic information. Results show that EL-SP had a lower mean absolute error and root mean square error than the data produced by the other models tested in this paper. Notably, the EL-SP maps can describe more locally detailed information and more accurate spatial patterns for soil potassium content than the other methods because of the combined use of different types of environmental information; these maps are capable of showing abrupt boundary information for soil potassium content. Furthermore, the EL-SP method not only reduces prediction errors, but it also compliments other environmental information, which makes the spatial interpolation of soil potassium content more reasonable and useful. PMID:25928138

  12. Ensemble learning for spatial interpolation of soil potassium content based on environmental information.

    PubMed

    Liu, Wei; Du, Peijun; Wang, Dongchen

    2015-01-01

    One important method to obtain the continuous surfaces of soil properties from point samples is spatial interpolation. In this paper, we propose a method that combines ensemble learning with ancillary environmental information for improved interpolation of soil properties (hereafter, EL-SP). First, we calculated the trend value for soil potassium contents at the Qinghai Lake region in China based on measured values. Then, based on soil types, geology types, land use types, and slope data, the remaining residual was simulated with the ensemble learning model. Next, the EL-SP method was applied to interpolate soil potassium contents at the study site. To evaluate the utility of the EL-SP method, we compared its performance with other interpolation methods including universal kriging, inverse distance weighting, ordinary kriging, and ordinary kriging combined geographic information. Results show that EL-SP had a lower mean absolute error and root mean square error than the data produced by the other models tested in this paper. Notably, the EL-SP maps can describe more locally detailed information and more accurate spatial patterns for soil potassium content than the other methods because of the combined use of different types of environmental information; these maps are capable of showing abrupt boundary information for soil potassium content. Furthermore, the EL-SP method not only reduces prediction errors, but it also compliments other environmental information, which makes the spatial interpolation of soil potassium content more reasonable and useful.

  13. Assessing the predictive capability of randomized tree-based ensembles in streamflow modelling

    NASA Astrophysics Data System (ADS)

    Galelli, S.; Castelletti, A.

    2013-07-01

    Combining randomization methods with ensemble prediction is emerging as an effective option to balance accuracy and computational efficiency in data-driven modelling. In this paper, we investigate the prediction capability of extremely randomized trees (Extra-Trees), in terms of accuracy, explanation ability and computational efficiency, in a streamflow modelling exercise. Extra-Trees are a totally randomized tree-based ensemble method that (i) alleviates the poor generalisation property and tendency to overfitting of traditional standalone decision trees (e.g. CART); (ii) is computationally efficient; and, (iii) allows to infer the relative importance of the input variables, which might help in the ex-post physical interpretation of the model. The Extra-Trees potential is analysed on two real-world case studies - Marina catchment (Singapore) and Canning River (Western Australia) - representing two different morphoclimatic contexts. The evaluation is performed against other tree-based methods (CART and M5) and parametric data-driven approaches (ANNs and multiple linear regression). Results show that Extra-Trees perform comparatively well to the best of the benchmarks (i.e. M5) in both the watersheds, while outperforming the other approaches in terms of computational requirement when adopted on large datasets. In addition, the ranking of the input variable provided can be given a physically meaningful interpretation.

  14. Assessing the predictive capability of randomized tree-based ensembles in streamflow modelling

    NASA Astrophysics Data System (ADS)

    Galelli, S.; Castelletti, A.

    2013-02-01

    Combining randomization methods with ensemble prediction is emerging as an effective option to balance accuracy and computational efficiency in data-driven modeling. In this paper we investigate the prediction capability of extremely randomized trees (Extra-Trees), in terms of accuracy, explanation ability and computational efficiency, in a streamflow modeling exercise. Extra-Trees are a totally randomized tree-based ensemble method that (i) alleviates the poor generalization property and tendency to overfitting of traditional standalone decision trees (e.g. CART); (ii) is computationally very efficient; and, (iii) allows to infer the relative importance of the input variables, which might help in the ex-post physical interpretation of the model. The Extra-Trees potential is analyzed on two real-world case studies (Marina catchment (Singapore) and Canning River (Western Australia)) representing two different morphoclimatic contexts comparatively with other tree-based methods (CART and M5) and parametric data-driven approaches (ANNs and multiple linear regression). Results show that Extra-Trees perform comparatively well to the best of the benchmarks (i.e. M5) in both the watersheds, while outperforming the other approaches in terms of computational requirement when adopted on large datasets. In addition, the ranking of the input variable provided can be given a physically meaningful interpretation.

  15. A genetic algorithm-based weighted ensemble method for predicting transposon-derived piRNAs.

    PubMed

    Li, Dingfang; Luo, Longqiang; Zhang, Wen; Liu, Feng; Luo, Fei

    2016-08-31

    Predicting piwi-interacting RNA (piRNA) is an important topic in the small non-coding RNAs, which provides clues for understanding the generation mechanism of gamete. To the best of our knowledge, several machine learning approaches have been proposed for the piRNA prediction, but there is still room for improvements. In this paper, we develop a genetic algorithm-based weighted ensemble method for predicting transposon-derived piRNAs. We construct datasets for three species: Human, Mouse and Drosophila. For each species, we compile the balanced dataset and imbalanced dataset, and thus obtain six datasets to build and evaluate prediction models. In the computational experiments, the genetic algorithm-based weighted ensemble method achieves 10-fold cross validation AUC of 0.932, 0.937 and 0.995 on the balanced Human dataset, Mouse dataset and Drosophila dataset, respectively, and achieves AUC of 0.935, 0.939 and 0.996 on the imbalanced datasets of three species. Further, we use the prediction models trained on the Mouse dataset to identify piRNAs of other species, and the models demonstrate the good performances in the cross-species prediction. Compared with other state-of-the-art methods, our method can lead to better performances. In conclusion, the proposed method is promising for the transposon-derived piRNA prediction. The source codes and datasets are available in https://github.com/zw9977129/piRNAPredictor .

  16. Reaction Ensemble Molecular Dynamics: Direct Simulation of the Dynamic Equilibrium Properties of Chemically Reacting Mixtures

    DTIC Science & Technology

    2006-09-01

    Therefore, dynamic quantities of reaction mixtures such as the velocity autocorrelation functions and the diffusion coefficients can be accurately...using the virial expression [25]. A standard NVT molecular dynamics method was em- ployed with the equations of motion solved using the Verlet leapfrog...configurational energy, pressure, and species concen- trations) are compared to quantities calculated by the RxMC approach. Second , the dynamic quantities

  17. Molecular Simulation of Shocked Materials Using Reaction Ensemble Monte Carlo: Part 1. Application to Nitrogen Dissociation

    DTIC Science & Technology

    2006-11-01

    constituting the chemically reacting species is conserved. Thermochemical software such as the chemical equilibrium code (12) and Cheetah (13) are...stoichiometric coefficient of species i in reaction j; ξj is the molecular extent of reaction for reaction j; qint,i is the quantum partition function for the...Detonation Properties of PETN. J. Chem. Phys. 1984, 81, 1251. 13. Fried, L. E. Cheetah 3.0 User’s Manual; Lawrence Livermore National Laboratory

  18. Applications of Ensemble-based Data Assimilation Techniques for Aquifer Characterization using Tracer Data at Hanford 300 Area

    SciTech Connect

    Chen, Xingyuan; Hammond, Glenn E.; Murray, Christopher J.; Rockhold, Mark L.; Vermeul, Vincent R.; Zachara, John M.

    2013-10-31

    Subsurface aquifer characterization often involves high parameter dimensionality and requires tremendous computational resources if employing a full Bayesian approach. Ensemble-based data assimilation techniques, including filtering and smoothing, are computationally efficient alternatives. Despite the increasing number of applications of ensemble-based methods in assimilating flow and transport related data for subsurface aquifer charaterization, most are limited to either synthetic studies or two-dimensional problems. In this study, we applied ensemble-based techniques for assimilating field tracer experimental data obtained from the Integrated Field Research Challenge (IFRC) site at the Hanford 300 Area. The forward problem was simulated using the massively-parallel three-dimensional flow and transport code PFLOTRAN to effectively deal with the highly transient flow boundary conditions at the site and to meet the computational demands of ensemble-based methods. This study demonstrates the effectiveness of ensemble-based methods for characterizing a heterogeneous aquifer by sequentially assimilating multiple types of data. The necessity of employing high performance computing is shown to enable increasingly mechanistic non-linear forward simulations to be performed within the data assimilation framework for a complex system with reasonable turnaround time.

  19. In silico prediction of toxicity of non-congeneric industrial chemicals using ensemble learning based modeling approaches

    SciTech Connect

    Singh, Kunwar P. Gupta, Shikha

    2014-03-15

    Ensemble learning approach based decision treeboost (DTB) and decision tree forest (DTF) models are introduced in order to establish quantitative structure–toxicity relationship (QSTR) for the prediction of toxicity of 1450 diverse chemicals. Eight non-quantum mechanical molecular descriptors were derived. Structural diversity of the chemicals was evaluated using Tanimoto similarity index. Stochastic gradient boosting and bagging algorithms supplemented DTB and DTF models were constructed for classification and function optimization problems using the toxicity end-point in T. pyriformis. Special attention was drawn to prediction ability and robustness of the models, investigated both in external and 10-fold cross validation processes. In complete data, optimal DTB and DTF models rendered accuracies of 98.90%, 98.83% in two-category and 98.14%, 98.14% in four-category toxicity classifications. Both the models further yielded classification accuracies of 100% in external toxicity data of T. pyriformis. The constructed regression models (DTB and DTF) using five descriptors yielded correlation coefficients (R{sup 2}) of 0.945, 0.944 between the measured and predicted toxicities with mean squared errors (MSEs) of 0.059, and 0.064 in complete T. pyriformis data. The T. pyriformis regression models (DTB and DTF) applied to the external toxicity data sets yielded R{sup 2} and MSE values of 0.637, 0.655; 0.534, 0.507 (marine bacteria) and 0.741, 0.691; 0.155, 0.173 (algae). The results suggest for wide applicability of the inter-species models in predicting toxicity of new chemicals for regulatory purposes. These approaches provide useful strategy and robust tools in the screening of ecotoxicological risk or environmental hazard potential of chemicals. - Graphical abstract: Importance of input variables in DTB and DTF classification models for (a) two-category, and (b) four-category toxicity intervals in T. pyriformis data. Generalization and predictive abilities of the

  20. Soft sensor modeling based on variable partition ensemble method for nonlinear batch processes

    NASA Astrophysics Data System (ADS)

    Wang, Li; Chen, Xiangguang; Yang, Kai; Jin, Huaiping

    2017-01-01

    Batch processes are always characterized by nonlinear and system uncertain properties, therefore, the conventional single model may be ill-suited. A local learning strategy soft sensor based on variable partition ensemble method is developed for the quality prediction of nonlinear and non-Gaussian batch processes. A set of input variable sets are obtained by bootstrapping and PMI criterion. Then, multiple local GPR models are developed based on each local input variable set. When a new test data is coming, the posterior probability of each best performance local model is estimated based on Bayesian inference and used to combine these local GPR models to get the final prediction result. The proposed soft sensor is demonstrated by applying to an industrial fed-batch chlortetracycline fermentation process.

  1. An empirical study of ensemble-based semi-supervised learning approaches for imbalanced splice site datasets.

    PubMed

    Stanescu, Ana; Caragea, Doina

    2015-01-01

    Recent biochemical advances have led to inexpensive, time-efficient production of massive volumes of raw genomic data. Traditional machine learning approaches to genome annotation typically rely on large amounts of labeled data. The process of labeling data can be expensive, as it requires domain knowledge and expert involvement. Semi-supervised learning approaches that can make use of unlabeled data, in addition to small amounts of labeled data, can help reduce the costs associated with labeling. In this context, we focus on the problem of predicting splice sites in a genome using semi-supervised learning approaches. This is a challenging problem, due to the highly imbalanced distribution of the data, i.e., small number of splice sites as compared to the number of non-splice sites. To address this challenge, we propose to use ensembles of semi-supervised classifiers, specifically self-training and co-training classifiers. Our experiments on five highly imbalanced splice site datasets, with positive to negative ratios of 1-to-99, showed that the ensemble-based semi-supervised approaches represent a good choice, even when the amount of labeled data consists of less than 1% of all training data. In particular, we found that ensembles of co-training and self-training classifiers that dynamically balance the set of labeled instances during the semi-supervised iterations show improvements over the corresponding supervised ensemble baselines. In the presence of limited amounts of labeled data, ensemble-based semi-supervised approaches can successfully leverage the unlabeled data to enhance supervised ensembles learned from highly imbalanced data distributions. Given that such distributions are common for many biological sequence classification problems, our work can be seen as a stepping stone towards more sophisticated ensemble-based approaches to biological sequence annotation in a semi-supervised framework.

  2. An empirical study of ensemble-based semi-supervised learning approaches for imbalanced splice site datasets

    PubMed Central

    2015-01-01

    Background Recent biochemical advances have led to inexpensive, time-efficient production of massive volumes of raw genomic data. Traditional machine learning approaches to genome annotation typically rely on large amounts of labeled data. The process of labeling data can be expensive, as it requires domain knowledge and expert involvement. Semi-supervised learning approaches that can make use of unlabeled data, in addition to small amounts of labeled data, can help reduce the costs associated with labeling. In this context, we focus on the problem of predicting splice sites in a genome using semi-supervised learning approaches. This is a challenging problem, due to the highly imbalanced distribution of the data, i.e., small number of splice sites as compared to the number of non-splice sites. To address this challenge, we propose to use ensembles of semi-supervised classifiers, specifically self-training and co-training classifiers. Results Our experiments on five highly imbalanced splice site datasets, with positive to negative ratios of 1-to-99, showed that the ensemble-based semi-supervised approaches represent a good choice, even when the amount of labeled data consists of less than 1% of all training data. In particular, we found that ensembles of co-training and self-training classifiers that dynamically balance the set of labeled instances during the semi-supervised iterations show improvements over the corresponding supervised ensemble baselines. Conclusions In the presence of limited amounts of labeled data, ensemble-based semi-supervised approaches can successfully leverage the unlabeled data to enhance supervised ensembles learned from highly imbalanced data distributions. Given that such distributions are common for many biological sequence classification problems, our work can be seen as a stepping stone towards more sophisticated ensemble-based approaches to biological sequence annotation in a semi-supervised framework. PMID:26356316

  3. Ensemble classification of colon biopsy images based on information rich hybrid features.

    PubMed

    Rathore, Saima; Hussain, Mutawarra; Aksam Iftikhar, Muhammad; Jalil, Abdul

    2014-04-01

    In recent years, classification of colon biopsy images has become an active research area. Traditionally, colon cancer is diagnosed using microscopic analysis. However, the process is subjective and leads to considerable inter/intra observer variation. Therefore, reliable computer-aided colon cancer detection techniques are in high demand. In this paper, we propose a colon biopsy image classification system, called CBIC, which benefits from discriminatory capabilities of information rich hybrid feature spaces, and performance enhancement based on ensemble classification methodology. Normal and malignant colon biopsy images differ with each other in terms of the color distribution of different biological constituents. The colors of different constituents are sharp in normal images, whereas the colors diffuse with each other in malignant images. In order to exploit this variation, two feature types, namely color components based statistical moments (CCSM) and Haralick features have been proposed, which are color components based variants of their traditional counterparts. Moreover, in normal colon biopsy images, epithelial cells possess sharp and well-defined edges. Histogram of oriented gradients (HOG) based features have been employed to exploit this information. Different combinations of hybrid features have been constructed from HOG, CCSM, and Haralick features. The minimum Redundancy Maximum Relevance (mRMR) feature selection method has been employed to select meaningful features from individual and hybrid feature sets. Finally, an ensemble classifier based on majority voting has been proposed, which classifies colon biopsy images using the selected features. Linear, RBF, and sigmoid SVM have been employed as base classifiers. The proposed system has been tested on 174 colon biopsy images, and improved performance (=98.85%) has been observed compared to previously reported studies. Additionally, the use of mRMR method has been justified by comparing the

  4. Prognostics of Proton Exchange Membrane Fuel Cells stack using an ensemble of constraints based connectionist networks

    NASA Astrophysics Data System (ADS)

    Javed, Kamran; Gouriveau, Rafael; Zerhouni, Noureddine; Hissel, Daniel

    2016-08-01

    Proton Exchange Membrane Fuel Cell (PEMFC) is considered the most versatile among available fuel cell technologies, which qualify for diverse applications. However, the large-scale industrial deployment of PEMFCs is limited due to their short life span and high exploitation costs. Therefore, ensuring fuel cell service for a long duration is of vital importance, which has led to Prognostics and Health Management of fuel cells. More precisely, prognostics of PEMFC is major area of focus nowadays, which aims at identifying degradation of PEMFC stack at early stages and estimating its Remaining Useful Life (RUL) for life cycle management. This paper presents a data-driven approach for prognostics of PEMFC stack using an ensemble of constraint based Summation Wavelet- Extreme Learning Machine (SW-ELM) models. This development aim at improving the robustness and applicability of prognostics of PEMFC for an online application, with limited learning data. The proposed approach is applied to real data from two different PEMFC stacks and compared with ensembles of well known connectionist algorithms. The results comparison on long-term prognostics of both PEMFC stacks validates our proposition.

  5. Ensemble-Based Parameter Estimation in a Coupled General Circulation Model

    DOE PAGES

    Liu, Y.; Liu, Z.; Zhang, S.; ...

    2014-09-10

    Parameter estimation provides a potentially powerful approach to reduce model bias for complex climate models. Here, in a twin experiment framework, the authors perform the first parameter estimation in a fully coupled ocean–atmosphere general circulation model using an ensemble coupled data assimilation system facilitated with parameter estimation. The authors first perform single-parameter estimation and then multiple-parameter estimation. In the case of the single-parameter estimation, the error of the parameter [solar penetration depth (SPD)] is reduced by over 90% after ~40 years of assimilation of the conventional observations of monthly sea surface temperature (SST) and salinity (SSS). The results of multiple-parametermore » estimation are less reliable than those of single-parameter estimation when only the monthly SST and SSS are assimilated. Assimilating additional observations of atmospheric data of temperature and wind improves the reliability of multiple-parameter estimation. The errors of the parameters are reduced by 90% in ~8 years of assimilation. Finally, the improved parameters also improve the model climatology. With the optimized parameters, the bias of the climatology of SST is reduced by ~90%. Altogether, this study suggests the feasibility of ensemble-based parameter estimation in a fully coupled general circulation model.« less

  6. A Novel Computer-Based Set-Up to Study Movement Coordination in Human Ensembles.

    PubMed

    Alderisio, Francesco; Lombardi, Maria; Fiore, Gianfranco; di Bernardo, Mario

    2017-01-01

    Existing experimental works on movement coordination in human ensembles mostly investigate situations where each subject is connected to all the others through direct visual and auditory coupling, so that unavoidable social interaction affects their coordination level. Here, we present a novel computer-based set-up to study movement coordination in human groups so as to minimize the influence of social interaction among participants and implement different visual pairings between them. In so doing, players can only take into consideration the motion of a designated subset of the others. This allows the evaluation of the exclusive effects on coordination of the structure of interconnections among the players in the group and their own dynamics. In addition, our set-up enables the deployment of virtual computer players to investigate dyadic interaction between a human and a virtual agent, as well as group synchronization in mixed teams of human and virtual agents. We show how this novel set-up can be employed to study coordination both in dyads and in groups over different structures of interconnections, in the presence as well as in the absence of virtual agents acting as followers or leaders. Finally, in order to illustrate the capabilities of the architecture, we describe some preliminary results. The platform is available to any researcher who wishes to unfold the mechanisms underlying group synchronization in human ensembles and shed light on its socio-psychological aspects.

  7. A Novel Computer-Based Set-Up to Study Movement Coordination in Human Ensembles

    PubMed Central

    Alderisio, Francesco; Lombardi, Maria; Fiore, Gianfranco; di Bernardo, Mario

    2017-01-01

    Existing experimental works on movement coordination in human ensembles mostly investigate situations where each subject is connected to all the others through direct visual and auditory coupling, so that unavoidable social interaction affects their coordination level. Here, we present a novel computer-based set-up to study movement coordination in human groups so as to minimize the influence of social interaction among participants and implement different visual pairings between them. In so doing, players can only take into consideration the motion of a designated subset of the others. This allows the evaluation of the exclusive effects on coordination of the structure of interconnections among the players in the group and their own dynamics. In addition, our set-up enables the deployment of virtual computer players to investigate dyadic interaction between a human and a virtual agent, as well as group synchronization in mixed teams of human and virtual agents. We show how this novel set-up can be employed to study coordination both in dyads and in groups over different structures of interconnections, in the presence as well as in the absence of virtual agents acting as followers or leaders. Finally, in order to illustrate the capabilities of the architecture, we describe some preliminary results. The platform is available to any researcher who wishes to unfold the mechanisms underlying group synchronization in human ensembles and shed light on its socio-psychological aspects. PMID:28649217

  8. Fault Diagnosis of Rotating Machinery Based on an Adaptive Ensemble Empirical Mode Decomposition

    PubMed Central

    Lei, Yaguo; Li, Naipeng; Lin, Jing; Wang, Sizhe

    2013-01-01

    The vibration based signal processing technique is one of the principal tools for diagnosing faults of rotating machinery. Empirical mode decomposition (EMD), as a time-frequency analysis technique, has been widely used to process vibration signals of rotating machinery. But it has the shortcoming of mode mixing in decomposing signals. To overcome this shortcoming, ensemble empirical mode decomposition (EEMD) was proposed accordingly. EEMD is able to reduce the mode mixing to some extent. The performance of EEMD, however, depends on the parameters adopted in the EEMD algorithms. In most of the studies on EEMD, the parameters were selected artificially and subjectively. To solve the problem, a new adaptive ensemble empirical mode decomposition method is proposed in this paper. In the method, the sifting number is adaptively selected, and the amplitude of the added noise changes with the signal frequency components during the decomposition process. The simulation, the experimental and the application results demonstrate that the adaptive EEMD provides the improved results compared with the original EEMD in diagnosing rotating machinery. PMID:24351666

  9. Terrain classification of polarimetric synthetic aperture radar imagery based on polarimetric features and ensemble learning

    NASA Astrophysics Data System (ADS)

    Huang, Chuanbo

    2017-04-01

    An evolutionary classification system for terrain classification of polarimetric synthetic aperture radar (PolSAR) imagery based on ensemble learning with polarimetric and texture features is proposed. Polarimetric measurements cannot produce sufficient identification information for PolSAR terrain classification in some complex areas. To address this issue, texture features have been successfully used in image segmentation. The system classification feature has been adopted using a combination of Pauli features and the last principal component of Gabor texture-feature dimensionality reduction. The resulting feature combination assigned through experimental analysis is very suitable for describing structural and spatial information. To obtain a good integration effect, the basic classifier should be as precise as possible and the differences among the features should be as distinct as possible. We therefore examine and construct an ensemble-weighted voting classifier, including two support vector machine models that are constructed using kernel functions of the radial basis and sigmoid, extreme learning machine, k-nearest neighbor, and discriminant analysis classifier, which can avoid redundancy and bias because of different theoretical backgrounds. An experiment was performed to estimate the proposed algorithm's performance. The results verified that the algorithm can obtain better accuracy than the four classifiers mentioned in this paper.

  10. Ensemble-Based Parameter Estimation in a Coupled General Circulation Model

    SciTech Connect

    Liu, Y.; Liu, Z.; Zhang, S.; Jacob, R.; Lu, F.; Rong, X.; Wu, S.

    2014-09-10

    Parameter estimation provides a potentially powerful approach to reduce model bias for complex climate models. Here, in a twin experiment framework, the authors perform the first parameter estimation in a fully coupled ocean–atmosphere general circulation model using an ensemble coupled data assimilation system facilitated with parameter estimation. The authors first perform single-parameter estimation and then multiple-parameter estimation. In the case of the single-parameter estimation, the error of the parameter [solar penetration depth (SPD)] is reduced by over 90% after ~40 years of assimilation of the conventional observations of monthly sea surface temperature (SST) and salinity (SSS). The results of multiple-parameter estimation are less reliable than those of single-parameter estimation when only the monthly SST and SSS are assimilated. Assimilating additional observations of atmospheric data of temperature and wind improves the reliability of multiple-parameter estimation. The errors of the parameters are reduced by 90% in ~8 years of assimilation. Finally, the improved parameters also improve the model climatology. With the optimized parameters, the bias of the climatology of SST is reduced by ~90%. Altogether, this study suggests the feasibility of ensemble-based parameter estimation in a fully coupled general circulation model.

  11. Ensemble of One-Class Classifiers for Personal Risk Detection Based on Wearable Sensor Data

    PubMed Central

    Rodríguez, Jorge; Barrera-Animas, Ari Y.; Trejo, Luis A.; Medina-Pérez, Miguel Angel; Monroy, Raúl

    2016-01-01

    This study introduces the One-Class K-means with Randomly-projected features Algorithm (OCKRA). OCKRA is an ensemble of one-class classifiers built over multiple projections of a dataset according to random feature subsets. Algorithms found in the literature spread over a wide range of applications where ensembles of one-class classifiers have been satisfactorily applied; however, none is oriented to the area under our study: personal risk detection. OCKRA has been designed with the aim of improving the detection performance in the problem posed by the Personal RIsk DEtection(PRIDE) dataset. PRIDE was built based on 23 test subjects, where the data for each user were captured using a set of sensors embedded in a wearable band. The performance of OCKRA was compared against support vector machine and three versions of the Parzen window classifier. On average, experimental results show that OCKRA outperformed the other classifiers for at least 0.53% of the area under the curve (AUC). In addition, OCKRA achieved an AUC above 90% for more than 57% of the users. PMID:27690054

  12. Fault diagnosis of rotating machinery based on an adaptive ensemble empirical mode decomposition.

    PubMed

    Lei, Yaguo; Li, Naipeng; Lin, Jing; Wang, Sizhe

    2013-12-09

    The vibration based signal processing technique is one of the principal tools for diagnosing faults of rotating machinery. Empirical mode decomposition (EMD), as a time-frequency analysis technique, has been widely used to process vibration signals of rotating machinery. But it has the shortcoming of mode mixing in decomposing signals. To overcome this shortcoming, ensemble empirical mode decomposition (EEMD) was proposed accordingly. EEMD is able to reduce the mode mixing to some extent. The performance of EEMD, however, depends on the parameters adopted in the EEMD algorithms. In most of the studies on EEMD, the parameters were selected artificially and subjectively. To solve the problem, a new adaptive ensemble empirical mode decomposition method is proposed in this paper. In the method, the sifting number is adaptively selected, and the amplitude of the added noise changes with the signal frequency components during the decomposition process. The simulation, the experimental and the application results demonstrate that the adaptive EEMD provides the improved results compared with the original EEMD in diagnosing rotating machinery.

  13. Ensemble of One-Class Classifiers for Personal Risk Detection Based on Wearable Sensor Data.

    PubMed

    Rodríguez, Jorge; Barrera-Animas, Ari Y; Trejo, Luis A; Medina-Pérez, Miguel Angel; Monroy, Raúl

    2016-09-29

    This study introduces the One-Class K-means with Randomly-projected features Algorithm (OCKRA). OCKRA is an ensemble of one-class classifiers built over multiple projections of a dataset according to random feature subsets. Algorithms found in the literature spread over a wide range of applications where ensembles of one-class classifiers have been satisfactorily applied; however, none is oriented to the area under our study: personal risk detection. OCKRA has been designed with the aim of improving the detection performance in the problem posed by the Personal RIsk DEtection(PRIDE) dataset. PRIDE was built based on 23 test subjects, where the data for each user were captured using a set of sensors embedded in a wearable band. The performance of OCKRA was compared against support vector machine and three versions of the Parzen window classifier. On average, experimental results show that OCKRA outperformed the other classifiers for at least 0.53% of the area under the curve (AUC). In addition, OCKRA achieved an AUC above 90% for more than 57% of the users.

  14. Deconvoluting Protein (Un)folding Structural Ensembles Using X-Ray Scattering, Nuclear Magnetic Resonance Spectroscopy and Molecular Dynamics Simulation.

    PubMed

    Nasedkin, Alexandr; Marcellini, Moreno; Religa, Tomasz L; Freund, Stefan M; Menzel, Andreas; Fersht, Alan R; Jemth, Per; van der Spoel, David; Davidsson, Jan

    2015-01-01

    The folding and unfolding of protein domains is an apparently cooperative process, but transient intermediates have been detected in some cases. Such (un)folding intermediates are challenging to investigate structurally as they are typically not long-lived and their role in the (un)folding reaction has often been questioned. One of the most well studied (un)folding pathways is that of Drosophila melanogaster Engrailed homeodomain (EnHD): this 61-residue protein forms a three helix bundle in the native state and folds via a helical intermediate. Here we used molecular dynamics simulations to derive sample conformations of EnHD in the native, intermediate, and unfolded states and selected the relevant structural clusters by comparing to small/wide angle X-ray scattering data at four different temperatures. The results are corroborated using residual dipolar couplings determined by NMR spectroscopy. Our results agree well with the previously proposed (un)folding pathway. However, they also suggest that the fully unfolded state is present at a low fraction throughout the investigated temperature interval, and that the (un)folding intermediate is highly populated at the thermal midpoint in line with the view that this intermediate can be regarded to be the denatured state under physiological conditions. Further, the combination of ensemble structural techniques with MD allows for determination of structures and populations of multiple interconverting structures in solution.

  15. Structural insights for designed alanine-rich helices: Comparing NMR helicity measures and conformational ensembles from molecular dynamics simulation

    PubMed Central

    Song, Kun; Stewart, James M.; Fesinmeyer, R. Matthew

    2013-01-01

    The temperature dependence of helical propensities for the peptides Ac-ZGG-(KAAAA)3X-NH2 (Z = Y or G, X = A, K, and d-Arg) were studied both experimentally and by molecular dynamics simulations. Good agreement is observed in both the absolute helical propensities as well as relative helical content along the sequence; the global minimum on the calculated free energy landscape corresponds to a single α-helical conformation running from K4 – A18 with some terminal fraying, particularly at the C-terminus. Energy component analysis shows that the single helix state has favorable intramolecular electrostatic energy due to hydrogen bonds, and that less-favorable two-helix globular states have favorable solvation energy. The central lysine residues do not appear to increase helicity; however, both experimental and simulation studies show increasing helicity in the series X = Ala → Lys → d-Arg. This C-capping preference was also experimentally confirmed in Ac-(KAAAA)3X-GY-NH2 and (KAAAA)3X-GY-NH2 sequences. The roles of the C-capping groups, and of lysines throughout the sequence, in the MD-derived ensembles are analyzed in detail. PMID:18428207

  16. On the structure of crystalline and molten cryolite: Insights from the ab initio molecular dynamics in NpT ensemble

    NASA Astrophysics Data System (ADS)

    Bučko, Tomáš; Šimko, František

    2016-02-01

    Ab initio molecular dynamics simulations in isobaric-isothermal ensemble have been performed to study the low- and the high-temperature crystalline and liquid phases of cryolite. The temperature induced transitions from the low-temperature solid (α) to the high-temperature solid phase (β) and from the phase β to the liquid phase have been simulated using a series of MD runs performed at gradually increasing temperature. The structure of crystalline and liquid phases is analysed in detail and our computational approach is shown to reliably reproduce the available experimental data for a wide range of temperatures. Relatively frequent reorientations of the AlF6 octahedra observed in our simulation of the phase β explain the thermal disorder in positions of the F- ions observed in X-ray diffraction experiments. The isolated AlF63-, AlF52-, AlF4-, as well as the bridged Al 2 Fm 6 - m ionic entities have been identified as the main constituents of cryolite melt. In accord with the previous high-temperature NMR and Raman spectroscopic experiments, the compound AlF5 2 - has been shown to be the most abundant Al-containing species formed in the melt. The characteristic vibrational frequencies for the AlFn 3 - n species in realistic environment have been determined and the computed values have been found to be in a good agreement with experiment.

  17. On the structure of crystalline and molten cryolite: Insights from the ab initio molecular dynamics in NpT ensemble.

    PubMed

    Bučko, Tomáš; Šimko, František

    2016-02-14

    Ab initio molecular dynamics simulations in isobaric-isothermal ensemble have been performed to study the low- and the high-temperature crystalline and liquid phases of cryolite. The temperature induced transitions from the low-temperature solid (α) to the high-temperature solid phase (β) and from the phase β to the liquid phase have been simulated using a series of MD runs performed at gradually increasing temperature. The structure of crystalline and liquid phases is analysed in detail and our computational approach is shown to reliably reproduce the available experimental data for a wide range of temperatures. Relatively frequent reorientations of the AlF6 octahedra observed in our simulation of the phase β explain the thermal disorder in positions of the F(-) ions observed in X-ray diffraction experiments. The isolated AlF6(3-), AlF5(2-), AlF4(-), as well as the bridged Al2Fm(6-m) ionic entities have been identified as the main constituents of cryolite melt. In accord with the previous high-temperature NMR and Raman spectroscopic experiments, the compound AlF5(2-) has been shown to be the most abundant Al-containing species formed in the melt. The characteristic vibrational frequencies for the AlFn(3-n) species in realistic environment have been determined and the computed values have been found to be in a good agreement with experiment.

  18. Massively parallel molecular-dynamics simulation of ice crystallisation and melting: the roles of system size, ensemble, and electrostatics.

    PubMed

    English, Niall J

    2014-12-21

    Ice crystallisation and melting was studied via massively parallel molecular dynamics under periodic boundary conditions, using approximately spherical ice nano-particles (both "isolated" and as a series of heterogeneous "seeds") of varying size, surrounded by liquid water and at a variety of temperatures. These studies were performed for a series of systems ranging in size from ∼1 × 10(6) to 8.6 × 10(6) molecules, in order to establish system-size effects upon the nano-clusters" crystallisation and dissociation kinetics. Both "traditional" four-site and "single-site" and water models were used, with and without formal point charges, dipoles, and electrostatics, respectively. Simulations were carried out in the microcanonical and isothermal-isobaric ensembles, to assess the influence of "artificial" thermo- and baro-statting, and important disparities were observed, which declined upon using larger systems. It was found that there was a dependence upon system size for both ice growth and dissociation, in that larger systems favoured slower growth and more rapid melting, given the lower extent of "communication" of ice nano-crystallites with their periodic replicae in neighbouring boxes. Although the single-site model exhibited less variation with system size vis-à-vis the multiple-site representation with explicit electrostatics, its crystallisation-dissociation kinetics was artificially fast.

  19. Massively parallel molecular-dynamics simulation of ice crystallisation and melting: The roles of system size, ensemble, and electrostatics

    NASA Astrophysics Data System (ADS)

    English, Niall J.

    2014-12-01

    Ice crystallisation and melting was studied via massively parallel molecular dynamics under periodic boundary conditions, using approximately spherical ice nano-particles (both "isolated" and as a series of heterogeneous "seeds") of varying size, surrounded by liquid water and at a variety of temperatures. These studies were performed for a series of systems ranging in size from ˜1 × 106 to 8.6 × 106 molecules, in order to establish system-size effects upon the nano-clusters" crystallisation and dissociation kinetics. Both "traditional" four-site and "single-site" and water models were used, with and without formal point charges, dipoles, and electrostatics, respectively. Simulations were carried out in the microcanonical and isothermal-isobaric ensembles, to assess the influence of "artificial" thermo- and baro-statting, and important disparities were observed, which declined upon using larger systems. It was found that there was a dependence upon system size for both ice growth and dissociation, in that larger systems favoured slower growth and more rapid melting, given the lower extent of "communication" of ice nano-crystallites with their periodic replicae in neighbouring boxes. Although the single-site model exhibited less variation with system size vis-à-vis the multiple-site representation with explicit electrostatics, its crystallisation-dissociation kinetics was artificially fast.

  20. Electrical characterization of ensemble of GaN nanowires grown by the molecular beam epitaxy technique

    NASA Astrophysics Data System (ADS)

    Kolkovsky, Vl.; Zytkiewicz, Z. R.; Sobanska, M.; Klosek, K.

    2013-08-01

    High quality Schottky contacts are formed on GaN nanowires (NWs) structures grown by the molecular beam epitaxy technique on Si(111) substrate. The current-voltage characteristics show the rectification ratio of about 103 and the leakage current of about 10-4 A/cm2 at room temperature. From the capacitance-voltage measurements the free carrier concentration in GaN NWs is determined as about 1016 cm-3. Two deep levels (H200 and E280) are found in the structures containing GaN NWs. H200 is attributed to an extended defect located at the interface between the substrate and SiNx or near the sidewalls at the bottom of the NWs whereas E280 is tentatively assigned to a gallium-vacancy- or nitrogen interstitials-related defect.

  1. Single-molecule imaging of non-equilibrium molecular ensembles on the millisecond timescale

    PubMed Central

    Juette, Manuel F.; Terry, Daniel S.; Wasserman, Michael R.; Altman, Roger B.; Zhou, Zhou; Zhao, Hong; Blanchard, Scott C.

    2016-01-01

    Molecular recognition is often driven by transient processes beyond the reach of detection. Single-molecule fluorescence microscopy methods are uniquely suited for detecting such non-accumulating intermediates, yet achieving the time resolution and statistics to realize this potential has proven challenging. Here, we present a single-molecule fluorescence resonance energy transfer (smFRET) imaging and analysis platform leveraging advances in scientific complementary metal-oxide semiconductor (sCMOS) detectors that enable the imaging of more than 10,000 individual molecules simultaneously at millisecond rates. The utility of this advance is demonstrated through quantitative measurements of previously obscured processes relevant to the fidelity mechanism in protein synthesis. PMID:26878382

  2. Hidden Conformation Events in DNA Base Extrusions: A Generalized Ensemble Path Optimization and Equilibrium Simulation Study

    PubMed Central

    Cao, Liaoran; Lv, Chao; Yang, Wei

    2013-01-01

    DNA base extrusion is a crucial component of many biomolecular processes. Elucidating how bases are selectively extruded from the interiors of double-strand DNAs is pivotal to accurately understanding and efficiently sampling this general type of conformational transitions. In this work, the on-the-path random walk (OTPRW) method, which is the first generalized ensemble sampling scheme designed for finite-temperature-string path optimizations, was improved and applied to obtain the minimum free energy path (MFEP) and the free energy profile of a classical B-DNA major-groove base extrusion pathway. Along the MFEP, an intermediate state and the corresponding transition state were located and characterized. The MFEP result suggests that a base-plane-elongation event rather than the commonly focused base-flipping event is dominant in the transition state formation portion of the pathway; and the energetic penalty at the transition state is mainly introduced by the stretching of the Watson-Crick base pair. Moreover to facilitate the essential base-plane-elongation dynamics, the surrounding environment of the flipped base needs to be intimately involved. Further taking the advantage of the extended-dynamics nature of the OTPRW Hamiltonian, an equilibrium generalized ensemble simulation was performed along the optimized path; and based on the collected samples, several base-flipping (opening) angle collective variables were evaluated. In consistence with the MFEP result, the collective variable analysis result reveals that none of these commonly employed flipping (opening) angles alone can adequately represent the base extrusion pathway, especially in the pre-transition-state portion. As further revealed by the collective variable analysis, the base-pairing partner of the extrusion target undergoes a series of in-plane rotations to facilitate the base-plane-elongation dynamics. A base-plane rotation angle is identified to be a possible reaction coordinate to represent

  3. Toward an Operational Particle Filter-Based Ensemble Data Assimilation System

    DTIC Science & Technology

    2014-09-22

    tangent linear model or adjoint). Ensemble assimi lation algorithms produce a so lution by generating a sample of the joint PDF of interest, but are...addressed the question of whether ensemble filters, wh ich employ the full nonlinear model , are capable of representing quantities that are hard bounded...11. Posselt, D. J., Nonlinear Model Parameter Estimation: Comparison of Results From a Markov Chain Monte Carlo Algorithm and An Ensemble Transform

  4. Ensemble based adaptive over-sampling method for imbalanced data learning in computer aided detection of microaneurysm.

    PubMed

    Ren, Fulong; Cao, Peng; Li, Wei; Zhao, Dazhe; Zaiane, Osmar

    2017-01-01

    Diabetic retinopathy (DR) is a progressive disease, and its detection at an early stage is crucial for saving a patient's vision. An automated screening system for DR can help in reduce the chances of complete blindness due to DR along with lowering the work load on ophthalmologists. Among the earliest signs of DR are microaneurysms (MAs). However, current schemes for MA detection appear to report many false positives because detection algorithms have high sensitivity. Inevitably some non-MAs structures are labeled as MAs in the initial MAs identification step. This is a typical "class imbalance problem". Class imbalanced data has detrimental effects on the performance of conventional classifiers. In this work, we propose an ensemble based adaptive over-sampling algorithm for overcoming the class imbalance problem in the false positive reduction, and we use Boosting, Bagging, Random subspace as the ensemble framework to improve microaneurysm detection. The ensemble based over-sampling methods we proposed combine the strength of adaptive over-sampling and ensemble. The objective of the amalgamation of ensemble and adaptive over-sampling is to reduce the induction biases introduced from imbalanced data and to enhance the generalization classification performance of extreme learning machines (ELM). Experimental results show that our ASOBoost method has higher area under the ROC curve (AUC) and G-mean values than many existing class imbalance learning methods. Copyright © 2016 Elsevier Ltd. All rights reserved.

  5. Comparison of ensemble post-processing approaches, based on empirical and dynamical error modelisation of rainfall-runoff model forecasts

    NASA Astrophysics Data System (ADS)

    Chardon, J.; Mathevet, T.; Le Lay, M.; Gailhard, J.

    2012-04-01

    In the context of a national energy company (EDF : Electricité de France), hydro-meteorological forecasts are necessary to ensure safety and security of installations, meet environmental standards and improve water ressources management and decision making. Hydrological ensemble forecasts allow a better representation of meteorological and hydrological forecasts uncertainties and improve human expertise of hydrological forecasts, which is essential to synthesize available informations, coming from different meteorological and hydrological models and human experience. An operational hydrological ensemble forecasting chain has been developed at EDF since 2008 and is being used since 2010 on more than 30 watersheds in France. This ensemble forecasting chain is characterized ensemble pre-processing (rainfall and temperature) and post-processing (streamflow), where a large human expertise is solicited. The aim of this paper is to compare 2 hydrological ensemble post-processing methods developed at EDF in order improve ensemble forecasts reliability (similar to Monatanari &Brath, 2004; Schaefli et al., 2007). The aim of the post-processing methods is to dress hydrological ensemble forecasts with hydrological model uncertainties, based on perfect forecasts. The first method (called empirical approach) is based on a statistical modelisation of empirical error of perfect forecasts, by streamflow sub-samples of quantile class and lead-time. The second method (called dynamical approach) is based on streamflow sub-samples of quantile class and streamflow variation, and lead-time. On a set of 20 watersheds used for operational forecasts, results show that both approaches are necessary to ensure a good post-processing of hydrological ensemble, allowing a good improvement of reliability, skill and sharpness of ensemble forecasts. The comparison of the empirical and dynamical approaches shows the limits of the empirical approach which is not able to take into account hydrological

  6. Polypeptides Based Molecular Electronics

    DTIC Science & Technology

    2008-10-06

    Molecular Electronics 4 Figure 3. Dehydration synthesis reaction CHAPTER 2 Review of Literature 2.1 Peptides 2.1.1 Introduction to peptides...Peptides are biomolecules formed from the 20 naturally occurring amino acids. Figure 3 shows dehydration synthesis reaction (known as condensation

  7. Signal enhancement based on complex curvelet transform and complementary ensemble empirical mode decomposition

    NASA Astrophysics Data System (ADS)

    Dong, Lieqian; Wang, Deying; Zhang, Yimeng; Zhou, Datong

    2017-09-01

    Signal enhancement is a necessary step in seismic data processing. In this paper we utilize the complementary ensemble empirical mode decomposition (CEEMD) and complex curvelet transform (CCT) methods to separate signal from random noise further to improve the signal to noise (S/N) ratio. Firstly, the original data with noise is decomposed into a series of intrinsic mode function (IMF) profiles with the aid of CEEMD. Then the IMFs with noise are transformed into CCT domain. By choosing different thresholds which are based on the noise level difference of each IMF profile, the noise in original data can be suppressed. Finally, we illustrate the effectiveness of the approach by simulated and field datasets.

  8. A Human ECG Identification System Based on Ensemble Empirical Mode Decomposition

    PubMed Central

    Zhao, Zhidong; Yang, Lei; Chen, Diandian; Luo, Yi

    2013-01-01

    In this paper, a human electrocardiogram (ECG) identification system based on ensemble empirical mode decomposition (EEMD) is designed. A robust preprocessing method comprising noise elimination, heartbeat normalization and quality measurement is proposed to eliminate the effects of noise and heart rate variability. The system is independent of the heart rate. The ECG signal is decomposed into a number of intrinsic mode functions (IMFs) and Welch spectral analysis is used to extract the significant heartbeat signal features. Principal component analysis is used reduce the dimensionality of the feature space, and the K-nearest neighbors (K-NN) method is applied as the classifier tool. The proposed human ECG identification system was tested on standard MIT-BIH ECG databases: the ST change database, the long-term ST database, and the PTB database. The system achieved an identification accuracy of 95% for 90 subjects, demonstrating the effectiveness of the proposed method in terms of accuracy and robustness. PMID:23698274

  9. A knowledge-based approach to generating diverse but energetically representative ensembles of ligand conformers

    NASA Astrophysics Data System (ADS)

    Dorfman, Roman J.; Smith, Karl M.; Masek, Brian B.; Clark, Robert D.

    2008-09-01

    This paper describes a new and efficient stochastic conformational sampling method for generating a range of low-energy molecule conformations. Sampling can be tailored to a specific structural domain (e.g., peptides) by extracting torsional profiles from specific datasets and subsequently applying them to target molecules outside the reference set. The programs that handle creation of the knowledge-based torsional profiles and conformer generation per se are separate and so can be used independently or sequentially, depending on the task at hand. The conformational ensembles produced are contrasted with those generated using local minimization approaches. They are also quantitatively compared with a broader range of techniques in terms of speed and the ability to reproduce bound ligand conformations found in complexes with proteins.

  10. Arch-based configurations in the volume ensemble of static granular systems

    NASA Astrophysics Data System (ADS)

    Slobinsky, D.; Pugnaloni, Luis A.

    2015-02-01

    We propose an alternative approach to count the microscopic static configurations of granular packs under gravity by considering arches. This strategy obviates the problem of filtering out configurations that are not mechanically stable, opening the way for a range of granular models to be studied via ensemble theory. Following this arch-based approach, we have obtained the exact density of states for a 2D, non-interacting rigid arch model of granular assemblies. The calculated arch size distribution and volume fluctuations show qualitative agreement with realistic simulations of tapped granular beds. We have also validated our calculations by comparing them with the analytic solution for the limiting case of a quasi-1D column of frictionless disks.

  11. Multi-faults decoupling on turbo-expander using differential-based ensemble empirical mode decomposition

    NASA Astrophysics Data System (ADS)

    Li, Hongguang; Li, Ming; Li, Cheng; Li, Fucai; Meng, Guang

    2017-09-01

    This paper dedicates on the multi-faults decoupling of turbo-expander rotor system using Differential-based Ensemble Empirical Mode Decomposition (DEEMD). DEEMD is an improved version of DEMD to resolve the imperfection of mode mixing. The nonlinear behaviors of the turbo-expander considering temperature gradient with crack, rub-impact and pedestal looseness faults are investigated respectively, so that the baseline for the multi-faults decoupling can be established. DEEMD is then utilized on the vibration signals of the rotor system with coupling faults acquired by numerical simulation, and the results indicate that DEEMD can successfully decouple the coupling faults, which is more efficient than EEMD. DEEMD is also applied on the vibration signal of the misalignment coupling with rub-impact fault obtained during the adjustment of the experimental system. The conclusion shows that DEEMD can decompose the practical multi-faults signal and the industrial prospect of DEEMD is verified as well.

  12. Compressed sensing of hyperspectral images based on scrambled block Hadamard ensemble

    NASA Astrophysics Data System (ADS)

    Wang, Li; Feng, Yan

    2016-11-01

    A fast measurement matrix based on scrambled block Hadamard ensemble for compressed sensing (CS) of hyperspectral images (HSI) is investigated. The proposed measurement matrix offers several attractive features. First, the proposed measurement matrix possesses Gaussian behavior, which illustrates that the matrix is universal and requires a near-optimal number of samples for exact reconstruction. In addition, it could be easily implemented in the optical domain due to its integer-valued elements. More importantly, the measurement matrix only needs small memory for storage in the sampling process. Experimental results on HSIs reveal that the reconstruction performance of the proposed measurement matrix is comparable or better than Gaussian matrix and Bernoulli matrix using different reconstruction algorithms while consuming less computational time. The proposed matrix could be used in CS of HSI, which would save the storage memory on board, improve the sampling efficiency, and ameliorate the reconstruction quality.

  13. Inferring Alcoholism SNPs and Regulatory Chemical Compounds Based on Ensemble Bayesian Network.

    PubMed

    Chen, Huan; Sun, Jiatong; Jiang, Hong; Wang, Xianyue; Wu, Lingxiang; Wu, Wei; Wang, Qh

    2016-12-20

    The disturbance of consciousness is one of the most common symptoms of those have alcoholism and may cause disability and mortality. Previous studies indicated that several single nucleotide polymorphisms (SNP) increase the susceptibility of alcoholism. In this study, we utilized the Ensemble Bayesian Network (EBN) method to identify causal SNPs of alcoholism based on the verified GAW14 data. Thirteen out of eighteen SNPs directly connected with alcoholism were found concordance with potential risk regions of alcoholism in OMIM database. As a number of SNPs were found contributing to alteration on gene expression, known as expression quantitative trait loci (eQTLs), we further sought to identify chemical compounds acting as regulators of alcoholism genes captured by causal SNPs. Chloroprene and valproic acid were identified as the expression regulators for genes C11orf66 and SALL3 which were captured by alcoholism SNPs, respectively.

  14. Ensemble Methods

    NASA Astrophysics Data System (ADS)

    Re, Matteo; Valentini, Giorgio

    2012-03-01

    Ensemble methods are statistical and computational learning procedures reminiscent of the human social learning behavior of seeking several opinions before making any crucial decision. The idea of combining the opinions of different "experts" to obtain an overall “ensemble” decision is rooted in our culture at least from the classical age of ancient Greece, and it has been formalized during the Enlightenment with the Condorcet Jury Theorem[45]), which proved that the judgment of a committee is superior to those of individuals, provided the individuals have reasonable competence. Ensembles are sets of learning machines that combine in some way their decisions, or their learning algorithms, or different views of data, or other specific characteristics to obtain more reliable and more accurate predictions in supervised and unsupervised learning problems [48,116]. A simple example is represented by the majority vote ensemble, by which the decisions of different learning machines are combined, and the class that receives the majority of “votes” (i.e., the class predicted by the majority of the learning machines) is the class predicted by the overall ensemble [158]. In the literature, a plethora of terms other than ensembles has been used, such as fusion, combination, aggregation, and committee, to indicate sets of learning machines that work together to solve a machine learning problem [19,40,56,66,99,108,123], but in this chapter we maintain the term ensemble in its widest meaning, in order to include the whole range of combination methods. Nowadays, ensemble methods represent one of the main current research lines in machine learning [48,116], and the interest of the research community on ensemble methods is witnessed by conferences and workshops specifically devoted to ensembles, first of all the multiple classifier systems (MCS) conference organized by Roli, Kittler, Windeatt, and other researchers of this area [14,62,85,149,173]. Several theories have been

  15. Data-worth analysis through probabilistic collocation-based Ensemble Kalman Filter

    NASA Astrophysics Data System (ADS)

    Dai, Cheng; Xue, Liang; Zhang, Dongxiao; Guadagnini, Alberto

    2016-09-01

    We propose a new and computationally efficient data-worth analysis and quantification framework keyed to the characterization of target state variables in groundwater systems. We focus on dynamically evolving plumes of dissolved chemicals migrating in randomly heterogeneous aquifers. An accurate prediction of the detailed features of solute plumes requires collecting a substantial amount of data. Otherwise, constraints dictated by the availability of financial resources and ease of access to the aquifer system suggest the importance of assessing the expected value of data before these are actually collected. Data-worth analysis is targeted to the quantification of the impact of new potential measurements on the expected reduction of predictive uncertainty based on a given process model. Integration of the Ensemble Kalman Filter method within a data-worth analysis framework enables us to assess data worth sequentially, which is a key desirable feature for monitoring scheme design in a contaminant transport scenario. However, it is remarkably challenging because of the (typically) high computational cost involved, considering that repeated solutions of the inverse problem are required. As a computationally efficient scheme, we embed in the data-worth analysis framework a modified version of the Probabilistic Collocation Method-based Ensemble Kalman Filter proposed by Zeng et al. (2011) so that we take advantage of the ability to assimilate data sequentially in time through a surrogate model constructed via the polynomial chaos expansion. We illustrate our approach on a set of synthetic scenarios involving solute migrating in a two-dimensional random permeability field. Our results demonstrate the computational efficiency of our approach and its ability to quantify the impact of the design of the monitoring network on the reduction of uncertainty associated with the characterization of a migrating contaminant plume.

  16. Constructing Better Classifier Ensemble Based on Weighted Accuracy and Diversity Measure

    PubMed Central

    Chao, Lidia S.

    2014-01-01

    A weighted accuracy and diversity (WAD) method is presented, a novel measure used to evaluate the quality of the classifier ensemble, assisting in the ensemble selection task. The proposed measure is motivated by a commonly accepted hypothesis; that is, a robust classifier ensemble should not only be accurate but also different from every other member. In fact, accuracy and diversity are mutual restraint factors; that is, an ensemble with high accuracy may have low diversity, and an overly diverse ensemble may negatively affect accuracy. This study proposes a method to find the balance between accuracy and diversity that enhances the predictive ability of an ensemble for unknown data. The quality assessment for an ensemble is performed such that the final score is achieved by computing the harmonic mean of accuracy and diversity, where two weight parameters are used to balance them. The measure is compared to two representative measures, Kappa-Error and GenDiv, and two threshold measures that consider only accuracy or diversity, with two heuristic search algorithms, genetic algorithm, and forward hill-climbing algorithm, in ensemble selection tasks performed on 15 UCI benchmark datasets. The empirical results demonstrate that the WAD measure is superior to others in most cases. PMID:24672402

  17. Development of new ensemble methods based on the performace skills of regional climate models over South Korea

    NASA Astrophysics Data System (ADS)

    Suh, M. S.; Oh, S. G.; Lee, D. K.; Cha, D. H.; Choi, S. J.; Hong, S. Y.; Kang, H. S.

    2012-04-01

    It is well known that multi-model ensembles can reduce the uncertainties of the model results and increase the reliability of the model results. In this paper, the prediction skills for temperature and precipitation of five ensemble methods were discussed by using the 20 years simulation results (from 1989 to 2008) by four regional climate models (RCMs : SNURCM, WRF, RegCM4, and RSM) driven by NCEP-DOE and ERA-interim boundary conditions. The simulation domain is CORDEX (COordinated Regional climate Downscaling Experiment) East Asia and the number of grids is 197 x 233 grids with a 50-km horizontal resolution. The new three ensemble methods, PEA_BRC, PEA_RAC and PEA_ROC, developed in this study, are performance based ensemble averaging methods by using bias, RMSE (root mean square errors) and correlation, RMSE and absolute correlation, and RMSE and original correlation, respectively. The other two ensemble methods are equal weighted averaging (EWA) and multivariate linear regression (Mul_Reg). Fifteen years and five years data from 20-year simulation data were used to derive the weighting coefficients and cross-validate the prediction skills of five ensemble methods. The total number of training and evaluation is 20 times through a cyclic method from 20 years data. The Mul_Reg (EWA) method among the five ensemble methods shows the best (worst) skill without regard to seasons and variables during the training period. And the PEA_RAC and PEA_ROC show very similar skills with Mul_Reg for all variables and seasons during training period. However, the skills and stabilities of Mul_Reg are drastically reduced when it applied to prediction regardless of variables and seasons. However, the skills and stabilities of PEA_RAC are slightly reduced. As a result, the PEA_RAC shows the best skill without regard to the seasons and variables during the prediction period. This result confirms that the new ensemble methods developed in this study, the PEA_RAC, can be used for the

  18. Ensemble Models

    EPA Science Inventory

    Ensemble forecasting has been used for operational numerical weather prediction in the United States and Europe since the early 1990s. An ensemble of weather or climate forecasts is used to characterize the two main sources of uncertainty in computer models of physical systems: ...

  19. Ensemble Models

    EPA Science Inventory

    Ensemble forecasting has been used for operational numerical weather prediction in the United States and Europe since the early 1990s. An ensemble of weather or climate forecasts is used to characterize the two main sources of uncertainty in computer models of physical systems: ...

  20. Comparison of ensemble-based state and parameter estimation methods for soil moisture data assimilation

    NASA Astrophysics Data System (ADS)

    Chen, Weijing; Huang, Chunlin; Shen, Huanfeng; Li, Xin

    2015-12-01

    Model parameters are a source of uncertainty that can easily cause systematic deviation and significantly affect the accuracy of soil moisture generation in assimilation systems. This study addresses the issue of retrieving model parameters related to soil moisture via the simultaneous estimation of states and parameters based on the Common Land Model (CoLM). The state-parameter estimation algorithms AEnKF (Augmented Ensemble Kalman Filter), DEnKF (Dual Ensemble Kalman Filter) and SODA (Simultaneous optimization and data assimilation) are entirely implemented within an EnKF framework to investigate how the three algorithms can correct model parameters and improve the accuracy of soil moisture estimation. The analysis is illustrated by assimilating the surface soil moisture levels from varying observation intervals using data from Mongolian plateau sites. Furthermore, a radiation transfer model is introduced as an observation operator to analyze the influence of brightness temperature assimilation on states and parameters that are estimated at different microwave signal frequencies. Three cases were analyzed for both soil moisture and brightness temperature assimilation, focusing on the progressive incorporation of parameter uncertainty, forcing data uncertainty and model uncertainty. It has been demonstrated that EnKF is outperformed by all other methods, as it consistently maintains a bias. State-parameter estimation algorithms can provide a more accurate estimation of soil moisture than EnKF. AEnKF is the most robust method, with the lowest RMSE values for retrieving states and parameters dealing only with parameter uncertainty, but it possesses disadvantages related to increasing sources of uncertainty and decreasing numbers of observations. SODA performs well under the complex situations in which DEnKF shows slight disadvantages in terms of statistical indicators; however, the former consumes far more memory and time than the latter.

  1. Ensemble-based characterization of unbound and bound states on protein energy landscape

    PubMed Central

    Ruvinsky, Anatoly M; Kirys, Tatsiana; Tuzikov, Alexander V; Vakser, Ilya A

    2013-01-01

    Physicochemical description of numerous cell processes is fundamentally based on the energy landscapes of protein molecules involved. Although the whole energy landscape is difficult to reconstruct, increased attention to particular targets has provided enough structures for mapping functionally important subspaces associated with the unbound and bound protein structures. The subspace mapping produces a discrete representation of the landscape, further called energy spectrum. We compiled and characterized ensembles of bound and unbound conformations of six small proteins and explored their spectra in implicit solvent. First, the analysis of the unbound-to-bound changes points to conformational selection as the binding mechanism for four proteins. Second, results show that bound and unbound spectra often significantly overlap. Moreover, the larger the overlap the smaller the root mean square deviation (RMSD) between the bound and unbound conformational ensembles. Third, the center of the unbound spectrum has a higher energy than the center of the corresponding bound spectrum of the dimeric and multimeric states for most of the proteins. This suggests that the unbound states often have larger entropy than the bound states. Fourth, the exhaustively long minimization, making small intrarotamer adjustments (all-atom RMSD ≤ 0.7 Å), dramatically reduces the distance between the centers of the bound and unbound spectra as well as the spectra extent. It condenses unbound and bound energy levels into a thin layer at the bottom of the energy landscape with the energy spacing that varies between 0.8–4.6 and 3.5–10.5 kcal/mol for the unbound and bound states correspondingly. Finally, the analysis of protein energy fluctuations showed that protein vibrations itself can excite the interstate transitions, including the unbound-to-bound ones. PMID:23526684

  2. The impact of Ensemble-based data assimilation on the predictability of landfalling Hurricane Katrina (2005)

    NASA Astrophysics Data System (ADS)

    Zhang, H.; Pu, Z.

    2012-12-01

    Accurate forecasts of the track, intensity and structure of a landfalling hurricane can save lives and mitigate social impacts. Over the last two decades, significant improvements have been achieved for hurricane forecasts. However, only a few of studies have emphasized landfalling hurricanes. Specifically, there are difficulties in predicting hurricane landfall due to the uncertainties in representing the atmospheric near-surface conditions in numerical weather prediction models, the complicated interaction between the atmosphere and the ocean, and the multiple-scale dynamical and physical processes accompanying storm development. In this study, the impact of the assimilation of conventional and satellite observations on the predictability of landfalling hurricanes is examined by using a mesoscale community Weather Research and Forecasting (WRF) model and an ensemble Kalman filter developed by NCAR Data Assimilation Research Testbed (DART). Hurricane Katrina (2005) was chosen as a case study since it was one of the deadliest disasters in US history. The minimum sea level pressure from the best track, QuikScat ocean surface wind vectors, surface mesonet observations, airborne Doppler radar derived wind components and available conventional observations are assimilated in a series of experiments to examine the data impacts on the predictability of Hurricane Katrina. The analyses and forecasts show that ensemble-based data assimilation significantly improves the forecast of Hurricane Katrina. The assimilation improves the track forecast through modifying the storm structures and related environmental fields. Cyclonic increments are clearly seen in vorticity and wind analyses. Temperature and humidity fields are also modified by the data assimilation. The changes in relevant fields help organize the structure of the storm, intensify the circulation, and result in a positive impact on the evolution of the storm in both analyses and forecasts. The forecasts in the

  3. An exact approach for studying cargo transport by an ensemble of molecular motors

    PubMed Central

    2013-01-01

    Background Intracellular transport is crucial for many cellular processes where a large fraction of the cargo is transferred by motor-proteins over a network of microtubules. Malfunctions in the transport mechanism underlie a number of medical maladies. Existing methods for studying how motor-proteins coordinate the transfer of a shared cargo over a microtubule are either analytical or are based on Monte-Carlo simulations. Approaches that yield analytical results, while providing unique insights into transport mechanism, make simplifying assumptions, where a detailed characterization of important transport modalities is difficult to reach. On the other hand, Monte-Carlo based simulations can incorporate detailed characteristics of the transport mechanism; however, the quality of the results depend on the number and quality of simulation runs used in arriving at results. Here, for example, it is difficult to simulate and study rare-events that can trigger abnormalities in transport. Results In this article, a semi-analytical methodology that determines the probability distribution function of motor-protein behavior in an exact manner is developed. The method utilizes a finite-dimensional projection of the underlying infinite-dimensional Markov model, which retains the Markov property, and enables the detailed and exact determination of motor configurations, from which meaningful inferences on transport characteristics of the original model can be derived. Conclusions Under this novel probabilistic approach new insights about the mechanisms of action of these proteins are found, suggesting hypothesis about their behavior and driving the design and realization of new experiments. The advantages provided in accuracy and efficiency make it possible to detect rare events in the motor protein dynamics, that could otherwise pass undetected using standard simulation methods. In this respect, the model has allowed to provide a possible explanation for possible mechanisms

  4. Bayesian model aggregation for ensemble-based estimates of protein pKa values

    SciTech Connect

    Gosink, Luke J.; Hogan, Emilie A.; Pulsipher, Trenton C.; Baker, Nathan A.

    2014-03-01

    This paper investigates an ensemble-based technique called Bayesian Model Averaging (BMA) to improve the performance of protein amino acid p$K_a$ predictions. Structure-based p$K_a$ calculations play an important role in the mechanistic interpretation of protein structure and are also used to determine a wide range of protein properties. A diverse set of methods currently exist for p$K_a$ prediction, ranging from empirical statistical models to {\\it ab initio} quantum mechanical approaches. However, each of these methods are based on a set of assumptions that have inherent bias and sensitivities that can effect a model's accuracy and generalizability for p$K_a$ prediction in complicated biomolecular systems. We use BMA to combine eleven diverse prediction methods that each estimate pKa values of amino acids in staphylococcal nuclease. These methods are based on work conducted for the pKa Cooperative and the pKa measurements are based on experimental work conducted by the Garc{\\'i}a-Moreno lab. Our study demonstrates that the aggregated estimate obtained from BMA outperforms all individual prediction methods in our cross-validation study with improvements from 40-70\\% over other method classes. This work illustrates a new possible mechanism for improving the accuracy of p$K_a$ prediction and lays the foundation for future work on aggregate models that balance computational cost with prediction accuracy.

  5. Application of NARR-based NLDAS Ensemble Simulations to Continental-Scale Drought Monitoring

    NASA Astrophysics Data System (ADS)

    Alonge, C. J.; Cosgrove, B. A.

    2008-05-01

    Government estimates indicate that droughts cause billions of dollars of damage to agricultural interests each year. More effective identification of droughts would directly benefit decision makers, and would allow for the more efficient allocation of resources that might mitigate the event. Land data assimilation systems, with their high quality representations of soil moisture, present an ideal platform for drought monitoring, and offer many advantages over traditional modeling systems. The recently released North American Regional Reanalysis (NARR) covers the NLDAS domain and provides all fields necessary to force the NLDAS for 27 years. This presents an ideal opportunity to combine NARR and NLDAS resources into an effective real-time drought monitor. Toward this end, our project seeks to validate and explore the NARR's suitability as a base for drought monitoring applications - both in terms of data set length and accuracy. Along the same lines, the project will examine the impact of the use of different (longer) LDAS model climatologies on drought monitoring, and will explore the advantages of ensemble simulations versus single model simulations in drought monitoring activities. We also plan to produce a NARR- and observation-based high quality 27 year, 1/8th degree, 3-hourly, land surface and meteorological forcing data sets. An investigation of the best way to force an LDAS-type system will also be made, with traditional NLDAS and NLDASE forcing options explored. This presentation will focus on an overview of the drought monitoring project, and will include a summary of recent progress. Developments include the generation of forcing data sets, ensemble LSM output, and production of model-based drought indices over the entire NLDAS domain. Project forcing files use 32km NARR model output as a data backbone, and include observed precipitation (blended CPC gauge, PRISM gauge, Stage II, HPD, and CMORPH) and a GOES-based bias correction of downward solar

  6. Generic Learning-Based Ensemble Framework for Small Sample Size Face Recognition in Multi-Camera Networks.

    PubMed

    Zhang, Cuicui; Liang, Xuefeng; Matsuyama, Takashi

    2014-12-08

    Multi-camera networks have gained great interest in video-based surveillance systems for security monitoring, access control, etc. Person re-identification is an essential and challenging task in multi-camera networks, which aims to determine if a given individual has already appeared over the camera network. Individual recognition often uses faces as a trial and requires a large number of samples during the training phrase. This is difficult to fulfill due to the limitation of the camera hardware system and the unconstrained image capturing conditions. Conventional face recognition algorithms often encounter the "small sample size" (SSS) problem arising from the small number of training samples compared to the high dimensionality of the sample space. To overcome this problem, interest in the combination of multiple base classifiers has sparked research efforts in ensemble methods. However, existing ensemble methods still open two questions: (1) how to define diverse base classifiers from the small data; (2) how to avoid the diversity/accuracy dilemma occurring during ensemble. To address these problems, this paper proposes a novel generic learning-based ensemble framework, which augments the small data by generating new samples based on a generic distribution and introduces a tailored 0-1 knapsack algorithm to alleviate the diversity/accuracy dilemma. More diverse base classifiers can be generated from the expanded face space, and more appropriate base classifiers are selected for ensemble. Extensive experimental results on four benchmarks demonstrate the higher ability of our system to cope with the SSS problem compared to the state-of-the-art system.

  7. Investigating the Utility of Oblique Tree-Based Ensembles for the Classification of Hyperspectral Data

    PubMed Central

    Poona, Nitesh; van Niekerk, Adriaan; Ismail, Riyad

    2016-01-01

    Ensemble classifiers are being widely used for the classification of spectroscopic data. In this regard, the random forest (RF) ensemble has been successfully applied in an array of applications, and has proven to be robust in handling high dimensional data. More recently, several variants of the traditional RF algorithm including rotation forest (rotF) and oblique random forest (oRF) have been applied to classifying high dimensional data. In this study we compare the traditional RF, rotF, and oRF (using three different splitting rules, i.e., ridge regression, partial least squares, and support vector machine) for the classification of healthy and infected Pinus radiata seedlings using high dimensional spectroscopic data. We further test the robustness of these five ensemble classifiers to reduced spectral resolution by spectral resampling (binning) of the original spectral bands. The results showed that the three oblique random forest ensembles outperformed both the traditional RF and rotF ensembles. Additionally, the rotF ensemble proved to be the least robust of the five ensembles tested. Spectral resampling of the original bands provided mixed results. Nevertheless, the results demonstrate that using spectral resampled bands is a promising approach to classifying asymptomatic stress in Pinus radiata seedlings. PMID:27854290

  8. Uncertainty analysis of neural network based flood forecasting models: An ensemble based approach for constructing prediction interval

    NASA Astrophysics Data System (ADS)

    Kasiviswanathan, K.; Sudheer, K.

    2013-05-01

    Artificial neural network (ANN) based hydrologic models have gained lot of attention among water resources engineers and scientists, owing to their potential for accurate prediction of flood flows as compared to conceptual or physics based hydrologic models. The ANN approximates the non-linear functional relationship between the complex hydrologic variables in arriving at the river flow forecast values. Despite a large number of applications, there is still some criticism that ANN's point prediction lacks in reliability since the uncertainty of predictions are not quantified, and it limits its use in practical applications. A major concern in application of traditional uncertainty analysis techniques on neural network framework is its parallel computing architecture with large degrees of freedom, which makes the uncertainty assessment a challenging task. Very limited studies have considered assessment of predictive uncertainty of ANN based hydrologic models. In this study, a novel method is proposed that help construct the prediction interval of ANN flood forecasting model during calibration itself. The method is designed to have two stages of optimization during calibration: at stage 1, the ANN model is trained with genetic algorithm (GA) to obtain optimal set of weights and biases vector, and during stage 2, the optimal variability of ANN parameters (obtained in stage 1) is identified so as to create an ensemble of predictions. During the 2nd stage, the optimization is performed with multiple objectives, (i) minimum residual variance for the ensemble mean, (ii) maximum measured data points to fall within the estimated prediction interval and (iii) minimum width of prediction interval. The method is illustrated using a real world case study of an Indian basin. The method was able to produce an ensemble that has an average prediction interval width of 23.03 m3/s, with 97.17% of the total validation data points (measured) lying within the interval. The derived

  9. Estimating Parameters in Real-Time Under Changing Conditions Via the Ensemble Kalman Filter Based Method

    NASA Astrophysics Data System (ADS)

    Meng, S.; Xie, X.

    2014-12-01

    Hydrological model performance is usually not as acceptable as expected due to limited measurements and imperfect parameterization which is attributable to the uncertainties from model parameters and model structures. In applications, a general assumption is hold that model parameters are constant in a stationary condition during the simulation period, and the parameters are generally prescribed though calibration with observed data. In reality, but the model parameters related to the physical or conceptual characteristics of a catchment will travel in nonstationary conditions in response to climate transition and land use alteration. The travels or changes of parameters are especially evident for long-term hydrological simulations. Therefore, the assumption of using constant parameters under nonstationary condition is inappropriate, and it will deliver errors from the parameters to the outputs during the simulation and prediction. Even though a few of studies have acknowledged the parameter travel or change, little attention has been paid on the estimation of changing parameters. In this study, we employ an ensemble Kalman filter (EnKF) based method to trace parameter changes in real time. Through synthetic experiments, the capability of the EnKF-based is demonstrated by assimilating runoff observations into a rainfall-runoff model, i.e., the Xinanjing Model. In addition to the stationary condition, three typical nonstationary conditions are considered, i.e., the leap, linear and Ω-shaped transitions. To examine the robustness of the method, different errors from rainfall input, modelling and observations are investigated. The shuffled complex evolution (SCE-UA) algorithm is applied under the same conditions to make a comparison. The results show that the EnKF-based method is capable of capturing the general pattern of the parameter travels even for high levels of uncertainties. It provides better estimates than the SCE-UA method does by taking advantages of real

  10. Microarray data classification using the spectral-feature-based TLS ensemble algorithm.

    PubMed

    Sun, Zhan-Li; Wang, Han; Lau, Wai-Shing; Seet, Gerald; Wang, Danwei; Lam, Kin-Man

    2014-09-01

    The reliable and accurate identification of cancer categories is crucial to a successful diagnosis and a proper treatment of the disease. In most existing work, samples of gene expression data are treated as one-dimensional signals, and are analyzed by means of some statistical signal processing techniques or intelligent computation algorithms. In this paper, from an image-processing viewpoint, a spectral-feature-based Tikhonov-regularized least-squares (TLS) ensemble algorithm is proposed for cancer classification using gene expression data. In the TLS model, a test sample is represented as a linear combination of the atoms of a dictionary. Two types of dictionaries, namely singular value decomposition (SVD)-based eigenassays and independent component analysis (ICA)-based eigenassays, are proposed for the TLS model, and both are extracted via a two-stage approach. The proposed algorithm is inspired by our finding that, among these eigenassays, the categories of some of the testing samples can be assigned correctly by using the TLS models formed from some of the spectral features, but not for those formed from the original samples only. In order to retain the positive characteristics of these spectral features in making correct category assignments, a strategy of classifier committee learning (CCL) is designed to combine the results obtained from the different spectral features. Experimental results on standard databases demonstrate the feasibility and effectiveness of the proposed method.

  11. Surrogate model based iterative ensemble smoother for subsurface flow data assimilation

    NASA Astrophysics Data System (ADS)

    Chang, Haibin; Liao, Qinzhuo; Zhang, Dongxiao

    2017-02-01

    Subsurface geological formation properties often involve some degree of uncertainty. Thus, for most conditions, uncertainty quantification and data assimilation are necessary for predicting subsurface flow. The surrogate model based method is one common type of uncertainty quantification method, in which a surrogate model is constructed for approximating the relationship between model output and model input. Based on the prediction ability, the constructed surrogate model can be utilized for performing data assimilation. In this work, we develop an algorithm for implementing an iterative ensemble smoother (ES) using the surrogate model. We first derive an iterative ES scheme using a regular routine. In order to utilize surrogate models, we then borrow the idea of Chen and Oliver (2013) to modify the Hessian, and further develop an independent parameter based iterative ES formula. Finally, we establish the algorithm for the implementation of iterative ES using surrogate models. Two surrogate models, the PCE surrogate and the interpolation surrogate, are introduced for illustration. The performances of the proposed algorithm are tested by synthetic cases. The results show that satisfactory data assimilation results can be obtained by using surrogate models that have sufficient accuracy.

  12. An Approach for Identifying Cytokines Based on a Novel Ensemble Classifier

    PubMed Central

    Zou, Quan; Wang, Zhen; Guan, Xinjun; Liu, Bin; Wu, Yunfeng; Lin, Ziyu

    2013-01-01

    Biology is meaningful and important to identify cytokines and investigate their various functions and biochemical mechanisms. However, several issues remain, including the large scale of benchmark datasets, serious imbalance of data, and discovery of new gene families. In this paper, we employ the machine learning approach based on a novel ensemble classifier to predict cytokines. We directly selected amino acids sequences as research objects. First, we pretreated the benchmark data accurately. Next, we analyzed the physicochemical properties and distribution of whole amino acids and then extracted a group of 120-dimensional (120D) valid features to represent sequences. Third, in the view of the serious imbalance in benchmark datasets, we utilized a sampling approach based on the synthetic minority oversampling technique algorithm and K-means clustering undersampling algorithm to rebuild the training set. Finally, we built a library for dynamic selection and circulating combination based on clustering (LibD3C) and employed the new training set to realize cytokine classification. Experiments showed that the geometric mean of sensitivity and specificity obtained through our approach is as high as 93.3%, which proves that our approach is effective for identifying cytokines. PMID:24027761

  13. Towards the Operational Ensemble-based Data Assimilation System for the Wave Field at the National Weather Service

    NASA Astrophysics Data System (ADS)

    Flampouris, Stylianos; Penny, Steve; Alves, Henrique

    2017-04-01

    The National Centers for Environmental Prediction (NCEP) of the National Oceanic and Atmospheric Administration (NOAA) provides the operational wave forecast for the US National Weather Service (NWS). Given the continuous efforts to improve forecast, NCEP is developing an ensemble-based data assimilation system, based on the local ensemble transform Kalman filter (LETKF), the existing operational global wave ensemble system (GWES) and on satellite and in-situ observations. While the LETKF was designed for atmospheric applications (Hunt et al 2007), and has been adapted for several ocean models (e.g. Penny 2016), this is the first time applied for oceanic waves assimilation. This new wave assimilation system provides a global estimation of the surface sea state and its approximate uncertainty. It achieves this by analyzing the 21-member ensemble of the significant wave height provided by GWES every 6h. Observations from four altimeters and all the available in-situ measurements are used in this analysis. The analysis of the significant wave height is used for initializing the next forecasting cycle; the data assimilation system is currently being tested for operational use.

  14. An efficient tree classifier ensemble-based approach for pedestrian detection.

    PubMed

    Xu, Yanwu; Cao, Xianbin; Qiao, Hong

    2011-02-01

    Classification-based pedestrian detection systems (PDSs) are currently a hot research topic in the field of intelligent transportation. A PDS detects pedestrians in real time on moving vehicles. A practical PDS demands not only high detection accuracy but also high detection speed. However, most of the existing classification-based approaches mainly seek for high detection accuracy, while the detection speed is not purposely optimized for practical application. At the same time, the performance, particularly the speed, is primarily tuned based on experiments without theoretical foundations, leading to a long training procedure. This paper starts with measuring and optimizing detection speed, and then a practical classification-based pedestrian detection solution with high detection speed and training speed is described. First, an extended classification/detection speed metric, named feature-per-object (fpo), is proposed to measure the detection speed independently from execution. Then, an fpo minimization model with accuracy constraints is formulated based on a tree classifier ensemble, where the minimum fpo can guarantee the highest detection speed. Finally, the minimization problem is solved efficiently by using nonlinear fitting based on radial basis function neural networks. In addition, the optimal solution is directly used to instruct classifier training; thus, the training speed could be accelerated greatly. Therefore, a rapid and accurate classification-based detection technique is proposed for the PDS. Experimental results on urban traffic videos show that the proposed method has a high detection speed with an acceptable detection rate and a false-alarm rate for onboard detection; moreover, the training procedure is also very fast.

  15. Multiple time step molecular dynamics in the optimized isokinetic ensemble steered with the molecular theory of solvation: Accelerating with advanced extrapolation of effective solvation forces

    NASA Astrophysics Data System (ADS)

    Omelyan, Igor; Kovalenko, Andriy

    2013-12-01

    We develop efficient handling of solvation forces in the multiscale method of multiple time step molecular dynamics (MTS-MD) of a biomolecule steered by the solvation free energy (effective solvation forces) obtained from the 3D-RISM-KH molecular theory of solvation (three-dimensional reference interaction site model complemented with the Kovalenko-Hirata closure approximation). To reduce the computational expenses, we calculate the effective solvation forces acting on the biomolecule by using advanced solvation force extrapolation (ASFE) at inner time steps while converging the 3D-RISM-KH integral equations only at large outer time steps. The idea of ASFE consists in developing a discrete non-Eckart rotational transformation of atomic coordinates that minimizes the distances between the atomic positions of the biomolecule at different time moments. The effective solvation forces for the biomolecule in a current conformation at an inner time step are then extrapolated in the transformed subspace of those at outer time steps by using a modified least square fit approach applied to a relatively small number of the best force-coordinate pairs. The latter are selected from an extended set collecting the effective solvation forces obtained from 3D-RISM-KH at outer time steps over a broad time interval. The MTS-MD integration with effective solvation forces obtained by converging 3D-RISM-KH at outer time steps and applying ASFE at inner time steps is stabilized by employing the optimized isokinetic Nosé-Hoover chain (OIN) ensemble. Compared to the previous extrapolation schemes used in combination with the Langevin thermostat, the ASFE approach substantially improves the accuracy of evaluation of effective solvation forces and in combination with the OIN thermostat enables a dramatic increase of outer time steps. We demonstrate on a fully flexible model of alanine dipeptide in aqueous solution that the MTS-MD/OIN/ASFE/3D-RISM-KH multiscale method of molecular dynamics

  16. Multiple time step molecular dynamics in the optimized isokinetic ensemble steered with the molecular theory of solvation: Accelerating with advanced extrapolation of effective solvation forces

    SciTech Connect

    Omelyan, Igor E-mail: omelyan@icmp.lviv.ua; Kovalenko, Andriy

    2013-12-28

    We develop efficient handling of solvation forces in the multiscale method of multiple time step molecular dynamics (MTS-MD) of a biomolecule steered by the solvation free energy (effective solvation forces) obtained from the 3D-RISM-KH molecular theory of solvation (three-dimensional reference interaction site model complemented with the Kovalenko-Hirata closure approximation). To reduce the computational expenses, we calculate the effective solvation forces acting on the biomolecule by using advanced solvation force extrapolation (ASFE) at inner time steps while converging the 3D-RISM-KH integral equations only at large outer time steps. The idea of ASFE consists in developing a discrete non-Eckart rotational transformation of atomic coordinates that minimizes the distances between the atomic positions of the biomolecule at different time moments. The effective solvation forces for the biomolecule in a current conformation at an inner time step are then extrapolated in the transformed subspace of those at outer time steps by using a modified least square fit approach applied to a relatively small number of the best force-coordinate pairs. The latter are selected from an extended set collecting the effective solvation forces obtained from 3D-RISM-KH at outer time steps over a broad time interval. The MTS-MD integration with effective solvation forces obtained by converging 3D-RISM-KH at outer time steps and applying ASFE at inner time steps is stabilized by employing the optimized isokinetic Nosé-Hoover chain (OIN) ensemble. Compared to the previous extrapolation schemes used in combination with the Langevin thermostat, the ASFE approach substantially improves the accuracy of evaluation of effective solvation forces and in combination with the OIN thermostat enables a dramatic increase of outer time steps. We demonstrate on a fully flexible model of alanine dipeptide in aqueous solution that the MTS-MD/OIN/ASFE/3D-RISM-KH multiscale method of molecular dynamics

  17. Plasmon-enhanced sensitivity of spin-based sensors based on a diamond ensemble of nitrogen vacancy color centers.

    PubMed

    Guo, Hao; Chen, Yulei; Wu, Dajin; Zhao, Rui; Tang, Jun; Ma, Zongmin; Xue, Chenyang; Zhang, Wendong; Liu, Jun

    2017-02-01

    A method for enhancement of the sensitivity of a spin sensor based on an ensemble of nitrogen vacancy (NV) color centers was demonstrated. Gold nanoparticles (NPs) were deposited on the bulk diamond, which had NV centers distributed on its surface. The experimental results demonstrate that, when using this simple method, plasmon enhancement of the deposited gold NPs produces an improvement of ∼10 times in the quantum efficiency and has also improved the signal-to-noise ratio by approximately ∼2.5 times. It was also shown that more electrons participated in the spin sensing process, leading to an improvement in the sensitivity of approximately seven times; this has been proved by Rabi oscillation and optical detection of magnetic resonance (ODMR) measurements. The proposed method has proved to be a more efficient way to design an ensemble of NV centers-based sensors; because the result increases in the number of NV centers, the quantum efficiency and the contrast ratio could greatly increase the device's sensitivity.

  18. A rapid, ensemble and free energy based method for engineering protein stabilities.

    PubMed

    Naganathan, Athi N

    2013-05-02

    Engineering the conformational stabilities of proteins through mutations has immense potential in biotechnological applications. It is, however, an inherently challenging problem given the weak noncovalent nature of the stabilizing interactions. In this regard, we present here a robust and fast strategy to engineer protein stabilities through mutations involving charged residues using a structure-based statistical mechanical model that accounts for the ensemble nature of folding. We validate the method by predicting the absolute changes in stability for 138 experimental mutations from 16 different proteins and enzymes with a correlation of 0.65 and importantly with a success rate of 81%. Multiple point mutants are predicted with a higher success rate (90%) that is validated further by comparing meosphile-thermophile protein pairs. In parallel, we devise a methodology to rapidly engineer mutations in silico which we benchmark against experimental mutations of ubiquitin (correlation of 0.95) and check for its feasibility on a larger therapeutic protein DNase I. We expect the method to be of importance as a first and rapid step to screen for protein mutants with specific stability in the biotechnology industry, in the construction of stability maps at the residue level (i.e., hot spots), and as a robust tool to probe for mutations that enhance the stability of protein-based drugs.

  19. Seasonal drought ensemble predictions based on multiple climate models in the upper Han River Basin, China

    NASA Astrophysics Data System (ADS)

    Ma, Feng; Ye, Aizhong; Duan, Qingyun

    2017-03-01

    An experimental seasonal drought forecasting system is developed based on 29-year (1982-2010) seasonal meteorological hindcasts generated by the climate models from the North American Multi-Model Ensemble (NMME) project. This system made use of a bias correction and spatial downscaling method, and a distributed time-variant gain model (DTVGM) hydrologic model. DTVGM was calibrated using observed daily hydrological data and its streamflow simulations achieved Nash-Sutcliffe efficiency values of 0.727 and 0.724 during calibration (1978-1995) and validation (1996-2005) periods, respectively, at the Danjiangkou reservoir station. The experimental seasonal drought forecasting system (known as NMME-DTVGM) is used to generate seasonal drought forecasts. The forecasts were evaluated against the reference forecasts (i.e., persistence forecast and climatological forecast). The NMME-DTVGM drought forecasts have higher detectability and accuracy and lower false alarm rate than the reference forecasts at different lead times (from 1 to 4 months) during the cold-dry season. No apparent advantage is shown in drought predictions during spring and summer seasons because of a long memory of the initial conditions in spring and a lower predictive skill for precipitation in summer. Overall, the NMME-based seasonal drought forecasting system has meaningful skill in predicting drought several months in advance, which can provide critical information for drought preparedness and response planning as well as the sustainable practice of water resource conservation over the basin.

  20. A single-ensemble clutter rejection method based on the analytic geometry for ultrasound color flow imaging.

    PubMed

    You, Wei; Wang, Yuanyuan

    2011-11-01

    In ultrasound color flow imaging (CFI), the single-ensemble eigen-based filters can reject clutter components using each slow-time ensemble individually. They have shown excellent spatial adaptability. This article proposes a novel clutter rejection method called the single-ensemble geometry filter (SGF), which is derived from an analytic geometry perspective. If the transmitted pulse number M equals two, the clutter component distribution on a two-dimensional (2-D) plane will be similar to a tilted ellipse. Therefore, the direction of the major axis of the ellipse can be used as the first principal component of the autocorrelation matrix estimated from multiple ensembles. Then the algorithm is generalized from 2-D to a higher dimensional space by using linear algebra representations of the ellipse. Comparisons have been made with the high-pass filter (HPF), the Hankel-singular value decomposition (SVD) filter and the recursive eigen-decomposition (RED) method using both simulated and human carotid data. Results show that compared with HPF and Hankel-SVD, the proposed filter causes less bias on the velocity estimation when the clutter velocity is close to that of the blood flow. On the other hand, the proposed filter does not need to update the autocorrelation matrix and can achieve better spatial adaptability than the RED. Copyright © 2011 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.

  1. MRF-Based Spatial Expert Tracking of the Multi-Model Ensemble

    NASA Astrophysics Data System (ADS)

    McQuade, S.; Monteleoni, C.

    2013-12-01

    We consider the problem of adaptively combining the 'multi-model ensemble' of General Circulation Models (GCMs) that inform the Intergovernmental Panel on Climate Change (IPCC), drawn from major laboratories around the world. This problem can be treated as an expert tracking problem in the online setting, where an algorithm maintains a set of weights over the experts (here the GCMs are the experts). At each time interval these weights are used to make a combined projection, and then the weights can be updated based on the performance of experts. In this work we focus on tracking the GCMs at different geographic locations and effectively incorporating spatial influence and correlations between these locations. We approach this multi-model ensemble problem using a pairwise Markov Random Field (MRF), where the state of each hidden variable is the identity of the best GCM at a specific location. Our MRF takes the form of a lattice over the Earth, with links between neighboring locations. To establish reasonable energy functions for the MRF, we first show that the Fixed-Share algorithm for expert tracking over time can be expressed as a simple MRF. By expressing Fixed-Share as an MRF, we identify the energy function that corresponds to the switching dynamics (how the best expert switches over time). Since an MRF is an undirected graph, this 'switching' energy function can be naturally applied to spatial links between variables as well. To calculate the marginal probabilities of the hidden variables (i.e. our new beliefs over GCMs), we apply Loopy Belief Propagation (LBP) to the MRF. In LBP, each node sends messages to neighboring nodes about the sender's 'belief' of the neighbor's state. Figure 1 shows our initial results from an online evaluation of GCM temperature hindcasts from the IPCC Phase 3 Coupled Model Intercomparison Project (CMIP3) archive. The red line shows the mean loss of our method versus the spatial switching rate. The right-most point on the graph

  2. Development of web-based services for an ensemble flood forecasting and risk assessment system

    NASA Astrophysics Data System (ADS)

    Yaw Manful, Desmond; He, Yi; Cloke, Hannah; Pappenberger, Florian; Li, Zhijia; Wetterhall, Fredrik; Huang, Yingchun; Hu, Yuzhong

    2010-05-01

    Flooding is a wide spread and devastating natural disaster worldwide. Floods that took place in the last decade in China were ranked the worst amongst recorded floods worldwide in terms of the number of human fatalities and economic losses (Munich Re-Insurance). Rapid economic development and population expansion into low lying flood plains has worsened the situation. Current conventional flood prediction systems in China are neither suited to the perceptible climate variability nor the rapid pace of urbanization sweeping the country. Flood prediction, from short-term (a few hours) to medium-term (a few days), needs to be revisited and adapted to changing socio-economic and hydro-climatic realities. The latest technology requires implementation of multiple numerical weather prediction systems. The availability of twelve global ensemble weather prediction systems through the ‘THORPEX Interactive Grand Global Ensemble' (TIGGE) offers a good opportunity for an effective state-of-the-art early forecasting system. A prototype of a Novel Flood Early Warning System (NEWS) using the TIGGE database is tested in the Huai River basin in east-central China. It is the first early flood warning system in China that uses the massive TIGGE database cascaded with river catchment models, the Xinanjiang hydrologic model and a 1-D hydraulic model, to predict river discharge and flood inundation. The NEWS algorithm is also designed to provide web-based services to a broad spectrum of end-users. The latter presents challenges as both databases and proprietary codes reside in different locations and converge at dissimilar times. NEWS will thus make use of a ready-to-run grid system that makes distributed computing and data resources available in a seamless and secure way. An ability to run or function on different operating systems and provide an interface or front that is accessible to broad spectrum of end-users is additional requirement. The aim is to achieve robust interoperability

  3. Effective Visualization of Temporal Ensembles.

    PubMed

    Hao, Lihua; Healey, Christopher G; Bass, Steffen A

    2016-01-01

    An ensemble is a collection of related datasets, called members, built from a series of runs of a simulation or an experiment. Ensembles are large, temporal, multidimensional, and multivariate, making them difficult to analyze. Another important challenge is visualizing ensembles that vary both in space and time. Initial visualization techniques displayed ensembles with a small number of members, or presented an overview of an entire ensemble, but without potentially important details. Recently, researchers have suggested combining these two directions, allowing users to choose subsets of members to visualization. This manual selection process places the burden on the user to identify which members to explore. We first introduce a static ensemble visualization system that automatically helps users locate interesting subsets of members to visualize. We next extend the system to support analysis and visualization of temporal ensembles. We employ 3D shape comparison, cluster tree visualization, and glyph based visualization to represent different levels of detail within an ensemble. This strategy is used to provide two approaches for temporal ensemble analysis: (1) segment based ensemble analysis, to capture important shape transition time-steps, clusters groups of similar members, and identify common shape changes over time across multiple members; and (2) time-step based ensemble analysis, which assumes ensemble members are aligned in time by combining similar shapes at common time-steps. Both approaches enable users to interactively visualize and analyze a temporal ensemble from different perspectives at different levels of detail. We demonstrate our techniques on an ensemble studying matter transition from hadronic gas to quark-gluon plasma during gold-on-gold particle collisions.

  4. An Efficient Data-worth Analysis Framework via Probabilistic Collocation Method Based Ensemble Kalman Filter

    NASA Astrophysics Data System (ADS)

    Xue, L.; Dai, C.; Zhang, D.; Guadagnini, A.

    2015-12-01

    It is critical to predict contaminant plume in an aquifer under uncertainty, which can help assess environmental risk and design rational management strategies. An accurate prediction of contaminant plume requires the collection of data to help characterize the system. Due to the limitation of financial resources, ones should estimate the expectative value of data collected from each optional monitoring scheme before carried out. Data-worth analysis is believed to be an effective approach to identify the value of the data in some problems, which quantifies the uncertainty reduction assuming that the plausible data has been collected. However, it is difficult to apply the data-worth analysis to a dynamic simulation of contaminant transportation model owning to its requirement of large number of inverse-modeling. In this study, a novel efficient data-worth analysis framework is proposed by developing the Probabilistic Collocation Method based Ensemble Kalman Filter (PCKF). The PCKF constructs polynomial chaos expansion surrogate model to replace the original complex numerical model. Consequently, the inverse modeling can perform on the proxy rather than the original model. An illustrative example, considering the dynamic change of the contaminant concentration, is employed to demonstrate the proposed approach. The Results reveal that schemes with different sampling frequencies, monitoring networks location, prior data content will have significant impact on the uncertainty reduction of the estimation of contaminant plume. Our proposition is validated to provide the reasonable value of data from various schemes.

  5. Validation and Parameter Sensitivity Tests for Reconstructing Swell Field Based on an Ensemble Kalman Filter.

    PubMed

    Wang, Xuan; Tandeo, Pierre; Fablet, Ronan; Husson, Romain; Guan, Lei; Chen, Ge

    2016-11-25

    The swell propagation model built on geometric optics is known to work well when simulating radiated swells from a far located storm. Based on this simple approximation, satellites have acquired plenty of large samples on basin-traversing swells induced by fierce storms situated in mid-latitudes. How to routinely reconstruct swell fields with these irregularly sampled observations from space via known swell propagation principle requires more examination. In this study, we apply 3-h interval pseudo SAR observations in the ensemble Kalman filter (EnKF) to reconstruct a swell field in ocean basin, and compare it with buoy swell partitions and polynomial regression results. As validated against in situ measurements, EnKF works well in terms of spatial-temporal consistency in far-field swell propagation scenarios. Using this framework, we further address the influence of EnKF parameters, and perform a sensitivity analysis to evaluate estimations made under different sets of parameters. Such analysis is of key interest with respect to future multiple-source routinely recorded swell field data. Satellite-derived swell data can serve as a valuable complementary dataset to in situ or wave re-analysis datasets.

  6. Validation and Parameter Sensitivity Tests for Reconstructing Swell Field Based on an Ensemble Kalman Filter

    PubMed Central

    Wang, Xuan; Tandeo, Pierre; Fablet, Ronan; Husson, Romain; Guan, Lei; Chen, Ge

    2016-01-01

    The swell propagation model built on geometric optics is known to work well when simulating radiated swells from a far located storm. Based on this simple approximation, satellites have acquired plenty of large samples on basin-traversing swells induced by fierce storms situated in mid-latitudes. How to routinely reconstruct swell fields with these irregularly sampled observations from space via known swell propagation principle requires more examination. In this study, we apply 3-h interval pseudo SAR observations in the ensemble Kalman filter (EnKF) to reconstruct a swell field in ocean basin, and compare it with buoy swell partitions and polynomial regression results. As validated against in situ measurements, EnKF works well in terms of spatial–temporal consistency in far-field swell propagation scenarios. Using this framework, we further address the influence of EnKF parameters, and perform a sensitivity analysis to evaluate estimations made under different sets of parameters. Such analysis is of key interest with respect to future multiple-source routinely recorded swell field data. Satellite-derived swell data can serve as a valuable complementary dataset to in situ or wave re-analysis datasets. PMID:27898005

  7. A global perspective of the limits of prediction skill based on the ECMWF ensemble

    NASA Astrophysics Data System (ADS)

    Zagar, Nedjeljka

    2016-04-01

    In this talk presents a new model of the global forecast error growth applied to the forecast errors simulated by the ensemble prediction system (ENS) of the ECMWF. The proxy for forecast errors is the total spread of the ECMWF operational ensemble forecasts obtained by the decomposition of the wind and geopotential fields in the normal-mode functions. In this way, the ensemble spread can be quantified separately for the balanced and inertio-gravity (IG) modes for every forecast range. Ensemble reliability is defined for the balanced and IG modes comparing the ensemble spread with the control analysis in each scale. The results show that initial uncertainties in the ECMWF ENS are largest in the tropical large-scale modes and their spatial distribution is similar to the distribution of the short-range forecast errors. Initially the ensemble spread grows most in the smallest scales and in the synoptic range of the IG modes but the overall growth is dominated by the increase of spread in balanced modes in synoptic and planetary scales in the midlatitudes. During the forecasts, the distribution of spread in the balanced and IG modes grows towards the climatological spread distribution characteristic of the analyses. The ENS system is found to be somewhat under-dispersive which is associated with the lack of tropical variability, primarily the Kelvin waves. The new model of the forecast error growth has three fitting parameters to parameterize the initial fast growth and a more slow exponential error growth later on. The asymptotic values of forecast errors are independent of the exponential growth rate. It is found that the asymptotic values of the errors due to unbalanced dynamics are around 10 days while the balanced and total errors saturate in 3 to 4 weeks. Reference: Žagar, N., R. Buizza, and J. Tribbia, 2015: A three-dimensional multivariate modal analysis of atmospheric predictability with application to the ECMWF ensemble. J. Atmos. Sci., 72, 4423-4444.

  8. Evaluating the effect of disturbed ensemble distributions on SCFG based statistical sampling of RNA secondary structures

    PubMed Central

    2012-01-01

    Background Over the past years, statistical and Bayesian approaches have become increasingly appreciated to address the long-standing problem of computational RNA structure prediction. Recently, a novel probabilistic method for the prediction of RNA secondary structures from a single sequence has been studied which is based on generating statistically representative and reproducible samples of the entire ensemble of feasible structures for a particular input sequence. This method samples the possible foldings from a distribution implied by a sophisticated (traditional or length-dependent) stochastic context-free grammar (SCFG) that mirrors the standard thermodynamic model applied in modern physics-based prediction algorithms. Specifically, that grammar represents an exact probabilistic counterpart to the energy model underlying the Sfold software, which employs a sampling extension of the partition function (PF) approach to produce statistically representative subsets of the Boltzmann-weighted ensemble. Although both sampling approaches have the same worst-case time and space complexities, it has been indicated that they differ in performance (both with respect to prediction accuracy and quality of generated samples), where neither of these two competing approaches generally outperforms the other. Results In this work, we will consider the SCFG based approach in order to perform an analysis on how the quality of generated sample sets and the corresponding prediction accuracy changes when different degrees of disturbances are incorporated into the needed sampling probabilities. This is motivated by the fact that if the results prove to be resistant to large errors on the distinct sampling probabilities (compared to the exact ones), then it will be an indication that these probabilities do not need to be computed exactly, but it may be sufficient and more efficient to approximate them. Thus, it might then be possible to decrease the worst-case time requirements of

  9. Evaluating the effect of disturbed ensemble distributions on SCFG based statistical sampling of RNA secondary structures.

    PubMed

    Scheid, Anika; Nebel, Markus E

    2012-07-09

    Over the past years, statistical and Bayesian approaches have become increasingly appreciated to address the long-standing problem of computational RNA structure prediction. Recently, a novel probabilistic method for the prediction of RNA secondary structures from a single sequence has been studied which is based on generating statistically representative and reproducible samples of the entire ensemble of feasible structures for a particular input sequence. This method samples the possible foldings from a distribution implied by a sophisticated (traditional or length-dependent) stochastic context-free grammar (SCFG) that mirrors the standard thermodynamic model applied in modern physics-based prediction algorithms. Specifically, that grammar represents an exact probabilistic counterpart to the energy model underlying the Sfold software, which employs a sampling extension of the partition function (PF) approach to produce statistically representative subsets of the Boltzmann-weighted ensemble. Although both sampling approaches have the same worst-case time and space complexities, it has been indicated that they differ in performance (both with respect to prediction accuracy and quality of generated samples), where neither of these two competing approaches generally outperforms the other. In this work, we will consider the SCFG based approach in order to perform an analysis on how the quality of generated sample sets and the corresponding prediction accuracy changes when different degrees of disturbances are incorporated into the needed sampling probabilities. This is motivated by the fact that if the results prove to be resistant to large errors on the distinct sampling probabilities (compared to the exact ones), then it will be an indication that these probabilities do not need to be computed exactly, but it may be sufficient and more efficient to approximate them. Thus, it might then be possible to decrease the worst-case time requirements of such an SCFG based

  10. A demonstration of ensemble-based assimilation methods with a layered OGCM from the perspective of operational ocean forecasting systems

    NASA Astrophysics Data System (ADS)

    Brusdal, K.; Brankart, J. M.; Halberstadt, G.; Evensen, G.; Brasseur, P.; van Leeuwen, P. J.; Dombrowsky, E.; Verron, J.

    2003-04-01

    A demonstration study of three advanced, sequential data assimilation methods, applied with the nonlinear Miami Isopycnic Coordinate Ocean Model (MICOM), has been performed within the European Commission-funded DIADEM project. The data assimilation techniques considered are the Ensemble Kalman Filter (EnKF), the Ensemble Kalman Smoother (EnKS) and the Singular Evolutive Extended Kalman (SEEK) Filter, which all in different ways resemble the original Kalman Filter. In the EnKF and EnKS an ensemble of model states is integrated forward in time according to the model dynamics, and statistical moments needed at analysis time are calculated from the ensemble of model states. The EnKS, as opposed to the EnKF, update the analysis also backward in time whenever new observations are available, thereby improving the estimated states at the previous analysis times. The SEEK filter reduces the computational burden of the error propagation by representing the errors in a subspace which is initially calculated from a truncated EOF analysis. A hindcast experiment, where sea-level anomaly and sea-surface temperature data are assimilated, has been conducted in the North Atlantic for the time period July until September 1996. In this paper, we describe the implementation of ensemble-based assimilation methods with a common theoretical framework, we present results from hindcast experiments achieved with the EnKF, EnKS and SEEK filter, and we discuss the relative merits of these methods from the perspective of operational marine monitoring and forecasting systems. We found that the three systems have similar performances, and they can be considered feasible technologically for building preoperational prototypes.

  11. Determining optimal clothing ensembles based on weather forecasts, with particular reference to outdoor winter military activities

    NASA Astrophysics Data System (ADS)

    Morabito, Marco; Pavlinic, Daniela Z.; Crisci, Alfonso; Capecchi, Valerio; Orlandini, Simone; Mekjavic, Igor B.

    2011-07-01

    Military and civil defense personnel are often involved in complex activities in a variety of outdoor environments. The choice of appropriate clothing ensembles represents an important strategy to establish the success of a military mission. The main aim of this study was to compare the known clothing insulation of the garment ensembles worn by soldiers during two winter outdoor field trials (hike and guard duty) with the estimated optimal clothing thermal insulations recommended to maintain thermoneutrality, assessed by using two different biometeorological procedures. The overall aim was to assess the applicability of such biometeorological procedures to weather forecast systems, thereby developing a comprehensive biometeorological tool for military operational forecast purposes. Military trials were carried out during winter 2006 in Pokljuka (Slovenia) by Slovene Armed Forces personnel. Gastrointestinal temperature, heart rate and environmental parameters were measured with portable data acquisition systems. The thermal characteristics of the clothing ensembles worn by the soldiers, namely thermal resistance, were determined with a sweating thermal manikin. Results showed that the clothing ensemble worn by the military was appropriate during guard duty but generally inappropriate during the hike. A general under-estimation of the biometeorological forecast model in predicting the optimal clothing insulation value was observed and an additional post-processing calibration might further improve forecast accuracy. This study represents the first step in the development of a comprehensive personalized biometeorological forecast system aimed at improving recommendations regarding the optimal thermal insulation of military garment ensembles for winter activities.

  12. A new ensemble-based consistency test for the Community Earth System Model

    NASA Astrophysics Data System (ADS)

    Baker, A. H.; Hammerling, D. M.; Levy, M. N.; Xu, H.; Dennis, J. M.; Eaton, B. E.; Edwards, J.; Hannay, C.; Mickelson, S. A.; Neale, R. B.; Nychka, D.; Shollenberger, J.; Tribbia, J.; Vertenstein, M.; Williamson, D.

    2015-05-01

    Climate simulations codes, such as the Community Earth System Model (CESM), are especially complex and continually evolving. Their on-going state of development requires frequent software verification in the form of quality assurance to both preserve the quality of the code and instill model confidence. To formalize and simplify this previously subjective and computationally-expensive aspect of the verification process, we have developed a new tool for evaluating climate consistency. Because an ensemble of simulations allows us to gauge the natural variability of the model's climate, our new tool uses an ensemble approach for consistency testing. In particular, an ensemble of CESM climate runs is created, from which we obtain a statistical distribution that can be used to determine whether a new climate run is statistically distinguishable from the original ensemble. The CESM Ensemble Consistency Test, referred to as CESM-ECT, is objective in nature and accessible to CESM developers and users. The tool has proven its utility in detecting errors in software and hardware environments and providing rapid feedback to model developers.

  13. Determining optimal clothing ensembles based on weather forecasts, with particular reference to outdoor winter military activities.

    PubMed

    Morabito, Marco; Pavlinic, Daniela Z; Crisci, Alfonso; Capecchi, Valerio; Orlandini, Simone; Mekjavic, Igor B

    2011-07-01

    Military and civil defense personnel are often involved in complex activities in a variety of outdoor environments. The choice of appropriate clothing ensembles represents an important strategy to establish the success of a military mission. The main aim of this study was to compare the known clothing insulation of the garment ensembles worn by soldiers during two winter outdoor field trials (hike and guard duty) with the estimated optimal clothing thermal insulations recommended to maintain thermoneutrality, assessed by using two different biometeorological procedures. The overall aim was to assess the applicability of such biometeorological procedures to weather forecast systems, thereby developing a comprehensive biometeorological tool for military operational forecast purposes. Military trials were carried out during winter 2006 in Pokljuka (Slovenia) by Slovene Armed Forces personnel. Gastrointestinal temperature, heart rate and environmental parameters were measured with portable data acquisition systems. The thermal characteristics of the clothing ensembles worn by the soldiers, namely thermal resistance, were determined with a sweating thermal manikin. Results showed that the clothing ensemble worn by the military was appropriate during guard duty but generally inappropriate during the hike. A general under-estimation of the biometeorological forecast model in predicting the optimal clothing insulation value was observed and an additional post-processing calibration might further improve forecast accuracy. This study represents the first step in the development of a comprehensive personalized biometeorological forecast system aimed at improving recommendations regarding the optimal thermal insulation of military garment ensembles for winter activities.

  14. Constructing prediction interval for artificial neural network rainfall runoff models based on ensemble simulations

    NASA Astrophysics Data System (ADS)

    Kasiviswanathan, K. S.; Cibin, R.; Sudheer, K. P.; Chaubey, I.

    2013-08-01

    This paper presents a method of constructing prediction interval for artificial neural network (ANN) rainfall runoff models during calibration with a consideration of generating ensemble predictions. A two stage optimization procedure is envisaged in this study for construction of prediction interval for the ANN output. In Stage 1, ANN model is trained with genetic algorithm (GA) to obtain optimal set of weights and biases vector. In Stage 2, possible variability of ANN parameters (obtained in Stage 1) is optimized so as to create an ensemble of models with the consideration of minimum residual variance for the ensemble mean, while ensuring a maximum of the measured data to fall within the estimated prediction interval. The width of the prediction interval is also minimized simultaneously. The method is demonstrated using a real world case study of rainfall runoff data for an Indian basin. The method was able to produce ensembles with a prediction interval (average width) of 26.49 m3/s with 97.17% of the total observed data points lying within the interval in validation. One specific advantage of the method is that when ensemble mean value is considered as a forecast, the peak flows are predicted with improved accuracy by this method compared to traditional single point forecasted ANNs.

  15. Accelerating Monte Carlo molecular simulations by reweighting and reconstructing Markov chains: Extrapolation of canonical ensemble averages and second derivatives to different temperature and density conditions

    SciTech Connect

    Kadoura, Ahmad; Sun, Shuyu Salama, Amgad

    2014-08-01

    Accurate determination of thermodynamic properties of petroleum reservoir fluids is of great interest to many applications, especially in petroleum engineering and chemical engineering. Molecular simulation has many appealing features, especially its requirement of fewer tuned parameters but yet better predicting capability; however it is well known that molecular simulation is very CPU expensive, as compared to equation of state approaches. We have recently introduced an efficient thermodynamically consistent technique to regenerate rapidly Monte Carlo Markov Chains (MCMCs) at different thermodynamic conditions from the existing data points that have been pre-computed with expensive classical simulation. This technique can speed up the simulation more than a million times, making the regenerated molecular simulation almost as fast as equation of state approaches. In this paper, this technique is first briefly reviewed and then numerically investigated in its capability of predicting ensemble averages of primary quantities at different neighboring thermodynamic conditions to the original simulated MCMCs. Moreover, this extrapolation technique is extended to predict second derivative properties (e.g. heat capacity and fluid compressibility). The method works by reweighting and reconstructing generated MCMCs in canonical ensemble for Lennard-Jones particles. In this paper, system's potential energy, pressure, isochoric heat capacity and isothermal compressibility along isochors, isotherms and paths of changing temperature and density from the original simulated points were extrapolated. Finally, an optimized set of Lennard-Jones parameters (ε, σ) for single site models were proposed for methane, nitrogen and carbon monoxide.

  16. Accelerating Monte Carlo molecular simulations by reweighting and reconstructing Markov chains: Extrapolation of canonical ensemble averages and second derivatives to different temperature and density conditions

    NASA Astrophysics Data System (ADS)

    Kadoura, Ahmad; Sun, Shuyu; Salama, Amgad

    2014-08-01

    Accurate determination of thermodynamic properties of petroleum reservoir fluids is of great interest to many applications, especially in petroleum engineering and chemical engineering. Molecular simulation has many appealing features, especially its requirement of fewer tuned parameters but yet better predicting capability; however it is well known that molecular simulation is very CPU expensive, as compared to equation of state approaches. We have recently introduced an efficient thermodynamically consistent technique to regenerate rapidly Monte Carlo Markov Chains (MCMCs) at different thermodynamic conditions from the existing data points that have been pre-computed with expensive classical simulation. This technique can speed up the simulation more than a million times, making the regenerated molecular simulation almost as fast as equation of state approaches. In this paper, this technique is first briefly reviewed and then numerically investigated in its capability of predicting ensemble averages of primary quantities at different neighboring thermodynamic conditions to the original simulated MCMCs. Moreover, this extrapolation technique is extended to predict second derivative properties (e.g. heat capacity and fluid compressibility). The method works by reweighting and reconstructing generated MCMCs in canonical ensemble for Lennard-Jones particles. In this paper, system's potential energy, pressure, isochoric heat capacity and isothermal compressibility along isochors, isotherms and paths of changing temperature and density from the original simulated points were extrapolated. Finally, an optimized set of Lennard-Jones parameters (ε, σ) for single site models were proposed for methane, nitrogen and carbon monoxide.

  17. The free energy of the metastable supersaturated vapor via restricted ensemble simulations. II. Effects of constraints and comparison with molecular dynamics simulations.

    PubMed

    Nie, Chu; Geng, Jun; Marlow, W H

    2008-06-21

    Extensive restricted canonical ensemble Monte Carlo simulations [D. S. Corti and P. Debenedetti, Chem. Eng. Sci. 49, 2717 (1994)] were performed. Pressure, excess chemical potential, and excess free energy with respect to ideal gas data were obtained at different densities of the supersaturated Lennard-Jones (LJ) vapor at reduced temperatures from 0.7 to 1.0. Among different constraints imposed on the system studied, the one with the local minimum of the excess free energy was taken to be the approximated equilibrium state of the metastable LJ vapor. Also, a comparison of our results with molecular dynamic simulations [A. Linhart et al., J. Chem. Phys. 122, 144506 (2005)] was made.

  18. The free energy of the metastable supersaturated vapor via restricted ensemble simulations. II. Effects of constraints and comparison with molecular dynamics simulations

    NASA Astrophysics Data System (ADS)

    Nie, Chu; Geng, Jun; Marlow, W. H.

    2008-06-01

    Extensive restricted canonical ensemble Monte Carlo simulations [D. S. Corti and P. Debenedetti, Chem. Eng. Sci. 49, 2717 (1994)] were performed. Pressure, excess chemical potential, and excess free energy with respect to ideal gas data were obtained at different densities of the supersaturated Lennard-Jones (LJ) vapor at reduced temperatures from 0.7 to 1.0. Among different constraints imposed on the system studied, the one with the local minimum of the excess free energy was taken to be the approximated equilibrium state of the metastable LJ vapor. Also, a comparison of our results with molecular dynamic simulations [A. Linhart et al., J. Chem. Phys. 122, 144506 (2005)] was made.

  19. Ensemble Tractography

    PubMed Central

    Wandell, Brian A.

    2016-01-01

    Tractography uses diffusion MRI to estimate the trajectory and cortical projection zones of white matter fascicles in the living human brain. There are many different tractography algorithms and each requires the user to set several parameters, such as curvature threshold. Choosing a single algorithm with specific parameters poses two challenges. First, different algorithms and parameter values produce different results. Second, the optimal choice of algorithm and parameter value may differ between different white matter regions or different fascicles, subjects, and acquisition parameters. We propose using ensemble methods to reduce algorithm and parameter dependencies. To do so we separate the processes of fascicle generation and evaluation. Specifically, we analyze the value of creating optimized connectomes by systematically combining candidate streamlines from an ensemble of algorithms (deterministic and probabilistic) and systematically varying parameters (curvature and stopping criterion). The ensemble approach leads to optimized connectomes that provide better cross-validated prediction error of the diffusion MRI data than optimized connectomes generated using a single-algorithm or parameter set. Furthermore, the ensemble approach produces connectomes that contain both short- and long-range fascicles, whereas single-parameter connectomes are biased towards one or the other. In summary, a systematic ensemble tractography approach can produce connectomes that are superior to standard single parameter estimates both for predicting the diffusion measurements and estimating white matter fascicles. PMID:26845558

  20. iACP-GAEnsC: Evolutionary genetic algorithm based ensemble classification of anticancer peptides by utilizing hybrid feature space.

    PubMed

    Akbar, Shahid; Hayat, Maqsood; Iqbal, Muhammad; Jan, Mian Ahmad

    2017-06-01

    Cancer is a fatal disease, responsible for one-quarter of all deaths in developed countries. Traditional anticancer therapies such as, chemotherapy and radiation, are highly expensive, susceptible to errors and ineffective techniques. These conventional techniques induce severe side-effects on human cells. Due to perilous impact of cancer, the development of an accurate and highly efficient intelligent computational model is desirable for identification of anticancer peptides. In this paper, evolutionary intelligent genetic algorithm-based ensemble model, 'iACP-GAEnsC', is proposed for the identification of anticancer peptides. In this model, the protein sequences are formulated, using three different discrete feature representation methods, i.e., amphiphilic Pseudo amino acid composition, g-Gap dipeptide composition, and Reduce amino acid alphabet composition. The performance of the extracted feature spaces are investigated separately and then merged to exhibit the significance of hybridization. In addition, the predicted results of individual classifiers are combined together, using optimized genetic algorithm and simple majority technique in order to enhance the true classification rate. It is observed that genetic algorithm-based ensemble classification outperforms than individual classifiers as well as simple majority voting base ensemble. The performance of genetic algorithm-based ensemble classification is highly reported on hybrid feature space, with an accuracy of 96.45%. In comparison to the existing techniques, 'iACP-GAEnsC' model has achieved remarkable improvement in terms of various performance metrics. Based on the simulation results, it is observed that 'iACP-GAEnsC' model might be a leading tool in the field of drug design and proteomics for researchers. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. Ensemble models of proteins and protein domains based on distance distribution restraints.

    PubMed

    Jeschke, Gunnar

    2016-04-01

    Conformational ensembles of intrinsically disordered peptide chains are not fully determined by experimental observations. Uncertainty due to lack of experimental restraints and due to intrinsic disorder can be distinguished if distance distributions restraints are available. Such restraints can be obtained from pulsed dipolar electron paramagnetic resonance (EPR) spectroscopy applied to pairs of spin labels. Here, we introduce a Monte Carlo approach for generating conformational ensembles that are consistent with a set of distance distribution restraints, backbone dihedral angle statistics in known protein structures, and optionally, secondary structure propensities or membrane immersion depths. The approach is tested with simulated restraints for a terminal and an internal loop and for a protein with 69 residues by using sets of sparse restraints for underlying well-defined conformations and for published ensembles of a premolten globule-like and a coil-like intrinsically disordered protein. © 2016 Wiley Periodicals, Inc.

  2. Self-Adaptive Prediction of Cloud Resource Demands Using Ensemble Model and Subtractive-Fuzzy Clustering Based Fuzzy Neural Network

    PubMed Central

    Chen, Zhijia; Zhu, Yuanchang; Di, Yanqiang; Feng, Shaochong

    2015-01-01

    In IaaS (infrastructure as a service) cloud environment, users are provisioned with virtual machines (VMs). To allocate resources for users dynamically and effectively, accurate resource demands predicting is essential. For this purpose, this paper proposes a self-adaptive prediction method using ensemble model and subtractive-fuzzy clustering based fuzzy neural network (ESFCFNN). We analyze the characters of user preferences and demands. Then the architecture of the prediction model is constructed. We adopt some base predictors to compose the ensemble model. Then the structure and learning algorithm of fuzzy neural network is researched. To obtain the number of fuzzy rules and the initial value of the premise and consequent parameters, this paper proposes the fuzzy c-means combined with subtractive clustering algorithm, that is, the subtractive-fuzzy clustering. Finally, we adopt different criteria to evaluate the proposed method. The experiment results show that the method is accurate and effective in predicting the resource demands. PMID:25691896

  3. Self-adaptive prediction of cloud resource demands using ensemble model and subtractive-fuzzy clustering based fuzzy neural network.

    PubMed

    Chen, Zhijia; Zhu, Yuanchang; Di, Yanqiang; Feng, Shaochong

    2015-01-01

    In IaaS (infrastructure as a service) cloud environment, users are provisioned with virtual machines (VMs). To allocate resources for users dynamically and effectively, accurate resource demands predicting is essential. For this purpose, this paper proposes a self-adaptive prediction method using ensemble model and subtractive-fuzzy clustering based fuzzy neural network (ESFCFNN). We analyze the characters of user preferences and demands. Then the architecture of the prediction model is constructed. We adopt some base predictors to compose the ensemble model. Then the structure and learning algorithm of fuzzy neural network is researched. To obtain the number of fuzzy rules and the initial value of the premise and consequent parameters, this paper proposes the fuzzy c-means combined with subtractive clustering algorithm, that is, the subtractive-fuzzy clustering. Finally, we adopt different criteria to evaluate the proposed method. The experiment results show that the method is accurate and effective in predicting the resource demands.

  4. A novel signal compression method based on optimal ensemble empirical mode decomposition for bearing vibration signals

    NASA Astrophysics Data System (ADS)

    Guo, Wei; Tse, Peter W.

    2013-01-01

    Today, remote machine condition monitoring is popular due to the continuous advancement in wireless communication. Bearing is the most frequently and easily failed component in many rotating machines. To accurately identify the type of bearing fault, large amounts of vibration data need to be collected. However, the volume of transmitted data cannot be too high because the bandwidth of wireless communication is limited. To solve this problem, the data are usually compressed before transmitting to a remote maintenance center. This paper proposes a novel signal compression method that can substantially reduce the amount of data that need to be transmitted without sacrificing the accuracy of fault identification. The proposed signal compression method is based on ensemble empirical mode decomposition (EEMD), which is an effective method for adaptively decomposing the vibration signal into different bands of signal components, termed intrinsic mode functions (IMFs). An optimization method was designed to automatically select appropriate EEMD parameters for the analyzed signal, and in particular to select the appropriate level of the added white noise in the EEMD method. An index termed the relative root-mean-square error was used to evaluate the decomposition performances under different noise levels to find the optimal level. After applying the optimal EEMD method to a vibration signal, the IMF relating to the bearing fault can be extracted from the original vibration signal. Compressing this signal component obtains a much smaller proportion of data samples to be retained for transmission and further reconstruction. The proposed compression method were also compared with the popular wavelet compression method. Experimental results demonstrate that the optimization of EEMD parameters can automatically find appropriate EEMD parameters for the analyzed signals, and the IMF-based compression method provides a higher compression ratio, while retaining the bearing defect

  5. Endowing a Content-Based Medical Image Retrieval System with Perceptual Similarity Using Ensemble Strategy.

    PubMed

    Bedo, Marcos Vinicius Naves; Pereira Dos Santos, Davi; Ponciano-Silva, Marcelo; de Azevedo-Marques, Paulo Mazzoncini; Ferreira de Carvalho, André Ponce de León; Traina, Caetano

    2016-02-01

    Content-based medical image retrieval (CBMIR) is a powerful resource to improve differential computer-aided diagnosis. The major problem with CBMIR applications is the semantic gap, a situation in which the system does not follow the users' sense of similarity. This gap can be bridged by the adequate modeling of similarity queries, which ultimately depends on the combination of feature extractor methods and distance functions. In this study, such combinations are referred to as perceptual parameters, as they impact on how images are compared. In a CBMIR, the perceptual parameters must be manually set by the users, which imposes a heavy burden on the specialists; otherwise, the system will follow a predefined sense of similarity. This paper presents a novel approach to endow a CBMIR with a proper sense of similarity, in which the system defines the perceptual parameter depending on the query element. The method employs ensemble strategy, where an extreme learning machine acts as a meta-learner and identifies the most suitable perceptual parameter according to a given query image. This parameter defines the search space for the similarity query that retrieves the most similar images. An instance-based learning classifier labels the query image following the query result set. As the concept implementation, we integrated the approach into a mammogram CBMIR. For each query image, the resulting tool provided a complete second opinion, including lesion class, system certainty degree, and set of most similar images. Extensive experiments on a large mammogram dataset showed that our proposal achieved a hit ratio up to 10% higher than the traditional CBMIR approach without requiring external parameters from the users. Our database-driven solution was also up to 25% faster than content retrieval traditional approaches.

  6. Multi-model ensemble-based probabilistic prediction of tropical cyclogenesis using TIGGE model forecasts

    NASA Astrophysics Data System (ADS)

    Jaiswal, Neeru; Kishtawal, C. M.; Bhomia, Swati; Pal, P. K.

    2016-10-01

    An extended range tropical cyclogenesis forecast model has been developed using the forecasts of global models available from TIGGE portal. A scheme has been developed to detect the signatures of cyclogenesis in the global model forecast fields [i.e., the mean sea level pressure and surface winds (10 m horizontal winds)]. For this, a wind matching index was determined between the synthetic cyclonic wind fields and the forecast wind fields. The thresholds of 0.4 for wind matching index and 1005 hpa for pressure were determined to detect the cyclonic systems. These detected cyclonic systems in the study region are classified into different cyclone categories based on their intensity (maximum wind speed). The forecasts of up to 15 days from three global models viz., ECMWF, NCEP and UKMO have been used to predict cyclogenesis based on multi-model ensemble approach. The occurrence of cyclonic events of different categories in all the forecast steps in the grided region (10 × 10 km2) was used to estimate the probability of the formation of cyclogenesis. The probability of cyclogenesis was estimated by computing the grid score using the wind matching index by each model and at each forecast step and convolving it with Gaussian filter. The proposed method is used to predict the cyclogenesis of five named tropical cyclones formed during the year 2013 in the north Indian Ocean. The 6-8 days advance cyclogenesis of theses systems were predicted using the above approach. The mean lead prediction time for the cyclogenesis event of the proposed model has been found as 7 days.

  7. Texture Descriptors Ensembles Enable Image-Based Classification of Maturation of Human Stem Cell-Derived Retinal Pigmented Epithelium

    PubMed Central

    Caetano dos Santos, Florentino Luciano; Skottman, Heli; Juuti-Uusitalo, Kati; Hyttinen, Jari

    2016-01-01

    Aims A fast, non-invasive and observer-independent method to analyze the homogeneity and maturity of human pluripotent stem cell (hPSC) derived retinal pigment epithelial (RPE) cells is warranted to assess the suitability of hPSC-RPE cells for implantation or in vitro use. The aim of this work was to develop and validate methods to create ensembles of state-of-the-art texture descriptors and to provide a robust classification tool to separate three different maturation stages of RPE cells by using phase contrast microscopy images. The same methods were also validated on a wide variety of biological image classification problems, such as histological or virus image classification. Methods For image classification we used different texture descriptors, descriptor ensembles and preprocessing techniques. Also, three new methods were tested. The first approach was an ensemble of preprocessing methods, to create an additional set of images. The second was the region-based approach, where saliency detection and wavelet decomposition divide each image in two different regions, from which features were extracted through different descriptors. The third method was an ensemble of Binarized Statistical Image Features, based on different sizes and thresholds. A Support Vector Machine (SVM) was trained for each descriptor histogram and the set of SVMs combined by sum rule. The accuracy of the computer vision tool was verified in classifying the hPSC-RPE cell maturation level. Dataset and Results The RPE dataset contains 1862 subwindows from 195 phase contrast images. The final descriptor ensemble outperformed the most recent stand-alone texture descriptors, obtaining, for the RPE dataset, an area under ROC curve (AUC) of 86.49% with the 10-fold cross validation and 91.98% with the leave-one-image-out protocol. The generality of the three proposed approaches was ascertained with 10 more biological image datasets, obtaining an average AUC greater than 97%. Conclusions Here we

  8. Nucleic acid based molecular devices.

    PubMed

    Krishnan, Yamuna; Simmel, Friedrich C

    2011-03-28

    In biology, nucleic acids are carriers of molecular information: DNA's base sequence stores and imparts genetic instructions, while RNA's sequence plays the role of a messenger and a regulator of gene expression. As biopolymers, nucleic acids also have exciting physicochemical properties, which can be rationally influenced by the base sequence in myriad ways. Consequently, in recent years nucleic acids have also become important building blocks for bottom-up nanotechnology: as molecules for the self-assembly of molecular nanostructures and also as a material for building machinelike nanodevices. In this Review we will cover the most important developments in this growing field of nucleic acid nanodevices. We also provide an overview of the biochemical and biophysical background of this field and the major "historical" influences that shaped its development. Particular emphasis is laid on DNA molecular motors, molecular robotics, molecular information processing, and applications of nucleic acid nanodevices in biology. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  9. Probabilistic regional wind power forecasts based on calibrated Numerical Weather Forecast ensembles

    NASA Astrophysics Data System (ADS)

    Späth, Stephan; von Bremen, Lueder; Junk, Constantin; Heinemann, Detlev

    2014-05-01

    With increasing shares of installed wind power in Germany, accurate forecasts of wind speed and power get increasingly important for the grid integration of Renewable Energies. Applications like grid management and trading also benefit from uncertainty information. This uncertainty information can be provided by ensemble forecasts. These forecasts often exhibit systematic errors such as biases and spread deficiencies. The errors can be reduced by statistical post-processing. We use forecast data from the regional Numerical Weather Prediction model COSMO-DE EPS as input to regional wind power forecasts. In order to enhance the power forecast, we first calibrate the wind speed forecasts against the model analysis, so some of the model's systematic errors can be removed. Wind measurements at every grid point are usually not available and as we want to conduct grid zone forecasts, the model analysis is the best target for calibration. We use forecasts from the COSMO-DE EPS, a high-resolution ensemble prediction system with 20 forecast members. The model covers the region of Germany and surroundings with a vertical resolution of 50 model levels and a horizontal resolution of 0.025 degrees (approximately 2.8 km). The forecast range is 21 hours with model output available on an hourly basis. Thus, we use it for shortest-term wind power forecasts. The COSMO-DE EPS was originally designed with a focus on forecasts of convective precipitation. The COSMO-DE EPS wind speed forecasts at hub height were post-processed by nonhomogenous Gaussian regression (NGR; Thorarinsdottir and Gneiting, 2010), a calibration method that fits a truncated normal distribution to the ensemble wind speed forecasts. As calibration target, the model analysis was used. The calibration is able to remove some deficits of the COSMO-DE EPS. In contrast to the raw ensemble members, the calibrated ensemble members do not show anymore the strong correlations with each other and the spread-skill relationship

  10. Towards the knowledge-based design of universal influenza epitope ensemble vaccines.

    PubMed

    Sheikh, Qamar M; Gatherer, Derek; Reche, Pedro A; Flower, Darren R

    2016-11-01

    Influenza A viral heterogeneity remains a significant threat due to unpredictable antigenic drift in seasonal influenza and antigenic shifts caused by the emergence of novel subtypes. Annual review of multivalent influenza vaccines targets strains of influenza A and B likely to be predominant in future influenza seasons. This does not induce broad, cross protective immunity against emergent subtypes. Better strategies are needed to prevent future pandemics. Cross-protection can be achieved by activating CD8+ and CD4+ T cells against highly conserved regions of the influenza genome. We combine available experimental data with informatics-based immunological predictions to help design vaccines potentially able to induce cross-protective T-cells against multiple influenza subtypes. To exemplify our approach we designed two epitope ensemble vaccines comprising highly conserved and experimentally verified immunogenic influenza A epitopes as putative non-seasonal influenza vaccines; one specifically targets the US population and the other is a universal vaccine. The USA-specific vaccine comprised 6 CD8+ T cell epitopes (GILGFVFTL, FMYSDFHFI, GMDPRMCSL, SVKEKDMTK, FYIQMCTEL, DTVNRTHQY) and 3 CD4+ epitopes (KGILGFVFTLTVPSE, EYIMKGVYINTALLN, ILGFVFTLTVPSERG). The universal vaccine comprised 8 CD8+ epitopes: (FMYSDFHFI, GILGFVFTL, ILRGSVAHK, FYIQMCTEL, ILKGKFQTA, YYLEKANKI, VSDGGPNLY, YSHGTGTGY) and the same 3 CD4+ epitopes. Our USA-specific vaccine has a population protection coverage (portion of the population potentially responsive to one or more component epitopes of the vaccine, PPC) of over 96 and 95% coverage of observed influenza subtypes. The universal vaccine has a PPC value of over 97 and 88% coverage of observed subtypes. http://imed.med.ucm.es/Tools/episopt.html CONTACT: d.r.flower@aston.ac.uk. © The Author 2016. Published by Oxford University Press.

  11. Towards the knowledge-based design of universal influenza epitope ensemble vaccines

    PubMed Central

    Sheikh, Qamar M.; Gatherer, Derek; Reche, Pedro A; Flower, Darren R.

    2016-01-01

    Motivation: Influenza A viral heterogeneity remains a significant threat due to unpredictable antigenic drift in seasonal influenza and antigenic shifts caused by the emergence of novel subtypes. Annual review of multivalent influenza vaccines targets strains of influenza A and B likely to be predominant in future influenza seasons. This does not induce broad, cross protective immunity against emergent subtypes. Better strategies are needed to prevent future pandemics. Cross-protection can be achieved by activating CD8+ and CD4+ T cells against highly conserved regions of the influenza genome. We combine available experimental data with informatics-based immunological predictions to help design vaccines potentially able to induce cross-protective T-cells against multiple influenza subtypes. Results: To exemplify our approach we designed two epitope ensemble vaccines comprising highly conserved and experimentally verified immunogenic influenza A epitopes as putative non-seasonal influenza vaccines; one specifically targets the US population and the other is a universal vaccine. The USA-specific vaccine comprised 6 CD8+ T cell epitopes (GILGFVFTL, FMYSDFHFI, GMDPRMCSL, SVKEKDMTK, FYIQMCTEL, DTVNRTHQY) and 3 CD4+ epitopes (KGILGFVFTLTVPSE, EYIMKGVYINTALLN, ILGFVFTLTVPSERG). The universal vaccine comprised 8 CD8+ epitopes: (FMYSDFHFI, GILGFVFTL, ILRGSVAHK, FYIQMCTEL, ILKGKFQTA, YYLEKANKI, VSDGGPNLY, YSHGTGTGY) and the same 3 CD4+ epitopes. Our USA-specific vaccine has a population protection coverage (portion of the population potentially responsive to one or more component epitopes of the vaccine, PPC) of over 96 and 95% coverage of observed influenza subtypes. The universal vaccine has a PPC value of over 97 and 88% coverage of observed subtypes. Availability and Implementation: http://imed.med.ucm.es/Tools/episopt.html. Contact: d.r.flower@aston.ac.uk PMID:27402904

  12. [Estimation and forecast of chlorophyll a concentration in Taihu Lake based on ensemble square root filters].

    PubMed

    Li, Yuan; Li, Yun-Mei; Wang, Qiao; Zhang, Zhuo; Guo, Fei; Lü, Heng; Bi, Kun; Huang, Chang-Chun; Guo, Yu-Long

    2013-01-01

    Chlorophyll a concentration is one of the important parameters for the characterization of water quality, which reflects the degree of eutrophication and algae content in the water body. It is also an important factor in determining water spectral reflectance. Chlorophyll a concentration is an important water quality parameter in water quality remote sensing. Remote sensing quantitative retrieval of chlorophyll a concentration can provide new ideas and methods for the monitoring and evaluation of lake water quality. In this work, we developed a data assimilation scheme based on ensemble square root filters and three-dimensional numerical modeling for wind-driven circulation and pollutant transport to assimilate the concentration of chlorophyll a. We also conducted some assimilation experiments using buoy observation data on May 20, 2010. We estimated the concentration of chlorophyll a in Taihu Lake, and then used this result to forecast the concentration of chlorophyll a. During the assimilation stage, the root mean square error reduced from 1.58, 1.025, and 2.76 to 0.465, 0.276, and 1.01, respectively, and the average relative error reduced from 0.2 to 0.05, 0.046, and 0.069, respectively. During the prediction stage, the root mean square error reduced from 1.486, 1.143, and 2.38 to 0.017, 0.147, and 0.23, respectively, and the average relative error reduced from 0.2 to 0.002, 0.025, and 0.019, respectively. The final results indicate that the method of data assimilation can significantly improve the accuracy in the estimation and prediction of chlorophyll a concentration in Taihu Lake.

  13. Are Charge-State Distributions a Reliable Tool Describing Molecular Ensembles of Intrinsically Disordered Proteins by Native MS?

    NASA Astrophysics Data System (ADS)

    Natalello, Antonino; Santambrogio, Carlo; Grandori, Rita

    2017-01-01

    Native mass spectrometry (MS) has become a central tool of structural proteomics, but its applicability to the peculiar class of intrinsically disordered proteins (IDPs) is still object of debate. IDPs lack an ordered tridimensional structure and are characterized by high conformational plasticity. Since they represent valuable targets for cancer and neurodegeneration research, there is an urgent need of methodological advances for description of the conformational ensembles populated by these proteins in solution. However, structural rearrangements during electrospray-ionization (ESI) or after the transfer to the gas phase could affect data obtained by native ESI-MS. In particular, charge-state distributions (CSDs) are affected by protein conformation inside ESI droplets, while ion mobility (IM) reflects protein conformation in the gas phase. This review focuses on the available evidence relating IDP solution ensembles with CSDs, trying to summarize cases of apparent consistency or discrepancy. The protein-specificity of ionization patterns and their responses to ligands and buffer conditions suggests that CSDs are imprinted to protein structural features also in the case of IDPs. Nevertheless, it seems that these proteins are more easily affected by electrospray conditions, leading in some cases to rearrangements of the conformational ensembles.

  14. Forecasting European cold waves based on subsampling strategies of CMIP5 and Euro-CORDEX ensembles

    NASA Astrophysics Data System (ADS)

    Cordero-Llana, Laura; Braconnot, Pascale; Vautard, Robert; Vrac, Mathieu; Jezequel, Aglae

    2016-04-01

    Forecasting future extreme events under the present changing climate represents a difficult task. Currently there are a large number of ensembles of simulations for climate projections that take in account different models and scenarios. However, there is a need for reducing the size of the ensemble to make the interpretation of these simulations more manageable for impact studies or climate risk assessment. This can be achieved by developing subsampling strategies to identify a limited number of simulations that best represent the ensemble. In this study, cold waves are chosen to test different approaches for subsampling available simulations. The definition of cold waves depends on the criteria used, but they are generally defined using a minimum temperature threshold, the duration of the cold spell as well as their geographical extend. These climate indicators are not universal, highlighting the difficulty of directly comparing different studies. As part of the of the CLIPC European project, we use daily surface temperature data obtained from CMIP5 outputs as well as Euro-CORDEX simulations to predict future cold waves events in Europe. From these simulations a clustering method is applied to minimise the number of ensembles required. Furthermore, we analyse the different uncertainties that arise from the different model characteristics and definitions of climate indicators. Finally, we will test if the same subsampling strategy can be used for different climate indicators. This will facilitate the use of the subsampling results for a wide number of impact assessment studies.

  15. Ensemble-Based Instrumental Music Instruction: Dead-End Tradition or Opportunity for Socially Enlightened Teaching

    ERIC Educational Resources Information Center

    Heuser, Frank

    2011-01-01

    Public school music education in the USA remains wedded to large ensemble performance. Instruction tends to be teacher directed, relies on styles from the Western canon and exhibits little concern for musical interests of students. The idea that a fundamental purpose of education is the creation of a just society is difficult for many music…

  16. Ensemble-Based Instrumental Music Instruction: Dead-End Tradition or Opportunity for Socially Enlightened Teaching

    ERIC Educational Resources Information Center

    Heuser, Frank

    2011-01-01

    Public school music education in the USA remains wedded to large ensemble performance. Instruction tends to be teacher directed, relies on styles from the Western canon and exhibits little concern for musical interests of students. The idea that a fundamental purpose of education is the creation of a just society is difficult for many music…

  17. A new data assimilation technique based on ensemble Kalman filter and Brownian bridges: An application to Richards' equation

    NASA Astrophysics Data System (ADS)

    Berardi, Marco; Andrisani, Andrea; Lopez, Luciano; Vurro, Michele

    2016-11-01

    In this paper a new data assimilation technique is proposed which is based on the ensemble Kalman filter (EnKF). Such a technique will be effective if few observations of a dynamical system are available and a large model error occurs. The idea is to acquire a fine grid of synthetic observations in two steps: (1) first we interpolate the real observations with suitable polynomial curves; (2) then we estimate the relative measurement errors by means of Brownian bridges. This technique has been tested on the Richards' equation, which governs the water flow in unsaturated soils, where a large model error has been introduced by solving the Richards' equation by means of an explicit numerical scheme. The application of this technique to some synthetic experiments has shown improvements with respect to the classical ensemble Kalman filter, in particular for problems with a large model error.

  18. Comparison of Ensemble Kalman Filter groundwater-data assimilation methods based on stochastic moment equations and Monte Carlo simulation

    NASA Astrophysics Data System (ADS)

    Panzeri, M.; Riva, M.; Guadagnini, A.; Neuman, S. P.

    2014-04-01

    Traditional Ensemble Kalman Filter (EnKF) data assimilation requires computationally intensive Monte Carlo (MC) sampling, which suffers from filter inbreeding unless the number of simulations is large. Recently we proposed an alternative EnKF groundwater-data assimilation method that obviates the need for sampling and is free of inbreeding issues. In our new approach, theoretical ensemble moments are approximated directly by solving a system of corresponding stochastic groundwater flow equations. Like MC-based EnKF, our moment equations (ME) approach allows Bayesian updating of system states and parameters in real-time as new data become available. Here we compare the performances and accuracies of the two approaches on two-dimensional transient groundwater flow toward a well pumping water in a synthetic, randomly heterogeneous confined aquifer subject to prescribed head and flux boundary conditions.

  19. Ensembl comparative genomics resources

    PubMed Central

    Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J.; Searle, Stephen M. J.; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

    2016-01-01

    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org. PMID:26896847

  20. Ensembl comparative genomics resources.

    PubMed

    Herrero, Javier; Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J; Searle, Stephen M J; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

    2016-01-01

    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org. © The Author(s) 2016. Published by Oxford University Press.

  1. Conservative strategy-based ensemble surrogate model for optimal groundwater remediation design at DNAPLs-contaminated sites

    NASA Astrophysics Data System (ADS)

    Ouyang, Qi; Lu, Wenxi; Lin, Jin; Deng, Wenbing; Cheng, Weiguo

    2017-08-01

    The surrogate-based simulation-optimization techniques are frequently used for optimal groundwater remediation design. When this technique is used, surrogate errors caused by surrogate-modeling uncertainty may lead to generation of infeasible designs. In this paper, a conservative strategy that pushes the optimal design into the feasible region was used to address surrogate-modeling uncertainty. In addition, chance-constrained programming (CCP) was adopted to compare with the conservative strategy in addressing this uncertainty. Three methods, multi-gene genetic programming (MGGP), Kriging (KRG) and support vector regression (SVR), were used to construct surrogate models for a time-consuming multi-phase flow model. To improve the performance of the surrogate model, ensemble surrogates were constructed based on combinations of different stand-alone surrogate models. The results show that: (1) the surrogate-modeling uncertainty was successfully addressed by the conservative strategy, which means that this method is promising for addressing surrogate-modeling uncertainty. (2) The ensemble surrogate model that combines MGGP with KRG showed the most favorable performance, which indicates that this ensemble surrogate can utilize both stand-alone surrogate models to improve the performance of the surrogate model.

  2. Ligand-biased ensemble receptor docking (LigBEnD): a hybrid ligand/receptor structure-based approach

    NASA Astrophysics Data System (ADS)

    Lam, Polo C.-H.; Abagyan, Ruben; Totrov, Maxim

    2017-09-01

    Ligand docking to flexible protein molecules can be efficiently carried out through ensemble docking to multiple protein conformations, either from experimental X-ray structures or from in silico simulations. The success of ensemble docking often requires the careful selection of complementary protein conformations, through docking and scoring of known co-crystallized ligands. False positives, in which a ligand in a wrong pose achieves a better docking score than that of native pose, arise as additional protein conformations are added. In the current study, we developed a new ligand-biased ensemble receptor docking method and composite scoring function which combine the use of ligand-based atomic property field (APF) method with receptor structure-based docking. This method helps us to correctly dock 30 out of 36 ligands presented by the D3R docking challenge. For the six mis-docked ligands, the cognate receptor structures prove to be too different from the 40 available experimental Pocketome conformations used for docking and could be identified only by receptor sampling beyond experimentally explored conformational subspace.

  3. Ensembl 2016

    PubMed Central

    Yates, Andrew; Akanni, Wasiu; Amode, M. Ridwan; Barrell, Daniel; Billis, Konstantinos; Carvalho-Silva, Denise; Cummins, Carla; Clapham, Peter; Fitzgerald, Stephen; Gil, Laurent; Girón, Carlos García; Gordon, Leo; Hourlier, Thibaut; Hunt, Sarah E.; Janacek, Sophie H.; Johnson, Nathan; Juettemann, Thomas; Keenan, Stephen; Lavidas, Ilias; Martin, Fergal J.; Maurel, Thomas; McLaren, William; Murphy, Daniel N.; Nag, Rishi; Nuhn, Michael; Parker, Anne; Patricio, Mateus; Pignatelli, Miguel; Rahtz, Matthew; Riat, Harpreet Singh; Sheppard, Daniel; Taylor, Kieron; Thormann, Anja; Vullo, Alessandro; Wilder, Steven P.; Zadissa, Amonida; Birney, Ewan; Harrow, Jennifer; Muffato, Matthieu; Perry, Emily; Ruffier, Magali; Spudich, Giulietta; Trevanion, Stephen J.; Cunningham, Fiona; Aken, Bronwen L.; Zerbino, Daniel R.; Flicek, Paul

    2016-01-01

    The Ensembl project (http://www.ensembl.org) is a system for genome annotation, analysis, storage and dissemination designed to facilitate the access of genomic annotation from chordates and key model organisms. It provides access to data from 87 species across our main and early access Pre! websites. This year we introduced three newly annotated species and released numerous updates across our supported species with a concentration on data for the latest genome assemblies of human, mouse, zebrafish and rat. We also provided two data updates for the previous human assembly, GRCh37, through a dedicated website (http://grch37.ensembl.org). Our tools, in particular the VEP, have been improved significantly through integration of additional third party data. REST is now capable of larger-scale analysis and our regulatory data BioMart can deliver faster results. The website is now capable of displaying long-range interactions such as those found in cis-regulated datasets. Finally we have launched a website optimized for mobile devices providing views of genes, variants and phenotypes. Our data is made available without restriction and all code is available from our GitHub organization site (http://github.com/Ensembl) under an Apache 2.0 license. PMID:26687719

  4. Input Decimated Ensembles

    NASA Technical Reports Server (NTRS)

    Tumer, Kagan; Oza, Nikunj C.; Clancy, Daniel (Technical Monitor)

    2001-01-01

    Using an ensemble of classifiers instead of a single classifier has been shown to improve generalization performance in many pattern recognition problems. However, the extent of such improvement depends greatly on the amount of correlation among the errors of the base classifiers. Therefore, reducing those correlations while keeping the classifiers' performance levels high is an important area of research. In this article, we explore input decimation (ID), a method which selects feature subsets for their ability to discriminate among the classes and uses them to decouple the base classifiers. We provide a summary of the theoretical benefits of correlation reduction, along with results of our method on two underwater sonar data sets, three benchmarks from the Probenl/UCI repositories, and two synthetic data sets. The results indicate that input decimated ensembles (IDEs) outperform ensembles whose base classifiers use all the input features; randomly selected subsets of features; and features created using principal components analysis, on a wide range of domains.

  5. Illumination correction of dyed fabrics approach using Bagging-based ensemble particle swarm optimization-extreme learning machine

    NASA Astrophysics Data System (ADS)

    Zhou, Zhiyu; Xu, Rui; Wu, Dichong; Zhu, Zefei; Wang, Haiyan

    2016-09-01

    Changes in illumination will result in serious color difference evaluation errors during the dyeing process. A Bagging-based ensemble extreme learning machine (ELM) mechanism hybridized with particle swarm optimization (PSO), namely Bagging-PSO-ELM, is proposed to develop an accurate illumination correction model for dyed fabrics. The model adopts PSO algorithm to optimize the input weights and hidden biases for the ELM neural network called PSO-ELM, which enhances the performance of ELM. Meanwhile, to further increase the prediction accuracy, a Bagging ensemble scheme is used to construct an independent PSO-ELM learning machine by taking bootstrap replicates of the training set. Then, the obtained multiple different PSO-ELM learners are aggregated to establish the prediction model. The proposed prediction model is evaluated with real dyed fabric images and discussed in comparison with several related methods. Experimental results show that the ensemble color constancy method is able to generate a more robust illuminant estimation model with better generalization performance.

  6. Hydrologic ensembles based on convection-permitting precipitation nowcasts for flash flood warnings

    NASA Astrophysics Data System (ADS)

    Demargne, Julie; Javelle, Pierre; Organde, Didier; de Saint Aubin, Céline; Ramos, Maria-Helena

    2017-04-01

    In order to better anticipate flash flood events and provide timely warnings to communities at risk, the French national service in charge of flood forecasting (SCHAPI) is implementing a national flash flood warning system for small-to-medium ungauged basins. Based on a discharge-threshold flood warning method called AIGA (Javelle et al. 2014), the current version of the system runs a simplified hourly distributed hydrologic model with operational radar-gauge QPE grids from Météo-France at a 1-km2 resolution every 15 minutes. This produces real-time peak discharge estimates along the river network, which are subsequently compared to regionalized flood frequency estimates to provide warnings according to the AIGA-estimated return period of the ongoing event. To further extend the effective warning lead time while accounting for hydrometeorological uncertainties, the flash flood warning system is being enhanced to include Météo-France's AROME-NWC high-resolution precipitation nowcasts as time-lagged ensembles and multiple sets of hydrological regionalized parameters. The operational deterministic precipitation forecasts, from the nowcasting version of the AROME convection-permitting model (Auger et al. 2015), were provided at a 2.5-km resolution for a 6-hr forecast horizon for 9 significant rain events from September 2014 to June 2016. The time-lagged approach is a practical choice of accounting for the atmospheric forecast uncertainty when no extensive forecast archive is available for statistical modelling. The evaluation on 781 French basins showed significant improvements in terms of flash flood event detection and effective warning lead-time, compared to warnings from the current AIGA setup (without any future precipitation). We also discuss how to effectively communicate verification information to help determine decision-relevant warning thresholds for flood magnitude and probability. Javelle, P., Demargne, J., Defrance, D., Arnaud, P., 2014. Evaluating

  7. Comparison of Probabilistic Coastal Inundation Maps Based on Historical Storms and Statistically Modeled Storm Ensemble

    NASA Astrophysics Data System (ADS)

    Feng, X.; Sheng, Y.; Condon, A. J.; Paramygin, V. A.; Hall, T.

    2012-12-01

    which had been used for Western North Pacific (WNP) tropical cyclone (TC) genesis (Hall 2011) as well as North Atlantic tropical cyclone genesis (Hall and Jewson 2007). The introduction of these tracks complements the shortage of the historical samples and allows for more reliable PDFs required for implementation of JPM-OS. Using the 33,731 tracks and JPM-OS, an optimal storm ensemble is determined. This approach results in different storms/winds for storm surge and inundation modeling, and produces different Base Flood Elevation maps for coastal regions. Coastal inundation maps produced by the two different methods will be discussed in detail in the poster paper.

  8. Vision-based posture recognition using an ensemble classifier and a vote filter

    NASA Astrophysics Data System (ADS)

    Ji, Peng; Wu, Changcheng; Xu, Xiaonong; Song, Aiguo; Li, Huijun

    2016-10-01

    Posture recognition is a very important Human-Robot Interaction (HRI) way. To segment effective posture from an image, we propose an improved region grow algorithm which combining with the Single Gauss Color Model. The experiment shows that the improved region grow algorithm can get the complete and accurate posture than traditional Single Gauss Model and region grow algorithm, and it can eliminate the similar region from the background at the same time. In the posture recognition part, and in order to improve the recognition rate, we propose a CNN ensemble classifier, and in order to reduce the misjudgments during a continuous gesture control, a vote filter is proposed and applied to the sequence of recognition results. Comparing with CNN classifier, the CNN ensemble classifier we proposed can yield a 96.27% recognition rate, which is better than that of CNN classifier, and the proposed vote filter can improve the recognition result and reduce the misjudgments during the consecutive gesture switch.

  9. Rotaxane-based molecular muscles.

    PubMed

    Bruns, Carson J; Stoddart, J Fraser

    2014-07-15

    CONSPECTUS: More than two decades of investigating the chemistry of bistable mechanically interlocked molecules (MIMs), such as rotaxanes and catenanes, has led to the advent of numerous molecular switches that express controlled translational or circumrotational movement on the nanoscale. Directed motion at this scale is an essential feature of many biomolecular assemblies known as molecular machines, which carry out essential life-sustaining functions of the cell. It follows that the use of bistable MIMs as artificial molecular machines (AMMs) has been long anticipated. This objective is rarely achieved, however, because of challenges associated with coupling the directed motions of mechanical switches with other systems on which they can perform work. A natural source of inspiration for designing AMMs is muscle tissue, since it is a material that relies on the hierarchical organization of molecular machines (myosin) and filaments (actin) to produce the force and motion that underpin locomotion, circulation, digestion, and many other essential life processes in humans and other animals. Muscle is characterized at both microscopic and macroscopic length scales by its ability to generate forces that vary the distance between two points at the expense of chemical energy. Artificial muscles that mimic this ability are highly sought for applications involving the transduction of mechanical energy. Rotaxane-based molecular switches are excellent candidates for artificial muscles because their architectures intrinsically possess movable filamentous molecular components. In this Account, we describe (i) the different types of rotaxane "molecular muscle" architectures that express contractile and extensile motion, (ii) the molecular recognition motifs and corresponding stimuli that have been used to actuate them, and (iii) the progress made on integrating and scaling up these motions for potential applications. We identify three types of rotaxane muscles, namely, "daisy

  10. Ensemble-Based Estimates of the Predictability of Wind-Driven Coastal Ocean Flow Over Topography

    DTIC Science & Technology

    2008-01-01

    433. Anthes, R. A., Y. H. Kuo, D . P. Baumhefner, R. P. Errico , and T. W. Bettge, 1985: Pre- dictability of mesoscale atmospheric motions. Adv. Geophys...amplitude instabilities on a coastal upwelling front. J. Phys. Oceanogr., 37, 837–854. Errico , R. and D . Baumhefner, 1987: Predictability experiments using a...error in density (a-c; kg m−3 - 1000) and vector velocity ( d - f ; m s−1) for the persistence (thick solid line), control (dashed), and ensemble-mean

  11. GPU-Based Interactive Exploration and Online Probability Maps Calculation for Visualizing Assimilated Ocean Ensembles Data

    NASA Astrophysics Data System (ADS)

    Hoteit, I.; Hollt, T.; Hadwiger, M.; Knio, O. M.; Gopalakrishnan, G.; Zhan, P.

    2016-02-01

    Ocean reanalyses and forecasts are nowadays generated by combining ensemble simulations with data assimilation techniques. Most of these techniques resample the ensemble members after each assimilation cycle. Tracking behavior over time, such as all possible paths of a particle in an ensemble vector field, becomes very difficult, as the number of combinations rises exponentially with the number of assimilation cycles. In general a single possible path is not of interest but only the probabilities that any point in space might be reached by a particle at some point in time. We present an approach using probability-weighted piecewise particle trajectories to allow for interactive probability mapping. This is achieved by binning the domain and splitting up the tracing process into the individual assimilation cycles, so that particles that fall into the same bin after a cycle can be treated as a single particle with a larger probability as input for the next cycle. As a result we loose the possibility to track individual particles, but can create probability maps for any desired seed at interactive rates. The technique is integrated in an interactive visualization system that enables the visual analysis of the particle traces side by side with other forecast variables, such as the sea surface height, and their corresponding behavior over time. By harnessing the power of modern graphics processing units (GPUs) for visualization as well as computation, our system allows the user to browse through the simulation ensembles in real-time, view specific parameter settings or simulation models and move between different spatial or temporal regions without delay. In addition our system provides advanced visualizations to highlight the uncertainty, or show the complete distribution of the simulations at user-defined positions over the complete time series of the domain.

  12. Assessment of probability density function based on POD reduced-order model for ensemble-based data assimilation

    NASA Astrophysics Data System (ADS)

    Kikuchi, Ryota; Misaka, Takashi; Obayashi, Shigeru

    2015-10-01

    An integrated method of a proper orthogonal decomposition based reduced-order model (ROM) and data assimilation is proposed for the real-time prediction of an unsteady flow field. In this paper, a particle filter (PF) and an ensemble Kalman filter (EnKF) are compared for data assimilation and the difference in the predicted flow fields is evaluated focusing on the probability density function (PDF) of the model variables. The proposed method is demonstrated using identical twin experiments of an unsteady flow field around a circular cylinder at the Reynolds number of 1000. The PF and EnKF are employed to estimate temporal coefficients of the ROM based on the observed velocity components in the wake of the circular cylinder. The prediction accuracy of ROM-PF is significantly better than that of ROM-EnKF due to the flexibility of PF for representing a PDF compared to EnKF. Furthermore, the proposed method reproduces the unsteady flow field several orders faster than the reference numerical simulation based on the Navier-Stokes equations.

  13. Development of web-based services for a novel ensemble flood forecasting and risk assessment system

    NASA Astrophysics Data System (ADS)

    He, Y.; Manful, D. Y.; Cloke, H. L.; Wetterhall, F.; Li, Z.; Bao, H.; Pappenberger, F.; Wesner, S.; Schubert, L.; Yang, L.; Hu, Y.

    2009-12-01

    Flooding is a wide spread and devastating natural disaster worldwide. Floods that took place in the last decade in China were ranked the worst amongst recorded floods worldwide in terms of the number of human fatalities and economic losses (Munich Re-Insurance). Rapid economic development and population expansion into low lying flood plains has worsened the situation. Current conventional flood prediction systems in China are neither suited to the perceptible climate variability nor the rapid pace of urbanization sweeping the country. Flood prediction, from short-term (a few hours) to medium-term (a few days), needs to be revisited and adapted to changing socio-economic and hydro-climatic realities. The latest technology requires implementation of multiple numerical weather prediction systems. The availability of twelve global ensemble weather prediction systems through the ‘THORPEX Interactive Grand Global Ensemble’ (TIGGE) offers a good opportunity for an effective state-of-the-art early forecasting system. A prototype of a Novel Flood Early Warning System (NEWS) using the TIGGE database is tested in the Huai River basin in east-central China. It is the first early flood warning system in China that uses the massive TIGGE database cascaded with river catchment models, the Xinanjiang hydrologic model and a 1-D hydraulic model, to predict river discharge and flood inundation. The NEWS algorithm is also designed to provide web-based services to a broad spectrum of end-users. The latter presents challenges as both databases and proprietary codes reside in different locations and converge at dissimilar times. NEWS will thus make use of a ready-to-run grid system that makes distributed computing and data resources available in a seamless and secure way. An ability to run or function on different operating systems and provide an interface or front that is accessible to broad spectrum of end-users is additional requirement. The aim is to achieve robust

  14. Ensemble: a web-based system for psychology survey and experiment management.

    PubMed

    Tomic, Stefan T; Janata, Petr

    2007-08-01

    We provide a description of Ensemble, a suite of Web-integrated modules for managing and analyzing data associated with psychology experiments in a small research lab. The system delivers interfaces via a Web browser for creating and presenting simple surveys without the need to author Web pages and with little or no programming effort. The surveys may be extended by selecting and presenting auditory and/or visual stimuli with MATLAB and Flash to enable a wide range of psychophysical and cognitive experiments which do not require the recording of precise reaction times. Additionally, one is provided with the ability to administer and present experiments remotely. The software technologies employed by the various modules of Ensemble are MySQL, PHP, MATLAB, and Flash. The code for Ensemble is open source and available to the public, so that its functions can be readily extended by users. We describe the architecture of the system, the functionality of each module, and provide basic examples of the interfaces.

  15. Genre-based image classification using ensemble learning for online flyers

    NASA Astrophysics Data System (ADS)

    Pourashraf, Payam; Tomuro, Noriko; Apostolova, Emilia

    2015-07-01

    This paper presents an image classification model developed to classify images embedded in commercial real estate flyers. It is a component in a larger, multimodal system which uses texts as well as images in the flyers to automatically classify them by the property types. The role of the image classifier in the system is to provide the genres of the embedded images (map, schematic drawing, aerial photo, etc.), which to be combined with the texts in the flyer to do the overall classification. In this work, we used an ensemble learning approach and developed a model where the outputs of an ensemble of support vector machines (SVMs) are combined by a k-nearest neighbor (KNN) classifier. In this model, the classifiers in the ensemble are strong classifiers, each of which is trained to predict a given/assigned genre. Not only is our model intuitive by taking advantage of the mutual distinctness of the image genres, it is also scalable. We tested the model using over 3000 images extracted from online real estate flyers. The result showed that our model outperformed the baseline classifiers by a large margin.

  16. Maximum Likelihood Ensemble Filter-based Data Assimilation with HSPF for Improving Water Quality Forecasting

    NASA Astrophysics Data System (ADS)

    Kim, S.; Riazi, H.; Shin, C.; Seo, D.

    2013-12-01

    Due to the large dimensionality of the state vector and sparsity of observations, the initial conditions (IC) of water quality models are subject to large uncertainties. To reduce the IC uncertainties in operational water quality forecasting, an ensemble data assimilation (DA) procedure for the Hydrologic Simulation Program - Fortran (HSPF) model has been developed and evaluated for the Kumho River Subcatchment of the Nakdong River Basin in Korea. The procedure, referred to herein as MLEF-HSPF, uses maximum likelihood ensemble filter (MLEF) which combines strengths of variational assimilation (VAR) and ensemble Kalman filter (EnKF). The Control variables involved in the DA procedure include the bias correction factors for mean areal precipitation and mean areal potential evaporation, the hydrologic state variables, and the water quality state variables such as water temperature, dissolved oxygen (DO), biochemical oxygen demand (BOD), ammonium (NH4), nitrate (NO3), phosphate (PO4) and chlorophyll a (CHL-a). Due to the very large dimensionality of the inverse problem, accurately specifying the parameters for the DA procdedure is a challenge. Systematic sensitivity analysis is carried out for identifying the optimal parameter settings. To evaluate the robustness of MLEF-HSPF, we use multiple subcatchments of the Nakdong River Basin. In evaluation, we focus on the performance of MLEF-HSPF on prediction of extreme water quality events.

  17. High teleportation rates using cold-atom-ensemble-based quantum repeaters with Rydberg blockade

    NASA Astrophysics Data System (ADS)

    Solmeyer, Neal; Li, Xiao; Quraishi, Qudsia

    2016-04-01

    We present a simplified version of a repeater protocol in a cold neutral-atom ensemble with Rydberg excitations optimized for two-node entanglement generation and describe a protocol for quantum teleportation. Our proposal draws from previous proposals [B. Zhao et al., Phys. Rev. A 81, 052329 (2010), 10.1103/PhysRevA.81.052329; Y. Han et al., Phys. Rev. A 81, 052311 (2010), 10.1103/PhysRevA.81.052311] that described efficient and robust protocols for long-distance entanglement with many nodes. Using realistic experimental values, we predict an entanglement generation rate of ˜25 Hz and a teleportation rate of ˜5 Hz . Our predicted rates match the current state-of-the-art experiments for entanglement generation and teleportation between quantum memories. With improved efficiencies we predict entanglement generation and teleportation rates of ˜7.8 and ˜3.6 kHz, respectively, representing a two-order-of-magnitude improvement over the currently realized values. Cold-atom ensembles with Rydberg excitations are promising candidates for repeater nodes because collective effects in the ensemble can be used to deterministically generate a long-lived ground-state memory which may be efficiently mapped onto a directionally emitted single photon.

  18. Ensembl 2014

    PubMed Central

    Flicek, Paul; Amode, M. Ridwan; Barrell, Daniel; Beal, Kathryn; Billis, Konstantinos; Brent, Simon; Carvalho-Silva, Denise; Clapham, Peter; Coates, Guy; Fitzgerald, Stephen; Gil, Laurent; Girón, Carlos García; Gordon, Leo; Hourlier, Thibaut; Hunt, Sarah; Johnson, Nathan; Juettemann, Thomas; Kähäri, Andreas K.; Keenan, Stephen; Kulesha, Eugene; Martin, Fergal J.; Maurel, Thomas; McLaren, William M.; Murphy, Daniel N.; Nag, Rishi; Overduin, Bert; Pignatelli, Miguel; Pritchard, Bethan; Pritchard, Emily; Riat, Harpreet S.; Ruffier, Magali; Sheppard, Daniel; Taylor, Kieron; Thormann, Anja; Trevanion, Stephen J.; Vullo, Alessandro; Wilder, Steven P.; Wilson, Mark; Zadissa, Amonida; Aken, Bronwen L.; Birney, Ewan; Cunningham, Fiona; Harrow, Jennifer; Herrero, Javier; Hubbard, Tim J.P.; Kinsella, Rhoda; Muffato, Matthieu; Parker, Anne; Spudich, Giulietta; Yates, Andy; Zerbino, Daniel R.; Searle, Stephen M.J.

    2014-01-01

    Ensembl (http://www.ensembl.org) creates tools and data resources to facilitate genomic analysis in chordate species with an emphasis on human, major vertebrate model organisms and farm animals. Over the past year we have increased the number of species that we support to 77 and expanded our genome browser with a new scrollable overview and improved variation and phenotype views. We also report updates to our core datasets and improvements to our gene homology relationships from the addition of new species. Our REST service has been extended with additional support for comparative genomics and ontology information. Finally, we provide updated information about our methods for data access and resources for user training. PMID:24316576

  19. Loss of conformational entropy in protein folding calculated using realistic ensembles and its implications for NMR-based calculations

    PubMed Central

    Baxa, Michael C.; Haddadian, Esmael J.; Jumper, John M.; Freed, Karl F.; Sosnick, Tobin R.

    2014-01-01

    The loss of conformational entropy is a major contribution in the thermodynamics of protein folding. However, accurate determination of the quantity has proven challenging. We calculate this loss using molecular dynamic simulations of both the native protein and a realistic denatured state ensemble. For ubiquitin, the total change in entropy is TΔSTotal = 1.4 kcal⋅mol−1 per residue at 300 K with only 20% from the loss of side-chain entropy. Our analysis exhibits mixed agreement with prior studies because of the use of more accurate ensembles and contributions from correlated motions. Buried side chains lose only a factor of 1.4 in the number of conformations available per rotamer upon folding (ΩU/ΩN). The entropy loss for helical and sheet residues differs due to the smaller motions of helical residues (TΔShelix−sheet = 0.5 kcal⋅mol−1), a property not fully reflected in the amide N-H and carbonyl C=O bond NMR order parameters. The results have implications for the thermodynamics of folding and binding, including estimates of solvent ordering and microscopic entropies obtained from NMR. PMID:25313044

  20. Single-ensemble-based eigen-processing methods for color flow imaging--Part I. The Hankel-SVD filter.

    PubMed

    Yu, Alfred C H; Cobbold, Richard S C

    2008-03-01

    Because of their adaptability to the slow-time signal contents, eigen-based filters have shown potential in improving the flow detection performance of color flow images. This paper proposes a new eigen-based filter called the Hankel-SVD filter that is intended to process each slowtime ensemble individually. The new filter is derived using the notion of principal Hankel component analysis, and it achieves clutter suppression by retaining only the principal components whose order is greater than the clutter eigen-space dimension estimated from a frequency based analysis algorithm. To assess its efficacy, the Hankel-SVD filter was first applied to synthetic slow-time data (ensemble size: 10) simulated from two different sets of flow parameters that model: 1) arterial imaging (blood velocity: 0 to 38.5 cm/s, tissue motion: up to 2 mm/s, transmit frequency: 5 MHz, pulse repetition period: 0.4 ms) and 2) deep vessel imaging (blood velocity: 0 to 19.2 cm/s, tissue motion: up to 2 cm/s, transmit frequency: 2 MHz, pulse repetition period: 2.0 ms). In the simulation analysis, the post-filter clutter-to- blood signal ratio (CBR) was computed as a function of blood velocity. Results show that for the same effective stopband size (50 Hz), the Hankel-SVD filter has a narrower transition region in the post-filter CBR curve than that of another type of adaptive filter called the clutter-downmixing filter. The practical efficacy of the proposed filter was tested by application to in vivo color flow data obtained from the human carotid arteries (transmit frequency: 4 MHz, pulse repetition period: 0.333 ms, ensemble size: 10). The resulting power images show that the Hankel-SVD filter can better distinguish between blood and moving-tissue regions (about 9 dB separation in power) than the clutter-downmixing filter and a fixed-rank multi ensemble-based eigen-filter (which showed a 2 to 3 dB separation).

  1. An automatic electroencephalography blinking artefact detection and removal method based on template matching and ensemble empirical mode decomposition.

    PubMed

    Bizopoulos, Paschalis A; Al-Ani, Tarik; Tsalikakis, Dimitrios G; Tzallas, Alexandros T; Koutsouris, Dimitrios D; Fotiadis, Dimitrios I

    2013-01-01

    Electrooculographic (EOG) artefacts are one of the most common causes of Electroencephalogram (EEG) distortion. In this paper, we propose a method for EOG Blinking Artefacts (BAs) detection and removal from EEG. Normalized Correlation Coefficient (NCC), based on a predetermined BA template library was used for detecting the BA. Ensemble Empirical Mode Decomposition (EEMD) was applied to the contaminated region and a statistical algorithm determined which Intrinsic Mode Functions (IMFs) correspond to the BA. The proposed method was applied in simulated EEG signals, which were contaminated with artificially created EOG BAs, increasing the Signal-to-Error Ratio (SER) of the EEG Contaminated Region (CR) by 35 dB on average.

  2. Ensemble-based analysis of extreme precipitation events from 2007-2011

    NASA Astrophysics Data System (ADS)

    Lynch, Samantha

    From 2007 to 2011, 22 widespread, multiday rain events occurred across the United States. This study makes use of the European Centre for Medium-Range Weather Forecasts (ECMWF), the National Centers of Environmental Prediction (NCEP), and the United Kingdom Office of Meteorology (UKMET) ensemble prediction systems (EPS) in order to assess their forecast skill of these 22 widespread, precipitation events. Overall, the ECMWF had a skillful forecast for almost every event, with an exception of the 25-30 June 2007 event, the mesoscale convective vortex (MCV) over the southern plains of the United States. Additionally, the ECMWF EPS generally outperformed both the NCEP and UKMET EPS. To better evaluate the ECMWF, two widespread, multiday precipitation events were selected for closer examination: 29 April-4 May 2010 and 23-28 April 2011. The 29 April-4 May 2010 case study used ECMWF ensemble forecasts to explore the processes responsible for the development and maintenance of a multiday precipitation event that occurred in early May 2010, due to two successive quasi-stationary mesoscale convective systems. Locations in central Tennessee accumulated more than 483 millimeters (19 inches) of rain and the city of Nashville experienced a historic flash flood. Differences between ensemble members that correctly predicted heavy precipitation and those that did not were determined in order to determine the processes that were favorable or detrimental to the system's development. Statistical analysis was used to determine how synoptic-scale flows were correlated to area- averaged precipitation. For this particular case, the distribution of precipitation was found to be closely related to the strength of an upper-level trough in the central United States and an associated surface cyclone, with a weaker trough and cyclone being associated with more precipitation in the area of interest. The 23-28 April 2011 case study also used ECMWF ensemble forecasts to explore the processes

  3. Evaluation and Sensitivity Analysis of An Ensemble-based Coupled Flash Flood and Landslide Modelling System Using Remote Sensing Forcing

    NASA Astrophysics Data System (ADS)

    Zhang, K.; Hong, Y.; Gourley, J. J.; Xue, X.; He, X.

    2015-12-01

    Heavy rainfall-triggered landslides are often associated with flood events and cause additional loss of life and property. It is pertinent to build a robust coupled flash flood and landslide disaster early warning system for disaster preparedness and hazard management based. In this study, we built an ensemble-based coupled flash flood and landslide disaster early warning system, which is aimed for operational use by the US National Weather Service, by integrating the Coupled Routing and Excess STorage (CREST) model and Sacramento Soil Moisture Accounting Model (SAC-SMA) with the physically based SLope-Infiltration-Distributed Equilibrium (SLIDE) landslide prediction model. We further evaluated this ensemble-based prototype warning system by conducting multi-year simulations driven by the Multi-Radar Multi-Sensor (MRMS) rainfall estimates in North Carolina and Oregon. We comprehensively evaluated the predictive capabilities of this system against observed and reported flood and landslides events. We then evaluated the sensitivity of the coupled system to the simulated hydrological processes. Our results show that the system is generally capable of making accurate predictions of flash flood and landslide events in terms of their locations and time of occurrence. The occurrence of predicted landslides show high sensitivity to total infiltration and soil water content, highlighting the importance of accurately simulating the hydrological processes on the accurate forecasting of rainfall triggered landslide events.

  4. Synthetic molecular systems based on porphyrins as models for the study of energy transfer in photosynthesis

    NASA Astrophysics Data System (ADS)

    Konovalova, Nadezhda V.; Evstigneeva, Rima P.; Luzgina, Valentina N.

    2001-11-01

    The published data on the synthesis and photochemical properties of porphyrin-based molecular ensembles which represent models of natural photosynthetic light-harvesting complexes are generalised and systematised. The dependence of the transfer of excitation energy on the distance between donor and acceptor components, their mutual arrangement, electronic and environmental factors are discussed. Two mechanisms of energy transfer reactions, viz., 'through space' and 'through bond', are considered. The bibliography includes 96 references.

  5. Selective ensemble modeling load parameters of ball mill based on multi-scale frequency spectral features and sphere criterion

    NASA Astrophysics Data System (ADS)

    Tang, Jian; Yu, Wen; Chai, Tianyou; Liu, Zhuo; Zhou, Xiaojie

    2016-01-01

    It is difficult to model multi-frequency signal, such as mechanical vibration and acoustic signals of wet ball mill in the mineral grinding process. In this paper, these signals are decomposed into multi-scale intrinsic mode functions (IMFs) by the empirical mode decomposition (EMD) technique. A new adaptive multi-scale spectral features selection approach based on sphere criterion (SC) is applied to these IMFs frequency spectra. The candidate sub-models are constructed by the partial least squares (PLS) with the selected features. Finally, the branch and bound based selective ensemble (BBSEN) algorithm is applied to select and combine these ensemble sub-models. This method can be easily extended to regression and classification problems with multi-time scale signal. We successfully apply this approach to a laboratory-scale ball mill. The shell vibration and acoustic signals are used to model mill load parameters. The experimental results demonstrate that this novel approach is more effective than the other modeling methods based on multi-scale frequency spectral features.

  6. Temperature fluctuations in a changing climate: an ensemble-based experimental approach.

    PubMed

    Vincze, Miklós; Borcia, Ion Dan; Harlander, Uwe

    2017-03-21

    There is an ongoing debate in the literature about whether the present global warming is increasing local and global temperature variability. The central methodological issues of this debate relate to the proper treatment of normalised temperature anomalies and trends in the studied time series which may be difficult to separate from time-evolving fluctuations. Some argue that temperature variability is indeed increasing globally, whereas others conclude it is decreasing or remains practically unchanged. Meanwhile, a consensus appears to emerge that local variability in certain regions (e.g. Western Europe and North America) has indeed been increasing in the past 40 years. Here we investigate the nature of connections between external forcing and climate variability conceptually by using a laboratory-scale minimal model of mid-latitude atmospheric thermal convection subject to continuously decreasing 'equator-to-pole' temperature contrast ΔT, mimicking climate change. The analysis of temperature records from an ensemble of experimental runs ('realisations') all driven by identical time-dependent external forcing reveals that the collective variability of the ensemble and that of individual realisations may be markedly different - a property to be considered when interpreting climate records.

  7. Streamflow Simulations for the Mississippi River Basin Based on Ensemble Regional Climate Model Simulations

    NASA Astrophysics Data System (ADS)

    Arritt, R. W.; Jha, M.; Takle, E. S.; Gu, R.

    2004-12-01

    Ensemble simulations provide a useful tool for studying uncertainties in climate projections and for deriving probabilistic information from deterministic forecasts. Although a number of studies have examined variability within climate models, fewer have quantified the extent to which variability and uncertainty in climate simulations then propagates through impacts models. Here we evaluate the variability in simulated streamflow that result from taking the streamflow model's inputs from different members of an ensemble of simulations by a decadal-scale nested regional climate model. The regional climate model, RegCM3, simulated a domain covering the continental U.S. and most of Mexico for the period 1986-2003 using initial and lateral boundary conditions from the NCEP-DOE Reanalysis 2. Three RegCM3 realizations were created, each initialized one month apart but otherwise identical in configuration so that their collective behavior provides a measure of internal variability of the climate model. RegCM3 output for daily precipitation, temperature, and radiation were then used as input to the Soil and Water Assessment Tool (SWAT) over the upper Mississippi River basin. Seasonal and interannual variability of SWAT-predicted streamflow indicate that the internal variability of the RegCM3 climate model carries through to produce spread in simulated streamflow from SWAT.

  8. The diagnostics of diabetes mellitus based on ensemble modeling and hair/urine element level analysis.

    PubMed

    Chen, Hui; Tan, Chao; Lin, Zan; Wu, Tong

    2014-07-01

    The aim of the present work focuses on exploring the feasibility of analyzing the relationship between diabetes mellitus and several element levels in hair/urine specimens by chemometrics. A dataset involving 211 specimens and eight element concentrations was used. The control group was divided into three age subsets in order to analyze the influence of age. It was found that the most obvious difference was the effect of age on the level of zinc and iron. The decline of iron concentration with age in hair was exactly consistent with the opposite trend in urine. Principal component analysis (PCA) was used as a tool for a preliminary evaluation of the data. Both ensemble and single support vector machine (SVM) algorithms were used as the classification tools. On average, the accuracy, sensitivity and specificity of ensemble SVM models were 99%, 100%, 99% and 97%, 89%, 99% for hair and urine samples, respectively. The findings indicate that hair samples are superior to urine samples. Even so, it can provide more valuable information for prevention, diagnostics, treatment and research of diabetes by simultaneously analyzing the hair and urine samples.

  9. Motion based markerless gait analysis using standard events of gait and ensemble Kalman filtering.

    PubMed

    Vishnoi, Nalini; Mitra, Anish; Duric, Zoran; Gerber, Naomi Lynn

    2014-01-01

    We present a novel approach to gait analysis using ensemble Kalman filtering which permits markerless determination of segmental movement. We use image flow analysis to reliably compute temporal and kinematic measures including the translational velocity of the torso and rotational velocities of the lower leg segments. Detecting the instances where velocity changes direction also determines the standard events of a gait cycle (double-support, toe-off, mid-swing and heel-strike). In order to determine the kinematics of lower limbs, we model the synergies between the lower limb motions (thigh-shank, shank-foot) by building a nonlinear dynamical system using CMUs 3D motion capture database. This information is fed into the ensemble Kalman Filter framework to estimate the unobserved limb (upper leg and foot) motion from the measured lower leg rotational velocity. Our approach does not require calibrated cameras or special markers to capture movement. We have tested our method on different gait sequences collected from the sagttal plane and presented the estimated kinematics overlaid on the original image frames. We have also validated our approach by manually labeling the videos and comparing our results against them.

  10. Comment on ``Preserving the Boltzmann ensemble in replica-exchange molecular dynamics'' [J. Chem. Phys. 129, 164112 (2008)

    NASA Astrophysics Data System (ADS)

    Fukuda, Ikuo

    2010-03-01

    A brief discussion of the ergodic description of constant temperature molecular dynamics (MD) is provided; the discussion is based on the analysis of criticisms raised in a recent paper [B. Cooke and S. C. Schmidler, J. Chem. Phys.129, 164112 (2008)]. In the paper, the following criticisms relating to the basic concepts of constant temperature MD are made in mathematical manners: (I) the Nosé-Hoover (NH) equation is not measure-preserving; (II) NH system and NH chain system are not ergodic under the Boltzmann measure; and (III) the Nosé Hamiltonian system as well as the Nosé-Poincaré Hamiltonian system is not ergodic. In this comment, I show the necessity for the reconsideration of these criticisms. The NH equation is measure-preserving, where the measure carries the Boltzmann-Gibbs density; this fact provides the compatibility between MD equation and the Boltzmann-Gibbs distribution. The arguments advanced in support of the above criticisms are unsound; ergodicities of those systems are still not theoretically judged. I discuss exact ergodic-theoretical expressions appropriate for constant temperature MD, and explain the reason behind the incorrect recognitions.

  11. Multi-objective Optimization Based Calibration of Hydrologic Model and Ensemble Hydrologic Forecast for Java Island, Indonesia

    NASA Astrophysics Data System (ADS)

    Yanto, M.; Kasprzyk, J. R.; Rajagopalan, B.; Livneh, B.

    2016-12-01

    This study explores the benefits of multi-objective optimization of Variable Infiltration Capacity (VIC) model for five watersheds in Java, the most populous island in Indonesia. Six objective functions: Nash Sutcliffe Efficiency (NSE), percent bias (PBIAS), logarithmic function of root mean square error (Log-RMSE), predictive efficiency (Pe), percent errors in peak (PEP) and slope of flow duration curve error (SFDCE) were selected as evaluation metrics. These metrics were optimized by tuning four VIC model parameters: infiltration shape parameter (b), fraction of maximum baseflow where nonlinear baseflow begin (Ds), thickness of soil layer 2 (thick2) and thickness of soil layer 3 (thick3). We employed Borg Multiobjective Evolutionary Algorithm (Borg MOEA), an automatic simulation-optimization algorithm, to search for non-dominated solutions. We identified the redundancy between NSE and Log-RMSE, Pe, and PEP through visual inspection of their sensitivity to parameters b and Ds of VIC model and to baseflow index (BFI). Accordingly, we proposed NSE, PBIAS and SFDCE as critical objective functions to represent hydrologic processes in tropical region of Java, Indonesia. Using these three objective functions, we culled the objective functions based on at least - NSE > 0.75, PBIAS < 15% and SFDCE < 15% - and showed that the number of optimized objective functions as well as model parameters in the ensemble are reduced but the value ranges are maintained. We used the culled model parameters, to run the VIC model using an ensemble of conditioned seasonal climate forecast to generate an ensemble streamflow prediction in the period 2001 - 2010, the time window when the seasonal climate forecasts and observed streamflow records overlaps. We measured the skill of this seasonal forecast by computing the rank probability skill score (RPSS) of seasonal total flows and extremes at three different thresholds, for the dry and wet seasons. We showed that the RPSS of seasonal flows

  12. Multi-Conformer Ensemble Docking to Difficult Protein Targets

    SciTech Connect

    Ellingson, Sally R.; Miao, Yinglong; Baudry, Jerome; Smith, Jeremy C.

    2014-09-08

    We investigate large-scale ensemble docking using five proteins from the Directory of Useful Decoys (DUD, dud.docking.org) for which docking to crystal structures has proven difficult. Molecular dynamics trajectories are produced for each protein and an ensemble of representative conformational structures extracted from the trajectories. Docking calculations are performed on these selected simulation structures and ensemble-based enrichment factors compared with those obtained using docking in crystal structures of the same protein targets or random selection of compounds. We also found simulation-derived snapshots with improved enrichment factors that increased the chemical diversity of docking hits for four of the five selected proteins. A combination of all the docking results obtained from molecular dynamics simulation followed by selection of top-ranking compounds appears to be an effective strategy for increasing the number and diversity of hits when using docking to screen large libraries of chemicals against difficult protein targets.

  13. Multi-conformer ensemble docking to difficult protein targets.

    PubMed

    Ellingson, Sally R; Miao, Yinglong; Baudry, Jerome; Smith, Jeremy C

    2015-01-22

    Large-scale ensemble docking is investigated using five proteins from the Directory of Useful Decoys (DUD, dud.docking.org ) for which docking to crystal structures has proven difficult. Molecular dynamics trajectories are produced for each protein and an ensemble of representative conformational structures extracted from the trajectories. Docking calculations are performed on these selected simulation structures and ensemble-based enrichment factors compared with those obtained using docking in crystal structures of the same protein targets or random selection of compounds. Simulation-derived snapshots are found with improved enrichment factors that increase the chemical diversity of docking hits for four of the five selected proteins. A combination of all the docking results obtained from molecular dynamics simulation followed by selection of top-ranking compounds appears to be an effective strategy for increasing the number and diversity of hits when using docking to screen large libraries of chemicals against difficult protein targets.

  14. Fluid trajectory evaluation based on an ensemble-averaged cross-correlation in time-resolved PIV

    NASA Astrophysics Data System (ADS)

    Jeon, Young Jin; Chatellier, Ludovic; David, Laurent

    2014-07-01

    A novel multi-frame particle image velocimetry (PIV) method, able to evaluate a fluid trajectory by means of an ensemble-averaged cross-correlation, is introduced. The method integrates the advantages of the state-of-art time-resolved PIV (TR-PIV) methods to further enhance both robustness and dynamic range. The fluid trajectory follows a polynomial model with a prescribed order. A set of polynomial coefficients, which maximizes the ensemble-averaged cross-correlation value across the frames, is regarded as the most appropriate solution. To achieve a convergence of the trajectory in terms of polynomial coefficients, an ensemble-averaged cross-correlation map is constructed by sampling cross-correlation values near the predictor trajectory with respect to an imposed change of each polynomial coefficient. A relation between the given change and corresponding cross-correlation maps, which could be calculated from the ordinary cross-correlation, is derived. A disagreement between computational domain and corresponding physical domain is compensated by introducing the Jacobian matrix based on the image deformation scheme in accordance with the trajectory. An increased cost of the convergence calculation, associated with the nonlinearity of the fluid trajectory, is moderated by means of a V-cycle iteration. To validate enhancements of the present method, quantitative comparisons with the state-of-arts TR-PIV methods, e.g., the adaptive temporal interval, the multi-frame pyramid correlation and the fluid trajectory correlation, were carried out by using synthetically generated particle image sequences. The performances of the tested methods are discussed in algorithmic terms. A high-rate TR-PIV experiment of a flow over an airfoil demonstrates the effectiveness of the present method. It is shown that the present method is capable of reducing random errors in both velocity and material acceleration while suppressing spurious temporal fluctuations due to measurement noise.

  15. BWM*: A Novel, Provable, Ensemble-based Dynamic Programming Algorithm for Sparse Approximations of Computational Protein Design.

    PubMed

    Jou, Jonathan D; Jain, Swati; Georgiev, Ivelin S; Donald, Bruce R

    2016-06-01

    Sparse energy functions that ignore long range interactions between residue pairs are frequently used by protein design algorithms to reduce computational cost. Current dynamic programming algorithms that fully exploit the optimal substructure produced by these energy functions only compute the GMEC. This disproportionately favors the sequence of a single, static conformation and overlooks better binding sequences with multiple low-energy conformations. Provable, ensemble-based algorithms such as A* avoid this problem, but A* cannot guarantee better performance than exhaustive enumeration. We propose a novel, provable, dynamic programming algorithm called Branch-Width Minimization* (BWM*) to enumerate a gap-free ensemble of conformations in order of increasing energy. Given a branch-decomposition of branch-width w for an n-residue protein design with at most q discrete side-chain conformations per residue, BWM* returns the sparse GMEC in O([Formula: see text]) time and enumerates each additional conformation in merely O([Formula: see text]) time. We define a new measure, Total Effective Search Space (TESS), which can be computed efficiently a priori before BWM* or A* is run. We ran BWM* on 67 protein design problems and found that TESS discriminated between BWM*-efficient and A*-efficient cases with 100% accuracy. As predicted by TESS and validated experimentally, BWM* outperforms A* in 73% of the cases and computes the full ensemble or a close approximation faster than A*, enumerating each additional conformation in milliseconds. Unlike A*, the performance of BWM* can be predicted in polynomial time before running the algorithm, which gives protein designers the power to choose the most efficient algorithm for their particular design problem.

  16. Carbon Nanotube Based Molecular Electronics

    NASA Technical Reports Server (NTRS)

    Srivastava, Deepak; Saini, Subhash; Menon, Madhu

    1998-01-01

    Carbon nanotubes and the nanotube heterojunctions have recently emerged as excellent candidates for nanoscale molecular electronic device components. Experimental measurements on the conductivity, rectifying behavior and conductivity-chirality correlation have also been made. While quasi-one dimensional simple heterojunctions between nanotubes with different electronic behavior can be generated by introduction of a pair of heptagon-pentagon defects in an otherwise all hexagon graphene sheet. Other complex 3- and 4-point junctions may require other mechanisms. Structural stability as well as local electronic density of states of various nanotube junctions are investigated using a generalized tight-binding molecular dynamics (GDBMD) scheme that incorporates non-orthogonality of the orbitals. The junctions investigated include straight and small angle heterojunctions of various chiralities and diameters; as well as more complex 'T' and 'Y' junctions which do not always obey the usual pentagon-heptagon pair rule. The study of local density of states (LDOS) reveal many interesting features, most prominent among them being the defect-induced states in the gap. The proposed three and four pointjunctions are one of the smallest possible tunnel junctions made entirely of carbon atoms. Furthermore the electronic behavior of the nanotube based device components can be taylored by doping with group III-V elements such as B and N, and BN nanotubes as a wide band gap semiconductor has also been realized in experiments. Structural properties of heteroatomic nanotubes comprising C, B and N will be discussed.

  17. Carbon Nanotube Based Molecular Electronics

    NASA Technical Reports Server (NTRS)

    Srivastava, Deepak; Saini, Subhash; Menon, Madhu

    1998-01-01

    Carbon nanotubes and the nanotube heterojunctions have recently emerged as excellent candidates for nanoscale molecular electronic device components. Experimental measurements on the conductivity, rectifying behavior and conductivity-chirality correlation have also been made. While quasi-one dimensional simple heterojunctions between nanotubes with different electronic behavior can be generated by introduction of a pair of heptagon-pentagon defects in an otherwise all hexagon graphene sheet. Other complex 3- and 4-point junctions may require other mechanisms. Structural stability as well as local electronic density of states of various nanotube junctions are investigated using a generalized tight-binding molecular dynamics (GDBMD) scheme that incorporates non-orthogonality of the orbitals. The junctions investigated include straight and small angle heterojunctions of various chiralities and diameters; as well as more complex 'T' and 'Y' junctions which do not always obey the usual pentagon-heptagon pair rule. The study of local density of states (LDOS) reveal many interesting features, most prominent among them being the defect-induced states in the gap. The proposed three and four pointjunctions are one of the smallest possible tunnel junctions made entirely of carbon atoms. Furthermore the electronic behavior of the nanotube based device components can be taylored by doping with group III-V elements such as B and N, and BN nanotubes as a wide band gap semiconductor has also been realized in experiments. Structural properties of heteroatomic nanotubes comprising C, B and N will be discussed.

  18. A modified Shockley equation taking into account the multi-element nature of light emitting diodes based on nanowire ensembles

    NASA Astrophysics Data System (ADS)

    Musolino, M.; Tahraoui, A.; van Treeck, D.; Geelhaar, L.; Riechert, H.

    2016-07-01

    In this work we study how the multi-element nature of light emitting diodes (LEDs) based on nanowire (NW) ensembles influences their current voltage (I-V) characteristics. We systematically address critical issues of the fabrication process that can result in significant fluctuations of the electrical properties among the individual NWs in such LEDs, paying particular attention to the planarization step. Electroluminescence (EL) maps acquired for two nominally identical NW-LEDs reveal that small processing variations can result in a large difference in the number of individual nano-devices emitting EL. The lower number of EL spots in one of the LEDs is caused by its inhomogeneous electrical properties. The I-V characteristics of this LED cannot be described well by the classical Shockley model. We are able to take into account the multi-element nature of such LEDs and fit the I-V characteristics in the forward bias regime by employing an ad hoc adjusted version of the Shockley equation. More specifically, we introduce a bias dependence of the ideality factor. The basic considerations of our model should remain valid also for other types of devices based on ensembles of interconnected p-n junctions with inhomogeneous electrical properties, regardless of the employed material system.

  19. Assessing a robust ensemble-based Kalman filter for efficient ecosystem data assimilation of the Cretan Sea

    NASA Astrophysics Data System (ADS)

    Triantafyllou, G.; Hoteit, I.; Luo, X.; Tsiaras, K.; Petihakis, G.

    2013-09-01

    An application of an ensemble-based robust filter for data assimilation into an ecosystem model of the Cretan Sea is presented and discussed. The ecosystem model comprises two on-line coupled sub-models: the Princeton Ocean Model (POM) and the European Regional Seas Ecosystem Model (ERSEM). The filtering scheme is based on the Singular Evolutive Interpolated Kalman (SEIK) filter which is implemented with a time-local H∞ filtering strategy to enhance robustness and performances during periods of strong ecosystem variability. Assimilation experiments in the Cretan Sea indicate that robustness can be achieved in the SEIK filter by introducing an adaptive inflation scheme of the modes of the filter error covariance matrix. Twin-experiments are performed to evaluate the performance of the assimilation system and to study the benefits of using robust filtering in an ensemble filtering framework. Pseudo-observations of surface chlorophyll, extracted from a model reference run, were assimilated every two days. Simulation results suggest that the adaptive inflation scheme significantly improves the behavior of the SEIK filter during periods of strong ecosystem variability.

  20. A novel computer-aided diagnosis system for breast MRI based on feature selection and ensemble learning.

    PubMed

    Lu, Wei; Li, Zhe; Chu, Jinghui

    2017-03-06

    Breast cancer is a common cancer among women. With the development of modern medical science and information technology, medical imaging techniques have an increasingly important role in the early detection and diagnosis of breast cancer. In this paper, we propose an automated computer-aided diagnosis (CADx) framework for magnetic resonance imaging (MRI). The scheme consists of an ensemble of several machine learning-based techniques, including ensemble under-sampling (EUS) for imbalanced data processing, the Relief algorithm for feature selection, the subspace method for providing data diversity, and Adaboost for improving the performance of base classifiers. We extracted morphological, various texture, and Gabor features. To clarify the feature subsets' physical meaning, subspaces are built by combining morphological features with each kind of texture or Gabor feature. We tested our proposal using a manually segmented Region of Interest (ROI) data set, which contains 438 images of malignant tumors and 1898 images of normal tissues or benign tumors. Our proposal achieves an area under the ROC curve (AUC) value of 0.9617, which outperforms most other state-of-the-art breast MRI CADx systems. Compared with other methods, our proposal significantly reduces the false-positive classification rate.

  1. A modified Shockley equation taking into account the multi-element nature of light emitting diodes based on nanowire ensembles.

    PubMed

    Musolino, M; Tahraoui, A; Treeck, D van; Geelhaar, L; Riechert, H

    2016-07-08

    In this work we study how the multi-element nature of light emitting diodes (LEDs) based on nanowire (NW) ensembles influences their current voltage (I-V) characteristics. We systematically address critical issues of the fabrication process that can result in significant fluctuations of the electrical properties among the individual NWs in such LEDs, paying particular attention to the planarization step. Electroluminescence (EL) maps acquired for two nominally identical NW-LEDs reveal that small processing variations can result in a large difference in the number of individual nano-devices emitting EL. The lower number of EL spots in one of the LEDs is caused by its inhomogeneous electrical properties. The I-V characteristics of this LED cannot be described well by the classical Shockley model. We are able to take into account the multi-element nature of such LEDs and fit the I-V characteristics in the forward bias regime by employing an ad hoc adjusted version of the Shockley equation. More specifically, we introduce a bias dependence of the ideality factor. The basic considerations of our model should remain valid also for other types of devices based on ensembles of interconnected p-n junctions with inhomogeneous electrical properties, regardless of the employed material system.

  2. A novel hybrid decomposition-and-ensemble model based on CEEMD and GWO for short-term PM2.5 concentration forecasting

    NASA Astrophysics Data System (ADS)

    Niu, Mingfei; Wang, Yufang; Sun, Shaolong; Li, Yongwu

    2016-06-01

    To enhance prediction reliability and accuracy, a hybrid model based on the promising principle of "decomposition and ensemble" and a recently proposed meta-heuristic called grey wolf optimizer (GWO) is introduced for daily PM2.5 concentration forecasting. Compared with existing PM2.5 forecasting methods, this proposed model has improved the prediction accuracy and hit rates of directional prediction. The proposed model involves three main steps, i.e., decomposing the original PM2.5 series into several intrinsic mode functions (IMFs) via complementary ensemble empirical mode decomposition (CEEMD) for simplifying the complex data; individually predicting each IMF with support vector regression (SVR) optimized by GWO; integrating all predicted IMFs for the ensemble result as the final prediction by another SVR optimized by GWO. Seven benchmark models, including single artificial intelligence (AI) models, other decomposition-ensemble models with different decomposition methods and models with the same decomposition-ensemble method but optimized by different algorithms, are considered to verify the superiority of the proposed hybrid model. The empirical study indicates that the proposed hybrid decomposition-ensemble model is remarkably superior to all considered benchmark models for its higher prediction accuracy and hit rates of directional prediction.

  3. An ensemble-based algorithm for optimizing the configuration of an in situ soil moisture monitoring network

    NASA Astrophysics Data System (ADS)

    De Vleeschouwer, Niels; Verhoest, Niko E. C.; Gobeyn, Sacha; De Baets, Bernard; Verwaeren, Jan; Pauwels, Valentijn R. N.

    2015-04-01

    factors that will influence the outcome of the algorithm are the following: the choice of the hydrological model, the uncertainty model applied for ensemble generation, the general wetness of the catchment during which the error covariance is computed, etc. In this research the influence of the latter two is examined more in-depth. Furthermore, the optimal network configuration resulting from the newly developed algorithm is compared to network configurations obtained by two other algorithms. The first algorithm is based on a temporal stability analysis of the modeled soil moisture in order to identify catchment representative monitoring locations with regard to average conditions. The second algorithm involves the clustering of available spatially distributed data (e.g. land cover and soil maps) that is not obtained by hydrological modeling.

  4. Hybrid MPI/OpenMP Implementation of the ORAC Molecular Dynamics Program for Generalized Ensemble and Fast Switching Alchemical Simulations.

    PubMed

    Procacci, Piero

    2016-06-27

    We present a new release (6.0β) of the ORAC program [Marsili et al. J. Comput. Chem. 2010, 31, 1106-1116] with a hybrid OpenMP/MPI (open multiprocessing message passing interface) multilevel parallelism tailored for generalized ensemble (GE) and fast switching double annihilation (FS-DAM) nonequilibrium technology aimed at evaluating the binding free energy in drug-receptor system on high performance computing platforms. The production of the GE or FS-DAM trajectories is handled using a weak scaling parallel approach on the MPI level only, while a strong scaling force decomposition scheme is implemented for intranode computations with shared memory access at the OpenMP level. The efficiency, simplicity, and inherent parallel nature of the ORAC implementation of the FS-DAM algorithm, project the code as a possible effective tool for a second generation high throughput virtual screening in drug discovery and design. The code, along with documentation, testing, and ancillary tools, is distributed under the provisions of the General Public License and can be freely downloaded at www.chim.unifi.it/orac .

  5. Dynamic State Estimation and Parameter Calibration of DFIG based on Ensemble Kalman Filter

    SciTech Connect

    Fan, Rui; Huang, Zhenyu; Wang, Shaobu; Diao, Ruisheng; Meng, Da

    2015-07-30

    With the growing interest in the application of wind energy, doubly fed induction generator (DFIG) plays an essential role in the industry nowadays. To deal with the increasing stochastic variations introduced by intermittent wind resource and responsive loads, dynamic state estimation (DSE) are introduced in any power system associated with DFIGs. However, sometimes this dynamic analysis canould not work because the parameters of DFIGs are not accurate enough. To solve the problem, an ensemble Kalman filter (EnKF) method is proposed for the state estimation and parameter calibration tasks. In this paper, a DFIG is modeled and implemented with the EnKF method. Sensitivity analysis is demonstrated regarding the measurement noise, initial state errors and parameter errors. The results indicate this EnKF method has a robust performance on the state estimation and parameter calibration of DFIGs.

  6. Bearing fault detection based on hybrid ensemble detector and empirical mode decomposition

    NASA Astrophysics Data System (ADS)

    Georgoulas, George; Loutas, Theodore; Stylios, Chrysostomos D.; Kostopoulos, Vassilis

    2013-12-01

    Aiming at more efficient fault diagnosis, this research work presents an integrated anomaly detection approach for seeded bearing faults. Vibration signals from normal bearings and bearings with three different fault locations, as well as different fault sizes and loading conditions are examined. The Empirical Mode Decomposition and the Hilbert Huang transform are employed for the extraction of a compact feature set. Then, a hybrid ensemble detector is trained using data coming only from the normal bearings and it is successfully applied for the detection of any deviation from the normal condition. The results prove the potential use of the proposed scheme as a first stage of an alarm signalling system for the detection of bearing faults irrespective of their loading condition.

  7. [Simulation of cropland soil moisture based on an ensemble Kalman filter].

    PubMed

    Liu, Zhao; Zhou, Yan-Lian; Ju, Wei-Min; Gao, Ping

    2011-11-01

    By using an ensemble Kalman filter (EnKF) to assimilate the observed soil moisture data, the modified boreal ecosystem productivity simulator (BEPS) model was adopted to simulate the dynamics of soil moisture in winter wheat root zones at Xuzhou Agro-meteorological Station, Jiangsu Province of China during the growth seasons in 2000-2004. After the assimilation of observed data, the determination coefficient, root mean square error, and average absolute error of simulated soil moisture were in the ranges of 0.626-0.943, 0.018-0.042, and 0.021-0.041, respectively, with the simulation precision improved significantly, as compared with that before assimilation, indicating the applicability of data assimilation in improving the simulation of soil moisture. The experimental results at single point showed that the errors in the forcing data and observations and the frequency and soil depth of the assimilation of observed data all had obvious effects on the simulated soil moisture.

  8. Ensemble-based hybrid probabilistic sampling for imbalanced data learning in lung nodule CAD.

    PubMed

    Cao, Peng; Yang, Jinzhu; Li, Wei; Zhao, Dazhe; Zaiane, Osmar

    2014-04-01

    Classification plays a critical role in false positive reduction (FPR) in lung nodule computer aided detection (CAD). The difficulty of FPR lies in the variation of the appearances of the nodules, and the imbalance distribution between the nodule and non-nodule class. Moreover, the presence of inherent complex structures in data distribution, such as within-class imbalance and high-dimensionality are other critical factors of decreasing classification performance. To solve these challenges, we proposed a hybrid probabilistic sampling combined with diverse random subspace ensemble. Experimental results demonstrate the effectiveness of the proposed method in terms of geometric mean (G-mean) and area under the ROC curve (AUC) compared with commonly used methods. Copyright © 2013 Elsevier Ltd. All rights reserved.

  9. Stability evaluation of short-circuiting gas metal arc welding based on ensemble empirical mode decomposition

    NASA Astrophysics Data System (ADS)

    Huang, Yong; Wang, Kehong; Zhou, Zhilan; Zhou, Xiaoxiao; Fang, Jimi

    2017-03-01

    The arc of gas metal arc welding (GMAW) contains abundant information about its stability and droplet transition, which can be effectively characterized by extracting the arc electrical signals. In this study, ensemble empirical mode decomposition (EEMD) was used to evaluate the stability of electrical current signals. The welding electrical signals were first decomposed by EEMD, and then transformed to a Hilbert–Huang spectrum and a marginal spectrum. The marginal spectrum is an approximate distribution of amplitude with frequency of signals, and can be described by a marginal index. Analysis of various welding process parameters showed that the marginal index of current signals increased when the welding process was more stable, and vice versa. Thus EEMD combined with the marginal index can effectively uncover the stability and droplet transition of GMAW.

  10. An efficient ensemble of radial basis functions method based on quadratic programming

    NASA Astrophysics Data System (ADS)

    Shi, Renhe; Liu, Li; Long, Teng; Liu, Jian

    2016-07-01

    Radial basis function (RBF) surrogate models have been widely applied in engineering design optimization problems to approximate computationally expensive simulations. Ensemble of radial basis functions (ERBF) using the weighted sum of stand-alone RBFs improves the approximation performance. To achieve a good trade-off between the accuracy and efficiency of the modelling process, this article presents a novel efficient ERBF method to determine the weights through solving a quadratic programming subproblem, denoted ERBF-QP. Several numerical benchmark functions are utilized to test the performance of the proposed ERBF-QP method. The results show that ERBF-QP can significantly improve the modelling efficiency compared with several existing ERBF methods. Moreover, ERBF-QP also provides satisfactory performance in terms of approximation accuracy. Finally, the ERBF-QP method is applied to a satellite multidisciplinary design optimization problem to illustrate its practicality and effectiveness for real-world engineering applications.

  11. A WRF-based ensemble data assimilation system for dynamic downscaling of satellite precipitation information (Invited)

    NASA Astrophysics Data System (ADS)

    Zhang, S. Q.; Hou, A. Y.; Zupanski, M.; Cheung, S.

    2010-12-01

    For many hydrological applications, dynamic downscaling from global analyses has been used to provide local scale information on spatial and temporal distribution of precipitation and other associated environmental parameters. In the near future the NASA Global Precipitation Measurement (GPM) Mission will provide new sources of precipitation observations with unprecedented spatial and temporal coverage for better understanding and prediction of climate, weather and hydro-meteorological processes. However, in terms of using precipitation observations in global analyses and forecasts, the capability of current operational systems is generally limited by the global model resolution, the requirement of linearization of parameterized cloud physics, and the static forecast error statistics often with no distinction for clear sky or storm. In order to maximize the utilization of satellite precipitation observations in dynamic downscaling for hydrological applications, an ensemble data assimilation system (Goddard-WRF-EDAS) has been developed jointly by NASA Goddard and Colorado State University (CSU). The system takes advantages of the cloud-resolving high-resolution of the Weather Research and Forecasting (WRF) model with NASA Goddard microphysics and the flow-dependent estimation of forecast error covariance from the Maximum Likelihood Ensemble Filter (MLEF). Satellite observed radiances in precipitation regions are assimilated using Goddard Satellite Data Simulator Unit (SDSU) as the observation operator. Experimental results using current available satellite precipitation data (AMSR-E and TRMM-TMI) are presented to investigate the ability of the assimilation system in ingesting information from in-situ and satellite observations to produce dynamically downscaled precipitation. The results from the assimilation of precipitation-affected microwave radiances in a storm case and in a heavy rainfall event demonstrate the data impact to down-scaled precipitation and

  12. Effects of ensembles on methane hydrate nucleation kinetics.

    PubMed

    Zhang, Zhengcai; Liu, Chan-Juan; Walsh, Matthew R; Guo, Guang-Jun

    2016-06-21

    By performing molecular dynamics simulations to form a hydrate with a methane nano-bubble in liquid water at 250 K and 50 MPa, we report how different ensembles, such as the NPT, NVT, and NVE ensembles, affect the nucleation kinetics of the methane hydrate. The nucleation trajectories are monitored using the face-saturated incomplete cage analysis (FSICA) and the mutually coordinated guest (MCG) order parameter (OP). The nucleation rate and the critical nucleus are obtained using the mean first-passage time (MFPT) method based on the FS cages and the MCG-1 OPs, respectively. The fitting results of MFPT show that hydrate nucleation and growth are coupled together, consistent with the cage adsorption hypothesis which emphasizes that the cage adsorption of methane is a mechanism for both hydrate nucleation and growth. For the three different ensembles, the hydrate nucleation rate is quantitatively ordered as follows: NPT > NVT > NVE, while the sequence of hydrate crystallinity is exactly reversed. However, the largest size of the critical nucleus appears in the NVT ensemble, rather than in the NVE ensemble. These results are helpful for choosing a suitable ensemble when to study hydrate formation via computer simulations, and emphasize the importance of the order degree of the critical nucleus.

  13. Molecular dynamics in the isothermal-isobaric ensemble: the requirement of a "shell" molecule. I. Theory and phase-space analysis.

    PubMed

    Uline, Mark J; Corti, David S

    2005-10-22

    Current constant pressure molecular-dynamics (MD) algorithms are not consistent with the recent reformulation of the isothermal-isobaric (NpT) ensemble. The NpT ensemble partition function requires the use of a "shell" molecule to identify uniquely the volume of the system, thereby avoiding the redundant counting of configurations [e.g., G. J. M. Koper and H. Reiss, J. Phys. Chem. 100, 422 (1996); D. S. Corti, Phys. Rev. E, 64, 016128 (2001)]. So far, only the NpT Monte Carlo method has been updated to allow the system volume to be defined by a shell particle [D. S. Corti, Mol. Phys. 100, 1887 (2002)]. A shell particle has yet to be incorporated into MD simulations. The proper modification of the NpT MD algorithm is therefore the subject of this paper. Unlike Andersen's method [H. C. Andersen, J. Chem. Phys. 72, 2384 (1980)] where a piston of unknown mass serves to control the response time of volume fluctuations, the newly proposed equations of motion impose a constant external pressure via the introduction of a shell particle of known mass. Hence, the system itself sets the time scales for pressure and volume fluctuations. The new algorithm is subject to a number of fundamentally rigorous tests to ensure that the equations of motion sample phase space correctly. We also show that the Hoover NpT algorithm [W. G. Hoover, Phys. Rev. A. 31, 1695 (1985); 34, 2499 (1986)] does sample phase correctly, but only when periodic boundary conditions are employed.

  14. The Ensembl gene annotation system

    PubMed Central

    Aken, Bronwen L.; Ayling, Sarah; Barrell, Daniel; Clarke, Laura; Curwen, Valery; Fairley, Susan; Fernandez Banet, Julio; Billis, Konstantinos; García Girón, Carlos; Hourlier, Thibaut; Howe, Kevin; Kähäri, Andreas; Kokocinski, Felix; Martin, Fergal J.; Murphy, Daniel N.; Nag, Rishi; Ruffier, Magali; Schuster, Michael; Tang, Y. Amy; Vogel, Jan-Hinnerk; White, Simon; Zadissa, Amonida; Flicek, Paul

    2016-01-01

    The Ensembl gene annotation system has been used to annotate over 70 different vertebrate species across a wide range of genome projects. Furthermore, it generates the automatic alignment-based annotation for the human and mouse GENCODE gene sets. The system is based on the alignment of biological sequences, including cDNAs, proteins and RNA-seq reads, to the target genome in order to construct candidate transcript models. Careful assessment and filtering of these candidate transcripts ultimately leads to the final gene set, which is made available on the Ensembl website. Here, we describe the annotation process in detail. Database URL: http://www.ensembl.org/index.html PMID:27337980

  15. Conformational Ensemble of hIAPP Dimer: Insight into the Molecular Mechanism by which a Green Tea Extract inhibits hIAPP Aggregation

    NASA Astrophysics Data System (ADS)

    Mo, Yuxiang; Lei, Jiangtao; Sun, Yunxiang; Zhang, Qingwen; Wei, Guanghong

    2016-09-01

    Small oligomers formed early along human islet amyloid polypeptide (hIAPP) aggregation is responsible for the cell death in Type II diabetes. The epigallocatechin gallate (EGCG), a green tea extract, was found to inhibit hIAPP fibrillation. However, the inhibition mechanism and the conformational distribution of the smallest hIAPP oligomer – dimer are mostly unknown. Herein, we performed extensive replica exchange molecular dynamic simulations on hIAPP dimer with and without EGCG molecules. Extended hIAPP dimer conformations, with a collision cross section value similar to that observed by ion mobility-mass spectrometry, were observed in our simulations. Notably, these dimers adopt a three-stranded antiparallel β-sheet and contain the previously reported β-hairpin amyloidogenic precursor. We find that EGCG binding strongly blocks both the inter-peptide hydrophobic and aromatic-stacking interactions responsible for inter-peptide β-sheet formation and intra-peptide interaction crucial for β-hairpin formation, thus abolishes the three-stranded β-sheet structures and leads to the formation of coil-rich conformations. Hydrophobic, aromatic-stacking, cation-π and hydrogen-bonding interactions jointly contribute to the EGCG-induced conformational shift. This study provides, on atomic level, the conformational ensemble of hIAPP dimer and the molecular mechanism by which EGCG inhibits hIAPP aggregation.

  16. Conformational Ensemble of hIAPP Dimer: Insight into the Molecular Mechanism by which a Green Tea Extract inhibits hIAPP Aggregation

    PubMed Central

    Mo, Yuxiang; Lei, Jiangtao; Sun, Yunxiang; Zhang, Qingwen; Wei, Guanghong

    2016-01-01

    Small oligomers formed early along human islet amyloid polypeptide (hIAPP) aggregation is responsible for the cell death in Type II diabetes. The epigallocatechin gallate (EGCG), a green tea extract, was found to inhibit hIAPP fibrillation. However, the inhibition mechanism and the conformational distribution of the smallest hIAPP oligomer – dimer are mostly unknown. Herein, we performed extensive replica exchange molecular dynamic simulations on hIAPP dimer with and without EGCG molecules. Extended hIAPP dimer conformations, with a collision cross section value similar to that observed by ion mobility-mass spectrometry, were observed in our simulations. Notably, these dimers adopt a three-stranded antiparallel β-sheet and contain the previously reported β-hairpin amyloidogenic precursor. We find that EGCG binding strongly blocks both the inter-peptide hydrophobic and aromatic-stacking interactions responsible for inter-peptide β-sheet formation and intra-peptide interaction crucial for β-hairpin formation, thus abolishes the three-stranded β-sheet structures and leads to the formation of coil-rich conformations. Hydrophobic, aromatic-stacking, cation-π and hydrogen-bonding interactions jointly contribute to the EGCG-induced conformational shift. This study provides, on atomic level, the conformational ensemble of hIAPP dimer and the molecular mechanism by which EGCG inhibits hIAPP aggregation. PMID:27620620

  17. Motor-motor interactions in ensembles of muscle myosin: using theory to connect single molecule to ensemble measurements

    NASA Astrophysics Data System (ADS)

    Walcott, Sam

    2013-03-01

    Interactions between the proteins actin and myosin drive muscle contraction. Properties of a single myosin interacting with an actin filament are largely known, but a trillion myosins work together in muscle. We are interested in how single-molecule properties relate to ensemble function. Myosin's reaction rates depend on force, so ensemble models keep track of both molecular state and force on each molecule. These models make subtle predictions, e.g. that myosin, when part of an ensemble, moves actin faster than when isolated. This acceleration arises because forces between molecules speed reaction kinetics. Experiments support this prediction and allow parameter estimates. A model based on this analysis describes experiments from single molecule to ensemble. In vivo, actin is regulated by proteins that, when present, cause the binding of one myosin to speed the binding of its neighbors; binding becomes cooperative. Although such interactions preclude the mean field approximation, a set of linear ODEs describes these ensembles under simplified experimental conditions. In these experiments cooperativity is strong, with the binding of one molecule affecting ten neighbors on either side. We progress toward a description of myosin ensembles under physiological conditions.

  18. Pattern Recognition of Momentary Mental Workload Based on Multi-Channel Electrophysiological Data and Ensemble Convolutional Neural Networks.

    PubMed

    Zhang, Jianhua; Li, Sunan; Wang, Rubin

    2017-01-01

    In this paper, we deal with the Mental Workload (MWL) classification problem based on the measured physiological data. First we discussed the optimal depth (i.e., the number of hidden layers) and parameter optimization algorithms for the Convolutional Neural Networks (CNN). The base CNNs designed were tested according to five classification performance indices, namely Accuracy, Precision, F-measure, G-mean, and required training time. Then we developed an Ensemble Convolutional Neural Network (ECNN) to enhance the accuracy and robustness of the individual CNN model. For the ECNN design, three model aggregation approaches (weighted averaging, majority voting and stacking) were examined and a resampling strategy was used to enhance the diversity of individual CNN models. The results of MWL classification performance comparison indicated that the proposed ECNN framework can effectively improve MWL classification performance and is featured by entirely automatic feature extraction and MWL classification, when compared with traditional machine learning methods.

  19. Progress in Multi-Center Probabilistic Wave Forecasting and Ensemble-Based Data Assimilation using LETKF at the US National Weather Service

    NASA Astrophysics Data System (ADS)

    Alves, Jose-Henrique; Bernier, Natacha; Etala, Paula; Wittmann, Paul

    2015-04-01

    The combination of ensemble predictions of Hs made by the US National Weather Service (NEW) and the US Navy Fleet Numerical Meteorological and Oceanography Center (FNMOC) has established the NFCENS, a probabilistic wave forecast system in operations at NCEP since 2011. Computed from 41 combined wave ensemble members, the new product outperforms deterministic and probabilistic forecasts and nowcasts of Hs issued separately at each forecast center, at all forecast ranges. The successful implementation of the NFCENS has brought new opportunities for collaboration with Environment Canada (EC). EC is in the process of adding new global wave model ensemble products to its existing suite of operational regional products. The planned upgrade to the current NFCENS wave multi-center ensemble includes the addition of 20 members from the Canadian WES. With this upgrade, the NFCENS will be renamed North American Wave Ensemble System (NAWES). As part of the new system implementation, new higher-resolution grids and upgrades to model physics using recent advances in source-term parameterizations are being tested. We provide results of a first validation of NAWES relative to global altimeter data, and buoy measurements of waves, as well as its ability to forecast waves during the 2012 North Atlantic hurricane Sandy. A second line of research involving wave ensembles at the NWS is the implementation of a LETKF-based data assimilation system developed in collaboration with the Argentinian Navy Meteorological Service. The project involves an implementation of the 4D-LETKF in the NWS global wave ensemble forecast system GWES. The 4-D scheme initializes a full 81-member ensemble in a 6-hour cycle. The LETKF determines the analysis ensemble locally in the space spanned by the ensemble, as a linear combination of the background perturbations. Observations from three altimeters and one scatterometer were used. Preliminary results for a prototype system running at the NWS, including

  20. Ensemble Kalman Filter vs Particle Filter in a Physically Based Coupled Model of Surface-Subsurface Flow (Invited)

    NASA Astrophysics Data System (ADS)

    Putti, M.; Camporese, M.; Pasetto, D.

    2010-12-01

    Data assimilation (DA) has recently received growing interest by the hydrological modeling community due to its capability to merge observations into model prediction. Among the many DA methods available, the Ensemble Kalman Filter (EnKF) and the Particle Filter (PF) are suitable alternatives for applications to detailed physically-based hydrological models. For each assimilation period, both methods use a Monte Carlo approach to approximate the state probability distribution (in terms of mean and covariance matrix) by a finite number of independent model trajectories, also called particles or realizations. The two approaches differ in the way the filtering distribution is evaluated. EnKF implements the classical Kalman filter, optimal only for linear dynamics and Gaussian error statistics. Particle filters, instead, use directly the recursive formula of the sequential Bayesian framework and approximate the posterior probability distributions by means of appropriate weights associated to each realization. We use the Sequential Importance Resampling (SIR) technique, which retains only the most probable particles, in practice the trajectories closest in a statistical sense to the observations, and duplicates them when needed. In contrast to EnKF, particle filters make no assumptions on the form of the prior distribution of the model state, and convergence to the true state is ensured for large enough ensemble size. In this study EnKF and PF have been implemented in a physically based catchment simulator that couples a three-dimensional finite element Richards equation solver with a finite difference diffusion wave approximation based on a digital elevation data for surface water dynamics. We report on the retrieval performance of the two schemes using a three-dimensional tilted v-catchment synthetic test case in which multi-source observations are assimilated (pressure head, soil moisture, and streamflow data). The comparison between the results of the two approaches

  1. Efficient ensemble system based on the copper binding motif for highly sensitive and selective detection of cyanide ions in 100% aqueous solutions by fluorescent and colorimetric changes.

    PubMed

    Jung, Kwan Ho; Lee, Keun-Hyeung

    2015-09-15

    A peptide-based ensemble for the detection of cyanide ions in 100% aqueous solutions was designed on the basis of the copper binding motif. 7-Nitro-2,1,3-benzoxadiazole-labeled tripeptide (NBD-SSH, NBD-SerSerHis) formed the ensemble with Cu(2+), leading to a change in the color of the solution from yellow to orange and a complete decrease of fluorescence emission. The ensemble (NBD-SSH-Cu(2+)) sensitively and selectively detected a low concentration of cyanide ions in 100% aqueous solutions by a colorimetric change as well as a fluorescent change. The addition of cyanide ions instantly removed Cu(2+) from the ensemble (NBD-SSH-Cu(2+)) in 100% aqueous solutions, resulting in a color change of the solution from orange to yellow and a "turn-on" fluorescent response. The detection limits for cyanide ions were lower than the maximum allowable level of cyanide ions in drinking water set by the World Health Organization. The peptide-based ensemble system is expected to be a potential and practical way for the detection of submicromolar concentrations of cyanide ions in 100% aqueous solutions.

  2. Future changes to drought characteristics over the Canadian Prairie Provinces based on NARCCAP multi-RCM ensemble

    NASA Astrophysics Data System (ADS)

    Masud, M. B.; Khaliq, M. N.; Wheater, H. S.

    2016-06-01

    This study assesses projected changes to drought characteristics in Alberta, Saskatchewan and Manitoba, the prairie provinces of Canada, using a multi-regional climate model (RCM) ensemble available through the North American Regional Climate Change Assessment Program. Simulations considered include those performed with six RCMs driven by National Center for Environmental Prediction reanalysis II for the 1981-2003 period and those driven by four Atmosphere-Ocean General Circulation Models for the 1970-1999 and 2041-2070 periods (i.e. eleven current and the same number of corresponding future period simulations). Drought characteristics are extracted using two drought indices, namely the Standardized Precipitation Index (SPI) and the Standardized Precipitation Evapotranspiration Index (SPEI). Regional frequency analysis is used to project changes to selected 20- and 50-year regional return levels of drought characteristics for fifteen homogeneous regions, covering the study area. In addition, multivariate analyses of drought characteristics, derived on the basis of 6-month SPI and SPEI values, are developed using the copula approach for each region. Analysis of multi-RCM ensemble-averaged projected changes to mean and selected return levels of drought characteristics show increases over the southern and south-western parts of the study area. Based on bi- and trivariate joint occurrence probabilities of drought characteristics, the southern regions along with the central regions are found highly drought vulnerable, followed by the southwestern and southeastern regions. Compared to the SPI-based analysis, the results based on SPEI suggest drier conditions over many regions in the future, indicating potential effects of rising temperatures on drought risks. These projections will be useful in the development of appropriate adaptation strategies for the water and agricultural sectors, which play an important role in the economy of the study area.

  3. Rolling bearing fault detection and diagnosis based on composite multiscale fuzzy entropy and ensemble support vector machines

    NASA Astrophysics Data System (ADS)

    Zheng, Jinde; Pan, Haiyang; Cheng, Junsheng

    2017-02-01

    To timely detect the incipient failure of rolling bearing and find out the accurate fault location, a novel rolling bearing fault diagnosis method is proposed based on the composite multiscale fuzzy entropy (CMFE) and ensemble support vector machines (ESVMs). Fuzzy entropy (FuzzyEn), as an improvement of sample entropy (SampEn), is a new nonlinear method for measuring the complexity of time series. Since FuzzyEn (or SampEn) in single scale can not reflect the complexity effectively, multiscale fuzzy entropy (MFE) is developed by defining the FuzzyEns of coarse-grained time series, which represents the system dynamics in different scales. However, the MFE values will be affected by the data length, especially when the data are not long enough. By combining information of multiple coarse-grained time series in the same scale, the CMFE algorithm is proposed in this paper to enhance MFE, as well as FuzzyEn. Compared with MFE, with the increasing of scale factor, CMFE obtains much more stable and consistent values for a short-term time series. In this paper CMFE is employed to measure the complexity of vibration signals of rolling bearings and is applied to extract the nonlinear features hidden in the vibration signals. Also the physically meanings of CMFE being suitable for rolling bearing fault diagnosis are explored. Based on these, to fulfill an automatic fault diagnosis, the ensemble SVMs based multi-classifier is constructed for the intelligent classification of fault features. Finally, the proposed fault diagnosis method of rolling bearing is applied to experimental data analysis and the results indicate that the proposed method could effectively distinguish different fault categories and severities of rolling bearings.

  4. Future changes to drought characteristics over the Canadian Prairie Provinces based on NARCCAP multi-RCM ensemble

    NASA Astrophysics Data System (ADS)

    Masud, M. B.; Khaliq, M. N.; Wheater, H. S.

    2017-04-01

    This study assesses projected changes to drought characteristics in Alberta, Saskatchewan and Manitoba, the prairie provinces of Canada, using a multi-regional climate model (RCM) ensemble available through the North American Regional Climate Change Assessment Program. Simulations considered include those performed with six RCMs driven by National Center for Environmental Prediction reanalysis II for the 1981-2003 period and those driven by four Atmosphere-Ocean General Circulation Models for the 1970-1999 and 2041-2070 periods (i.e. eleven current and the same number of corresponding future period simulations). Drought characteristics are extracted using two drought indices, namely the Standardized Precipitation Index (SPI) and the Standardized Precipitation Evapotranspiration Index (SPEI). Regional frequency analysis is used to project changes to selected 20- and 50-year regional return levels of drought characteristics for fifteen homogeneous regions, covering the study area. In addition, multivariate analyses of drought characteristics, derived on the basis of 6-month SPI and SPEI values, are developed using the copula approach for each region. Analysis of multi-RCM ensemble-averaged projected changes to mean and selected return levels of drought characteristics show increases over the southern and south-western parts of the study area. Based on bi- and trivariate joint occurrence probabilities of drought characteristics, the southern regions along with the central regions are found highly drought vulnerable, followed by the southwestern and southeastern regions. Compared to the SPI-based analysis, the results based on SPEI suggest drier conditions over many regions in the future, indicating potential effects of rising temperatures on drought risks. These projections will be useful in the development of appropriate adaptation strategies for the water and agricultural sectors, which play an important role in the economy of the study area.

  5. Cardiopulmonary Resuscitation Pattern Evaluation Based on Ensemble Empirical Mode Decomposition Filter via Nonlinear Approaches

    PubMed Central

    Ma, Matthew Huei-Ming

    2016-01-01

    Good quality cardiopulmonary resuscitation (CPR) is the mainstay of treatment for managing patients with out-of-hospital cardiac arrest (OHCA). Assessment of the quality of the CPR delivered is now possible through the electrocardiography (ECG) signal that can be collected by an automated external defibrillator (AED). This study evaluates a nonlinear approximation of the CPR given to the asystole patients. The raw ECG signal is filtered using ensemble empirical mode decomposition (EEMD), and the CPR-related intrinsic mode functions (IMF) are chosen to be evaluated. In addition, sample entropy (SE), complexity index (CI), and detrended fluctuation algorithm (DFA) are collated and statistical analysis is performed using ANOVA. The primary outcome measure assessed is the patient survival rate after two hours. CPR pattern of 951 asystole patients was analyzed for quality of CPR delivered. There was no significant difference observed in the CPR-related IMFs peak-to-peak interval analysis for patients who are younger or older than 60 years of age, similarly to the amplitude difference evaluation for SE and DFA. However, there is a difference noted for the CI (p < 0.05). The results show that patients group younger than 60 years have higher survival rate with high complexity of the CPR-IMFs amplitude differences. PMID:27529068

  6. Predictive Skill of Meteorological Drought Based on Multi-Model Ensemble Forecasts: A Real-Time Assessment

    NASA Astrophysics Data System (ADS)

    Chen, L. C.; Mo, K. C.; Zhang, Q.; Huang, J.

    2014-12-01

    Drought prediction from monthly to seasonal time scales is of critical importance to disaster mitigation, agricultural planning, and multi-purpose reservoir management. Starting in December 2012, NOAA Climate Prediction Center (CPC) has been providing operational Standardized Precipitation Index (SPI) Outlooks using the North American Multi-Model Ensemble (NMME) forecasts, to support CPC's monthly drought outlooks and briefing activities. The current NMME system consists of six model forecasts from U.S. and Canada modeling centers, including the CFSv2, CM2.1, GEOS-5, CCSM3.0, CanCM3, and CanCM4 models. In this study, we conduct an assessment of the predictive skill of meteorological drought using real-time NMME forecasts for the period from May 2012 to May 2014. The ensemble SPI forecasts are the equally weighted mean of the six model forecasts. Two performance measures, the anomaly correlation coefficient and root-mean-square errors against the observations, are used to evaluate forecast skill.Similar to the assessment based on NMME retrospective forecasts, predictive skill of monthly-mean precipitation (P) forecasts is generally low after the second month and errors vary among models. Although P forecast skill is not large, SPI predictive skill is high and the differences among models are small. The skill mainly comes from the P observations appended to the model forecasts. This factor also contributes to the similarity of SPI prediction among the six models. Still, NMME SPI ensemble forecasts have higher skill than those based on individual models or persistence, and the 6-month SPI forecasts are skillful out to four months. The three major drought events occurred during the 2012-2014 period, the 2012 Central Great Plains drought, the 2013 Upper Midwest flash drought, and 2013-2014 California drought, are used as examples to illustrate the system's strength and limitations. For precipitation-driven drought events, such as the 2012 Central Great Plains drought

  7. Ensemble flood forecasting: A review

    NASA Astrophysics Data System (ADS)

    Cloke, H. L.; Pappenberger, F.

    2009-09-01

    SummaryOperational medium range flood forecasting systems are increasingly moving towards the adoption of ensembles of numerical weather predictions (NWP), known as ensemble prediction systems (EPS), to drive their predictions. We review the scientific drivers of this shift towards such 'ensemble flood forecasting' and discuss several of the questions surrounding best practice in using EPS in flood forecasting systems. We also review the literature evidence of the 'added value' of flood forecasts based on EPS and point to remaining key challenges in using EPS successfully.

  8. Ensemble flood forecasting: A review

    NASA Astrophysics Data System (ADS)

    Cloke, Hannah; Pappenberger, Florian

    2010-05-01

    Operational medium range flood forecasting systems are increasingly moving towards the adoption of ensembles of numerical weather predictions (NWP), known as ensemble prediction systems (EPS), to drive their predictions. We review the scientific drivers of this shift towards such ‘ensemble flood forecasting' and discuss several of the questions surrounding best practice in using EPS in flood forecasting systems. We also review the literature evidence of the ‘added value' of flood forecasts based on EPS and point to remaining key challenges in using EPS successfully. A continuous review can be found on the website: http://www.floodrisk.net/.

  9. Fault identification of rotor-bearing system based on ensemble empirical mode decomposition and self-zero space projection analysis

    NASA Astrophysics Data System (ADS)

    Jiang, Fan; Zhu, Zhencai; Li, Wei; Zhou, Gongbo; Chen, Guoan

    2014-07-01

    Accurately identifying faults in rotor-bearing systems by analyzing vibration signals, which are nonlinear and nonstationary, is challenging. To address this issue, a new approach based on ensemble empirical mode decomposition (EEMD) and self-zero space projection analysis is proposed in this paper. This method seeks to identify faults appearing in a rotor-bearing system using simple algebraic calculations and projection analyses. First, EEMD is applied to decompose the collected vibration signals into a set of intrinsic mode functions (IMFs) for features. Second, these extracted features under various mechanical health conditions are used to design a self-zero space matrix according to space projection analysis. Finally, the so-called projection indicators are calculated to identify the rotor-bearing system's faults with simple decision logic. Experiments are implemented to test the reliability and effectiveness of the proposed approach. The results show that this approach can accurately identify faults in rotor-bearing systems.

  10. DIME: R-package for identifying differential ChIP-seq based on an ensemble of mixture models

    PubMed Central

    Taslim, Cenny; Huang, Tim; Lin, Shili

    2011-01-01

    Summary: Differential Identification using Mixtures Ensemble (DIME) is a package for identification of biologically significant differential binding sites between two conditions using ChIP-seq data. It considers a collection of finite mixture models combined with a false discovery rate (FDR) criterion to find statistically significant regions. This leads to a more reliable assessment of differential binding sites based on a statistical approach. In addition to ChIP-seq, DIME is also applicable to data from other high-throughput platforms. Availability and implementation: DIME is implemented as an R-package, which is available at http://www.stat.osu.edu/~statgen/SOFTWARE/DIME. It may also be downloaded from http://cran.r-project.org/web/packages/DIME/. Contact: shili@stat.osu.edu PMID:21471015

  11. A novel approach for baseline correction in 1H-MRS signals based on ensemble empirical mode decomposition.

    PubMed

    Parto Dezfouli, Mohammad Ali; Dezfouli, Mohsen Parto; Rad, Hamidreza Saligheh

    2014-01-01

    Proton magnetic resonance spectroscopy ((1)H-MRS) is a non-invasive diagnostic tool for measuring biochemical changes in the human body. Acquired (1)H-MRS signals may be corrupted due to a wideband baseline signal generated by macromolecules. Recently, several methods have been developed for the correction of such baseline signals, however most of them are not able to estimate baseline in complex overlapped signal. In this study, a novel automatic baseline correction method is proposed for (1)H-MRS spectra based on ensemble empirical mode decomposition (EEMD). This investigation was applied on both the simulated data and the in-vivo (1)H-MRS of human brain signals. Results justify the efficiency of the proposed method to remove the baseline from (1)H-MRS signals.

  12. Interference-based molecular transistors

    PubMed Central

    Li, Ying; Mol, Jan A.; Benjamin, Simon C.; Briggs, G. Andrew D.

    2016-01-01

    Molecular transistors have the potential for switching with lower gate voltages than conventional field-effect transistors. We have calculated the performance of a single-molecule device in which there is interference between electron transport through the highest occupied molecular orbital and the lowest unoccupied molecular orbital of a single molecule. Quantum interference results in a subthreshold slope that is independent of temperature. For realistic parameters the change in gate potential required for a change in source-drain current of two decades is 20 mV, which is a factor of six smaller than the theoretical limit for a metal-oxide-semiconductor field-effect transistor. PMID:27646692

  13. Tangent-linear and ensemble-based four-dimensional data assimilation strategies applied for assimilating conventional data and field observations for Hurricane Karl (2010)

    NASA Astrophysics Data System (ADS)

    Poterjoy, J.; Zhang, F.

    2014-12-01

    Two advanced four-dimensional ensemble data assimilation systems are applied for studying the genesis of Hurricane Karl (2010) using conventional observations and measurements collected during the Pre-Depression Investigation of Cloud Systems in the Tropics (PREDICT) field campaign. Both methods combine strategies from four-dimensional variational (4DVar) and Ensemble Kalman filter (EnKF) data assimilation techniques that have been developed for the Weather Research and Forecasting model. The first method, denoted E4DVar, operates in a manner similar to the traditional 4DVar data assimilation system, but with hybrid climate/ensemble background errors. The second method, denoted 4DEnVar, uses an ensemble of nonlinear model trajectories to replace the function of tangent linear and adjoint model operators in 4DVar, thus improving the parallelization of the data assimilation. Simulations initialized from E4DVar and 4DEnVar analyses provide track, genesis and intensity forecasts for Karl that are more accurate than an ensemble hybrid data assimilation method based on 3DVar (E3DVar). The two 4-D data assimilation methods are applied for studying Karl's genesis, while comparing their theoretical advantages and disadvantages for an application where the system dynamics evolve quickly in time, and are constrained by an unusually high number of in situ observations.

  14. Multi-model ensemble forecasts of tropical cyclones in 2010 and 2011 based on the Kalman Filter method

    NASA Astrophysics Data System (ADS)

    He, Chengfei; Zhi, Xiefei; You, Qinglong; Song, Bin; Fraedrich, Klaus

    2015-08-01

    This study conducted 24- to 72-h multi-model ensemble forecasts to explore the tracks and intensities (central mean sea level pressure) of tropical cyclones (TCs). Forecast data for the northwestern Pacific basin in 2010 and 2011 were selected from the China Meteorological Administration, European Centre for Medium-Range Weather Forecasts (ECMWF), Japan Meteorological Agency, and National Centers for Environmental Prediction datasets of the Observing System Research and Predictability Experiment Interactive Grand Global Ensemble project. The Kalman Filter was employed to conduct the TC forecasts, along with the ensemble mean and super-ensemble for comparison. The following results were obtained: (1) The statistical-dynamic Kalman Filter, in which recent observations are given more importance and model weighting coefficients are adjusted over time, produced quite different results from that of the super-ensemble. (2) The Kalman Filter reduced the TC mean absolute track forecast error by approximately 50, 80 and 100 km in the 24-, 48- and 72-h forecasts, respectively, compared with the best individual model (ECMWF). Also, the intensity forecasts were improved by the Kalman Filter to some extent in terms of average intensity deviation (AID) and correlation coefficients with reanalysis intensity data. Overall, the Kalman Filter technique performed better compared to multi-models, the ensemble mean, and the super-ensemble in 3-day forecasts. The implication of this study is that this technique appears to be a very promising statistical-dynamic method for multi-model ensemble forecasts of TCs.

  15. Gold price analysis based on ensemble empirical model decomposition and independent component analysis

    NASA Astrophysics Data System (ADS)

    Xian, Lu; He, Kaijian; Lai, Kin Keung

    2016-07-01

    In recent years, the increasing level of volatility of the gold price has received the increasing level of attention from the academia and industry alike. Due to the complexity and significant fluctuations observed in the gold market, however, most of current approaches have failed to produce robust and consistent modeling and forecasting results. Ensemble Empirical Model Decomposition (EEMD) and Independent Component Analysis (ICA) are novel data analysis methods that can deal with nonlinear and non-stationary time series. This study introduces a new methodology which combines the two methods and applies it to gold price analysis. This includes three steps: firstly, the original gold price series is decomposed into several Intrinsic Mode Functions (IMFs) by EEMD. Secondly, IMFs are further processed with unimportant ones re-grouped. Then a new set of data called Virtual Intrinsic Mode Functions (VIMFs) is reconstructed. Finally, ICA is used to decompose VIMFs into statistically Independent Components (ICs). The decomposition results reveal that the gold price series can be represented by the linear combination of ICs. Furthermore, the economic meanings of ICs are analyzed and discussed in detail, according to the change trend and ICs' transformation coefficients. The analyses not only explain the inner driving factors and their impacts but also conduct in-depth analysis on how these factors affect gold price. At the same time, regression analysis has been conducted to verify our analysis. Results from the empirical studies in the gold markets show that the EEMD-ICA serve as an effective technique for gold price analysis from a new perspective.

  16. Ensemble Empirical Mode Decomposition based methodology for ultrasonic testing of coarse grain austenitic stainless steels.

    PubMed

    Sharma, Govind K; Kumar, Anish; Jayakumar, T; Purnachandra Rao, B; Mariyappa, N

    2015-03-01

    A signal processing methodology is proposed in this paper for effective reconstruction of ultrasonic signals in coarse grained high scattering austenitic stainless steel. The proposed methodology is comprised of the Ensemble Empirical Mode Decomposition (EEMD) processing of ultrasonic signals and application of signal minimisation algorithm on selected Intrinsic Mode Functions (IMFs) obtained by EEMD. The methodology is applied to ultrasonic signals obtained from austenitic stainless steel specimens of different grain size, with and without defects. The influence of probe frequency and data length of a signal on EEMD decomposition is also investigated. For a particular sampling rate and probe frequency, the same range of IMFs can be used to reconstruct the ultrasonic signal, irrespective of the grain size in the range of 30-210 μm investigated in this study. This methodology is successfully employed for detection of defects in a 50mm thick coarse grain austenitic stainless steel specimens. Signal to noise ratio improvement of better than 15 dB is observed for the ultrasonic signal obtained from a 25 mm deep flat bottom hole in 200 μm grain size specimen. For ultrasonic signals obtained from defects at different depths, a minimum of 7 dB extra enhancement in SNR is achieved as compared to the sum of selected IMF approach. The application of minimisation algorithm with EEMD processed signal in the proposed methodology proves to be effective for adaptive signal reconstruction with improved signal to noise ratio. This methodology was further employed for successful imaging of defects in a B-scan.

  17. Ensemble pharmacophore meets ensemble docking: a novel screening strategy for the identification of RIPK1 inhibitors

    NASA Astrophysics Data System (ADS)

    Fayaz, S. M.; Rajanikant, G. K.

    2014-07-01

    Programmed cell death has been a fascinating area of research since it throws new challenges and questions in spite of the tremendous ongoing research in this field. Recently, necroptosis, a programmed form of necrotic cell death, has been implicated in many diseases including neurological disorders. Receptor interacting serine/threonine protein kinase 1 (RIPK1) is an important regulatory protein involved in the necroptosis and inhibition of this protein is essential to stop necroptotic process and eventually cell death. Current structure-based virtual screening methods involve a wide range of strategies and recently, considering the multiple protein structures for pharmacophore extraction has been emphasized as a way to improve the outcome. However, using the pharmacophoric information completely during docking is very important. Further, in such methods, using the appropriate protein structures for docking is desirable. If not, potential compound hits, obtained through pharmacophore-based screening, may not have correct ranks and scores after docking. Therefore, a comprehensive integration of different ensemble methods is essential, which may provide better virtual screening results. In this study, dual ensemble screening, a novel computational strategy was used to identify diverse and potent inhibitors against RIPK1. All the pharmacophore features present in the binding site were captured using both the apo and holo protein structures and an ensemble pharmacophore was built by combining these features. This ensemble pharmacophore was employed in pharmacophore-based screening of ZINC database. The compound hits, thus obtained, were subjected to ensemble docking. The leads acquired through docking were further validated through feature evaluation and molecular dynamics simulation.

  18. Multiple time scale molecular dynamics for fluids with orientational degrees of freedom. II. Canonical and isokinetic ensembles

    NASA Astrophysics Data System (ADS)

    Omelyan, Igor P.; Kovalenko, Andriy

    2011-12-01

    We have developed several multiple time stepping techniques to overcome the limitations on efficiency of molecular dynamics simulations of complex fluids. They include the modified canonical and isokinetic schemes, as well as the extended isokinetic Nosé-Hoover chain approach. The latter generalizes the method of Minary, Tuckerman, and Martyna for translational motion [Phys. Rev. Lett. 93, 150201 (2004)], 10.1103/PhysRevLett.93.150201 to systems with both translational and orientational degrees of freedom. Although the microcanonical integrators are restricted to relatively small outer time steps of order of 16 fs, we show on the basis of molecular dynamics simulations of ambient water that in the canonical and isokinetic thermostats the size of these steps can be increased to 50 and 75 fs, respectively (at the same inner time step of 4 fs). Within the generalized isokinetic Nosé-Hoover chain algorithm we have derived, huge outer time steps of order of 500 fs can be used without losing numerical stability and affecting equilibrium properties

  19. Large-Scale Hybrid Density Functional Theory Calculations in the Condensed-Phase: Ab Initio Molecular Dynamics in the Isobaric-Isothermal Ensemble

    NASA Astrophysics Data System (ADS)

    Ko, Hsin-Yu; Santra, Biswajit; Distasio, Robert A., Jr.; Wu, Xifan; Car, Roberto

    Hybrid functionals are known to alleviate the self-interaction error in density functional theory (DFT) and provide a more accurate description of the electronic structure of molecules and materials. However, hybrid DFT in the condensed-phase has a prohibitively high associated computational cost which limits their applicability to large systems of interest. In this work, we present a general-purpose order(N) implementation of hybrid DFT in the condensed-phase using Maximally localized Wannier function; this implementation is optimized for massively parallel computing architectures. This algorithm is used to perform large-scale ab initio molecular dynamics simulations of liquid water, ice, and aqueous ionic solutions. We have performed simulations in the isothermal-isobaric ensemble to quantify the effects of exact exchange on the equilibrium density properties of water at different thermodynamic conditions. We find that the anomalous density difference between ice I h and liquid water at ambient conditions as well as the enthalpy differences between ice I h, II, and III phases at the experimental triple point (238 K and 20 Kbar) are significantly improved using hybrid DFT over previous estimates using the lower rungs of DFT This work has been supported by the Department of Energy under Grants No. DE-FG02-05ER46201 and DE-SC0008626.

  20. Interplay of hole transfer and host-guest interaction in a molecular dyad and triad: ensemble and single-molecule spectroscopy and sensing applications.

    PubMed

    Wu, Xiangyang; Liu, Fang; Wells, Kym L; Tan, Serena L J; Webster, Richard D; Tan, Howe-Siang; Zhang, Dawei; Xing, Bengang; Yeow, Edwin K L

    2015-02-16

    A new molecular dyad consisting of a Cy5 chromophore and ferrocene (Fc) and a triad consisting of Cy5, Fc, and β-cyclodextrin (CD) are synthesized and their photophysical properties investigated at both the ensemble and single-molecule levels. Hole transfer efficiency from Cy5 to Fc in the dyad is reduced upon addition of CD. This is due to an increase in the Cy5-Fc separation (r) when the Fc is encapsulated in the macrocyclic host. On the other hand, the triad adopts either a Fc-CD inclusion complex conformation in which hole transfer quenching of the Cy5 by Fc is minimal or a quasi-static conformation with short r and rapid charge transfer. Single-molecule fluorescence measurements reveal that r is lengthened when the triad molecules are deposited on a glass substrate. By combining intramolecular charge transfer and competitive supramolecular interaction, the triad acts as an efficient chemical sensor to detect different bioactive analytes such as amantadine hydrochloride and sodium lithocholate in aqueous solution and synthetic urine.

  1. A mapping of an ensemble of mitochondrial sequences for various organisms into 3D space based on the word composition.

    PubMed

    Aita, Takuyo; Nishigaki, Koichi

    2012-11-01

    To visualize a bird's-eye view of an ensemble of mitochondrial genome sequences for various species, we recently developed a novel method of mapping a biological sequence ensemble into Three-Dimensional (3D) vector space. First, we represented a biological sequence of a species s by a word-composition vector x(s), where its length [absolute value]x(s)[absolute value] represents the sequence length, and its unit vector x(s)/[absolute value]x(s)[absolute value] represents the relative composition of the K-tuple words through the sequence and the size of the dimension, N=4(K), is the number of all possible words with the length of K. Second, we mapped the vector x(s) to the 3D position vector y(s), based on the two following simple principles: (1) [absolute value]y(s)[absolute value]=[absolute value]x(s)[absolute value] and (2) the angle between y(s) and y(t) maximally correlates with the angle between x(s) and x(t). The mitochondrial genome sequences for 311 species, including 177 Animalia, 85 Fungi and 49 Green plants, were mapped into 3D space by using K=7. The mapping was successful because the angles between vectors before and after the mapping highly correlated with each other (correlation coefficients were 0.92-0.97). Interestingly, the Animalia kingdom is distributed along a single arc belt (just like the Milky Way on a Celestial Globe), and the Fungi and Green plant kingdoms are distributed in a similar arc belt. These two arc belts intersect at their respective middle regions and form a cross structure just like a jet aircraft fuselage and its wings. This new mapping method will allow researchers to intuitively interpret the visual information presented in the maps in a highly effective manner.

  2. Quantifying the Usefulness of Ensemble-Based Precipitation Forecasts with Respect to Water Use and Yield during a Field Trial

    NASA Astrophysics Data System (ADS)

    Christ, E.; Webster, P. J.; Collins, G.; Byrd, S.

    2014-12-01

    Recent droughts and the continuing water wars between the states of Georgia, Alabama and Florida have made agricultural producers more aware of the importance of managing their irrigation systems more efficiently. Many southeastern states are beginning to consider laws that will require monitoring and regulation of water used for irrigation. Recently, Georgia suspended issuing irrigation permits in some areas of the southwestern portion of the state to try and limit the amount of water being used in irrigation. However, even in southern Georgia, which receives on average between 23 and 33 inches of rain during the growing season, irrigation can significantly impact crop yields. In fact, studies have shown that when fields do not receive rainfall at the most critical stages in the life of cotton, yield for irrigated fields can be up to twice as much as fields for non-irrigated cotton. This leads to the motivation for this study, which is to produce a forecast tool that will enable producers to make more efficient irrigation management decisions. We will use the ECMWF (European Centre for Medium-Range Weather Forecasts) vars EPS (Ensemble Prediction System) model precipitation forecasts for the grid points included in the 1◦ x 1◦ lat/lon square surrounding the point of interest. We will then apply q-to-q bias corrections to the forecasts. Once we have applied the bias corrections, we will use the check-book method of irrigation scheduling to determine the probability of receiving the required amount of rainfall for each week of the growing season. These forecasts will be used during a field trial conducted at the CM Stripling Irrigation Research Park in Camilla, Georgia. This research will compare differences in yield and water use among the standard checkbook method of irrigation, which uses no precipitation forecast knowledge, the weather.com forecast, a dry land plot, and the ensemble-based forecasts mentioned above.

  3. Representing Color Ensembles.

    PubMed

    Chetverikov, Andrey; Campana, Gianluca; Kristjánsson, Árni

    2017-10-01

    Colors are rarely uniform, yet little is known about how people represent color distributions. We introduce a new method for studying color ensembles based on intertrial learning in visual search. Participants looked for an oddly colored diamond among diamonds with colors taken from either uniform or Gaussian color distributions. On test trials, the targets had various distances in feature space from the mean of the preceding distractor color distribution. Targets on test trials therefore served as probes into probabilistic representations of distractor colors. Test-trial response times revealed a striking similarity between the physical distribution of colors and their internal representations. The results demonstrate that the visual system represents color ensembles in a more detailed way than previously thought, coding not only mean and variance but, most surprisingly, the actual shape (uniform or Gaussian) of the distribution of colors in the environment.

  4. Tailored Random Graph Ensembles

    NASA Astrophysics Data System (ADS)

    Roberts, E. S.; Annibale, A.; Coolen, A. C. C.

    2013-02-01

    Tailored graph ensembles are a developing bridge between biological networks and statistical mechanics. The aim is to use this concept to generate a suite of rigorous tools that can be used to quantify and compare the topology of cellular signalling networks, such as protein-protein interaction networks and gene regulation networks. We calculate exact and explicit formulae for the leading orders in the system size of the Shannon entropies of random graph ensembles constrained with degree distribution and degree-degree correlation. We also construct an ergodic detailed balance Markov chain with non-trivial acceptance probabilities which converges to a strictly uniform measure and is based on edge swaps that conserve all degrees. The acceptance probabilities can be generalized to define Markov chains that target any alternative desired measure on the space of directed or undirected graphs, in order to generate graphs with more sophisticated topological features.

  5. Molecular simulation of aqueous electrolyte solubility. 2. Osmotic ensemble Monte Carlo methodology for free energy and solubility calculations and application to NaCl.

    PubMed

    Moučka, Filip; Lísal, Martin; Škvor, Jiří; Jirsák, Jan; Nezbeda, Ivo; Smith, William R

    2011-06-23

    We present a new and computationally efficient methodology using osmotic ensemble Monte Carlo (OEMC) simulation to calculate chemical potential-concentration curves and the solubility of aqueous electrolytes. The method avoids calculations for the solid phase, incorporating readily available data from thermochemical tables that are based on well-defined reference states. It performs simulations of the aqueous solution at a fixed number of water molecules, pressure, temperature, and specified overall electrolyte chemical potential. Insertion/deletion of ions to/from the system is implemented using fractional ions, which are coupled to the system via a coupling parameter λ that varies between 0 (no interaction between the fractional ions and the other particles in the system) and 1 (full interaction between the fractional ions and the other particles of the system). Transitions between λ-states are accepted with a probability following from the osmotic ensemble partition function. Biasing weights associated with the λ-states are used in order to efficiently realize transitions between them; these are determined by means of the Wang-Landau method. We also propose a novel scaling procedure for λ, which can be used for both nonpolarizable and polarizable models of aqueous electrolyte systems. The approach is readily extended to involve other solvents, multiple electrolytes, and species complexation reactions. The method is illustrated for NaCl, using SPC/E water and several force field models for NaCl from the literature, and the results are compared with experiment at ambient conditions. Good agreement is obtained for the chemical potential-concentration curve and the solubility prediction is reasonable. Future improvements to the predictions will require improved force field models.

  6. Residue-level global and local ensemble-ensemble comparisons of protein domains

    PubMed Central

    Clark, Sarah A; Tronrud, Dale E; Andrew Karplus, P

    2015-01-01

    Many methods of protein structure generation such as NMR-based solution structure determination and template-based modeling do not produce a single model, but an ensemble of models consistent with the available information. Current strategies for comparing ensembles lose information because they use only a single representative structure. Here, we describe the ENSEMBLATOR and its novel strategy to directly compare two ensembles containing the same atoms to identify significant global and local backbone differences between them on per-atom and per-residue levels, respectively. The ENSEMBLATOR has four components: eePREP (ee for ensemble-ensemble), which selects atoms common to all models; eeCORE, which identifies atoms belonging to a cutoff-distance dependent common core; eeGLOBAL, which globally superimposes all models using the defined core atoms and calculates for each atom the two intraensemble variations, the interensemble variation, and the closest approach of members of the two ensembles; and eeLOCAL, which performs a local overlay of each dipeptide and, using a novel measure of local backbone similarity, reports the same four variations as eeGLOBAL. The combination of eeGLOBAL and eeLOCAL analyses identifies the most significant differences between ensembles. We illustrate the ENSEMBLATOR's capabilities by showing how using it to analyze NMR ensembles and to compare NMR ensembles with crystal structures provides novel insights compared to published studies. One of these studies leads us to suggest that a “consistency check” of NMR-derived ensembles may be a useful analysis step for NMR-based structure determinations in general. The ENSEMBLATOR 1.0 is available as a first generation tool to carry out ensemble-ensemble comparisons. PMID:26032515

  7. Structural Ensembles of Membrane-bound α-Synuclein Reveal the Molecular Determinants of Synaptic Vesicle Affinity

    PubMed Central

    Fusco, Giuliana; De Simone, Alfonso; Arosio, Paolo; Vendruscolo, Michele; Veglia, Gianluigi; Dobson, Christopher M.

    2016-01-01

    A detailed characterisation of the molecular determinants of membrane binding by α-synuclein (αS), a 140-residue protein whose aggregation is associated with Parkinson’s disease, is of fundamental significance to clarify the manner in which the balance between functional and dysfunctional processes are regulated for this protein. Despite its biological relevance, the structural nature of the membrane-bound state αS remains elusive, in part because of the intrinsically dynamic nature of the protein and also because of the difficulties in studying this state in a physiologically relevant environment. In the present study we have used solid-state NMR and restrained MD simulations to refine structure and topology of the N-terminal region of αS bound to the surface of synaptic-like membranes. This region has fundamental importance in the binding mechanism of αS as it acts as to anchor the protein to lipid bilayers. The results enabled the identification of the key elements for the biological properties of αS in its membrane-bound state. PMID:27273030

  8. Evaluation of WRF-based convection-permitting multi-physics ensemble forecasts over China for an extreme rainfall event on 21 July 2012 in Beijing

    NASA Astrophysics Data System (ADS)

    Zhu, Kefeng; Xue, Ming

    2016-11-01

    On 21 July 2012, an extreme rainfall event that recorded a maximum rainfall amount over 24 hours of 460 mm, occurred in Beijing, China. Most operational models failed to predict such an extreme amount. In this study, a convective-permitting ensemble forecast system (CEFS), at 4-km grid spacing, covering the entire mainland of China, is applied to this extreme rainfall case. CEFS consists of 22 members and uses multiple physics parameterizations. For the event, the predicted maximum is 415 mm d-1 in the probability-matched ensemble mean. The predicted high-probability heavy rain region is located in southwest Beijing, as was observed. Ensemble-based verification scores are then investigated. For a small verification domain covering Beijing and its surrounding areas, the precipitation rank histogram of CEFS is much flatter than that of a reference global ensemble. CEFS has a lower (higher) Brier score and a higher resolution than the global ensemble for precipitation, indicating more reliable probabilistic forecasting by CEFS. Additionally, forecasts of different ensemble members are compared and discussed. Most of the extreme rainfall comes from convection in the warm sector east of an approaching cold front. A few members of CEFS successfully reproduce such precipitation, and orographic lift of highly moist low-level flows with a significantly southeasterly component is suggested to have played important roles in producing the initial convection. Comparisons between good and bad forecast members indicate a strong sensitivity of the extreme rainfall to the mesoscale environmental conditions, and, to less of an extent, the model physics.

  9. The Protein Ensemble Database.

    PubMed

    Varadi, Mihaly; Tompa, Peter

    2015-01-01

    The scientific community's major conceptual notion of structural biology has recently shifted in emphasis from the classical structure-function paradigm due to the emergence of intrinsically disordered proteins (IDPs). As opposed to their folded cousins, these proteins are defined by the lack of a stable 3D fold and a high degree of inherent structural heterogeneity that is closely tied to their function. Due to their flexible nature, solution techniques such as small-angle X-ray scattering (SAXS), nuclear magnetic resonance (NMR) spectroscopy and fluorescence resonance energy transfer (FRET) are particularly well-suited for characterizing their biophysical properties. Computationally derived structural ensembles based on such experimental measurements provide models of the conformational sampling displayed by these proteins, and they may offer valuable insights into the functional consequences of inherent flexibility. The Protein Ensemble Database (http://pedb.vib.be) is the first openly accessible, manually curated online resource storing the ensemble models, protocols used during the calculation procedure, and underlying primary experimental data derived from SAXS and/or NMR measurements. By making this previously inaccessible data freely available to researchers, this novel resource is expected to promote the development of more advanced modelling methodologies, facilitate the design of standardized calculation protocols, and consequently lead to a better understanding of how function arises from the disordered state.

  10. One-day-ahead streamflow forecasting via super-ensembles of several neural network architectures based on the Multi-Level Diversity Model

    NASA Astrophysics Data System (ADS)

    Brochero, Darwin; Hajji, Islem; Pina, Jasson; Plana, Queralt; Sylvain, Jean-Daniel; Vergeynst, Jenna; Anctil, Francois

    2015-04-01

    Theories about generalization error with ensembles are mainly based on the diversity concept, which promotes resorting to many members of different properties to support mutually agreeable decisions. Kuncheva (2004) proposed the Multi Level Diversity Model (MLDM) to promote diversity in model ensembles, combining different data subsets, input subsets, models, parameters, and including a combiner level in order to optimize the final ensemble. This work tests the hypothesis about the minimisation of the generalization error with ensembles of Neural Network (NN) structures. We used the MLDM to evaluate two different scenarios: (i) ensembles from a same NN architecture, and (ii) a super-ensemble built by a combination of sub-ensembles of many NN architectures. The time series used correspond to the 12 basins of the MOdel Parameter Estimation eXperiment (MOPEX) project that were used by Duan et al. (2006) and Vos (2013) as benchmark. Six architectures are evaluated: FeedForward NN (FFNN) trained with the Levenberg Marquardt algorithm (Hagan et al., 1996), FFNN trained with SCE (Duan et al., 1993), Recurrent NN trained with a complex method (Weins et al., 2008), Dynamic NARX NN (Leontaritis and Billings, 1985), Echo State Network (ESN), and leak integrator neuron (L-ESN) (Lukosevicius and Jaeger, 2009). Each architecture performs separately an Input Variable Selection (IVS) according to a forward stepwise selection (Anctil et al., 2009) using mean square error as objective function. Post-processing by Predictor Stepwise Selection (PSS) of the super-ensemble has been done following the method proposed by Brochero et al. (2011). IVS results showed that the lagged stream flow, lagged precipitation, and Standardized Precipitation Index (SPI) (McKee et al., 1993) were the most relevant variables. They were respectively selected as one of the firsts three selected variables in 66, 45, and 28 of the 72 scenarios. A relationship between aridity index (Arora, 2002) and NN

  11. Emotion Recognition of Weblog Sentences Based on an Ensemble Algorithm of Multi-label Classification and Word Emotions

    NASA Astrophysics Data System (ADS)

    Li, Ji; Ren, Fuji

    Weblogs have greatly changed the communication ways of mankind. Affective analysis of blog posts is found valuable for many applications such as text-to-speech synthesis or computer-assisted recommendation. Traditional emotion recognition in text based on single-label classification can not satisfy higher requirements of affective computing. In this paper, the automatic identification of sentence emotion in weblogs is modeled as a multi-label text categorization task. Experiments are carried out on 12273 blog sentences from the Chinese emotion corpus Ren_CECps with 8-dimension emotion annotation. An ensemble algorithm RAKEL is used to recognize dominant emotions from the writer's perspective. Our emotion feature using detailed intensity representation for word emotions outperforms the other main features such as the word frequency feature and the traditional lexicon-based feature. In order to deal with relatively complex sentences, we integrate grammatical characteristics of punctuations, disjunctive connectives, modification relations and negation into features. It achieves 13.51% and 12.49% increases for Micro-averaged F1 and Macro-averaged F1 respectively compared to the traditional lexicon-based feature. Result shows that multiple-dimension emotion representation with grammatical features can efficiently classify sentence emotion in a multi-label problem.

  12. Assessing an ensemble Kalman filter inference of Manning's n coefficient of an idealized tidal inlet against a polynomial chaos-based MCMC

    NASA Astrophysics Data System (ADS)

    Siripatana, Adil; Mayo, Talea; Sraj, Ihab; Knio, Omar; Dawson, Clint; Le Maitre, Olivier; Hoteit, Ibrahim

    2017-08-01

    Bayesian estimation/inversion is commonly used to quantify and reduce modeling uncertainties in coastal ocean model, especially in the framework of parameter estimation. Based on Bayes rule, the posterior probability distribution function (pdf) of the estimated quantities is obtained conditioned on available data. It can be computed either directly, using a Markov chain Monte Carlo (MCMC) approach, or by sequentially processing the data following a data assimilation approach, which is heavily exploited in large dimensional state estimation problems. The advantage of data assimilation schemes over MCMC-type methods arises from the ability to algorithmically accommodate a large number of uncertain quantities without significant increase in the computational requirements. However, only approximate estimates are generally obtained by this approach due to the restricted Gaussian prior and noise assumptions that are generally imposed in these methods. This contribution aims at evaluating the effectiveness of utilizing an ensemble Kalman-based data assimilation method for parameter estimation of a coastal ocean model against an MCMC polynomial chaos (PC)-based scheme. We focus on quantifying the uncertainties of a coastal ocean ADvanced CIRCulation (ADCIRC) model with respect to the Manning's n coefficients. Based on a realistic framework of observation system simulation experiments (OSSEs), we apply an ensemble Kalman filter and the MCMC method employing a surrogate of ADCIRC constructed by a non-intrusive PC expansion for evaluating the likelihood, and test both approaches under identical scenarios. We study the sensitivity of the estimated posteriors with respect to the parameters of the inference methods, including ensemble size, inflation factor, and PC order. A full analysis of both methods, in the context of coastal ocean model, suggests that an ensemble Kalman filter with appropriate ensemble size and well-tuned inflation provides reliable mean estimates and

  13. Niobate-based octahedral molecular sieves

    DOEpatents

    Nenoff, Tina M.; Nyman, May D.

    2006-10-17

    Niobate-based octahedral molecular sieves having significant activity for multivalent cations and a method for synthesizing such sieves are disclosed. The sieves have a net negatively charged octahedral framework, comprising niobium, oxygen, and octahedrally coordinated lower valence transition metals. The framework can be charge balanced by the occluded alkali cation from the synthesis method. The alkali cation can be exchanged for other contaminant metal ions. The ion-exchanged niobate-based octahedral molecular sieve can be backexchanged in acidic solutions to yield a solution concentrated in the contaminant metal. Alternatively, the ion-exchanged niobate-based octahedral molecular sieve can be thermally converted to a durable perovskite phase waste form.

  14. Niobate-based octahedral molecular sieves

    DOEpatents

    Nenoff, Tina M.; Nyman, May D.

    2003-07-22

    Niobate-based octahedral molecular sieves having significant activity for multivalent cations and a method for synthesizing such sieves are disclosed. The sieves have a net negatively charged octahedral framework, comprising niobium, oxygen, and octahedrally coordinated lower valence transition metals. The framework can be charge balanced by the occluded alkali cation from the synthesis method. The alkali cation can be exchanged for other contaminant metal ions. The ion-exchanged niobate-based octahedral molecular sieve can be backexchanged in acidic solutions to yield a solution concentrated in the contaminant metal. Alternatively, the ion-exchanged niobate-based octahedral molecular sieve can be thermally converted to a durable perovskite phase waste form.

  15. An Ensemble System Based on Hybrid EGARCH-ANN with Different Distributional Assumptions to Predict S&P 500 Intraday Volatility

    NASA Astrophysics Data System (ADS)

    Lahmiri, S.; Boukadoum, M.

    2015-10-01

    Accurate forecasting of stock market volatility is an important issue in portfolio risk management. In this paper, an ensemble system for stock market volatility is presented. It is composed of three different models that hybridize the exponential generalized autoregressive conditional heteroscedasticity (GARCH) process and the artificial neural network trained with the backpropagation algorithm (BPNN) to forecast stock market volatility under normal, t-Student, and generalized error distribution (GED) assumption separately. The goal is to design an ensemble system where each single hybrid model is capable to capture normality, excess skewness, or excess kurtosis in the data to achieve complementarity. The performance of each EGARCH-BPNN and the ensemble system is evaluated by the closeness of the volatility forecasts to realized volatility. Based on mean absolute error and mean of squared errors, the experimental results show that proposed ensemble model used to capture normality, skewness, and kurtosis in data is more accurate than the individual EGARCH-BPNN models in forecasting the S&P 500 intra-day volatility based on one and five-minute time horizons data.

  16. Parameter estimation in physically-based integrated hydrological models with the ensemble Kalman filter: a practical application.

    NASA Astrophysics Data System (ADS)

    Botto, Anna; Camporese, Matteo

    2017-04-01

    Hydrological models allow scientists to predict the response of water systems under varying forcing conditions. In particular, many physically-based integrated models were recently developed in order to understand the fundamental hydrological processes occurring at the catchment scale. However, the use of this class of hydrological models is still relatively limited, as their prediction skills heavily depend on reliable parameter estimation, an operation that is never trivial, being normally affected by large uncertainty and requiring huge computational effort. The objective of this work is to test the potential of data assimilation to be used as an inverse modeling procedure for the broad class of integrated hydrological models. To pursue this goal, a Bayesian data assimilation (DA) algorithm based on a Monte Carlo approach, namely the ensemble Kalman filter (EnKF), is combined with the CATchment HYdrology (CATHY) model. In this approach, input variables (atmospheric forcing, soil parameters, initial conditions) are statistically perturbed providing an ensemble of realizations aimed at taking into account the uncertainty involved in the process. Each realization is propagated forward by the CATHY hydrological model within a parallel R framework, developed to reduce the computational effort. When measurements are available, the EnKF is used to update both the system state and soil parameters. In particular, four different assimilation scenarios are applied to test the capability of the modeling framework: first only pressure head or water content are assimilated, then, the combination of both, and finally both pressure head and water content together with the subsurface outflow. To demonstrate the effectiveness of the approach in a real-world scenario, an artificial hillslope was designed and built to provide real measurements for the DA analyses. The experimental facility, located in the Department of Civil, Environmental and Architectural Engineering of the

  17. [MicroRNA Target Prediction Based on Support Vector Machine Ensemble Classification Algorithm of Under-sampling Technique].

    PubMed

    Chen, Zhiru; Hong, Wenxue

    2016-02-01

    Considering the low accuracy of prediction in the positive samples and poor overall classification effects caused by unbalanced sample data of MicroRNA (miRNA) target, we proposes a support vector machine (SVM)-integration of under-sampling and weight (IUSM) algorithm in this paper, an under-sampling based on the ensemble learning algorithm. The algorithm adopts SVM as learning algorithm and AdaBoost as integration framework, and embeds clustering-based under-sampling into the iterative process, aiming at reducing the degree of unbalanced distribution of positive and negative samples. Meanwhile, in the process of adaptive weight adjustment of the samples, the SVM-IUSM algorithm eliminates the abnormal ones in negative samples with robust sample weights smoothing mechanism so as to avoid over-learning. Finally, the prediction of miRNA target integrated classifier is achieved with the combination of multiple weak classifiers through the voting mechanism. The experiment revealed that the SVM-IUSW, compared with other algorithms on unbalanced dataset collection, could not only improve the accuracy of positive targets and the overall effect of classification, but also enhance the generalization ability of miRNA target classifier.

  18. A Cutting Pattern Recognition Method for Shearers Based on Improved Ensemble Empirical Mode Decomposition and a Probabilistic Neural Network

    PubMed Central

    Xu, Jing; Wang, Zhongbin; Tan, Chao; Si, Lei; Liu, Xinhua

    2015-01-01

    In order to guarantee the stable operation of shearers and promote construction of an automatic coal mining working face, an online cutting pattern recognition method with high accuracy and speed based on Improved Ensemble Empirical Mode Decomposition (IEEMD) and Probabilistic Neural Network (PNN) is proposed. An industrial microphone is installed on the shearer and the cutting sound is collected as the recognition criterion to overcome the disadvantages of giant size, contact measurement and low identification rate of traditional detectors. To avoid end-point effects and get rid of undesirable intrinsic mode function (IMF) components in the initial signal, IEEMD is conducted on the sound. The end-point continuation based on the practical storage data is performed first to overcome the end-point effect. Next the average correlation coefficient, which is calculated by the correlation of the first IMF with others, is introduced to select essential IMFs. Then the energy and standard deviation of the reminder IMFs are extracted as features and PNN is applied to classify the cutting patterns. Finally, a simulation example, with an accuracy of 92.67%, and an industrial application prove the efficiency and correctness of the proposed method. PMID:26528985

  19. A Cutting Pattern Recognition Method for Shearers Based on Improved Ensemble Empirical Mode Decomposition and a Probabilistic Neural Network.

    PubMed

    Xu, Jing; Wang, Zhongbin; Tan, Chao; Si, Lei; Liu, Xinhua

    2015-10-30

    In order to guarantee the stable operation of shearers and promote construction of an automatic coal mining working face, an online cutting pattern recognition method with high accuracy and speed based on Improved Ensemble Empirical Mode Decomposition (IEEMD) and Probabilistic Neural Network (PNN) is proposed. An industrial microphone is installed on the shearer and the cutting sound is collected as the recognition criterion to overcome the disadvantages of giant size, contact measurement and low identification rate of traditional detectors. To avoid end-point effects and get rid of undesirable intrinsic mode function (IMF) components in the initial signal, IEEMD is conducted on the sound. The end-point continuation based on the practical storage data is performed first to overcome the end-point effect. Next the average correlation coefficient, which is calculated by the correlation of the first IMF with others, is introduced to select essential IMFs. Then the energy and standard deviation of the reminder IMFs are extracted as features and PNN is applied to classify the cutting patterns. Finally, a simulation example, with an accuracy of 92.67%, and an industrial application prove the efficiency and correctness of the proposed method.

  20. A Compound fault diagnosis for rolling bearings method based on blind source separation and ensemble empirical mode decomposition.

    PubMed

    Wang, Huaqing; Li, Ruitong; Tang, Gang; Yuan, Hongfang; Zhao, Qingliang; Cao, Xi

    2014-01-01

    A Compound fault signal usually contains multiple characteristic signals and strong confusion noise, which makes it difficult to separate week fault signals from them through conventional ways, such as FFT-based envelope detection, wavelet transform or empirical mode decomposition individually. In order to improve the compound faults diagnose of rolling bearings via signals' separation, the present paper proposes a new method to identify compound faults from measured mixed-signals, which is based on ensemble empirical mode decomposition (EEMD) method and independent component analysis (ICA) technique. With the approach, a vibration signal is firstly decomposed into intrinsic mode functions (IMF) by EEMD method to obtain multichannel signals. Then, according to a cross correlation criterion, the corresponding IMF is selected as the input matrix of ICA. Finally, the compound faults can be separated effectively by executing ICA method, which makes the fault features more easily extracted and more clearly identified. Experimental results validate the effectiveness of the proposed method in compound fault separating, which works not only for the outer race defect, but also for the rollers defect and the unbalance fault of the experimental system.

  1. Comparison of future and base precipitation anomalies by SimCLIM statistical projection through ensemble approach in Pakistan

    NASA Astrophysics Data System (ADS)

    Amin, Asad; Nasim, Wajid; Mubeen, Muhammad; Kazmi, Dildar Hussain; Lin, Zhaohui; Wahid, Abdul; Sultana, Syeda Refat; Gibbs, Jim; Fahad, Shah

    2017-09-01

    Unpredictable precipitation trends have largely influenced by climate change which prolonged droughts or floods in South Asia. Statistical analysis of monthly, seasonal, and annual precipitation trend carried out for different temporal (1996-2015 and 2041-2060) and spatial scale (39 meteorological stations) in Pakistan. Statistical downscaling model (SimCLIM) was used for future precipitation projection (2041-2060) and analyzed by statistical approach. Ensemble approach combined with representative concentration pathways (RCPs) at medium level used for future projections. The magnitude and slop of trends were derived by applying Mann-Kendal and Sen's slop statistical approaches. Geo-statistical application used to generate precipitation trend maps. Comparison of base and projected precipitation by statistical analysis represented by maps and graphical visualization which facilitate to detect trends. Results of this study projects that precipitation trend was increasing more than 70% of weather stations for February, March, April, August, and September represented as base years. Precipitation trend was decreased in February to April but increase in July to October in projected years. Highest decreasing trend was reported in January for base years which was also decreased in projected years. Greater variation in precipitation trends for projected and base years was reported in February to April. Variations in projected precipitation trend for Punjab and Baluchistan highly accredited in March and April. Seasonal analysis shows large variation in winter, which shows increasing trend for more than 30% of weather stations and this increased trend approaches 40% for projected precipitation. High risk was reported in base year pre-monsoon season where 90% of weather station shows increasing trend but in projected years this trend decreased up to 33%. Finally, the annual precipitation trend has increased for more than 90% of meteorological stations in base (1996-2015) which

  2. GA(M)E-QSAR: a novel, fully automatic genetic-algorithm-(meta)-ensembles approach for binary classification in ligand-based drug design.

    PubMed

    Pérez-Castillo, Yunierkis; Lazar, Cosmin; Taminau, Jonatan; Froeyen, Mathy; Cabrera-Pérez, Miguel Ángel; Nowé, Ann

    2012-09-24

    Computer-aided drug design has become an important component of the drug discovery process. Despite the advances in this field, there is not a unique modeling approach that can be successfully applied to solve the whole range of problems faced during QSAR modeling. Feature selection and ensemble modeling are active areas of research in ligand-based drug design. Here we introduce the GA(M)E-QSAR algorithm that combines the search and optimization capabilities of Genetic Algorithms with the simplicity of the Adaboost ensemble-based classification algorithm to solve binary classification problems. We also explore the usefulness of Meta-Ensembles trained with Adaboost and Voting schemes to further improve the accuracy, generalization, and robustness of the optimal Adaboost Single Ensemble derived from the Genetic Algorithm optimization. We evaluated the performance of our algorithm using five data sets from the literature and found that it is capable of yielding similar or better classification results to what has been reported for these data sets with a higher enrichment of active compounds relative to the whole actives subset when only the most active chemicals are considered. More important, we compared our methodology with state of the art feature selection and classification approaches and found that it can provide highly accurate, robust, and generalizable models. In the case of the Adaboost Ensembles derived from the Genetic Algorithm search, the final models are quite simple since they consist of a weighted sum of the output of single feature classifiers. Furthermore, the Adaboost scores can be used as ranking criterion to prioritize chemicals for synthesis and biological evaluation after virtual screening experiments.

  3. Evaluating Model Performance of an Ensemble-based Chemical Data Assimilation System During INTEX-B Field Mission

    NASA Technical Reports Server (NTRS)

    Arellano, A. F., Jr.; Raeder, K.; Anderson, J. L.; Hess, P. G.; Emmons, L. K.; Edwards, D. P.; Pfister, G. G.; Campos, T. L.; Sachse, G. W.

    2007-01-01

    We present a global chemical data assimilation system using a global atmosphere model, the Community Atmosphere Model (CAM3) with simplified chemistry and the Data Assimilation Research Testbed (DART) assimilation package. DART is a community software facility for assimilation studies using the ensemble Kalman filter approach. Here, we apply the assimilation system to constrain global tropospheric carbon monoxide (CO) by assimilating meteorological observations of temperature and horizontal wind velocity and satellite CO retrievals from the Measurement of Pollution in the Troposphere (MOPITT) satellite instrument. We verify the system performance using independent CO observations taken on board the NSFINCAR C-130 and NASA DC-8 aircrafts during the April 2006 part of the Intercontinental Chemical Transport Experiment (INTEX-B). Our evaluations show that MOPITT data assimilation provides significant improvements in terms of capturing the observed CO variability relative to no MOPITT assimilation (i.e. the correlation improves from 0.62 to 0.71, significant at 99% confidence). The assimilation provides evidence of median CO loading of about 150 ppbv at 700 hPa over the NE Pacific during April 2006. This is marginally higher than the modeled CO with no MOPITT assimilation (-140 ppbv). Our ensemble-based estimates of model uncertainty also show model overprediction over the source region (i.e. China) and underprediction over the NE Pacific, suggesting model errors that cannot be readily explained by emissions alone. These results have important implications for improving regional chemical forecasts and for inverse modeling of CO sources and further demonstrate the utility of the assimilation system in comparing non-coincident measurements, e.g. comparing satellite retrievals of CO with in-situ aircraft measurements. The work described above also brought to light several short-comings of the data assimilation approach for CO profiles. Because of the limited vertical

  4. Evaluating Model Performance of an Ensemble-based Chemical Data Assimilation System During INTEX-B Field Mission

    NASA Technical Reports Server (NTRS)

    Arellano, A. F., Jr.; Raeder, K.; Anderson, J. L.; Hess, P. G.; Emmons, L. K.; Edwards, D. P.; Pfister, G. G.; Campos, T. L.; Sachse, G. W.

    2007-01-01

    We present a global chemical data assimilation system using a global atmosphere model, the Community Atmosphere Model (CAM3) with simplified chemistry and the Data Assimilation Research Testbed (DART) assimilation package. DART is a community software facility for assimilation studies using the ensemble Kalman filter approach. Here, we apply the assimilation system to constrain global tropospheric carbon monoxide (CO) by assimilating meteorological observations of temperature and horizontal wind velocity and satellite CO retrievals from the Measurement of Pollution in the Troposphere (MOPITT) satellite instrument. We verify the system performance using independent CO observations taken on board the NSFINCAR C-130 and NASA DC-8 aircrafts during the April 2006 part of the Intercontinental Chemical Transport Experiment (INTEX-B). Our evaluations show that MOPITT data assimilation provides significant improvements in terms of capturing the observed CO variability relative to no MOPITT assimilation (i.e. the correlation improves from 0.62 to 0.71, significant at 99% confidence). The assimilation provides evidence of median CO loading of about 150 ppbv at 700 hPa over the NE Pacific during April 2006. This is marginally higher than the modeled CO with no MOPITT assimilation (-140 ppbv). Our ensemble-based estimates of model uncertainty also show model overprediction over the source region (i.e. China) and underprediction over the NE Pacific, suggesting model errors that cannot be readily explained by emissions alone. These results have important implications for improving regional chemical forecasts and for inverse modeling of CO sources and further demonstrate the utility of the assimilation system in comparing non-coincident measurements, e.g. comparing satellite retrievals of CO with in-situ aircraft measurements. The work described above also brought to light several short-comings of the data assimilation approach for CO profiles. Because of the limited vertical

  5. Exploring ensemble visualization

    NASA Astrophysics Data System (ADS)

    Phadke, Madhura N.; Pinto, Lifford; Alabi, Oluwafemi; Harter, Jonathan; Taylor, Russell M., II; Wu, Xunlei; Petersen, Hannah; Bass, Steffen A.; Healey, Christopher G.

    2012-01-01

    An ensemble is a collection of related datasets. Each dataset, or member, of an ensemble is normally large, multidimensional, and spatio-temporal. Ensembles are used extensively by scientists and mathematicians, for example, by executing a simulation repeatedly with slightly different input parameters and saving the results in an ensemble to see how parameter choices affect the simulation. To draw inferences from an ensemble, scientists need to compare data both within and between ensemble members. We propose two techniques to support ensemble exploration and comparison: a pairwise sequential animation method that visualizes locally neighboring members simultaneously, and a screen door tinting method that visualizes subsets of members using screen space subdivision. We demonstrate the capabilities of both techniques, first using synthetic data, then with simulation data of heavy ion collisions in high-energy physics. Results show that both techniques are capable of supporting meaningful comparisons of ensemble data.

  6. Exploring Ensemble Visualization

    PubMed Central

    Phadke, Madhura N.; Pinto, Lifford; Alabi, Femi; Harter, Jonathan; Taylor, Russell M.; Wu, Xunlei; Petersen, Hannah; Bass, Steffen A.; Healey, Christopher G.

    2012-01-01

    An ensemble is a collection of related datasets. Each dataset, or member, of an ensemble is normally large, multidimensional, and spatio-temporal. Ensembles are used extensively by scientists and mathematicians, for example, by executing a simulation repeatedly with slightly different input parameters and saving the results in an ensemble to see how parameter choices affect the simulation. To draw inferences from an ensemble, scientists need to compare data both within and between ensemble members. We propose two techniques to support ensemble exploration and comparison: a pairwise sequential animation method that visualizes locally neighboring members simultaneously, and a screen door tinting method that visualizes subsets of members using screen space subdivision. We demonstrate the capabilities of both techniques, first using synthetic data, then with simulation data of heavy ion collisions in high-energy physics. Results show that both techniques are capable of supporting meaningful comparisons of ensemble data. PMID:22347540

  7. Adaptive correction of ensemble forecasts

    NASA Astrophysics Data System (ADS)

    Pelosi, Anna; Battista Chirico, Giovanni; Van den Bergh, Joris; Vannitsem, Stephane

    2017-04-01

    Forecasts from numerical weather prediction (NWP) models often suffer from both systematic and non-systematic errors. These are present in both deterministic and ensemble forecasts, and originate from various sources such as model error and subgrid variability. Statistical post-processing techniques can partly remove such errors, which is particularly important when NWP outputs concerning surface weather variables are employed for site specific applications. Many different post-processing techniques have been developed. For deterministic forecasts, adaptive methods such as the Kalman filter are often used, which sequentially post-process the forecasts by continuously updating the correction parameters as new ground observations become available. These methods are especially valuable when long training data sets do not exist. For ensemble forecasts, well-known techniques are ensemble model output statistics (EMOS), and so-called "member-by-member" approaches (MBM). Here, we introduce a new adaptive post-processing technique for ensemble predictions. The proposed method is a sequential Kalman filtering technique that fully exploits the information content of the ensemble. One correction equation is retrieved and applied to all members, however the parameters of the regression equations are retrieved by exploiting the second order statistics of the forecast ensemble. We compare our new method with two other techniques: a simple method that makes use of a running bias correction of the ensemble mean, and an MBM post-processing approach that rescales the ensemble mean and spread, based on minimization of the Continuous Ranked Probability Score (CRPS). We perform a verification study for the region of Campania in southern Italy. We use two years (2014-2015) of daily meteorological observations of 2-meter temperature and 10-meter wind speed from 18 ground-based automatic weather stations distributed across the region, comparing them with the corresponding COSMO

  8. Faults Diagnostics of Railway Axle Bearings Based on IMF’s Confidence Index Algorithm for Ensemble EMD

    PubMed Central

    Yi, Cai; Lin, Jianhui; Zhang, Weihua; Ding, Jianming

    2015-01-01

    As train loads and travel speeds have increased over time, railway axle bearings have become critical elements which require more efficient non-destructive inspection and fault diagnostics methods. This paper presents a novel and adaptive procedure based on ensemble empirical mode decomposition (EEMD) and Hilbert marginal spectrum for multi-fault diagnostics of axle bearings. EEMD overcomes the limitations that often hypothesize about data and computational efforts that restrict the application of signal processing techniques. The outputs of this adaptive approach are the intrinsic mode functions that are treated with the Hilbert transform in order to obtain the Hilbert instantaneous frequency spectrum and marginal spectrum. Anyhow, not all the IMFs obtained by the decomposition should be considered into Hilbert marginal spectrum. The IMFs’ confidence index arithmetic proposed in this paper is fully autonomous, overcoming the major limit of selection by user with experience, and allows the development of on-line tools. The effectiveness of the improvement is proven by the successful diagnosis of an axle bearing with a single fault or multiple composite faults, e.g., outer ring fault, cage fault and pin roller fault. PMID:25970256

  9. Risk assessment of agricultural water requirement based on a multi-model ensemble framework, southwest of Iran

    NASA Astrophysics Data System (ADS)

    Zamani, Reza; Akhond-Ali, Ali-Mohammad; Roozbahani, Abbas; Fattahi, Rouhollah

    2016-06-01

    Water shortage and climate change are the most important issues of sustainable agricultural and water resources development. Given the importance of water availability in crop production, the present study focused on risk assessment of climate change impact on agricultural water requirement in southwest of Iran, under two emission scenarios (A2 and B1) for the future period (2025-2054). A multi-model ensemble framework based on mean observed temperature-precipitation (MOTP) method and a combined probabilistic approach Long Ashton Research Station-Weather Generator (LARS-WG) and change factor (CF) have been used for downscaling to manage the uncertainty of outputs of 14 general circulation models (GCMs). The results showed an increasing temperature in all months and irregular changes of precipitation (either increasing or decreasing) in the future period. In addition, the results of the calculated annual net water requirement for all crops affected by climate change indicated an increase between 4 and 10 %. Furthermore, an increasing process is also expected regarding to the required water demand volume. The most and the least expected increase in the water demand volume is about 13 and 5 % for A2 and B1 scenarios, respectively. Considering the results and the limited water resources in the study area, it is crucial to provide water resources planning in order to reduce the negative effects of climate change. Therefore, the adaptation scenarios with the climate change related to crop pattern and water consumption should be taken into account.

  10. Modification of input datasets for the Ensemble Streamflow Prediction based on large-scale climatic indices and weather generator

    NASA Astrophysics Data System (ADS)

    Šípek, Václav; Daňhelka, Jan

    2015-09-01

    Ensemble Streamflow Prediction (ESP) provides an efficient tool for seasonal hydrological forecasts. In this study, we propose a new modification of input data series for the ESP system used for the runoff volume prediction with a lead of one month. These series are not represented by short historical weather datasets but by longer generated synthetic weather data series. Before their submission to the hydrological model, their number is restricted by relations among observed meteorological variables (average monthly precipitation and temperature) and large-scale climatic patterns and indices (e.g. North Atlantic Oscillation, sea level pressure values and two geopotential heights). This modification was tested over a four-year testing period using the river basin in central Europe. The LARS-WG weather generator proved to be a suitable tool for the extension of the historical weather records. The modified ESP approach proved to be more efficient in the majority of months compared both to the original ESP method and reference forecast (based on probability distribution of historical discharges). The improvement over traditional ESP was most obvious in the narrower forecast interval of the expected runoff volume. The inefficient forecasts of the modified ESP scheme (compared to traditional ESP) were conditioned by an insufficient restriction of input synthetic weather datasets by the climate forecast.

  11. Risk assessment of agricultural water requirement based on a multi-model ensemble framework, southwest of Iran

    NASA Astrophysics Data System (ADS)

    Zamani, Reza; Akhond-Ali, Ali-Mohammad; Roozbahani, Abbas; Fattahi, Rouhollah

    2017-08-01

    Water shortage and climate change are the most important issues of sustainable agricultural and water resources development. Given the importance of water availability in crop production, the present study focused on risk assessment of climate change impact on agricultural water requirement in southwest of Iran, under two emission scenarios (A2 and B1) for the future period (2025-2054). A multi-model ensemble framework based on mean observed temperature-precipitation (MOTP) method and a combined probabilistic approach Long Ashton Research Station-Weather Generator (LARS-WG) and change factor (CF) have been used for downscaling to manage the uncertainty of outputs of 14 general circulation models (GCMs). The results showed an increasing temperature in all months and irregular changes of precipitation (either increasing or decreasing) in the future period. In addition, the results of the calculated annual net water requirement for all crops affected by climate change indicated an increase between 4 and 10 %. Furthermore, an increasing process is also expected regarding to the required water demand volume. The most and the least expected increase in the water demand volume is about 13 and 5 % for A2 and B1 scenarios, respectively. Considering the results and the limited water resources in the study area, it is crucial to provide water resources planning in order to reduce the negative effects of climate change. Therefore, the adaptation scenarios with the climate change related to crop pattern and water consumption should be taken into account.

  12. Identifying outliers of non-Gaussian groundwater state data based on ensemble estimation for long-term trends

    NASA Astrophysics Data System (ADS)

    Jeong, Jina; Park, Eungyu; Han, Weon Shik; Kim, Kueyoung; Choung, Sungwook; Chung, Il Moon

    2017-05-01

    A hydrogeological dataset often includes substantial deviations that need to be inspected. In the present study, three outlier identification methods - the three sigma rule (3σ), inter quantile range (IQR), and median absolute deviation (MAD) - that take advantage of the ensemble regression method are proposed by considering non-Gaussian characteristics of groundwater data. For validation purposes, the performance of the methods is compared using simulated and actual groundwater data with a few hypothetical conditions. In the validations using simulated data, all of the proposed methods reasonably identify outliers at a 5% outlier level; whereas, only the IQR method performs well for identifying outliers at a 30% outlier level. When applying the methods to real groundwater data, the outlier identification performance of the IQR method is found to be superior to the other two methods. However, the IQR method shows limitation by identifying excessive false outliers, which may be overcome by its joint application with other methods (for example, the 3σ rule and MAD methods). The proposed methods can be also applied as potential tools for the detection of future anomalies by model training based on currently available data.

  13. Identifying Outliers of Non-Gaussian Groundwater State Data Based on Ensemble Estimation for Long-Term Trends

    NASA Astrophysics Data System (ADS)

    Park, E.; Jeong, J.; Choi, J.; Han, W. S.; Yun, S. T.

    2016-12-01

    Three modified outlier identification methods: the three sigma rule (3s), inter quantile range (IQR) and median absolute deviation (MAD), which take advantage of the ensemble regression method are proposed. For validation purposes, the performance of the methods is compared using simulated and actual groundwater data with a few hypothetical conditions. In the validations using simulated data, all of the proposed methods reasonably identify outliers at a 5% outlier level; whereas, only the IQR method performs well for identifying outliers at a 30% outlier level. When applying the methods to real groundwater data, the outlier identification performance of the IQR method is found to be superior to the other two methods. However, the IQR method is found to have a limitation in the false identification of excessive outliers, which may be supplemented by joint applications with the other methods (i.e., the 3s rule and MAD methods). The proposed methods can be also applied as a potential tool for future anomaly detection by model training based on currently available data.

  14. Automatic categorization of anatomical landmark-local appearances based on diffeomorphic demons and spectral clustering for constructing detector ensembles.

    PubMed

    Hanaoka, Shouhei; Masutani, Yoshitaka; Nemoto, Mitsutaka; Nomura, Yukihiro; Yoshikawa, Takeharu; Hayashi, Naoto; Ohtomo, Kuni

    2012-01-01

    A method for categorizing landmark-local appearances extracted from computed tomography (CT) datasets is presented. Anatomical landmarks in the human body inevitably have inter-individual variations that cause difficulty in automatic landmark detection processes. The goal of this study is to categorize subjects (i.e., training datasets) according to local shape variations of such a landmark so that each subgroup has less shape variation and thus the machine learning of each landmark detector is much easier. The similarity between each subject pair is measured based on the non-rigid registration result between them. These similarities are used by the spectral clustering process. After the clustering, all training datasets in each cluster, as well as synthesized intermediate images calculated from all subject-pairs in the cluster, are used to train the corresponding subgroup detector. All of these trained detectors compose a detector ensemble to detect the target landmark. Evaluation with clinical CT datasets showed great improvement in the detection performance.

  15. BgN-Score and BsN-Score: bagging and boosting based ensemble neural networks scoring functions for accurate binding affinity prediction of protein-ligand complexes.

    PubMed

    Ashtawy, Hossam M; Mahapatra, Nihar R

    2015-01-01

    Accurately predicting the binding affinities of large sets of protein-ligand complexes is a key challenge in computational biomolecular science, with applications in drug discovery, chemical biology, and structural biology. Since a scoring function (SF) is used to score, rank, and identify drug leads, the fidelity with which it predicts the affinity of a ligand candidate for a protein's binding site has a significant bearing on the accuracy of virtual screening. Despite intense efforts in developing conventional SFs, which are either force-field based, knowledge-based, or empirical, their limited predictive power has been a major roadblock toward cost-effective drug discovery. Therefore, in this work, we present novel SFs employing a large ensemble of neural networks (NN) in conjunction with a diverse set of physicochemical and geometrical features characterizing protein-ligand complexes to predict binding affinity. We assess the scoring accuracies of two new ensemble NN SFs based on bagging (BgN-Score) and boosting (BsN-Score), as well as those of conventional SFs in the context of the 2007 PDBbind benchmark that encompasses a diverse set of high-quality protein families. We find that BgN-Score and BsN-Score have more than 25% better Pearson's correlation coefficient (0.804 and 0.816 vs. 0.644) between predicted and measured binding affinities compared to that achieved by a state-of-the-art conventional SF. In addition, these ensemble NN SFs are also at least 19% more accurate (0.804 and 0.816 vs. 0.675) than SFs based on a single neural network that has been traditionally used in drug discovery applications. We further find that ensemble models based on NNs surpass SFs based on the decision-tree ensemble technique Random Forests. Ensemble neural networks SFs, BgN-Score and BsN-Score, are the most accurate in predicting binding affinity of protein-ligand complexes among the considered SFs. Moreover, their accuracies are even higher when they are used to predict

  16. Definition of Ensemble Error Statistics for Optimal Ensemble Data Assimilation

    NASA Astrophysics Data System (ADS)

    Frehlich, R.

    2009-09-01

    Next generation data assimilation methods must include the state dependent observation errors, i.e., the spatial and temporal variations produced by the atmospheric turbulent field. A rigorous analysis of optimal data assimilation algorithms and ensemble forecast systems requires a definition of model "truth" or perfect measurement which then defines the total observation error and forecast error. Truth is defined as the spatial average of the continuous atmospheric state variables centered on the model grid locations. To be consistent with the climatology of turbulence, the spatial average is chosen as the effective spatial filter of the numerical model. The observation errors then consist of two independent components: an instrument error and an observation sampling error which describes the mismatch of the spatial average of the observation and the spatial average of the perfect measurement or "truth". The observation sampling error is related to the "error of representativeness" but is defined only in terms of the local statistics of the atmosphere and the sampling pattern of the observation. Optimal data assimilation requires an estimate of the local background error correlation as well as the local observation error correlation. Both of these local correlations can be estimated from ensemble assimilation techniques where each member of the ensemble are produced by generating and assimilating random observations consistent with the estimates of the local sampling errors based on estimates of the local turbulent statistics. A rigorous evaluation of these optimal ensemble data assimilation techniques requires a definition of the ensemble members and the ensemble average that describes the error correlations. A new formulation is presented that is consistent with the climatology of atmospheric turbulence and the implications of this formulation for ensemble forecast systems is discussed.

  17. World Music Ensemble: Kulintang

    ERIC Educational Resources Information Center

    Beegle, Amy C.

    2012-01-01

    As instrumental world music ensembles such as steel pan, mariachi, gamelan and West African drums are becoming more the norm than the exception in North American school music programs, there are other world music ensembles just starting to gain popularity in particular parts of the United States. The kulintang ensemble, a drum and gong ensemble…

  18. World Music Ensemble: Kulintang

    ERIC Educational Resources Information Center

    Beegle, Amy C.

    2012-01-01

    As instrumental world music ensembles such as steel pan, mariachi, gamelan and West African drums are becoming more the norm than the exception in North American school music programs, there are other world music ensembles just starting to gain popularity in particular parts of the United States. The kulintang ensemble, a drum and gong ensemble…

  19. Complex catalysts from self-repairing ensembles to highly reactive air-based oxidation systems

    Treesearch

    Craig L. Hill; Laurent Delannoy; Dean C. Duncan; Ira A. Weinstock; Roman F. Renneke; Richard S. Reiner; Rajai H. Atalla; Jong Woo Han; Daniel A. Hillesheim; Rui Cao; Travis M. Anderson; Nelya M. Okun; Djamaladdin G. Musaev; Yurii V. Geletii

    2007-01-01

    Progress in four interrelated catalysis research efforts in our laboratory are summarized: (1) catalytic photochemical functionalization of unactivated CeH bonds by polyoxometalates (POMs); (2) self-repairing catalysts; (3) catalysts for air-based oxidations under ambient conditions; and (4) terminal oxo complexes of the late-transition metal elements and their...

  20. An Integrated Ensemble-Based Operational Framework to Predict Urban Flooding: A Case Study of Hurricane Sandy in the Passaic and Hackensack River Basins

    NASA Astrophysics Data System (ADS)

    Saleh, F.; Ramaswamy, V.; Georgas, N.; Blumberg, A. F.; Wang, Y.

    2016-12-01

    Advances in computational resources and modeling techniques are opening the path to effectively integrate existing complex models. In the context of flood prediction, recent extreme events have demonstrated the importance of integrating components of the hydrosystem to better represent the interactions amongst different physical processes and phenomena. As such, there is a pressing need to develop holistic and cross-disciplinary modeling frameworks that effectively integrate existing models and better represent the operative dynamics. This work presents a novel Hydrologic-Hydraulic-Hydrodynamic Ensemble (H3E) flood prediction framework that operationally integrates existing predictive models representing coastal (New York Harbor Observing and Prediction System, NYHOPS), hydrologic (US Army Corps of Engineers Hydrologic Modeling System, HEC-HMS) and hydraulic (2-dimensional River Analysis System, HEC-RAS) components. The state-of-the-art framework is forced with 125 ensemble meteorological inputs from numerical weather prediction models including the Global Ensemble Forecast System, the European Centre for Medium-Range Weather Forecasts (ECMWF), the Canadian Meteorological Centre (CMC), the Short Range Ensemble Forecast (SREF) and the North American Mesoscale Forecast System (NAM). The framework produces, within a 96-hour forecast horizon, on-the-fly Google Earth flood maps that provide critical information for decision makers and emergency preparedness managers. The utility of the framework was demonstrated by retrospectively forecasting an extreme flood event, hurricane Sandy in the Passaic and Hackensack watersheds (New Jersey, USA). Hurricane Sandy caused significant damage to a number of critical facilities in this area including the New Jersey Transit's main storage and maintenance facility. The results of this work demonstrate that ensemble based frameworks provide improved flood predictions and useful information about associated uncertainties, thus

  1. A new ensemble-based consistency test for the Community Earth System Model (pyCECT v1.0)

    NASA Astrophysics Data System (ADS)

    Baker, A. H.; Hammerling, D. M.; Levy, M. N.; Xu, H.; Dennis, J. M.; Eaton, B. E.; Edwards, J.; Hannay, C.; Mickelson, S. A.; Neale, R. B.; Nychka, D.; Shollenberger, J.; Tribbia, J.; Vertenstein, M.; Williamson, D.

    2015-09-01

    Climate simulation codes, such as the Community Earth System Model (CESM), are especially complex and continually evolving. Their ongoing state of development requires frequent software verification in the form of quality assurance to both preserve the quality of the code and instill model confidence. To formalize and simplify this previously subjective and computationally expensive aspect of the verification process, we have developed a new tool for evaluating climate consistency. Because an ensemble of simulations allows us to gauge the natural variability of the model's climate, our new tool uses an ensemble approach for consistency testing. In particular, an ensemble of CESM climate runs is created, from which we obtain a statistical distribution that can be used to determine whether a new climate run is statistically distinguishable from the original ensemble. The CESM ensemble consistency test, referred to as CESM-ECT, is objective in nature and accessible to CESM developers and users. The tool has proven its utility in detecting errors in software and hardware environments and providing rapid feedback to model developers.

  2. Physics Based Protein Structure Refinement through Multiple Molecular Dynamics Trajectories and Structure Averaging

    PubMed Central

    Mirjalili, Vahid; Noyes, Keenan; Feig, Michael

    2014-01-01

    We used molecular dynamics (MD) simulations for structure refinement of CASP10 targets. Refinement was achieved by selecting structures from the MD-based ensembles followed by structural averaging. The overall performance of this method in CASP10 is described and specific aspects are analyzed in detail to provide insight into key components. In particular, the use of different restraint types, sampling from multiple short simulations vs. a single long simulation, the success of a quality assessment criterion, the application of scoring vs. averaging, and the impact of a final refinement step are discussed in detail. PMID:23737254

  3. Girsanov reweighting for path ensembles and Markov state models

    NASA Astrophysics Data System (ADS)

    Donati, L.; Hartmann, C.; Keller, B. G.

    2017-06-01

    The sensitivity of molecular dynamics on changes in the potential energy function plays an important role in understanding the dynamics and function of complex molecules. We present a method to obtain path ensemble averages of a perturbed dynamics from a set of paths generated by a reference dynamics. It is based on the concept of path probability measure and the Girsanov theorem, a result from stochastic analysis to estimate a change of measure of a path ensemble. Since Markov state models (MSMs) of the molecular dynamics can be formulated as a combined phase-space and path ensemble average, the method can be extended to reweight MSMs by combining it with a reweighting of the Boltzmann distribution. We demonstrate how to efficiently implement the Girsanov reweighting in a molecular dynamics simulation program by calculating parts of the reweighting factor "on the fly" during the simulation, and we benchmark the method on test systems ranging from a two-dimensional diffusion process and an artificial many-body system to alanine dipeptide and valine dipeptide in implicit and explicit water. The method can be used to study the sensitivity of molecular dynamics on external perturbations as well as to reweight trajectories generated by enhanced sampling schemes to the original dynamics.

  4. Ensemble Flow Forecasts for Risk Based Reservoir Operations of Lake Mendocino in Mendocino County, California

    NASA Astrophysics Data System (ADS)

    Delaney, C.; Hartman, R. K.; Mendoza, J.; Evans, K. M.; Evett, S.

    2016-12-01

    Forecast informed reservoir operations (FIRO) is a methodology that incorporates short to mid-range precipitation or flow forecasts to inform the flood operations of reservoirs. Previous research and modeling for flood control reservoirs has shown that FIRO can reduce flood risk and increase water supply for many reservoirs. The risk-based method of FIRO presents a unique approach that incorporates flow forecasts made by NOAA's California-Nevada River Forecast Center (CNRFC) to model and assess risk of meeting or exceeding identified management targets or thresholds. Forecasted risk is evaluated against set risk tolerances to set reservoir flood releases. A water management model was developed for Lake Mendocino, a 116,500 acre-foot reservoir located near Ukiah, California. Lake Mendocino is a dual use reservoir, which is owned and operated for flood control by the United State Army Corps of Engineers and is operated by the Sonoma County Water Agency for water supply. Due to recent changes in the operations of an upstream hydroelectric facility, this reservoir has been plagued with water supply reliability issues since 2007. FIRO is applied to Lake Mendocino by simulating daily hydrologic conditions from 1985 to 2010 in the Upper Russian River from Lake Mendocino to the City of Healdsburg approximately 50 miles downstream. The risk-based method is simulated using a 15-day, 61 member streamflow hindcast by the CNRFC. Model simulation results of risk-based flood operations demonstrate a 23% increase in average end of water year (September 30) storage levels over current operations. Model results show no increase in occurrence of flood damages for points downstream of Lake Mendocino. This investigation demonstrates that FIRO may be a viable flood control operations approach for Lake Mendocino and warrants further investigation through additional modeling and analysis.

  5. Carbon-based ion and molecular channels

    NASA Astrophysics Data System (ADS)

    Sint, Kyaw; Wang, Boyang; Kral, Petr

    2008-03-01

    We design ion and molecular channels based on layered carboneous materials, with chemically-functionalized pore entrances. Our molecular dynamics simulations demonstrate that these ultra-narrow pores, with diameters around 1 nm, are highly selective to the charges and sizes of the passing (Na^+ and Cl^-) ions and short alkanes. We demonstrate that the molecular flows through these pores can be easily controlled by electrical and mechanical means. These artificial pores could be integrated in fluidic nanodevices and lab-on-a-chip techniques with numerous potential applications. [1] Kyaw Sint, Boyang Wang and Petr Kral, submitted. [2] Boyang Wang and Petr Kral, JACS 128, 15984 (2006).

  6. A semiautomatic CT-based ensemble segmentation of lung tumors: comparison with oncologists’ delineations and with the surgical specimen

    PubMed Central

    Velazquez, Emmanuel Rios; Aerts, Hugo J. W. L.; Gu, Yuhua; Goldgof, Dmitry B.; De Ruysscher, Dirk; Dekker, Andre; Korn, René; Gillies, Robert J.; Lambin, Philippe

    2013-01-01

    Purpose To assess the clinical relevance of a semiautomatic CT-based ensemble segmentation method, by comparing it to pathology and to CT/PET manual delineations by five independent radiation oncologists in non-small cell lung cancer (NSCLC). Materials and Methods For twenty NSCLC patients (stage Ib – IIIb) the primary tumor was delineated manually on CT/PET scans by five independent radiation oncologists and segmented using a CT based semi-automatic tool. Tumor volume and overlap fractions between manual and semiautomatic-segmented volumes were compared. All measurements were correlated with the maximal diameter on macroscopic examination of the surgical specimen. Imaging data is available on www.cancerdata.org. Results High overlap fractions were observed between the semi-automatically segmented volumes and the intersection (92.5 ± 9.0, mean ± SD) and union (94.2 ± 6.8) of the manual delineations. No statistically significant differences in tumor volume were observed between the semiautomatic segmentation (71.4 ± 83.2 cm3, mean ± SD) and manual delineations (81.9 ± 94.1 cm3; p = 0.57). The maximal tumor diameter of the semiautomatic-segmented tumor correlated strongly with the macroscopic diameter of the primary tumor (r = 0.96). Conclusion Semiautomatic segmentation of the primary tumor on CT demonstrated high agreement with CT/PET manual delineations and strongly correlated with the macroscopic diameter considered the “gold standard”. This method may be used routinely in clinical practice and could be employed as a starting point for treatment planning, target definition in multi-center clinical trials or for high throughput data mining research. This method is particularly suitable for peripherally located tumors. PMID:23157978

  7. Design of a Satellite Observational Operator (SOO) for ensemble-based data assimilation to improve volcanic plume forecasts

    NASA Astrophysics Data System (ADS)

    Fu, Guangliang; Prata, Fred; Lin, Hai Xiang; Heemink, Arnold; Segers, Arjo; Jin, Jianbing; Lu, Sha

    2017-04-01

    Using data assimilation (DA) is efficient to improve volcanic model forecast accuracy. Infrared satellite measurements of volcanic ash mass loadings are often used as input observations for the assimilation scheme. However, because these primary satellite-retrieved data are often 2D and the ash plume is usually vertically narrow, thus directly assimilating the 2D ash mass loadings in a 3D volcanic ash model (with an integral observational operator) can usually introduce large spurious vertical correlations. In this study, we look at an approach to avoid the spurious vertical correlations by not involving the integral operator. (We focus on the case study of the 2010 Eyjafjallajökull volcanic ash plume.) By integrating available data of ash mass loadings and cloud heights, and data-based thickness assumptions, a Satellite Observational Operator (SOO) is proposed that translates satellite-retrieved 2D mass loadings to 3D concentrations. The SOO makes the analysis step of assimilation comparable in the 3D model space. Ensemble-based data assimilation is used to assimilate the extracted measurements of ash concentrations. The results show that satellite data assimilation with SOO can improve the estimate of volcanic ash state better than the standard assimilation without SOO. Comparison with both satellite retrieved data and aircraft in situ measurements shows that the effective volcanic ash forecasts can be obtained after assimilation with SOO. In addition, this study provides an idea in the sense of incorporating many available measurements. We expect the SOO can be potentially improved by incorporating more data, but at the moment DA with SOO has shown its advantage than the standard way (without SOO) in dealing with passive satellite data assimilation.

  8. An OSSE for a Local Ensemble Transform Kalman Filter - Based Now-casting System of Biwa Lake, Japan

    NASA Astrophysics Data System (ADS)

    Auger, G.; Wells, J. C.

    2016-02-01

    Fresh water bodies provide drinking water for inhabitants living in their vicinity. However short-lived extreme events, geophysical or anthropological, can worsen the water quality. In the Kinki region of Japan fourteen million people receive drinking water from Biwa Lake, Japan. The fact that water treatment plants surround the lake, and tropical cyclones hit the region every year makes the mitigation of water-quality-worsening events an important matter. Having informations in real time about the three-dimensional circulation of the lake will facilitate the mitigation of the extreme events. To obtain such informations, we are developing a now-casting system for the tracking of Biwa lakes's flow, the first in a limnological environment in Japan. We based our system on the coastal ocean simulator SUNTANS, and we added the LETKF scheme to assimilate available and future data streams. The system generates the ensemble of state vectors using six bred vectors and one unperturbed state vector. We will present the assessment of performances of the now-casting during an extreme event. To analyse the performances, we first performed a fine-scale simulation of the typhoon Man-Yi (September 2013) on Biwa Lake's circulation. We chose this specific event due to the strong wind and biomaterial discharge associated with it. The consistency analysis of the simulation was performed based on in-situ temperature data at six depth levels for the vertical consistency, space borne SST for the horizontal consistency. We also used near-infrared satellite data to analyse the propagation of biomaterial after the typhoon. Because the original simulation was consistent with observations, artificial data streams from the simulation are assimilated into the now-casting system. We show the results of the hindcast of the typhoon ManYi using the now-casting system. We also talk about the presence of instabilities during and after the typhoon that are highlighted by the bred vectors, used in the

  9. Similarity-based multi-model ensemble approach for 1-15-day advance prediction of monsoon rainfall over India

    NASA Astrophysics Data System (ADS)

    Jaiswal, Neeru; Kishtawal, C. M.; Bhomia, Swati

    2017-04-01

    The southwest (SW) monsoon season (June, July, August and September) is the major period of rainfall over the Indian region. The present study focuses on the development of a new multi-model ensemble approach based on the similarity criterion (SMME) for the prediction of SW monsoon rainfall in the extended range. This approach is based on the assumption that training with the similar type of conditions may provide the better forecasts in spite of the sequential training which is being used in the conventional MME approaches. In this approach, the training dataset has been selected by matching the present day condition to the archived dataset and days with the most similar conditions were identified and used for training the model. The coefficients thus generated were used for the rainfall prediction. The precipitation forecasts from four general circulation models (GCMs), viz. European Centre for Medium-Range Weather Forecasts (ECMWF), United Kingdom Meteorological Office (UKMO), National Centre for Environment Prediction (NCEP) and China Meteorological Administration (CMA) have been used for developing the SMME forecasts. The forecasts of 1-5, 6-10 and 11-15 days were generated using the newly developed approach for each pentad of June-September during the years 2008-2013 and the skill of the model was analysed using verification scores, viz. equitable skill score (ETS), mean absolute error (MAE), Pearson's correlation coefficient and Nash-Sutcliffe model efficiency index. Statistical analysis of SMME forecasts shows superior forecast skill compared to the conventional MME and the individual models for all the pentads, viz. 1-5, 6-10 and 11-15 days.

  10. A general framework for multivariate multi-index drought prediction based on Multivariate Ensemble Streamflow Prediction (MESP)

    NASA Astrophysics Data System (ADS)

    Hao, Zengchao; Hao, Fanghua; Singh, Vijay P.

    2016-08-01

    Drought is among the costliest natural hazards worldwide and extreme drought events in recent years have caused huge losses to various sectors. Drought prediction is therefore critically important for providing early warning information to aid decision making to cope with drought. Due to the complicated nature of drought, it has been recognized that the univariate drought indicator may not be sufficient for drought characterization and hence multivariate drought indices have been developed for drought monitoring. Alongside the substantial effort in drought monitoring with multivariate drought indices, it is of equal importance to develop a drought prediction method with multivariate drought indices to integrate drought information from various sources. This study proposes a general framework for multivariate multi-index drought prediction that is capable of integrating complementary prediction skills from multiple drought indices. The Multivariate Ensemble Streamflow Prediction (MESP) is employed to sample from historical records for obtaining statistical prediction of multiple variables, which is then used as inputs to achieve multivariate prediction. The framework is illustrated with a linearly combined drought index (LDI), which is a commonly used multivariate drought index, based on climate division data in California and New York in the United States with different seasonality of precipitation. The predictive skill of LDI (represented with persistence) is assessed by comparison with the univariate drought index and results show that the LDI prediction skill is less affected by seasonality than the meteorological drought prediction based on SPI. Prediction results from the case study show that the proposed multivariate drought prediction outperforms the persistence prediction, implying a satisfactory performance of multivariate drought prediction. The proposed method would be useful for drought prediction to integrate drought information from various sources

  11. Projected changes to winter temperature characteristics over Canada based on an RCM ensemble

    NASA Astrophysics Data System (ADS)

    Jeong, Dae Il; Sushama, Laxmi; Diro, Gulilat Tefera; Khaliq, M. Naveed

    2016-09-01

    Cold temperature and associated extremes often impact adversely human health and environment and bring disruptions in economic activities during winter over Canada. This study investigates projected changes in winter (December to March) period cold extreme days (i.e., cold nights, cold days, frost days, and ice days) and cold spells over Canada based on 11 regional climate model (RCM) simulations for the future 2040-2069 period with respect to the current 1970-1999 period. These simulations, available from the North American Regional Climate Change Assessment Program, were obtained with six different RCMs, when driven by four different Atmosphere-Ocean General Circulation Models, under the Special Report on Emissions Scenarios A2 scenario. Based on the reanalysis boundary conditions, the RCM simulations reproduce spatial patterns of observed mean values of the daily minimum and maximum temperatures and inter-annual variability of the number of cold nights over different Canadian climatic regions considered in the study. A comparison of current and future period simulations suggests decreases in the frequency of cold extreme events (i.e., cold nights, cold days and cold spells) and in selected return levels of maximum duration of cold spells over the entire study domain. Important regional differences are noticed as the simulations generally indicate smaller decreases in the characteristics of extreme cold events over western Canada compared to the other regions. The analysis also suggests an increase in the frequency of midwinter freeze-thaw events, due mainly to a decrease in the number of frost days and ice days for all Canadian regions. Especially, densely populated southern and coastal Canadian regions will require in depth studies to facilitate appropriate adaptation strategies as these regions are clearly expected to experience large increases in the frequency of freeze-thaw events.

  12. Ultrathin inorganic molecular nanowire based on polyoxometalates

    PubMed Central

    Zhang, Zhenxin; Murayama, Toru; Sadakane, Masahiro; Ariga, Hiroko; Yasuda, Nobuhiro; Sakaguchi, Norihito; Asakura, Kiyotaka; Ueda, Wataru

    2015-01-01

    The development of metal oxide-based molecular wires is important for fundamental research and potential practical applications. However, examples of these materials are rare. Here we report an all-inorganic transition metal oxide molecular wire prepared by disassembly of larger crystals. The wires are comprised of molybdenum(VI) with either tellurium(IV) or selenium(IV): {(NH4)2[XMo6O21]}n (X=tellurium(IV) or selenium(IV)). The ultrathin molecular nanowires with widths of 1.2 nm grow to micrometre-scale crystals and are characterized by single-crystal X-ray analysis, Rietveld analysis, scanning electron microscopy, X-ray photoelectron spectroscopy, ultraviolet–visible spectroscopy, thermal analysis and elemental analysis. The crystals can be disassembled into individual molecular wires through cation exchange and subsequent ultrasound treatment, as visualized by atomic force microscopy and transmission electron microscopy. The ultrathin molecular wire-based material exhibits high activity as an acid catalyst, and the band gap of the molecular wire-based crystal is tunable by heat treatment. PMID:26139011

  13. Bioassays Based on Molecular Nanomechanics

    DOE PAGES

    Majumdar, Arun

    2002-01-01

    Recent experiments have shown that when specific biomolecular interactions are confined to one surface of a microcantilever beam, changes in intermolecular nanomechanical forces provide sufficient differential torque to bend the cantilever beam. This has been used to detect single base pair mismatches during DNA hybridization, as well as prostate specific antigen (PSA) at concentrations and conditions that are clinically relevant for prostate cancer diagnosis. Since cantilever motion originates from free energy change induced by specific biomolecular binding, this technique is now offering a common platform for label-free quantitative analysis of protein-protein binding, DNA hybridization DNA-protein interactions, and in general receptor-ligandmore » interactions. Current work is focused on developing “universal microarrays” of microcantilever beams for high-throughput multiplexed bioassays.« less

  14. Algorithms on ensemble quantum computers.

    PubMed

    Boykin, P Oscar; Mor, Tal; Roychowdhury, Vwani; Vatan, Farrokh

    2010-06-01

    In ensemble (or bulk) quantum computation, all computations are performed on an ensemble of computers rather than on a single computer. Measurements of qubits in an individual computer cannot be performed; instead, only expectation values (over the complete ensemble of computers) can be measured. As a result of this limitation on the model of computation, many algorithms cannot be processed directly on such computers, and must be modified, as the common strategy of delaying the measurements usually does not resolve this ensemble-measurement problem. Here we present several new strategies for resolving this problem. Based on these strategies we provide new versions of some of the most important quantum algorithms, versions that are suitable for implementing on ensemble quantum computers, e.g., on liquid NMR quantum computers. These algorithms are Shor's factorization algorithm, Grover's search algorithm (with several marked items), and an algorithm for quantum fault-tolerant computation. The first two algorithms are simply modified using a randomizing and a sorting strategies. For the last algorithm, we develop a classical-quantum hybrid strategy for removing measurements. We use it to present a novel quantum fault-tolerant scheme. More explicitly, we present schemes for fault-tolerant measurement-free implementation of Toffoli and σ(z)(¼) as these operations cannot be implemented "bitwise", and their standard fault-tolerant implementations require measurement.

  15. Improved predictive mapping of indoor radon concentrations using ensemble regression trees based on automatic clustering of geological units.

    PubMed

    Kropat, Georg; Bochud, Francois; Jaboyedoff, Michel; Laedermann, Jean-Pascal; Murith, Christophe; Palacios Gruson, Martha; Baechler, Sébastien

    2015-09-01

    According to estimations around 230 people die as a result of radon exposure in Switzerland. This public health concern makes reliable indoor radon prediction and mapping methods necessary in order to improve risk communication to the public. The aim of this study was to develop an automated method to classify lithological units according to their radon characteristics and to develop mapping and predictive tools in order to improve local radon prediction. About 240 000 indoor radon concentration (IRC) measurements in about 150 000 buildings were available for our analysis. The automated classification of lithological units was based on k-medoids clustering via pair-wise Kolmogorov distances between IRC distributions of lithological units. For IRC mapping and prediction we used random forests and Bayesian additive regression trees (BART). The automated classification groups lithological units well in terms of their IRC characteristics. Especially the IRC differences in metamorphic rocks like gneiss are well revealed by this method. The maps produced by random forests soundly represent the regional difference of IRCs in Switzerland and improve the spatial detail compared to existing approaches. We could explain 33% of the variations in IRC data with random forests. Additionally, the influence of a variable evaluated by random forests shows that building characteristics are less important predictors for IRCs than spatial/geological influences. BART could explain 29% of IRC variability and produced maps that indicate the prediction uncertainty. Ensemble regression trees are a powerful tool to model and understand the multidimensional influences on IRCs. Automatic clustering of lithological units complements this method by facilitating the interpretation of radon properties of rock types. This study provides an important element for radon risk communication. Future approaches should consider taking into account further variables like soil gas radon measurements as

  16. A hybrid model for PM₂.₅ forecasting based on ensemble empirical mode decomposition and a general regression neural network.

    PubMed

    Zhou, Qingping; Jiang, Haiyan; Wang, Jianzhou; Zhou, Jianling

    2014-10-15

    Exposure to high concentrations of fine particulate matter (PM₂.₅) can cause serious health problems because PM₂.₅ contains microscopic solid or liquid droplets that are sufficiently small to be ingested deep into human lungs. Thus, daily prediction of PM₂.₅ levels is notably important for regulatory plans that inform the public and restrict social activities in advance when harmful episodes are foreseen. A hybrid EEMD-GRNN (ensemble empirical mode decomposition-general regression neural network) model based on data preprocessing and analysis is firstly proposed in this paper for one-day-ahead prediction of PM₂.₅ concentrations. The EEMD part is utilized to decompose original PM₂.₅ data into several intrinsic mode functions (IMFs), while the GRNN part is used for the prediction of each IMF. The hybrid EEMD-GRNN model is trained using input variables obtained from principal component regression (PCR) model to remove redundancy. These input variables accurately and succinctly reflect the relationships between PM₂.₅ and both air quality and meteorological data. The model is trained with data from January 1 to November 1, 2013 and is validated with data from November 2 to November 21, 2013 in Xi'an Province, China. The experimental results show that the developed hybrid EEMD-GRNN model outperforms a single GRNN model without EEMD, a multiple linear regression (MLR) model, a PCR model, and a traditional autoregressive integrated moving average (ARIMA) model. The hybrid model with fast and accurate results can be used to develop rapid air quality warning systems.

  17. All-optical integrated logic operations based on chemical communication between molecular switches.

    PubMed

    Silvi, Serena; Constable, Edwin C; Housecroft, Catherine E; Beves, Jonathon E; Dunphy, Emma L; Tomasulo, Massimiliano; Raymo, Françisco M; Credi, Alberto

    2009-01-01

    Molecular logic gates process physical or chemical "inputs" to generate "outputs" based on a set of logical operators. We report the design and operation of a chemical ensemble in solution that behaves as integrated AND, OR, and XNOR gates with optical input and output signals. The ensemble is composed of a reversible merocyanine-type photoacid and a ruthenium polypyridine complex that functions as a pH-controlled three-state luminescent switch. The light-triggered release of protons from the photoacid is used to control the state of the transition-metal complex. Therefore, the two molecular switching devices communicate with one another through the exchange of ionic signals. By means of such a double (optical-chemical-optical) signal-transduction mechanism, inputs of violet light modulate a luminescence output in the red/far-red region of the visible spectrum. Nondestructive reading is guaranteed because the green light used for excitation in the photoluminescence experiments does not affect the state of the gate. The reset is thermally driven and, thus, does not involve the addition of chemicals and accumulation of byproducts. Owing to its reversibility and stability, this molecular device can afford many cycles of digital operation.

  18. An assessment of a North American Multi-Model Ensemble (NMME) based global drought early warning forecast system

    NASA Astrophysics Data System (ADS)

    Wood, E. F.; Yuan, X.; Sheffield, J.; Pan, M.; Roundy, J.

    2013-12-01

    One of the key recommendations of the WCRP Global Drought Information System (GDIS) workshop is to develop an experimental real-time global monitoring and prediction system. While great advances has been made in global drought monitoring based on satellite observations and model reanalysis data, global drought forecasting has been stranded in part due to the limited skill both in climate forecast models and global hydrologic predictions. Having been working on drought monitoring and forecasting over USA for more than a decade, the Princeton land surface hydrology group is now developing an experimental global drought early warning system that is based on multiple climate forecast models and a calibrated global hydrologic model. In this presentation, we will test its capability in seasonal forecasting of meteorological, agricultural and hydrologic droughts over global major river basins, using precipitation, soil moisture and streamflow forecasts respectively. Based on the joint probability distribution between observations using Princeton's global drought monitoring system and model hindcasts and real-time forecasts from North American Multi-Model Ensemble (NMME) project, we (i) bias correct the monthly precipitation and temperature forecasts from multiple climate forecast models, (ii) downscale them to a daily time scale, and (iii) use them to drive the calibrated VIC model to produce global drought forecasts at a 1-degree resolution. A parallel run using the ESP forecast method, which is based on resampling historical forcings, is also carried out for comparison. Analysis is being conducted over global major river basins, with multiple drought indices that have different time scales and characteristics. The meteorological drought forecast does not have uncertainty from hydrologic models and can be validated directly against observations - making the validation an 'apples-to-apples' comparison. Preliminary results for the evaluation of meteorological drought onset

  19. Diurnal Ensemble Surface Meteorology Statistics

    EPA Pesticide Factsheets

    Excel file containing diurnal ensemble statistics of 2-m temperature, 2-m mixing ratio and 10-m wind speed. This Excel file contains figures for Figure 2 in the paper and worksheets containing all statistics for the 14 members of the ensemble and a base simulation.This dataset is associated with the following publication:Gilliam , R., C. Hogrefe , J. Godowitch, S. Napelenok , R. Mathur , and S.T. Rao. Impact of inherent meteorology uncertainty on air quality model predictions. JOURNAL OF GEOPHYSICAL RESEARCH-ATMOSPHERES. American Geophysical Union, Washington, DC, USA, 120(23): 12,259–12,280, (2015).

  20. A stochastic ensemble-based model to predict crop water requirements from numerical weather forecasts and VIS-NIR high resolution satellite images in Southern Italy

    NASA Astrophysics Data System (ADS)

    Pelosi, Anna; Falanga Bolognesi, Salvatore; De Michele, Carlo; Medina Gonzalez, Hanoi; Villani, Paolo; D'Urso, Guido; Battista Chirico, Giovanni

    2015-04-01

    Irrigation agriculture is one the biggest consumer of water in Europe, especially in southern regions, where it accounts for up to 70% of the total water consumption. The EU Common Agricultural Policy, combined with the Water Framework Directive, imposes to farmers and irrigation managers a substantial increase of the efficiency in the use of water in agriculture for the next decade. Ensemble numerical weather predictions can be valuable data for developing operational advisory irrigation services. We propose a stochastic ensemble-based model providing spatial and temporal estimates of crop water requirements, implemented within an advisory service offering detailed maps of irrigation water requirements and crop water consumption estimates, to be used by water irrigation managers and farmers. The stochastic model combines estimates of crop potential evapotranspiration retrieved from ensemble numerical weather forecasts (COSMO-LEPS, 16 members, 7 km resolution) and canopy parameters (LAI, albedo, fractional vegetation cover) derived from high resolution satellite images in the visible and near infrared wavelengths. The service provides users with daily estimates of crop water requirements for lead times up to five days. The temporal evolution of the crop potential evapotranspiration is simulated with autoregressive models. An ensemble Kalman filter is employed for updating model states by assimilating both ground based meteorological variables (where available) and numerical weather forecasts. The model has been applied in Campania region (Southern Italy), where a satellite assisted irrigation advisory service has been operating since 2006. This work presents the results of the system performance for one year of experimental service. The results suggest that the proposed model can be an effective support for a sustainable use and management of irrigation water, under conditions of water scarcity and drought. Since the evapotranspiration term represents a staple

  1. Fragment Molecular Orbital method-based Molecular Dynamics (FMO-MD) as a simulator for chemical reactions in explicit solvation.

    PubMed

    Komeiji, Yuto; Ishikawa, Takeshi; Mochizuki, Yuji; Yamataka, Hiroshi; Nakano, Tatsuya

    2009-01-15

    Fragment Molecular Orbital based-Molecular Dynamics (FMO-MD, Komeiji et al., Chem Phys Lett 2003, 372, 342) is an ab initio MD method suitable for large molecular systems. Here, FMO-MD was implemented to conduct full quantum simulations of chemical reactions in explicit solvation. Several FMO-MD simulations were performed for a sphere of water to find a suitable simulation protocol. It was found that annealing of the initial configuration by a classical MD brought the subsequent FMO-MD trajectory to faster stabilization, and also that use of bond constraint in the FMO-MD heating stage effectively reduced the computation time. Then, the blue moon ensemble method (Sprik and Ciccotti, J Chem Phys 1998, 109, 7737) was implemented and was tested by calculating free energy profiles of the Menschutkin reaction (H3N + CH3Cl --> +H3NCH3 + Cl-) in the presence and absence of the solvent water via FMO-MD. The obtained free energy profiles were consistent with the Hammond postulate in that stabilization of the product by the solvent, namely hydration of Cl-, shifted the transition state to the reactant-side. Based on these FMO-MD results, plans for further improvement of the method are discussed. Copyright 2008 Wiley Periodicals, Inc.

  2. A probabilistic approach of the Flash Flood Early Warning System (FF-EWS) in Catalonia based on radar ensemble generation

    NASA Astrophysics Data System (ADS)

    Velasco, David; Sempere-Torres, Daniel; Corral, Carles; Llort, Xavier; Velasco, Enrique

    2010-05-01

    probabilistic component to the FF-EWS. As a first step, we have incorporated the uncertainty in rainfall estimates and forecasts based on an ensemble of equiprobable rainfall scenarios. The presented study has focused on a number of rainfall events and the performance of the FF-EWS evaluated in terms of its ability to produce probabilistic hazard warnings for decision-making support.

  3. Coordination-Cluster-Based Molecular Magnetic Refrigerants.

    PubMed

    Zhang, Shaowei; Cheng, Peng

    2016-08-01

    Coordination polymers serving as molecular magnetic refrigerants have been attracting great interest. In particular, coordination cluster compounds that demonstrate their apparent advantages on cryogenic magnetic refrigerants have attracted more attention in the last five years. Herein, we mainly focus on depicting aspects of syntheses, structures, and magnetothermal properties of coordination clusters that serve as magnetic refrigerants on account of the magnetocaloric effect. The documented molecular magnetic refrigerants are classified into two primary categories according to the types of metal centers, namely, homo- and heterometallic clusters. Every section is further divided into several subgroups based on the metal nuclearity and their dimensionalities, including discrete molecular clusters and those with extended structures constructed from molecular clusters. The objective is to present a rough overview of recent progress in coordination-cluster-based molecular magnetic refrigerants and provide a tutorial for researchers who are interested in the field. © 2016 The Chemical Society of Japan & Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. A fuzzy integral method based on the ensemble of neural networks to analyze fMRI data for cognitive state classification across multiple subjects.

    PubMed

    Cacha, L A; Parida, S; Dehuri, S; Cho, S-B; Poznanski, R R

    2016-12-01

    The huge number of voxels in fMRI over time poses a major challenge to for effective analysis. Fast, accurate, and reliable classifiers are required for estimating the decoding accuracy of brain activities. Although machine-learning classifiers seem promising, individual classifiers have their own limitations. To address this limitation, the present paper proposes a method based on the ensemble of neural networks to analyze fMRI data for cognitive state classification for application across multiple subjects. Similarly, the fuzzy integral (FI) approach has been employed as an efficient tool for combining different classifiers. The FI approach led to the development of a classifiers ensemble technique that performs better than any of the single classifier by reducing the misclassification, the bias, and the variance. The proposed method successfully classified the different cognitive states for multiple subjects with high accuracy of classification. Comparison of the performance improvement, while applying ensemble neural networks method, vs. that of the individual neural network strongly points toward the usefulness of the proposed method.

  5. Graphene-based nanoprobes for molecular diagnostics.

    PubMed

    Chen, Shixing; Li, Fuwu; Fan, Chunhai; Song, Shiping

    2015-10-07

    In recent years, graphene has received widespread attention owing to its extraordinary electrical, chemical, optical, mechanical and structural properties. Lately, considerable interest has been focused on exploring the potential applications of graphene in life sciences, particularly in disease-related molecular diagnostics. In particular, the coupling of functional molecules with graphene as a nanoprobe offers an excellent platform to realize the detection of biomarkers, such as nucleic acids, proteins and other bioactive molecules, with high performance. This article reviews emerging graphene-based nanoprobes in electrical, optical and other assay methods and their application in various strategies of molecular diagnostics. In particular, this review focuses on the construction of graphene-based nanoprobes and their special advantages for the detection of various bioactive molecules. Properties of graphene-based materials and their functionalization are also comprehensively discussed in view of the development of nanoprobes. Finally, future challenges and perspectives of graphene-based nanoprobes are discussed.

  6. A simulation study of the ensemble-based data assimilation of satellite-borne lidar aerosol observations

    NASA Astrophysics Data System (ADS)

    Sekiyama, T. T.; Tanaka, T. Y.; Miyoshi, T.

    2012-07-01

    A four-dimensional ensemble-based data assimilation system was assessed by observing system simulation experiments (OSSEs), in which the CALIPSO satellite was emulated via simulated satellite-borne lidar aerosol observations. Its performance over athree-month period was validated according to the Method for Object-based Diagnostic Evaluation (MODE), using aerosol optical thickness (AOT) distributions in East Asia as the objects of analysis. Consequently, this data assimilation system demonstrated the ability to produce better analyses of sulfate and dust aerosols in comparison to a free-running simulation model. For example, the mean centroid distance (from the truth) over a three-month collection period of aerosol plumes was improved from 2.15 grids (≈ 600 km) to 1.45 grids (≈ 400 km) for sulfate aerosols and from 2.59 grids (≈ 750 km) to 1.14 grids (≈ 330 km) for dust aerosols; the mean area ratio (to the truth) over a three-month collection period of aerosol plumes was improved from 0.49 to 0.76 for sulfate aerosols and from 0.51 to 0.72 for dust aerosols. The satellite-borne lidar data assimilation successfully improved the aerosol plume analysis and the dust emission estimation in the OSSEs. These results present great possibilities for the beneficial use of lidar data, whose distribution is vertically/temporally dense but horizontally sparse, when coupled with a four-dimensional data assimilation system. In addition, sensitivity tests were conducted, and their results indicated that the degree of freedom to control the aerosol variables was probably limited in the data assimilation because the meteorological field in the system was constrained to weather reanalysis using Newtonian relaxation. Further improvements to the aerosol analysis can be performed through the simultaneous assimilation of aerosol observations with meteorological observations. The OSSE results strongly suggest that the use of real CALIPSO data will have a beneficial effect on

  7. Probabilistic multimodel ensemble prediction of decadal variability of East Asian surface air temperature based on IPCC-AR5 near-term climate simulations

    NASA Astrophysics Data System (ADS)

    Wang, Jia; Zhi, Xiefei; Chen, Yuwen

    2013-07-01

    Based on near-term climate simulations for IPCC-AR5 (The Fifth Assessment Report), probabilistic multimodel ensemble prediction (PMME) of decadal variability of surface air temperature in East Asia (20°-50°N, 100°-145°E) was conducted using the multivariate Gaussian ensemble kernel dressing (GED) methodology. The ensemble system exhibited high performance in hindcasting the decadal (1981-2010) mean and trend of temperature anomalies with respect to 1961-90, with a RPS of 0.94 and 0.88 respectively. The interpretation of PMME for future decades (2006-35) over East Asia was made on the basis of the bivariate probability density of the mean and trend. The results showed that, under the RCP4.5 (Representative Concentration Pathway 4.5 W m-2) scenario, the annual mean temperature increases on average by about 1.1-1.2 K and the temperature trend reaches 0.6-0.7 K (30 yr)-1. The pattern for both quantities was found to be that the temperature increase will be less intense in the south. While the temperature increase in terms of the 30-yr mean was found to be virtually certain, the results for the 30-yr trend showed an almost 25% chance of a negative value. This indicated that, using a multimodel ensemble system, even if a longer-term warming exists for 2006-35 over East Asia, the trend for temperature may produce a negative value. Temperature was found to be more affected by seasonal variability, with the increase in temperature over East Asia more intense in autumn (mainly), faster in summer to the west of 115°E, and faster still in autumn to the east of 115°E.

  8. A benchmark for reaction coordinates in the transition path ensemble

    PubMed Central

    2016-01-01

    The molecular mechanism of a reaction is embedded in its transition path ensemble, the complete collection of reactive trajectories. Utilizing the information in the transition path ensemble alone, we developed a novel metric, which we termed the emergent potential energy, for distinguishing reaction coordinates from the bath modes. The emergent potential energy can be understood as the average energy cost for making a displacement of a coordinate in the transition path ensemble. Where displacing a bath mode invokes essentially no cost, it costs significantly to move the reaction coordinate. Based on some general assumptions of the behaviors of reaction and bath coordinates in the transition path ensemble, we proved theoretically with statistical mechanics that the emergent potential energy could serve as a benchmark of reaction coordinates and demonstrated its effectiveness by applying it to a prototypical system of biomolecular dynamics. Using the emergent potential energy as guidance, we developed a committor-free and intuition-independent method for identifying reaction coordinates in complex systems. We expect this method to be applicable to a wide range of reaction processes in complex biomolecular systems. PMID:27059559

  9. Incorporating abundance information and guiding variable selection for climate-based ensemble forecasting of species' distributional shifts.

    PubMed

    Tanner, Evan P; Papeş, Monica; Elmore, R Dwayne; Fuhlendorf, Samuel D; Davis, Craig A

    2017-01-01

    Ecological niche models (ENMs) have increasingly been used to estimate the potential effects of climate change on species' distributions worldwide. Recently, predictions of species abundance have also been obtained with such models, though knowledge about the climatic variables affecting species abundance is often lacking. To address this, we used a well-studied guild (temperate North American quail) and the Maxent modeling algorithm to compare model performance of three variable selection approaches: correlation/variable contribution (CVC), biological (i.e., variables known to affect species abundance), and random. We then applied the best approach to forecast potential distributions, under future climatic conditions, and analyze future potential distributions in light of available abundance data and presence-only occurrence data. To estimate species' distributional shifts we generated ensemble forecasts using four global circulation models, four representative concentration pathways, and two time periods (2050 and 2070). Furthermore, we present distributional shifts where 75%, 90%, and 100% of our ensemble models agreed. The CVC variable selection approach outperformed our biological approach for four of the six species. Model projections indicated species-specific effects of climate change on future distributions of temperate North American quail. The Gambel's quail (Callipepla gambelii) was the only species predicted to gain area in climatic suitability across all three scenarios of ensemble model agreement. Conversely, the scaled quail (Callipepla squamata) was the only species predicted to lose area in climatic suitability across all three scenarios of ensemble model agreement. Our models projected future loss of areas for the northern bobwhite (Colinus virginianus) and scaled quail in portions of their distributions which are currently areas of high abundance. Climatic variables that influence local abundance may not always scale up to influence species

  10. Incorporating abundance information and guiding variable selection for climate-based ensemble forecasting of species' distributional shifts

    PubMed Central

    2017-01-01

    Ecological niche models (ENMs) have increasingly been used to estimate the potential effects of climate change on species’ distributions worldwide. Recently, predictions of species abundance have also been obtained with such models, though knowledge about the climatic variables affecting species abundance is often lacking. To address this, we used a well-studied guild (temperate North American quail) and the Maxent modeling algorithm to compare model performance of three variable selection approaches: correlation/variable contribution (CVC), biological (i.e., variables known to affect species abundance), and random. We then applied the best approach to forecast potential distributions, under future climatic conditions, and analyze future potential distributions in light of available abundance data and presence-only occurrence data. To estimate species’ distributional shifts we generated ensemble forecasts using four global circulation models, four representative concentration pathways, and two time periods (2050 and 2070). Furthermore, we present distributional shifts where 75%, 90%, and 100% of our ensemble models agreed. The CVC variable selection approach outperformed our biological approach for four of the six species. Model projections indicated species-specific effects of climate change on future distributions of temperate North American quail. The Gambel’s quail (Callipepla gambelii) was the only species predicted to gain area in climatic suitability across all three scenarios of ensemble model agreement. Conversely, the scaled quail (Callipepla squamata) was the only species predicted to lose area in climatic suitability across all three scenarios of ensemble model agreement. Our models projected future loss of areas for the northern bobwhite (Colinus virginianus) and scaled quail in portions of their distributions which are currently areas of high abundance. Climatic variables that influence local abundance may not always scale up to influence

  11. Instantaneous phase difference analysis between thoracic and abdominal movement signals based on complementary ensemble empirical mode decomposition.

    PubMed

    Chen, Ya-Chen; Hsiao, Tzu-Chien

    2016-10-06

    Thoracoabdominal asynchrony is often adopted to discriminate respiratory diseases in clinics. Conventionally, Lissajous figure analysis is the most frequently used estimation of the phase difference in thoracoabdominal asynchrony. However, the temporal resolution of the produced results is low and the estimation error increases when the signals are not sinusoidal. Other previous studies have reported time-domain procedures with the use of band-pass filters for phase-angle estimation. Nevertheless, the band-pass filters need calibration for phase delay elimination. To improve the estimation, we propose a novel method (named as instantaneous phase difference) that is based on complementary ensemble empirical mode decomposition for estimating the instantaneous phase relation between measured thoracic wall movement and abdominal wall movement. To validate the proposed method, experiments on simulated time series and human-subject respiratory data with two breathing types (i.e., thoracic breathing and abdominal breathing) were conducted. Latest version of Lissajous figure analysis and automatic phase estimation procedure were compared. The simulation results show that the standard deviations of the proposed method were lower than those of two other conventional methods. The proposed method performed more accurately than the two conventional methods. For the human-subject respiratory data, the results of the proposed method are in line with those in the literature, and the correlation analysis result reveals that they were positively correlated with the results generated by the two conventional methods. Furthermore, the standard deviation of the proposed method was also the smallest. To summarize, this study proposes a novel method for estimating instantaneous phase differences. According to the findings from both the simulation and human-subject data, our approach was demonstrated to be effective. The method offers the following advantages: (1) improves the temporal

  12. Probabilistic Forecast for 21st Century Climate Based on an Ensemble of Simulations using a Business-As-Usual Scenario

    NASA Astrophysics Data System (ADS)

    Scott, J. R.; Forest, C. E.; Sokolov, A. P.; Dutkiewicz, S.

    2011-12-01

    The behavior of the climate system is examined in an ensemble of runs using an Earth System Model of intermediate complexity. Climate "parameters" varied are the climate sensitivity, the aerosol forcing, and the strength of ocean heat uptake. Variations in the latter were accomplished by changing the strength of the oceans' background vertical mixing. While climate sensitivity and aerosol forcing can be varied over rather wide ranges, it is more difficult to create such variation in heat uptake while maintaining a realistic overturning ocean circulation. Therefore, separate ensembles were carried out for a few values of the vertical diffusion coefficient. Joint probability distributions for climate sensitivity and aerosol forcing are constructed by comparing results from 20th century simulations with available observational data. These distributions are then used to generate ensembles of 21st century simulations; results allow us to construct probabilistic distributions for changes in important climate change variables such as surface air temperature, sea level rise, and magnitude of the AMOC. Changes in the rate of air-sea flux of CO2 and the export of carbon into the deep ocean are also examined.